Media Streaming
MPEG-DASH
Arnav Kumar
Senior Software Engineer, BrowserStack
Media Streaming
History
MPEG Transport Stream (TS)
- Digital Video Broadcasting (DVB), IPTV
- Differs from standard MPEG
- Packetized Elementary Stream (PES)
- Error correction and synchronization pattern
- Constant Bitrate (CBR) to maintain a consistent broadcast rate
TS Problems
- Broadcaster Controlled - Clients Join, Wait, Buffer
- For managed networks with controlled Quality of Services (QoS)
- Fail to sustain bandwidth - video would stutter, stop, and then start over
- Rigidity towards codecs
- CBR
- Tightly Coupled
RTMP/Flash to Rescue
- Proprietary protocol by Macromedia (Adobe) in conjunction to Flash
- Application layer - TCP (Port 1935)
- Encapsulates MP3/AAC audio and MP4/FLV video (& RPC) using Action Message Format (AMF)
- Stream fragment size negotiated dynamically between client and server
- RTMP Tunneled (RTMPT), RTMP data is encapsulated and exchanged via HTTP
Proprietary!
- Bandwidth must be larger than the video bit rate
- no easy way to switch
- Protocol was reverse engineered before Adobe released spec
- Cannot add support for other formats/codecs
- DRM Limitations
- Married to FLASH!!
Wait a minute 🤔
- What if I
- serve "web optimized" `video/mp4` or `video/webm`
- via webserver with `Accept-Ranges: bytes`
- and voila, it works as media playback in browsers is sophisticated enough to download video in chunks
- Metadata (moov) containing the size of whole container is defined in the beginning of a "web optimized" media file, and dang, it won't work with Live!
- Still no solution to bitrate switch
Enter Adaptive bitrate streaming (ABR)
- Builds on segmenting media in several bitrates, and delivering a timestamped manifest (playlist) of gapless segments
- Single-vendor controlled solutions
- Independent Implementations
- Apple HTTP Live Streaming (HLS)
- Microsoft Smooth Streaming (MSS)
- Adobe HTTP Dynamic Flash Streaming
- Flexible CDN
- Coupled codecs, DRM, segmentation
- Playback (clients) limited to ecosystems
- Independent Implementations
ABR Overview
MPEG-DASH
- Dynamic Adaptive Streaming over HTTP
- Independent, open and international standard
- Phases out Flash for HTML5 Media Source Extentions (MSE) of W3C
- Standard DRM via HTML5 Encrypted Media Extentions (EME)
- Uses TCP, delivered over conventional HTTP servers
- Codec-agnostic
Secret Sauce
- manifest file (.mpd)
- the segments
- (short) individual files - Livestream
- byte-range on a static file
- MSE
Â
download segments of video at an appropriate bitrate, and feed them to a video element when it gets hungry — using existing HTTP infrastructure
MPD - Media Presentation Description
-
XML maps out
- dynamic (live) / static
- all of the available stream URL, bitrates, dimensions & codecs
- start time, segment timings (as delta to the start time)
- min buffer time, mpd update time (avoiding manifest download with every segment), timescale
./audio/
├── ./esperanto/
| ├── ./128kbps/
| | ├── segment0.mp4
| | ├── segment1.mp4
| | └── segment2.mp4
| └── ./320kbps/
| ├── segment0.mp4
| ├── segment1.mp4
| └── segment2.mp4
└── ./french/
├── ./128kbps/
| ├── segment0.mp4
| ├── segment1.mp4
| └── segment2.mp4
└── ./320kbps/
├── segment0.mp4
├── segment1.mp4
└── segment2.mp4
./video/
├── ./240p/
| ├── segment0.mp4
| ├── segment1.mp4
| └── segment2.mp4
└── ./720p/
├── segment0.mp4
├── segment1.mp4
└── segment2.mp4
Segment Hierarchy
Live MPD Example
Media Source Extensions (MSE)
- We can load, decode and play media simply by providing a src URL to HTMLMediaElement
Â
- Media Source API is an extension to HTMLMediaElement
- Enabling more fine-grained control over the source of media, by allowing JavaScript to build streams for playback from 'chunks' of video
- This in turn enables adaptive streaming and time shifting
<video src='foo.webm'></video>
DRM Implementation - Widevine
Production Strategy
- Using ffmpeg to generate segments and descriptors from either a media file or various live inputs
- Using any HTTP server to serve them statically
ffmpeg \
-f v4l2 -input_format mjpeg -r 30 -s 1280x720 -i /dev/video0 \
-f alsa -ar 44100 -ac 2 -i hw:2 -map 0:0 -pix_fmt yuv420p \
-c:v libvpx-vp9 -s 1280x720 -keyint_min 60 -g 60 ${VP9_LIVE_PARAMS} -b:v 3000k \
-f webm_chunk -header "/var/www/webm_live/glass_360.hdr" -chunk_start_index 1 \
/var/www/webm_live/glass_360_%d.chk \
-map 1:0 \
-c:a libvorbis -b:a 128k -ar 44100 \
-f webm_chunk \
-audio_chunk_duration 2000 \
-header "/var/www/webm_live/glass_171.hdr" \
-chunk_start_index 1 \
/var/www/webm_live/glass_171_%d.chk
Sample ffmpeg - Header Generation
ffmpeg \
-f webm_dash_manifest -live 1 \
-i /var/www/webm_live/glass_360.hdr \
-f webm_dash_manifest -live 1 \
-i /var/www/webm_live/glass_171.hdr \
-c copy \
-map 0 -map 1 \
-f webm_dash_manifest -live 1 \
-adaptation_sets "id=0,streams=0 id=1,streams=1" \
-chunk_start_index 1 \
-chunk_duration_ms 2000 \
-time_shift_buffer_depth 7200 \
-minimum_update_period 7200 \
/var/www/webm_live/glass_live_manifest.mpd
Sample ffmpeg - Generate MPD
Consumption Strategy
- For browsers, use dash.js from DASH Industry Forum
- Fetch the MPD continually, respecting the min update time
- Maintain the segment tree and timestamp association from the start time
- Mux the required audio & video segments
Misc. Ops.
- Mirroring
- Fetch the manifest similar to consumption
- Instead of selectively muxing segments, fetch all available segments, mirror them
- Update the .mpd segment URLs to point to above
- Ad Insertion
- Fetch source manifest & mirror it with the source segments
- When ad needs to be inserted, stop serving the mirrored manifest, and insert the ad manifest
Clipping
- Keep a map of timestamps and segments corresponding to available qualities
- Identify series of segments that needs to be merged as per clipping request's time and quality
- Crop the start and end segment as per requirement
- Merge & mux them appropriately
FIN
arnav@arnav.at
MPEG-DASH
By arnAV
MPEG-DASH
- 1,486