Cheatsheets

Personal collection of cheatsheets.

Streaming

Streaming is the process of transmitting audio and video data in a continuous flow over a wired or wireless internet connection.

Index

Applications

Streaming applications are software programs that allow users to reproduce streams or to stream content over the internet. These applications are designed to facilitate the transmission and playback of video data, making it easy for users to watch audiovisual content from anywhere with an internet connection.

There are many streaming applications available, including:

Media Servers

Media servers are software programs that deliver video and audio content to clients who request it. The most common use of media servers is to deliver video on demand (VOD), in which the media server retrieves prerecorded video content from storage and delivers it across the Internet. Live streaming media servers deliver content as it is generated in real time or with only a slight delay.

There are many streaming media servers available, including:

Codecs

Codecs are devices or computer programs which encode or decode data streams or signals. Quantization is used to map input values from a large set (often a continuous set) to output values in a countable smaller set (often a finite set). The greater the quantization step, the lower the quality of the encoded video (lower Peak signal-to-noise ratio (PSNR)) the lower the bitrates. Greater quantization comes with lower computation complexity.

AVC/H.264

Advanced Video Coding (AVC), also known as H.264, is a video compression standard based on block-oriented, motion-compensated coding. Is the most commonly used format for the recording, compression, and distribution of video content but it is not well suited for the high bandwidth demands of 4K streaming due to the high compression ratios. It has many kinds of profiles and levels, and not every encoder or decoder supports every profile and level.

profile-level-id

Constrained Baseline
Decoders conforming to the Constrained Baseline profile at a specific level shall be capable of decoding all bitstreams in which all of the following are true:

Examples

HEVC/H.265

High Efficiency Video Coding (HEVC), also known as H.265, is a video compression standard designed as a successor to the widely used AVC. In comparison to AVC, HEVC offers from 25% to 50% better data compression at the same level of video quality, or substantially improved video quality at the same bit rate. It supports resolutions up to 8192x4320, including 8K UHD.

Transport

Transport protocols are standardized methods of delivering different types of media over the internet. They send chunks of content from one endpoint to another and define the method for reassembling these chunks into playable content on the other endpoint.

RTP

Real-time Transport Protocol (RTP) is a network protocol used in communication and entertainment systems that involve streaming media.

RTSP

Real Time Streaming Protocol (RTSP) is an application-level network protocol designed for multiplexing and packetizing multimedia transport streams. The transmission of streaming data itself is not a task of RTSP, most media servers use RTP in conjunction with RTCP for media stream delivery. Clients of media servers issue commands such as play, record and pause, to facilitate real-time control of the media streaming. The well known TCP port for RTSP traffic is 554. The most common use case of RTSP is streaming using IP cameras.

RTMP

Real-Time Messaging Protocol (RTMP) is a communication protocol for streaming audio, video, and data over the Internet that works on top of TCP and uses port number 1935 by default.

HLS

HTTP Live Streaming (HLS) is an HTTP-based adaptive bitrate streaming communications protocol. Resembles DASH in that it works by breaking the overall stream into a sequence of small HTTP-based file downloads, each downloading one short chunk of an overall potentially unbounded transport stream. A list of available streams, encoded at different bit rates, is sent to the client using an extended M3U playlist.

DASH

Dynamic Adaptive Streaming over HTTP (DASH), also known as MPEG-DASH, is an adaptive bitrate streaming technique that enables high quality streaming of media content over the Internet delivered from conventional HTTP web servers. Similar to HLS, DASH works by breaking the content into a sequence of small segments, which are served over HTTP.

SRT

Secure Reliable Transport (SRT) is an open source transport protocol that provides connection and control, reliable transmission at the application layer using UDP as the underlying transport layer. It supports packet recovery while maintaining low latency (120ms by default) and also supports encryption using AES. It has 3 working modes:

NDI

Network Device Interface (NDI) is a royalty-free software standard developed by NewTek to enable video-compatible products to communicate, deliver, and receive high-definition video over a computer network in a high-quality, low-latency manner that is frame accurate and suitable for switching in a live production environment.

Topologies

Mesh

In a mesh topology each node is directly connected to every other node. Each node sends its streams to every single node and downloads the streams from every node.

For a session with N nodes the total number of connections is O(N²).

Nodes N
Uplinks N(N-1)
Downlinks N(N-1)
Uplinksnode N-1
Downlinksnode N-1

Pros:

Cons:

MCU

In a Multipoint Conferencing Unit (MCU) topology each node is connected to the MCU server. With a MCU, each node uploads its stream once, the server decodes the stream, mixes the streams of all the nodes into one and encodes the stream to send it back to each node.

For a session with N nodes the total number of connections is O(N).

Nodes N
Uplinks N
Downlinks N
Uplinksnode 1
Downlinksnode 1

Pros:

Cons:

SFU

In a Selective Forwarding Unit (SFU) topology each node is connected to the SFU server. With a SFU, each node uploads its stream once and the server forwards the stream to every node.

For a session with N nodes the total number of connections is O(N²).

Nodes N
Uplinks N
Downlinks N(N-1)
Uplinksnode 1
Downlinksnode N-1

Pros:

Cons:

Bandwidth Strategies

In video streaming, bandwidth usage directly impacts the resolution, clarity, and overall viewing experience. Higher resolutions like high definition (HD) or standard definition (SD) require more bandwidth for smooth playback than lower ones.

For example:

Simulcast

Simulcast allows peers to publish multiple versions of the same stream with different spatial or temporal encodings, effectively sending more data.

Spatial

With spatial scalability the lower resolution layers consume less bandwidth than the high resolution ones.

For example:

The peer uses just 17% more bandwidth to publish the three layers.

Temporal

With temporal scalability it is possible to lower a stream's bitrate by dynamically reducing the stream's frame rate.

Streams contain mostly delta frames which depend on previous key frames. If the decoder needs to apply a delta to a key frame that was dropped, it can't render subsequent frames.

When temporal layers are used, frames from the base layer only reference other base layer frames.

For a subscriber with limited bandwidth, it is possible to send only the frames of a specific temporal layer, effectively reducing bandwidth.

Scalable Video Coding

Scalable Video Coding (SVC) is a video compression standard that defines encoding of a high-quality video bitstream that also contains one or more subset bitstreams (a form of layered coding). A subset video bitstream is derived by dropping packets from the larger video to reduce the bandwidth required for the subset bitstream. The subset bitstream can represent a lower spatial resolution (smaller screen), lower temporal resolution (lower frame rate), or lower quality video signal.

References