Cheatsheets

Personal collection of cheatsheets.

WebRTC

Web Real Time Communication (WebRTC) is a free and open-source project providing web browsers and mobile applications with real-time peer-to-peer communications. Its specification is still an ongoing work as a cooperative effort between the World Wide Web Consortium (W3C) defining the APIs and the Internet Engineering Task Force(IETF) standardizing the protocols.

Index

Signaling

WebRTC uses a peer-to-peer distributed architecture. Although public APIs and protocols are standardized, the initial negotiation and communication establishment is up to the application to implement. This initial handshaking should take care of simple stuff, such as letting one peer know when the other is calling, or more complex stuff like establishing a unique session between two peers and sharing offers, answers and candidates. The part of the application that is in charge of these is called the signaling server.

A signaling server should handle:

It is up to the application to ensure that this out-of-bounds communication is performed securely and accessible for both peers. However, there is a draft proposing a signaling protocol for media ingestion called WebRTC-HTTP ingestion protocol (WHIP). This protocol aims to to solve the need in the broadcast industry of a standard WebRTC signaling protocol for stream ingestion on media servers.

Connectivity

Interactive Connectivity Establishment (ICE) is a protocol for Network Address Translator (NAT) traversal used in computer networking to find ways for two computers to talk to each other as directly as possible in peer-to-peer networking.

In a real world scenario, establishing a WebRTC connection between 2 peers, caller and callee, using ICE has the following steps:

1. Address discovery

Each peer is located in a LAN behind a NAT and has a private address, to discover their public addresses each peer uses the Session Traversal Utilities for NAT (STUN) server.

2. Caller relay allocation

The caller allocates a connection in the Traversal Using Relays around NAT (TURN) server. The TURN server relays the data between two peers when a direct connection is not possible.

3. Caller sends offer

The caller sends a connection offer to the callee using a signaling server (both peers are already registered in the signaling server).

4. Callee relay allocation

The callee receives the offer and allocates a connection in the TURN server.

5. Callee sends answer

The callee sends a connection answer to the caller using the signaling server.

6. Candidate exchange

During the offer/answer process, each peer gathers candidates to be used for ICE. Each candidate is a potential address/port to receive the data. There are 3 types of candidates:

After each candidate is gathered, the candidate is exchanged with the other peer via the offer/answer or standalone using trickle ICE.

7a. Check direct connection

Each peer has an ICE agent making connectivity checks:

This process may produce additional candidates known as peer reflexive candidates. This happens when there is a symmetric NAT in between peers. During the connectivity check process, a STUN request is sent directly to the peer, which can generate a brand new binding. If it does, the STUN response is sent back informing the originating peer that a new binding was formed. This allows peers to have a direct media path between them, even in the presence of a symmetric NAT.

NAT Type STUN support
Full Cone NAT Yes
Address Restricted Cone NAT Yes
Port Restricted Cone NAT Yes
Symmetric NAT No

7b. Use relay connection

When a direct connection is not possible, the relay candidates are used. TURN servers are guaranteed to work because they are publicly available, unless NATs are specifically configured to block them.

A complete message flow of a peer to peer connection establishment is shown in the diagram below:

Media

WebRTC establishes a baseline set of codecs which all compliant applications are required to support. Applications may choose to allow other codecs as well. The minimum codecs required are:

Media streams (audio and video) are delivered through Real-time Transport Protocol (RTP). This protocol was designed to ensure timely and ordered packet arrival while tolerating data loss due to unreliable channels. RTP is usually used in conjunction with Real-time Transport Control Protocol (RTCP), which provides statistics, quality-of-service and synchronization data to the participants of the session.

Some of the packets sent using RTCP are:

Session Description Protocol (SDP) is the protocol used to represent the media capabilities of each peer. SDP is already used in other protocols like Real Time Streaming Protocol (RTSP) or Session Initiation Protocol (SIP) in streaming applications such as voice over IP (VoIP).

A SDP is generated and sent from each peer during the offer/answer process. A SDP has the following structure:

Session

v=  (protocol version number, currently only 0)
o=  (originator and session identifier: username, id, version number, network address)
s=  (session name: mandatory with at least one UTF-8-encoded character)
i=* (session title or short information)
u=* (URI of description)
e=* (zero or more email address with optional name of contacts)
p=* (zero or more phone number with optional name of contacts)
c=* (connection information—not required if included in all media)
b=* (zero or more bandwidth information lines)
One or more time descriptions ("t=" and "r=" lines; see below)
z=* (time zone adjustments)
k=* (encryption key)
a=* (zero or more session attribute lines)
Zero or more Media descriptions (each one starting by an "m=" line; see below)

Time

t=  (time the session is active)
r=* (zero or more repeat times)

Media

m=  (media name and transport address)
i=* (media title or information field)
c=* (connection information — optional if included at session level)
b=* (zero or more bandwidth information lines)
k=* (encryption key)
a=* (zero or more media attribute lines — overriding the Session attribute lines)

Example 1

v=0
o=- 0 0 IN IP4 10.47.16.5
s=session9000
c=IN IP4 224.2.17.12/127
t=0 0
m=audio 8080 RTP/AVP 111
a=rtpmap:111 OPUS/48000
m=video 9090 RTP/AVP 96
a=rtpmap:96 VP8/90000

Example 2

v=0
o=jdoe 2890844526 2890842807 IN IP4 224.2.17.12
s=-
c=IN IP4 224.2.17.12
t=2873397496 2873404696
m=video 5004 RTP/AVP 96 97
a=rtpmap:96 VP8/90000
a=rtpmap:97 H264/90000

Example 3

v=0
o=- 0 0 IN IP4 127.0.0.1
s=-
c=IN IP4 127.0.0.1
t=0 0
m=audio 5006 RTP/AVP 111
a=rtpmap:111 OPUS/48000/2
a=fmtp:111 minptime=10;useinbandfec=1
m=video 5004 RTP/AVP 96 98 102
a=rtcp:54321
a=rtpmap:96 VP8/90000
a=rtpmap:98 VP9/90000
a=rtpmap:102 H264/90000
a=fmtp:102 profile-level-id=42001f

Data

WebRTC lets you send text or binary data over an active connection to a peer, these connections are called data channels. The underlying data streams are delivered through Stream Control Transmission Protocol (SCTP). SCTP is a message-oriented transport protocol that ensures reliable, in-sequence transport of messages and congestion control. It differs from UDP and TCP in providing multi-homing and redundant paths to increase resilience and reliability.

UDP TCP SCTP
Reliability Unreliable Reliable Configurable
Delivery Unordered Ordered Configurable
Transmission Message-oriented Byte-oriented Message-oriented
Flow control No Yes Yes
Congestion control No Yes Yes

Security

Secure Real-time Transport Protocol (SRTP) and Secure Real-time Transport Control Protocol (SRTCP) allow secure data transmission for RTP and RTCP. SRTP enables RTP with authentication and encryption features, and may be disabled if desired, without the need of going back to pure RTP.

Media and data are transmited over Datagram Transport Layer Security (DTLS), which is based on Transport Layer Security (TLS). DTLS preserves the semantics of the underlying SRTP, SRTCP and SCTP but provides means of authentication, symmetric cryptography, privacy and integrity.

Profiling

Webcam

SDP

Connectivity

Bandwidth/Bitrate

  1. Run Google Chrome and go to:
chrome://webrtc-internals

  1. Select read stats from Legacy Non-Standard.


  1. Look for Stats graphs for bweforvideo (VideoBwe).


References

Articles

Libraries