Overview

The RTP (real-time transport protocol) defined in RFC3550 provides services for transferring real-time streams between two clients. There are two related streams RTP and RTCP, the RTP stream transfers payload data (e.g.audio packets), RTCP (RT control protocol) transfers control data to monitor quality of RTP. The RTCP is not mandatory and rarely used. In case of bi-directional transfer there is pair of streams (RTP+RTCP) in each direction. RTP runs typically on the top of the UDP protocol, the RTP stream has assigned an even port number, the RTCP port higher by 1. Opposite direction uses the same ports.

If both clients have public IP address it's trivial case.

 +----------------+             +----------------+
 | Client A       |     UDP     | Client B       |
 | IP     1.2.3.4 |   streams   | IP     5.6.7.8 |
 | RTP      10000 |------------>| 20000  RTP     |
 |                |<------------|                |
 | RTCP     10001 |------------>| 20001  RTCP    |
 |                |<------------|                |
 +----------------+             +----------------+

But when at least one client is hidden behind NAT there is not easy way for streams comming from public to private IP. Note even is highly probable that two neibourgh ports of private IP will be translated to the neibourgh ports we should predict also worser case. Packets comming from public client cannot reach private IP client until keyhole in NAT is opened by a packet comming from private IP client.

Client B is sending packets but they are barred at NAT box, resp. in this simple example client B even do not know IP where to send them.

 +----------------+           +----------------+           +----------------+
 | Client A       |    UDP    | NAT box        |    UDP    | Client B       |
 | IP 192.168.1.1 |  streams  | IP     1.2.3.4 |  streams  | IP     5.6.7.8 |
 | RTP      10000 |..........>|               X|..........>| 20000  RTP     |
 |                |<..........|                |<----------|                |
 | RTCP     10001 |..........>|               X|..........>| 20001  RTCP    |
 |                |<..........|                |<----------|                |
 +----------------+           +----------------+           +----------------+

Client A sends packets to 5.6.7.8:20000&20001, NAT box translates IP and ports (indicated the worst case, in real world NAT translates it most probably to 1.2.3.4:10000&10001).

 +----------------+           +----------------+           +----------------+
 | Client A       |    UDP    | NAT box        |    UDP    | Client B       |
 | IP 192.168.1.1 |  streams  | IP     1.2.3.4 |  streams  | IP     5.6.7.8 |
 | RTP      10000 |---------->|          30000 |---------->| 20000  RTP     |
 |                |<..........|                |<..........|                |
 | RTCP     10001 |---------->|          35000 |---------->| 20001  RTCP    |
 |                |<..........|                |<..........|                |
 +----------------+           +----------------+           +----------------+

Client B learns address where to send packets (1.2.3.4:30000&35000). Now the clients can communicate each other (until NAT keyhole is closed).

 +----------------+           +----------------+           +----------------+
 | Client A       |    UDP    | NAT box        |    UDP    | Client B       |
 | IP 192.168.1.1 |  streams  | IP     1.2.3.4 |  streams  | IP     5.6.7.8 |
 | RTP      10000 |---------->|          30000 |---------->| 20000  RTP     |
 |                |<----------|                |<----------|                |
 | RTCP     10001 |---------->|          35000 |---------->| 20001  RTCP    |
 |                |<----------|                |<----------|                |
 +----------------+           +----------------+           +----------------+

Of course also both clients may be behind NAT. The machanism to establish session is called signalling and next will be discussed usage in VoIP telephony using SIP (RFC3261) signalling. STUN is not considered.

Real example follows (SIP signalling is simplified). Caller Alice sends INVITE request to SIP router 1.1.1.1:5060 where provides own private addresses. Media streams are described in content using SDP (RFC 2327)

                                                           +------------------+
 +----------------+           +----------------+           | SIP proxy        |           +----------------+
 | Client A       |           | NAT box        |           | IP     1.1.1.1   |           | Client B       |
 | IP 192.168.1.1 |  UDP/TCP  | IP     1.2.3.4 |  UDP/TCP  |                  |  UDP/TCP  | IP     5.6.7.8 |
 | SIP      5060  |==========>|          5566  |==========>| 5060       5060  |           | 5060   SIP     |
 |                |           |                |           |                  |           |                |
 |                |           |                |           +------------------+           |                |
 |                |           |                |           | RTP proxy        |           |                |
 |                |    UDP    |                |    UDP    | IP     ???????   |    UDP    |                |
 |                |  streams  |                |  streams  |   A          B   |  streams  |                |
 | RTP      10000 |           |              ? |           | ????? RTP  ????? |           | ?????  RTP     |
 |                |           |              X |           |                  |           |                |
 | RTCP     10001 |           |              ? |           | ????? RTCP ????? |           | ?????  RTCP    |
 |                |           |              X |           |                  |           |                |
 +----------------+           +----------------+           | .....            |           +----------------+
                                                           +------------------+

  INVITE sip:bob@biloxi.com SIP/2.0
  Via: SIP/2.0/UDP 192.168.1.1;branch=z9hG4bK776asdhds
  Max-Forwards: 70
  To: Bob 
  From: Alice ;tag=1928301774
  Call-ID: a84b4c76e66710@pc33.atlanta.com
  CSeq: 314159 INVITE
  Contact: 
  Content-Type: application/sdp
  Content-Length: 142

v=0 o=user1 53655765 2353687637 IN IP4 test s=- c=IN IP4 192.168.1.1 t=0 0 m=audio 10000 RTP/AVP 0 a=rtpmap:0 PCMU/8000

SIP router receives packet from 1.2.3.4:5060 which does not corresponds to IP in Via and SIP router makes decision: "Alice's UE is behind the NAT", allocates RTP proxy resources and rewrites addresses. Request is sent to Bob's UE (contact known from registration).

                                                           +------------------+
 +----------------+           +----------------+           | SIP proxy        |           +----------------+
 | Client A       |           | NAT box        |           | IP     1.1.1.1   |           | Client B       |
 | IP 192.168.1.1 |  UDP/TCP  | IP     1.2.3.4 |  UDP/TCP  |                  |  UDP/TCP  | IP     5.6.7.8 |
 | SIP      5060  |<--------->|          5566  |<--------->| 5060       5060  |==========>| 5060   SIP     |
 |                |           |                |           |                  |           |                |
 |                |           |                |           +------------------+           |                |
 |                |           |                |           | RTP proxy        |           |                |
 |                |    UDP    |                |    UDP    | IP     1.1.1.1   |    UDP    |                |
 |                |  streams  |                |  streams  |   A          B   |  streams  |                |
 | RTP      10000 |           |                |           | 50000 RTP  50500 |           | ?????  RTP     |
 |                |<...........................X...........|                  |           |                |
 | RTCP     10001 |           |                |           | 50001 RTCP 50501 |           | ?????  RTCP    |
 |                |<..........|................X...........|                  |           |                |
 +----------------+           +----------------+           | .....            |           +----------------+
                                                           +------------------+

RTP session:

  A (Alice)        ports                   ports
  Source IP     RTP    RTCP  L  Gate IP  RTP    RTCP
  192.168.1.1   10000  10001 Y  1.1.1.1  50000  50001
  B (Bob)          ports                   ports
  Source IP     RTP    RTCP  L  Gate IP  RTP    RTCP
  ???????????   ?????  ????? N  1.1.1.1  50500  50501

Note: L = source address learning timeout

  INVITE sip:bob@biloxi.com SIP/2.0
  Via: SIP/2.0/UDP 1.2.3.4;branch=z9hG4bK776asbbhfru
  Via: SIP/2.0/UDP 192.168.1.1;rport=5566;received=1.2.3.4;branch=z9hG4bK776asdhds
  Max-Forwards: 69
  To: Bob 
  From: Alice ;tag=1928301774
  Call-ID: a84b4c76e66710@pc33.atlanta.com
  CSeq: 314159 INVITE
  Contact: 
  Content-Type: application/sdp
  Content-Length: 142

v=0 o=user1 53655765 2353687637 IN IP4 test s=- c=IN IP4 1.1.1.1 t=0 0 m=audio 50500 RTP/AVP 0 a=rtpmap:0 PCMU/8000

Bob replies "200 OK" and provides own contact and RTP address.

                                                           +------------------+
 +----------------+           +----------------+           | SIP proxy        |           +----------------+
 | Client A       |           | NAT box        |           | IP     1.1.1.1   |           | Client B       |
 | IP 192.168.1.1 |  UDP/TCP  | IP     1.2.3.4 |  UDP/TCP  |                  |  UDP/TCP  | IP     5.6.7.8 |
 | SIP      5060  |<--------->|          5566  |<--------->| 5060       5060  |<==========| 5060   SIP     |
 |                |           |                |           |                  |           |                |
 |                |           |                |           +------------------+           |                |
 |                |           |                |           | RTP proxy        |           |                |
 |                |    UDP    |                |    UDP    | IP     1.1.1.1   |    UDP    |                |
 |                |  streams  |                |  streams  |   A          B   |  streams  |                |
 | RTP      10000 |           |                |           | 50000 RTP  50500 |           | 20000  RTP     |
 |                |<...........................X...........|                  |<..........|                |
 | RTCP     10001 |           |                |           | 50001 RTCP 50501 |           | 20001  RTCP    |
 |                |<...........................X...........|                  |<..........|                |
 +----------------+           +----------------+           | .....            |           +----------------+
                                                           +------------------+

  SIP/2.0 200 OK
  Via: SIP/2.0/UDP 1.2.3.4;branch=z9hG4bK776asbbhfru
  Via: SIP/2.0/UDP 192.168.1.1;rport=5566;received=1.2.3.4;branch=z9hG4bK776asdhds
  To: Bob ;tag=a6c85cf
  From: Alice ;tag=1928301774
  Call-ID: a84b4c76e66710@pc33.atlanta.com
  CSeq: 314159 INVITE
  Contact: 
  Content-Type: application/sdp
  Content-Length: 184

v=0 o=- 53655765 2353687637 IN IP4 test s=- c=IN IP4 5.6.7.8 t=0 0 m=audio 20000 RTP/AVP 8 0 18 a=rtpmap:8 PCMA/8000 a=rtpmap:0 PCMU/8000 a=rtpmap:18 G729/8000

SIP router can take data related to transaction and rewrite SDP media to RTP proxy session allocated in INVITE.

                                                           +------------------+
 +----------------+           +----------------+           | SIP proxy        |           +----------------+
 | Client A       |           | NAT box        |           | IP     1.1.1.1   |           | Client B       |
 | IP 192.168.1.1 |  UDP/TCP  | IP     1.2.3.4 |  UDP/TCP  |                  |  UDP/TCP  | IP     5.6.7.8 |
 | SIP      5060  |<==========|          5566  |<==========| 5060       5060  |<--------->| 5060   SIP     |
 |                |           |                |           |                  |           |                |
 |                |           |                |           +------------------+           |                |
 |                |           |                |           | RTP proxy        |           |                |
 |                |    UDP    |                |    UDP    | IP     1.1.1.1   |    UDP    |                |
 |                |  streams  |                |  streams  |   A          B   |  streams  |                |
 | RTP      10000 |           |                |           | 50000 RTP  50500 |..........>| 20000  RTP     |
 |                |<...........................X...........|                  |<..........|                |
 | RTCP     10001 |           |                |           | 50001 RTCP 50501 |..........>| 20001  RTCP    |
 |                |<...........................X...........|                  |<..........|                |
 +----------------+           +----------------+           | .....            |           +----------------+
                                                           +------------------+

RTP session:

  A (Alice)        ports                   ports
  Source IP     RTP    RTCP  L  Gate IP  RTP    RTCP
  192.168.1.1   10000  10001 Y  1.1.1.1  50000  50001
  B (Bob)          ports                   ports
  Source IP     RTP    RTCP  L  Gate IP  RTP    RTCP
  5.6.7.8       20000  20001 N  1.1.1.1  50500  50501

  SIP/2.0 200 OK
  Via: SIP/2.0/UDP 192.168.1.1;rport=5566;received=1.2.3.4;branch=z9hG4bK776asdhds
  To: Bob ;tag=a6c85cf
  From: Alice ;tag=1928301774
  Call-ID: a84b4c76e66710@pc33.atlanta.com
  CSeq: 314159 INVITE
  Contact: 
  Content-Type: application/sdp
  Content-Length: 184

v=0 o=- 53655765 2353687637 IN IP4 test s=- c=IN IP4 1.1.1.1 t=0 0 m=audio 50000 RTP/AVP 8 0 18 a=rtpmap:8 PCMA/8000 a=rtpmap:0 PCMU/8000 a=rtpmap:18 G729/8000

The call is established and both sides have sufficient information.

                                                           +------------------+
 +----------------+           +----------------+           | SIP proxy        |           +----------------+
 | Client A       |           | NAT box        |           | IP     1.1.1.1   |           | Client B       |
 | IP 192.168.1.1 |  UDP/TCP  | IP     1.2.3.4 |  UDP/TCP  |                  |  UDP/TCP  | IP     5.6.7.8 |
 | SIP      5060  |<--------->|          5566  |<--------->| 5060       5060  |<--------->| 5060   SIP     |
 |                |           |                |           |                  |           |                |
 |                |           |                |           +------------------+           |                |
 |                |           |                |           | RTP proxy        |           |                |
 |                |    UDP    |                |    UDP    | IP     1.1.1.1   |    UDP    |                |
 |                |  streams  |                |  streams  |   A          B   |  streams  |                |
 | RTP      10000 |...........|          ????? |..........>| 50000 RTP  50500 |..........>| 20000  RTP     |
 |                |<...........................X...........|                  |<..........|                |
 | RTCP     10001 |...........|          ????? |..........>| 50001 RTCP 50501 |..........>| 20001  RTCP    |
 |                |<...........................X...........|                  |<..........|                |
 +----------------+           +----------------+           | .....            |           +----------------+
                                                           +------------------+

When call is established Bob and Alice can start talking, i.e. sending RTP stream data. E.g. Bob says "Hello, this's Bob" :-). RTP data should come from address provided in SDP because there is no NAT in the path. Data comming from different address are silently dropped unless learning timeout is set. Data are traversed to Alice's source address, it's in our case still local address but it does not matter. We must send it not to cause deadlock in case that two RTP proxies are used.

                                                           +------------------+
 +----------------+           +----------------+           | SIP proxy        |           +----------------+
 | Client A       |           | NAT box        |           | IP     1.1.1.1   |           | Client B       |
 | IP 192.168.1.1 |  UDP/TCP  | IP     1.2.3.4 |  UDP/TCP  |                  |  UDP/TCP  | IP     5.6.7.8 |
 | SIP      5060  |---------->|          5566  |---------->| 5060       5060  |---------->| 5060   SIP     |
 |                |           |                |           |                  |           |                |
 |                |           |                |           +------------------+           |                |
 |                |           |                |           | RTP proxy        |           |                |
 |                |    UDP    |                |    UDP    | IP     1.1.1.1   |    UDP    |                |
 |                |  streams  |                |  streams  |   A          B   |  streams  |                |
 | RTP      10000 |...........|          ????? |..........>| 50000 RTP  50500 |..........>| 20000  RTP     |
 |                |<...........................X<==========|                  |<==========|                |
 | RTCP     10001 |...........|          ????? |..........>| 50001 RTCP 50501 |..........>| 20001  RTCP    |
 |                |<...........................X<==========|                  |<==========|                |
 +----------------+           +----------------+           | .....            |           +----------------+
                                                           +------------------+

  A (Alice)        ports                   ports
  Source IP     RTP    RTCP  L  Gate IP  RTP    RTCP
  192.168.1.1   10000  10001 Y  1.1.1.1  50000  50001
  B (Bob)          ports                   ports
  Source IP     RTP    RTCP  L  Gate IP  RTP    RTCP
  5.6.7.8       20000  20001 N  1.1.1.1  50500  50501

Alice responds "Hello, this is Alice", RTP proxy receives RTP data traversed through NAT and notes source IP addresses in session. Unless always_learn feature is applied then learning of source address is disabled. Data comming to port 50000/50001 from address different than 1.2.3.4:30000/35000 will be silently dropped. Data are traversed to Bob having source address 1.1.1.1:50500/50501.

                                                           +------------------+
 +----------------+           +----------------+           | SIP proxy        |           +----------------+
 | Client A       |           | NAT box        |           | IP     1.1.1.1   |           | Client B       |
 | IP 192.168.1.1 |  UDP/TCP  | IP     1.2.3.4 |  UDP/TCP  |                  |  UDP/TCP  | IP     5.6.7.8 |
 | SIP      5060  |<--------->|          5566  |<--------->| 5060       5060  |<--------->| 5060   SIP     |
 |                |           |                |           |                  |           |                |
 |                |           |                |           +------------------+           |                |
 |                |           |                |           | RTP proxy        |           |                |
 |                |    UDP    |                |    UDP    | IP     1.1.1.1   |    UDP    |                |
 |                |  streams  |                |  streams  |   A          B   |  streams  |                |
 | RTP      10000 |==========>|          30000 |==========>| 50000 RTP  50500 |==========>| 20000  RTP     |
 |                |<..........|                |<..........|                  |<----------|                |
 | RTCP     10001 |==========>|          35000 |==========>| 50001 RTCP 50501 |==========>| 20001  RTCP    |
 |                |<..........|                |<..........|                  |<----------|                |
 +----------------+           +----------------+           | .....            |           +----------------+
                                                           +------------------+

  A (Alice)        ports                   ports
  Source IP     RTP    RTCP  L  Gate IP  RTP    RTCP
  1.2.3.4       30000  35000 N  1.1.1.1  50000  50001
  B (Bob)          ports                   ports
  Source IP     RTP    RTCP  L  Gate IP  RTP    RTCP
  5.6.7.8       20000  20001 N  1.1.1.1  50500  50501

Now bi-directional RTP/RTCP channel is established

                                                           +------------------+
 +----------------+           +----------------+           | SIP proxy        |           +----------------+
 | Client A       |           | NAT box        |           | IP     1.1.1.1   |           | Client B       |
 | IP 192.168.1.1 |  UDP/TCP  | IP     1.2.3.4 |  UDP/TCP  |                  |  UDP/TCP  | IP     5.6.7.8 |
 | SIP      5060  |<--------->|          5566  |<--------->| 5060       5060  |<--------->| 5060   SIP     |
 |                |           |                |           |                  |           |                |
 |                |           |                |           +------------------+           |                |
 |                |           |                |           | RTP proxy        |           |                |
 |                |    UDP    |                |    UDP    | IP     1.1.1.1   |    UDP    |                |
 |                |  streams  |                |  streams  |   A          B   |  streams  |                |
 | RTP      10000 |---------->|          30000 |---------->| 50000 RTP  50500 |---------->| 20000  RTP     |
 |                |<----------|                |<----------|                  |<----------|                |
 | RTCP     10001 |---------->|          35000 |---------->| 50001 RTCP 50501 |---------->| 20001  RTCP    |
 |                |<----------|                |<----------|                  |<----------|                |
 +----------------+           +----------------+           | .....            |           +----------------+
                                                           +------------------+

RTP session is valid until call is terminated (BYE) or no packets passes through for expiration timeout (the NAT keyhole will be closed). Such session is marked as destroyed or expired and cannot be used for another call until resurrection timeout elapses. This feature should decrease probability that delayed packets from previous session confuse RTP proxy.

Note make decision if callee UE (Bob) is behind the NAT based on "200 OK" response is not so easy because there is no Via telling info about previous hop. We can pressume e.g. that local address (RFC1918) in contact means NAT in the path or we can know it from registration. If we put RTP proxy in media path we must pass info about RTP proxy back to callee (re-invite).

Note more streams per call session, e.g. audio and video) may be required, we'll allocate one RTPPROXY session per stream.

Switchboard

The RTP proxy has dedicated port range at one or two interfaces that will be used for RTP traversal/relay. The range of dedicated ports at particular IP is called gate'. Because we have two clients in RTP/RTCP we must have also two gates. Two gates are permanently connected at the switchboard'. One or two IP addresses may be used for connected gates. Each of RTP/RTCP clients will send packets to gate's IP:port and receive packet from the same gate's IP:port as well. It's called the session'. The switchbord is responsible for routing to opposite client.

Incomming packets must be passed to lib_RTPPROXY in PREROUTING(&OUTPUT) phase to change destination address and outgoing packets in POSTROUTING(&INPUT) phase to change source address.

Session states

State describes approximately current session state. RTPPROXY provides statistics how many sessions are in partucular state. Note state is set if at least one of the stream fits condition. State is changed by traversed RTP packets or by alloc/update/delete session command.

NONE

ready for allocation

INIT1

session allocated, A/B(caller/callee) source address is known or learning timeout is set

INIT2

source addresses of both A (caller) and B (callee) are known or learning timeouts are set

FORWARD1

at least one packet was forwarded from A->B or B->A

FORWARD2

packets was forwarded in both directions

EXPIRED

no packet was forwarded for switchboard expiration timeout or learning timeout elapsed

STALED

Maximum duration of session has expired.

DESTROYED

session was destroyed by regular way using sockopt command

Session is changed from EXPIRED/DESTROYED to NONE after switchboard resurrection timeout. It's forces by switchboard audit globally for all sessions or when searching free session for allocation.

HA support

Two techniques:

  1. RTP proxy should replicate session state to standby machine. TODO
  2. Two RTP proxies must allocate sessions in two non-overlapping port ranges for active (50000-51000) and standby machine (51000-52000) and at moment when standby machine is going active make all session 50000-51000 to learn source addresses of incomming packets. But new sessions will be allocated from 51000-52000 range. All calls can continue uninterrupted if both sides sending data.

Vulnerability

RTP proxy is vulnerable for a while when is waiting for data to learn source address. We should keep period as short as possible to descrese probability of call hijacking. the only "safe" ways to run SIP behind NATs requires either, encryption (e.g. SRTP), some NAT traversal mechanism on the clients (e.g. ICE) or an ALG within the client's own NAT. (Rémi Denis-Courmont).

Netfilter

Why netfilter? Because RTP proxy needn't do any data manipulation, it need only traverse it as fast as possible not to increase latency. Why to create a userspace proxy and copy data twice between kernel space and user space ?

RTP proxy is implemented as xt_RTPPROXY kernel module, libipt_RTPPROXY library for iptables and also providing API to other applications (e.g.SIP router) and iptrtpproxy tool for session manipulation.

Note that RTPPROXY is not the NAT, the NAT uses one connection tracking session while RTPPROXY has two tracking session per call and also uses learning and expiration logic. Packets are manipulated in mangle table. The module is written as patch-o-matic-ng.

Note RTPPROXY currently does not support proxiing from local IP to local IP on the same machine. That due to netfilter connection tracking bug (or lack of API for such crazy case).

Performance

I compared closed source RTP proxy working in user space and the xt_RTPPROXY. One machine generated RTP traffic - saturated 100Mbit network, at second machine was RTPPROXY running and routed it to third machine where I could see incomming traffic cca 95MBits.

Key parameters were dumped at RTP proxy machine, load average (top) and delay between incomming and outgoing packets (tcpdump). There are following differences.

Parameteruser space proxyxt_RTPPROXY
us (User CPU time)1.2%0%
sy (System CPU time)4.7%0%
si (Software Interrupts)5.2%2.8-3.3%
1min load average (2 CPU machine)up to 0.200
Delay measured at RTPPROXY4-62mms, average 27mms4-14mms, average 5.8mms

Jitter is probably slightly increased in case of userspace proxy because more packets may be processed together in one timeslice and it causes random delay (microseconds).

See also

Sip Express Router (SER), experimental module "iptrtpproxy"