view Side-By-Side changes
Internet Engineering Task Force Audio/Video Transport Working Group Internet Draft Schulzrinne/Casner/Frederick/Jacobsonietf-avt-rtp-new-00.txtietf-avt-rtp-new-01.txt ColumbiaU./Precept/Xerox/LBNL December 5, 1997 Expires: June 5,U./Cisco/Xerox/LBNL August 7, 1998 Expires: February 7, 1999 RTP: A Transport Protocol for Real-Time Applications STATUS OF THIS MEMO This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress''. Tolearnview thecurrent statusentire list ofany Internet-Draft,current Internet-Drafts, please check the ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa),nic.nordu.net (Europe),ftp.nordu.net (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific Rim),ds.internic.netftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. ABSTRACT This memorandum is a revision of RFC 1889 in preparation for advancement from Proposed Standard to Draft Standard status. Readers are encouraged to use the PostScript form of this draft to see where changes from RFC 1889 are marked by change bars. The revision process is not yet complete; some changes which have been discussed and tentatively accepted in meetings of the Audio/Video Transport working group have not yet been incorporated into this draft. This memorandum describes RTP, the real-time transport protocol. RTP provides end-to-end network transport functions suitable for applications transmitting real- time data, such as audio, video or simulation data, over multicast or unicast network services. RTP does not address resource reservation and does not guarantee Schulzrinne/Casner/Frederick/Jacobson [Page 1] Internet Draft RTPDecember 5, 1997August 7, 1998 quality-of-service for real-time services. The data transport is augmented by a control protocol (RTCP) to allow monitoring of the data delivery in a manner scalable to large multicast networks, and to provide minimal control and identification functionality. RTP and RTCP are designed to be independent of the underlying transport and network layers. The protocol supports the use of RTP-level translators and mixers. This specification is a product of the Audio/Video Transport working group within the Internet Engineering Task Force. Comments are solicited and should be addressed to the working group's mailing list at rem-conf@es.net and/or the authors.1 Introduction This memorandum specifies the real-time transport protocol (RTP), which provides end-to-end delivery services for data with real-time characteristics, such as interactive audio and video. Those services include payload type identification, sequence numbering, timestamping and delivery monitoring. Applications typically run RTP on top of UDP to make use of its multiplexing and checksum services; both protocols contribute partsResolution ofthe transport protocol functionality. However, RTP may be used with other suitable underlying network or transport protocols (see Section 10). RTP supports data transferOpen Issues [Note tomultiple destinations using multicast distribution if provided bytheunderlying network. Note that RTP itself does not provide any mechanismRFC Editor: This section is toensure timely delivery or provide other quality-of-service guarantees,be deleted when this draft is published as an RFC butrelies on lower-layer servicesis shown here for reference during the Last Call.] Readers are directed todo so. It does not guarantee delivery or prevent out-of-order delivery, nor does it assume thatAppendix B, Changes from RFC 1889, for a listing of theunderlying network is reliable and delivers packetschanges that have been made insequence.this draft. Thesequence numbers includedchanges are marked with change bars inRTP allowthereceiverPostScript form of this draft. The revisions in this draft are mostly complete for Working Group last call; the open issues have been addressed: o A fudge factor has been added toreconstructthesender's packet sequence, but sequence numbers might also be usedRTCP unconditional reconsideration algorithm todeterminecompensate for theproper location of a packet, for example in video decoding, without necessarily decoding packets in sequence. While RTPfact that it settles to a steady state bandwidth that isprimarily designedbelow the desired level. o A new "bin" mechanism has been added tosatisfytheneedsalgorithm for sampled storaged ofmulti- participant multimedia conferences, itSSRC identifiers to avoid a temporary underestimate in group size when the group size is decreasing. o The "reverse reconsideration" algorithm does notlimitedprevent the group size estimate from incorrectly dropping tothat particular application. Storagezero for a short time when most participants ofcontinuous data, interactive distributed simulation, active badge, and control and measurement applications may also find RTP applicable.a large session leave at once but some remain. Thisdocument defines RTP, consisting of two closely-linked parts:has just been noted as only a secondary concern. o Scaling of thereal-time transport protocol (RTP),minimum RTCP interval inversely proportional tocarry data thatthe session bandwidth parameter has been added, but only in the direction of smaller intervals for higher bandwidth. Scaling to longer intervals for low bandwidths would cause a problem Schulzrinne/Casner/Frederick/Jacobson [Page 2] Internet Draft RTPDecember 5, 1997 real-time properties. o the RTP control protocol (RTCP), to monitor the quality of service andAugust 7, 1998 because this is an optional step. Some participants might be timed out prematurely if they scaled toconvey information abouta longer interval while others kept theparticipants in an on-going session.nominal 5 seconds. Thelatter aspectbenefit ofRTCP may be sufficient for "loosely controlled" sessions, i.e., where there is no explicit membership control and set-up, but it isscaling longer was notnecessarily intended to support all of an application's control communication requirements. This functionality mayconsidered great in any case. o No change was specified for the jitter computation for media with several packets with the same timestamp. There is not a clear answer as to what should befullydone, orpartially subsumed bythat any change would make aseparate session control protocol, which is beyondsignificant improvement. o As proposed without objection at thescopeLos Angeles IETF, definition ofthis document. RTP represents a new styleadditional SDES items such as PHOTO URL and NICKNAME will be deferred to subsequent registration through IANA since that method has been established. This is in the spirit of minimizing changes to the protocolfollowingin theprinciples of application level framing and integrated layer processing proposed by Clark and Tennenhouse [1]. That is, RTP is intendedtransition from Proposed tobe malleableDraft. o Nothing was added about allowing a translator to add its own random offsets toprovidetheinformation required by a particular applicationsequence number andwill often be integrated into the application processing rathertimestamp fields because it would likely cause more trouble thanbeing implemented as a separate layer. RTP is a protocol framework that is deliberately not complete.good. 1 Introduction Thisdocumentmemorandum specifiesthose functions expected to be common across alltheapplications forreal-time transport protocol (RTP), which provides end-to-end delivery services for data with real-time characteristics, such as interactive audio and video. Those services include payload type identification, sequence numbering, timestamping and delivery monitoring. Applications typically run RTPwould be appropriate. Unlike conventionalon top of UDP to make use of its multiplexing and checksum services; both protocolsin which additional functions might be accommodated by makingcontribute parts of the transport protocolmore general or by adding an option mechanismfunctionality. However, RTP may be used with other suitable underlying network or transport protocols (see Section 10). RTP supports data transfer to multiple destinations using multicast distribution if provided by the underlying network. Note thatwould require parsing,RTP itself does not provide any mechanism to ensure timely delivery or provide other quality-of-service guarantees, but relies on lower-layer services to do so. It does not guarantee delivery or prevent out-of-order delivery, nor does it assume that the underlying network isintendedreliable and delivers packets in sequence. The sequence numbers included in RTP allow the receiver to reconstruct the sender's packet sequence, but sequence numbers might also betailored through modifications and/or additionsused to determine theheaders as needed. Examples are givenproper location of a packet, for example inSections 5.3 and 6.4.3. Therefore,video decoding, without necessarily decoding packets inadditionsequence. While RTP is primarily designed tothis document, a complete specificationsatisfy the needs ofRTP for amulti- participant multimedia conferences, it is not limited to that particularapplication will require one or more companion documents (see Section 12): o a profile specification document, which defines a setapplication. Storage ofpayload type codescontinuous data, interactive Schulzrinne/Casner/Frederick/Jacobson [Page 3] Internet Draft RTP August 7, 1998 distributed simulation, active badge, andtheir mapping to payload formats (e.g., media encodings). A profilecontrol and measurement applications may alsodefine extensions or modifications tofind RTP applicable. This document defines RTP, consisting of two closely-linked parts: o the real-time transport protocol (RTP), to carry data thatare specifichas real-time properties. o the RTP control protocol (RTCP), toa particular classmonitor the quality ofapplications. Typicallyservice and to convey information about the participants in anapplication will operate under only one profile. A profileon-going session. The latter aspect of RTCP may be sufficient foraudio"loosely controlled" sessions, i.e., where there is no explicit membership control andvideo dataset-up, but it is not necessarily intended to support all of an application's control communication requirements. This functionality may befound in the companion RFC 1890. o payload format specification documents, which define how a particular payload, such as an audiofully orvideo encoding,partially subsumed by a separate session control protocol, which isto be carried in RTP. A discussionbeyond the scope ofreal-time services and algorithms for their implementation as well as background discussion on somethis document. RTP represents a new style of protocol following the principles of application level framing and integrated layer processing proposed by Clark and Tennenhouse [1]. That is, RTPdesign decisions canis intended to befound in [2]. Several RTP applications, both experimentalmalleable to provide the information required by a particular application andcommercial, have Schulzrinne/Casner/Frederick/Jacobson [Page 3] Internet Draft RTP December 5, 1997 already beenwill often be integrated into the application processing rather than being implementedfrom draft specifications. These applications include audio and video tools along with diagnostic tools suchastraffic monitors. Users of these tools number in the thousands. However, the current Internet cannot yet supporta separate layer. RTP is a protocol framework that is deliberately not complete. This document specifies those functions expected to be common across all thefull potential demandapplications forreal-time services. High-bandwidth services using RTP, such as video, can potentially seriously degrade the quality of service of other network services. Thus, implementors should take appropriate precautions to limit accidental bandwidth usage. Application documentation should clearly outline the limitations and possible operational impact of high-bandwidth real- time services on the Internet and other network services. 1.1 Changes Most of this draft is identical to RFC 1889. The changes are listed below and are marked with change barswhich RTP would be appropriate. Unlike conventional protocols in which additional functions might be accommodated by making thePostScript form of this draft. This section may become an appendix when the draft is published asprotocol more general or by adding anupdated RFC, but itoption mechanism that would require parsing, RTP isincluded here at the front of the document at this pointintended to be tailored through modifications and/or additions toencourage feedback on these changes. o The algorithm for calculatingtheRTCP transmission interval specifiedheaders as needed. Examples are given in Sections6.2 and 6.35.3 andillustrated6.4.3. Therefore, inAppendix A.7 is augmented to include "reconsideration"addition tominimize transmission over the intended rate when many participants jointhis document, asession simultaneously, and "reverse reconsideration" to reduce the incidence and duration of false participant timeouts when the numbercomplete specification ofparticipants drops rapidly. oRTP for a particular application will require one or more companion documents (see Section6.3.7 specifies new rules controlling when an RTCP BYE packet should be sent in order to avoid12): o aflood of packets when many participants leaveprofile specification document, which defines asession simultaneously. Sections 7.2 and 7.3 specify that translatorsset of payload type codes andmixers should send BYE packets for the sources theytheir mapping to payload formats (e.g., media encodings). A profile may also define extensions or modifications to RTP that areno longer forwarding. o An algorithm is specified in Sections 6.3.3 and 6.3.4specific toallow storage of onlyasamplingparticular class ofthe participants' SSRC identifiers to allow scaling to very large sessions. o Rule changesapplications. Typically an application will operate under only one profile. A profile forlayered encodings are defined in Sections 2.4, 6.3.9, 8.3audio and10. o An indentation bugvideo data may be found in the companion RFC1889 printing of the pseudo-code for the collision detection and resolution algorithm in Section 8.21890. o payload format specification documents, which define how a particular payload, such as an audio or video encoding, iscorrected, and the algorithm has been modifiedtoremove the restriction that both RTP and RTCP must be sent from the same source port number. o For unicast RTP sessions, distinct port pairs maybeused forcarried in RTP. Schulzrinne/Casner/Frederick/Jacobson [Page 4] Internet Draft RTPDecember 5, 1997 the two ends (Sections 3August 7, 1998 A discussion of real-time services and7.1). o It is specified that a receiver MUST ignore packets with payload types it does not understand. o The referencealgorithms for their implementation as well as background discussion on some of theUTF-8 character set was changedRTP design decisions can be found in [2]. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2044. o Small clarifications2119 [3] and indicate requirement levels for compliant RTP implementations. 2 RTP Use Scenarios The following sections describe some aspects of thetext have been made in several places in response to questions from readers. In particular: -A definition for "RTP media type" is given in Section 3use of RTP. The examples were chosen toallowillustrate theexplanationbasic operation ofmultiplexing RTP sessions in Section 5.2applications using RTP, not to limit what RTP may bemore clear regarding the multiplexing of multiple media. -The descriptionused for. In these examples, RTP is carried on top of IP and UDP, and follows thesession bandwidth parameter is expanded in Section 6.2. -The method for padding RTCP packets is clarified in Section 6.4. -The methodconventions established by the profile forterminatingaudio andpadding a sequence of SDES items is clarified in Section 6.5. 1.2 Open Issues The revisionsvideo specified inthis draft are not yet complete; first, there are some open issues regardingthechanges that have been made: o The RTCP timer reconsideration algorithm settles to a steady state bandwidth that is belowcompanion RFC 1890 (updated by Internet-Draft draft-ietf-avt- profile-new ). 2.1 Simple Multicast Audio Conference A working group of thedesired level. CanIETF meets to discuss thealgorithm compensate for thislatest protocol draft, usinga fudge factor? o The algorithm for sampled storaged of SSRC identifiers results in a temporary underestimate in group size (and an increase intheRTCP rate) by a factorIP multicast services of1/2 or more whenthegroup size is decreasing such that the mask size also decreases. This may requireInternet for voice communications. Through some allocation mechanismto compensate. o The "reverse reconsideration" algorithm does not preventthe working groupsize estimate from incorrectly dropping to zero forchair obtains ashort time when most participantsmulticast group address and pair ofa large session leave at once but some remain. The algorithm does makeports. One port is used for audio data, and theestimate returnother is used for control (RTCP) packets. This address and port information is distributed to thecorrect value more rapidly. Itintended participants. If privacy is desired, the data and control packets may bepossible to use a filter to slow the decreaseencrypted as specified inthe estimate and prevent this problem, but that wouldSection 9.1, in which case an encryption key must alsoslow downbe generated and distributed. The exact details of these allocation and distribution mechanisms are beyond theincreasescope of RTP. The audio conferencing application used by each conference participant sends audio data in small chunks of, say, 20 ms duration. Each chunk of audio data is preceded by an RTP header; RTP header and data are in turn contained in a UDP packet. The RTP header indicates what type of audio encoding (such as PCM, ADPCM or LPC) is contained in each packet so that senders can change theestimateencoding during a conference, forsimultaneous joins, whichexample, to accommodate a new participant that is connected through aproblem.low-bandwidth link or react to indications of network congestion. The Internet, like other packet networks, occasionally loses and reorders packets and delays them by variable amounts of time. To cope with these impairments, the RTP header contains timing information Schulzrinne/Casner/Frederick/Jacobson [Page 5] Internet Draft RTPDecember 5, 1997 incorrect drop to zero may be deemed only a secondary concern. Second, there are also some changes which have been discussedAugust 7, 1998 andtentatively accepteda sequence number that allow the receivers to reconstruct the timing produced by the source, so that inmeetingsthis example, chunks of audio are contiguously played out the speaker every 20 ms. This timing reconstruction is performed separately for each source of RTP packets in the conference. The sequence number can also be used by the receiver to estimate how many packets are being lost. Since members of theAudio/Video Transportworking grouphave not yet been incorporated into this draft: o Allowing RTCP senderjoin andreceiver bandwidthsleave during the conference, it is useful tobe separate parametersknow who is participating at any moment and how well they are receiving the audio data. For that purpose, each instance of thesession rather thanaudio application in the conference periodically multicasts astrict percentagereception report plus the name of its user on thesession bandwidth.RTCP (control) port. Thedefaults would retainreception report indicates how well the currentvalues of 1.25%speaker is being received and3.75%. This change would allow rate-may be used to control adaptiveapplicationsencodings. In addition toset an RTCP bandwidth consistent with a "typical" data bandwidth that is lower than the maximum bandwidth specified bythesession bandwidth parameter. It woulduser name, other identifying information may alsoallow RTCP reception reports to be turned off entirely for operation on unidirectional links. Correspondingly, the text requiring transmission of RTCP for multicast sessions needs tobegeneralized. o Scaling the minimum RTCP interval inversely proportionalincluded subject tothe sessioncontrol bandwidthparameter: -to a larger value to help reducelimits. A site sends thespike size on a step joinRTCP BYE packet (Section 6.6) whenaccess links are slow (andit leaves thesession bandwidth is therefore low); -to provide sufficient time for a packet to arrive for conditional reconsideration; -toconference. 2.2 Audio and Video Conference If both audio and video media are used in asmaller value for high-rate multicastconference, they are transmitted as separate RTP sessionsto allowRTCP packets are transmitted forfaster inter-media synchronization. Since the simultaneous join floodeach medium using two different UDP port pairs and/or multicast addresses. There islargely a function of the ratio of network delays tono direct coupling at theminimum interval,RTP level between thevalueaudio and video sessions, except that a user participating in both sessions shouldnot be scaled much belowuse thecurrent 5 second minimumsame distinguished (canonical) name in the RTCP packets forreceivers. However, senders couldboth so that the sessions can beallowedassociated. One motivation for this separation is totransmit a higher RTCP bandwidth while still using the 5 second value when computingallow some participants in theinterval for timeoutsconference toavoid timing out receivers. A smaller valuereceive only one medium if they choose. Further explanation isalso appropriate for unicast sessions. o The text should consistently use the terms MUST, SHOULD, MAY as definedgiven inRFC 2119. Third, sinceSection 5.2. Despite thepublicationseparation, synchronized playback ofRFC 1889,a source's audio and video can be achieved using timing information carried in thefollowing changesRTCP packets for both sessions. 2.3 Mixers and Translators So far, we havebeen suggested but not yet discussed within the working group: o Forassumed that all sites want to receive mediawith several packets withdata in the sametimestamp, the jitter computation shouldformat. However, this may not always bedone only forappropriate. Consider the case where participants in onepacket (the first?).area are connected through a low-speed link to the majority of the conference participants who enjoy high-speed network access. Instead of forcing everyone to use a lower-bandwidth, reduced-quality audio encoding, an RTP-level relay called a mixer may be placed near the low-bandwidth area. This mixer resynchronizes incoming audio packets to reconstruct the constant 20 ms spacing generated by the sender, mixes these reconstructed audio streams into a single stream, translates the Schulzrinne/Casner/Frederick/Jacobson [Page 6] Internet Draft RTPDecember 5, 1997 o DefineAugust 7, 1998 audio encoding to aphoto URL item in SDES, whichlower-bandwidth one and forwards the lower- bandwidth packet stream across the low-speed link. These packets might beconstrainedunicast touse by senders only. Such an addition could cause severe web server overload by triggering many simultaneous requests if used inalargesingle recipient or multicastsession. oon a different address to multiple recipients. Thespecification of the NTP timestamp inRTP header includes a means for mixers to identify theRTCP SR section sayssources thatwhen "relative" NTP timestamps are used they shouldcontributed to a mixed packet so that correct talker indication can bebased on elapsed time fromprovided at thestartreceivers. Some of thesession. However, if the start times forintended participants in the audioand video sessions areconference may be connected with high bandwidth links but might notthe same, then the NTP timestamps won'tbeusable for synchronization. Should the basedirectly reachable via IP multicast. For example, they might bechanged to "system uptime," and if so, how shouldbehind an application-level firewall thatbe defined? o The padding mechanism for RTCP packets iswill notexactly the same as for RTPlet any IP packetsbecause of the compound packet structure. This waspass. For these sites, mixing may notexplained clearly enough, resulting in incorrect implementations. It is suggested that the current padding mechanism for RTCP packets (only)bedeprecated. In its place, a new RTCP packetnecessary, in which case another type"PAD" could be defined that is always to be ignored. That packet can take whatever length (in 32-bit words) is required for padding, assuming there is no need to pad to odd boundaries. The new mechanism wouldof RTP-level relay called a translator may bebackward compatible because older implementations should ignore the unknown PAD packet type. o It is specified that sources should add random offsets to the sequence number and timestamp fields to make known-plaintext attacksused. Two translators are installed, one onencryption more difficult, even ifeither side of thesource itself does not encrypt, becausefirewall, with the outside one funneling all multicast packetsmay flowreceived through atranslator that does. However,secure connection to the translatorcannot depend upon the source to do this. Shouldinside the firewall. The translatorbe allowedinside the firewall sends them again as multicast packets toadd its own random offsetsa multicast group restricted tothese fields andthecorresponding fields in RTCP packets? o The discussion of security issuessite's internal network. Mixers and translators mayneed tobeexpanded. In particular, it has been recommendeddesigned for a variety of purposes. An example is a video mixer that scales theconfidentiality mechanisms definedimages of individual people inthis document should follow the same overall format as the IPSEC ESP work, unless there is some compelling reason not to. 2 RTP Use Scenarios The following sections describe some aspectsseparate video streams and composites them into one video stream to simulate a group scene. Other examples of translation include theuseconnection ofRTP. The examples were chosen to illustrate the basic operationa group ofapplications using RTP, nothosts speaking only IP/UDP tolimit what RTP may be used for. In these examples, RTP is carried on topa group ofIP and UDP, and followshosts that understand only ST-II, or theconventions established bypacket-by-packet encoding translation of video streams from individual sources without resynchronization or mixing. Details of theprofile for audiooperation of mixers andvideo specifiedtranslators are given in Section 7. 2.4 Layered Encodings Multimedia applications should be able to adjust thecompanion RFC 1890 (updated by Internet-Draft draft-ietf-avt- Schulzrinne/Casner/Frederick/Jacobson [Page 7] Internet Draft RTP December 5, 1997 profile-new ). 2.1 Simple Multicast Audio Conference A working grouptransmission rate to match the capacity of theIETF meetsreceiver or todiscussadapt to network congestion. Many implementations place thelatest protocol draft, usingresponsibility of rate- adaptivity at theIPsource. This does not work well with multicastservicestransmission because of theInternet for voice communications. Through some allocation mechanism the working group chair obtains a multicast group address and pairconflicting bandwidth requirements ofports. One portheterogeneous receivers. The result isused for audio data, andoften a least-common denominator scenario, where theother is used for control (RTCP) packets. This address and port information is distributed tosmallest pipe in theintended participants. If privacy is desired,network mesh dictates thedataquality andcontrol packets may be encrypted as specified in Section 9.1, in which case an encryption key must also be generated and distributed. The exact detailsfidelity ofthese allocation and distribution mechanisms are beyondthescope of RTP. The audio conferencing application usedoverall live multimedia "broadcast". Instead, responsibility for rate-adaptation can be placed at the receivers byeach conference participant sends audio data in small chunks of, say, 20 ms duration. Each chunkcombining a layered encoding with a layered transmission system. In the context ofaudio data is preceded by anRTPheader; RTP header and data are in turn contained inover IP multicast, the source can stripe the progressive layers of aUDP packet. Thehierarchically represented signal across multiple RTPheader indicates what type of audio encoding (such as PCM, ADPCM or LPC) is contained insessions eachpacket so that senderscarried on its own multicast group. Receivers canchange the encoding during a conference, for example, to accommodate a new participant that is connected through a low-bandwidth link or reactthen adapt toindications ofnetworkcongestion. The Internet, like other packet networks, occasionally loses and reorders packetsheterogeneity anddelays themcontrol their reception bandwidth byvariable amountsjoining only the appropriate subset oftime. To cope with these impairments,the Schulzrinne/Casner/Frederick/Jacobson [Page 7] Internet Draft RTPheader contains timing information and a sequence number that allow the receivers to reconstruct the timing produced by the source, so that in this example, chunksAugust 7, 1998 multicast groups. Details ofaudio are contiguously played outthespeaker every 20 ms. This timing reconstruction is performed separately for each sourceuse of RTPpackets in the conference. The sequence number can also be used by the receiver to estimate how many packetswith layered encodings arebeing lost. Since members of the working group joingiven in Sections 6.3.9, 8.3 andleave during the conference, it is useful to know who is participating at any moment10. 3 Definitions RTP payload: The data transported by RTP in a packet, for example audio samples or compressed video data. The payload format andhow well theyinterpretation arereceivingbeyond theaudio data. For that purpose, each instancescope of this document. RTP packet: A data packet consisting of theaudio application in the conference periodically multicastsfixed RTP header, areception report plus the namepossibly empty list ofits user on the RTCP (control) port. The reception report indicates how well the current speaker is being receivedcontributing sources (see below), and the payload data. Some underlying protocols maybe used to control adaptive encodings. In additionrequire an encapsulation of the RTP packet to be defined. Typically one packet of theuser name, other identifying informationunderlying protocol contains a single RTP packet, but several RTP packets mayalsobeincluded subject to control bandwidth limits. A site sendscontained if permitted by the encapsulation method (see Section 10). RTCPBYEpacket: A control packet(Section 6.6) when it leaves the conference. Schulzrinne/Casner/Frederick/Jacobson [Page 8] Internet Draftconsisting of a fixed header part similar to that of RTPDecember 5, 1997 2.2 Audio and Video Conference If both audio and video mediadata packets, followed by structured elements that vary depending upon the RTCP packet type. The formats areuseddefined ina conference, they are transmitted as separate RTP sessionsSection 6. Typically, multiple RTCP packets aretransmitted for each medium using two different UDP port pairs and/or multicast addresses. There is no direct coupling at the RTP level between the audio and video sessions, except thatsent together as auser participatingcompound RTCP packet inboth sessions should usea single packet of thesame distinguished (canonical) nameunderlying protocol; this is enabled by the length field in the fixed header of each RTCPpackets for both sopacket. Port: The "abstraction that transport protocols use to distinguish among multiple destinations within a given host computer. TCP/IP protocols identify ports using small positive integers." [4] The transport selectors (TSEL) used by thesessions can be associated. One motivation for this separation isOSI transport layer are equivalent toallow some participants inports. RTP depends upon theconferencelower-layer protocol toreceive only one medium if they choose. Further explanation is given in Section 5.2. Despiteprovide some mechanism such as ports to multiplex theseparation, synchronized playback of a source's audioRTP andvideo can be achieved using timing information carried in theRTCP packetsfor both sessions. 2.3 Mixersof a session. Transport address: The combination of a network address andTranslators So far, we have assumedport thatall sites wantidentifies a transport-level endpoint, for example an IP address and a UDP port. Packets are transmitted from a source transport address toreceivea destination transport address. RTP mediadata intype: An RTP media type is thesame format. However, this may not alwayscollection of payload types which can beappropriate. Consider the case where participants in one area are connected throughcarried within alow-speed linksingle RTP session. The RTP Profile assigns RTP media types tothe majorityRTP payload types. RTP session: The association among a set ofthe conferenceparticipantswho enjoy high-speed network access. Instead of forcing everyone to use a lower-bandwidth, reduced-quality audio encoding, an RTP-level relay called a mixer may be placed near the low-bandwidth area. This mixer resynchronizes incoming audio packets to reconstructcommunicating with RTP. For each participant, theconstant 20 ms spacing generatedsession is defined bythe sender, mixes these reconstructed audio streams intoasingle stream, translates the audio encoding toparticular pair of destination transport addresses (one network address plus alower-bandwidth oneport pair for RTP andforwards the lower- bandwidth packet stream acrossRTCP). The Schulzrinne/Casner/Frederick/Jacobson [Page 8] Internet Draft RTP August 7, 1998 destination transport address pair may be common for all participants, as in thelow-speed link. These packets mightcase of IP multicast, or may be different for each, as in the case of individual unicasttonetwork addresses and port pairs. In asingle recipient or multicast onmultimedia session, each medium is carried in adifferent address toseparate RTP session with its own RTCP packets. The multiplerecipients.RTP sessions are distinguished by different port number pairs and/or different multicast addresses. Synchronization source (SSRC): The source of a stream of RTPheader includespackets, identified by ameans for mixers to identify32-bit numeric SSRC identifier carried in thesources that contributed to a mixed packetRTP header sothat correct talker indication canas not to beprovided at the receivers. Some of the intended participants independent upon theaudio conference may be connected with high bandwidth links but might not be directly reachable via IP multicast. For example, they might be behind an application-level firewall that will not let any IPnetwork address. All packetspass. For these sites, mixing may not be necessary, in which case another type of RTP-level relay calledfrom atranslator may be used. Two translators are installed, one on either sidesynchronization source form part of thefirewall, withsame timing and sequence number space, so a receiver groups packets by synchronization source for playback. Examples of synchronization sources include theoutside one funneling all multicastsender of a stream of packetsreceived throughderived from asecure connection to the translator inside the firewall. The translator inside the firewall sends them againsignal source such asmulticast packets toamulticast group restricted to the site's internal network. Schulzrinne/Casner/Frederick/Jacobson [Page 9] Internet Draftmicrophone or a camera, or an RTPDecember 5, 1997 Mixers and translatorsmixer (see below). A synchronization source may change its data format, e.g., audio encoding, over time. The SSRC identifier is a randomly chosen value meant to bedesignedglobally unique within a particular RTP session (see Section 8). A participant need not use the same SSRC identifier for all the RTP sessions in avarietymultimedia session; the binding ofpurposes. An examplethe SSRC identifiers is provided through RTCP (see Section 6.5.1). If avideo mixer that scales the images of individual people in separate videoparticipant generates multiple streamsand composites them intoin one RTP session, for example from separate videostream to simulatecameras, each must be identified as agroup scene. Other examples of translation include the connectiondifferent SSRC. Contributing source (CSRC): A source of agroupstream ofhosts speaking only IP/UDPRTP packets that has contributed to the combined stream produced by an RTP mixer (see below). The mixer inserts agrouplist ofhosts that understand only ST-II, orthepacket-by-packet encoding translationSSRC identifiers ofvideo streams from individualthe sourceswithout resynchronization or mixing. Detailsthat contributed to the generation of a particular packet into theoperationRTP header ofmixers and translators are given in Section 7. 2.4 Layered Encodings Multimedia applications should be able to adjustthat packet. This list is called thetransmission rateCSRC list. An example application is audio conferencing where a mixer indicates all the talkers whose speech was combined tomatchproduce thecapacity ofoutgoing packet, allowing the receiveror to adapttonetwork congestion. Many implementations placeindicate theresponsibility of rate- adaptivity atcurrent talker, even though all thesource. This does not work well with multicast transmission because ofaudio packets contain theconflicting bandwidth requirementssame SSRC identifier (that ofheterogeneous receivers. The result is often a least-common denominator scenario, where the smallest pipe in the network mesh dictatesthequality and fidelity ofmixer). End system: An application that generates theoverall live multimedia "broadcast". Instead, responsibility for rate-adaptation cancontent to beplaced at the receivers by combining a layered encoding with a layered transmission system. Insent in RTP packets and/or consumes thecontextcontent of received RTPover IP multicast, the sourcepackets. An end system canstripe the progressive layers of a hierarchically represented signal across multipleact as one or more synchronization sources in a particular RTPsessions each carried on its own multicast group. Receivers can then adapt to network heterogeneity and control their reception bandwidth by joiningsession, but typically only one. Mixer: An intermediate system that receives RTP packets from one or more sources, possibly changes theappropriate subset of the multicast groups. Details ofdata format, combines theuse of RTP with layered encodings are givenpackets inSections 6.3.9, 8.3some manner and10. 3 Definitions RTP payload: The data transported by RTP inthen forwards apacket, for example audio samples or compressed video data. The payload format and interpretation are beyond the scope of this document.new RTPpacket: A data packet consisting ofpacket. Since thefixed RTP header, a possibly empty list of contributingtiming among multiple input sources(see below), and the payload data. Some underlying protocols may require an encapsulation of the RTP packet towill not generally bedefined. Typically one packet ofsynchronized, theunderlying protocol contains a single RTP packet, but several RTP packets may be contained if permitted bymixer will make timing adjustments among theencapsulation method (see Section 10).Schulzrinne/Casner/Frederick/Jacobson [Page10]9] Internet Draft RTPDecember 5, 1997 RTCP packet: A control packet consisting of a fixed header part similar to that of RTP data packets, followed by structured elements that vary depending uponAugust 7, 1998 streams and generate its own timing for theRTCP packet type. The formats are defined in Section 6. Typically, multiple RTCPcombined stream. Thus, all data packetsare sent together asoriginating from acompound RTCP packet in a single packet ofmixer will be identified as having theunderlying protocol; this is enabledmixer as their synchronization source. Translator: An intermediate system that forwards RTP packets with their synchronization source identifier intact. Examples of translators include devices that convert encodings without mixing, replicators from multicast to unicast, and application- level filters in firewalls. Monitor: An application that receives RTCP packets sent bythe length fieldparticipants in an RTP session, in particular thefixed headerreception reports, and estimates the current quality ofeach RTCP packet. Port:service for distribution monitoring, fault diagnosis and long-term statistics. The"abstraction that transport protocols usemonitor function is likely todistinguish among multiple destinations withinbe built into the application(s) participating in the session, but may also be agiven host computer. TCP/IP protocols identify ports using small positive integers." [3] The transport selectors (TSEL) used byseparate application that does not otherwise participate and does not send or receive theOSI transport layerRTP data packets. These areequivalentcalled third party monitors. Non-RTP means: Protocols and mechanisms that may be needed in addition toports.RTPdepends upon the lower-layer protocolto providesome mechanism such as ports to multiplex the RTP and RTCP packets ofasession. Transport address: The combination ofusable service. In particular, for multimedia conferences, anetwork addressconference control application may distribute multicast addresses andport that identifies a transport-level endpoint,keys forexample an IP address and a UDP port. Packets are transmitted from a source transport address to a destination transport address. RTP media type: An RTP media type isencryption, negotiate thecollection of payload types which can be carried within a single RTP session. The RTP Profile assigns RTP media typesencryption algorithm to be used, and define dynamic mappings between RTP payloadtypes. RTP session: The association among a set of participants communicating with RTP. For each participant,type values and thesession is defined bypayload formats they represent for formats that do not have aparticular pair of destination transport addresses (one network address pluspredefined payload type value. For simple applications, electronic mail or aport pair for RTP and RTCP). The destination transport address pairconference database may also becommon for all participants, as in the caseused. The specification ofIP multicast, or may be different for each, as insuch protocols and mechanisms is outside thecasescope ofindividual unicast network addressesthis document. 4 Byte Order, Alignment, andport pairs. In a multimedia session, each medium isTime Format All integer fields are carried ina separate RTP session with its own RTCP packets. The multiple RTP sessions are distinguished by different port number pairs and/or different multicast addresses. Synchronization source (SSRC):network byte order, that is, most significant byte (octet) first. This byte order is commonly known as big-endian. Thesource of a stream of RTP packets, identified by a 32-bittransmission order is described in detail in [5]. Unless otherwise noted, numericSSRC identifier carriedconstants are inthe RTPdecimal (base 10). All headerso as notdata is aligned tobe dependent upon the network address. All packets from a synchronization source form part of the same timing and sequence number space, so a receiver groups packetsits natural length, i.e., 16-bit fields are aligned on even offsets, 32-bit fields are aligned at offsets divisible bysynchronization source for playback. Examples of synchronization sources includefour, etc. Octets designated as padding have thesender of a streamvalue zero. Wallclock time (absolute date and time) is represented using the timestamp format ofpackets derived from a signal source such as a microphone or a camera, or an RTP mixer (see below). A synchronization source may change its data format, e.g., audio encoding, over time.the Network Time Protocol (NTP), which is in seconds relative to 0h UTC on 1 January 1900 [6]. TheSSRC identifierfull resolution NTP timestamp is arandomly chosen value meant to be globally64-bit unsigned fixed-point number with the Schulzrinne/Casner/Frederick/Jacobson [Page11]10] Internet Draft RTPDecember 5, 1997 unique within a particular RTP session (see Section 8). A participant need not use the same SSRC identifier for all the RTP sessionsAugust 7, 1998 integer part ina multimedia session;thebinding offirst 32 bits and theSSRC identifiers is provided through RTCP (see Section 6.5.1). If a participant generates multiple streamsfractional part inone RTP session, for example from separate video cameras, each must be identified as a different SSRC. Contributing source (CSRC): A source ofthe last 32 bits. In some fields where astream of RTP packetsmore compact representation is appropriate, only the middle 32 bits are used; thathas contributed tois, thecombined stream produced by an RTP mixer (see below). The mixer inserts a listlow 16 bits of theSSRC identifiers ofinteger part and thesources that contributed tohigh 16 bits of thegenerationfractional part. The high 16 bits ofa particular packet intothe integer part must be determined independently. 5 RTP Data Transfer Protocol 5.1 RTP Fixed Header Fields The RTP headerof that packet. This list is calledhas the following format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC |M| PT | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | contributing source (CSRC) identifiers | | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The first twelve octets are present in every RTP packet, while the list of CSRClist. An example applicationidentifiers isaudio conferencing wherepresent only when inserted by amixer indicates all the talkers whose speech was combined to produce the outgoing packet, allowing the receiver to indicatemixer. The fields have thecurrent talker, even though allfollowing meaning: version (V): 2 bits This field identifies theaudio packets containversion of RTP. The version defined by this specification is two (2). (The value 1 is used by thesame SSRC identifier (thatfirst draft version of RTP and themixer). End system: An application that generatesvalue 0 is used by thecontent to be sentprotocol initially implemented inRTP packets and/or consumesthecontent of received RTP packets. An end system can act as"vat" audio tool.) padding (P): 1 bit If the padding bit is set, the packet contains one or moresynchronization sources in a particular RTP session, but typically only one. Mixer: An intermediate system that receives RTP packets from one or more sources, possibly changes the data format, combines the packets in some manner and then forwards a new RTP packet. Sinceadditional padding octets at thetiming among multiple input sources willend which are notgenerally be synchronized, the mixer will make timing adjustments amongpart of thestreams and generate its own timing forpayload. The last octet of thecombined stream. Thus, all data packets originating frompadding contains amixer willcount of how many padding octets should beidentified as having the mixer as their synchronization source. Translator: An intermediate system that forwardsignored, including itself. Padding may be needed by some encryption algorithms with fixed block sizes or for carrying several RTP packetswith their synchronization source identifier intact. Examples of translators include devices that convert encodings without mixing, replicators from multicast to unicast, and application- level filtersinfirewalls. Monitor: An application that receives RTCP packets senta lower-layer protocol data unit. extension (X): 1 bit If the extension bit is set, the fixed header is followed byparticipantsexactly one header extension, with a format defined inanSection Schulzrinne/Casner/Frederick/Jacobson [Page 11] Internet Draft RTPsession, in particular the reception reports, and estimatesAugust 7, 1998 5.3.1. CSRC count (CC): 4 bits The CSRC count contains thecurrent qualitynumber ofservice for distribution monitoring, fault diagnosis and long-term statistics.CSRC identifiers that follow the fixed header. marker (M): 1 bit Themonitor function is likely to be built intointerpretation of theapplication(s) participatingmarker is defined by a profile. It is intended to allow significant events such as frame boundaries to be marked in thesession, butpacket stream. A profile mayalso be a separate application that does not otherwise participate and does not senddefine additional marker bits orreceive the RTP data packets. These are called third party monitors. Schulzrinne/Casner/Frederick/Jacobson [Page 12] Internet Draft RTP December 5, 1997 Non-RTP means: Protocols and mechanismsspecify thatmay be neededthere is no marker bit by changing the number of bits inaddition to RTP to provide a usable service. In particular, for multimedia conferences, a conference control application may distribute multicast addresses and keys for encryption, negotiatetheencryption algorithm to be used, and define dynamic mappings between RTPpayload typevalues andfield (see Section 5.3). payload type (PT): 7 bits This field identifies the format of the RTP payloadformats they represent for formats that do not haveand determines its interpretation by the application. A profile specifies apredefineddefault static mapping of payload typevalue. For simple applications, electronic mail or a conference databasecodes to payload formats. Additional payload type codes mayalsobeused. The specificationdefined dynamically through non-RTP means (see Section 3). An initial set ofsuch protocolsdefault mappings for audio andmechanismsvideo isoutsidespecified in thescope of this document. 4 Byte Order, Alignment,companion RFC 1890 (updated by Internet-Draft draft-ietf-avt- profile-new ), andTime Format All integer fields are carriedmay be extended innetwork byte order, that is, most significant byte (octet) first. This byte order is commonly known as big-endian. The transmission order is described in detail in [4]. Unless otherwise noted, numeric constants are in decimal (base 10). All header data is aligned to its natural length, i.e., 16-bit fields are aligned on even offsets, 32-bit fields are alignedfuture editions of the Assigned Numbers RFC [7]. An RTP sender emits a single RTP payload type atoffsets divisibleany given time; this field is not intended for multiplexing separate media streams (see Section 5.2). A receiver MUST ignore packets with payload types that it does not understand. sequence number: 16 bits The sequence number increments by one for each RTP data packet sent, and may be used byfour, etc. Octets designated as padding havethe receiver to detect packet loss and to restore packet sequence. The initial valuezero. Wallclock time (absolute time) is represented using the timestamp formatof theNetwork Time Protocol (NTP), which is in seconds relativesequence number SHOULD be random (unpredictable) to0h UTCmake known-plaintext attacks on1 January 1900 [5]. The full resolution NTP timestamp is a 64-bit unsigned fixed-point number with the integer part inencryption more difficult, even if thefirst 32 bits andsource itself does not encrypt according to thefractional partmethod in Section 9.1, because thelast 32 bits. In some fields wherepackets may flow through amore compact representation is appropriate, only the middletranslator that does. Techniques for choosing unpredictable numbers are discussed in [8]. timestamp: 32 bitsare used; that is,The timestamp reflects thelow 16 bitssampling instant of theinteger part and the high 16 bits offirst octet in thefractional part.RTP data packet. Thehigh 16 bitssampling instant must be derived from a clock that increments monotonically and linearly in time to allow synchronization and jitter calculations (see Section 6.4.1). The resolution of theinteger partclock must bedetermined independently. 5 RTP Data Transfer Protocol 5.1 RTP Fixed Header Fieldssufficient for the desired synchronization accuracy and for measuring packet arrival jitter (one tick per video frame is typically not sufficient). TheRTP header hasclock frequency is dependent on thefollowing format:format of Schulzrinne/Casner/Frederick/Jacobson [Page13]12] Internet Draft RTPDecember 5, 1997 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC |M| PT | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | contributing source (CSRC) identifiers | | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The first twelve octets are present in every RTP packet, while the list of CSRC identifiersAugust 7, 1998 data carried as payload and ispresent only when inserted by a mixer. The fields have the following meaning: version (V): 2 bits This field identifiesspecified statically in theversion of RTP. The version defined by thisprofile or payload format specificationis two (2). (The value 1 is used bythat defines thefirst draft version offormat, or may be specified dynamically for payload formats defined through non-RTP means. If RTPandpackets are generated periodically, thevalue 0nominal sampling instant as determined from the sampling clock isused byto be used, not a reading of theprotocol initially implemented in the "vat"system clock. As an example, for fixed-rate audiotool.) padding (P): 1 bit If the padding bit is set,thepacket containstimestamp clock would likely increment by oneor more additional padding octets at the end which are not part offor each sampling period. If an audio application reads blocks covering 160 sampling periods from thepayload. The last octet ofinput device, thepadding contains a count of how many padding octets should be ignored, including itself. Padding maytimestamp would beneededincreased bysome encryption algorithms with fixed block sizes or160 forcarrying several RTP packets in a lower-layer protocol data unit. extension (X): 1 bit If the extension bit is set,each such block, regardless of whether thefixed headerblock isfollowed by exactly one header extension, with a format definedtransmitted inSection 5.3.1. CSRC count (CC): 4 bits The CSRC count contains the number of CSRC identifiers that follow the fixed header. marker (M): 1 bita packet or dropped as silent. Theinterpretationinitial value of themarker is defined by a profile. Ittimestamp isintended to allow significant events suchrandom, asframe boundariesfor the sequence number. Several consecutive RTP packets may have equal timestamps if they are (logically) generated at once, e.g., belong tobe marked inthepacket stream. A profilesame video frame. Consecutive RTP packets maydefine additional marker bits or specifycontain timestamps thatthereare not monotonic if the data isno marker bit by changingnot transmitted in thenumber of bitsorder it was sampled, as in thepayload type field (see Section 5.3). Schulzrinne/Casner/Frederick/Jacobson [Page 14] Internet Draft RTP December 5, 1997 payload type (PT): 7case of MPEG interpolated video frames. (The sequence numbers of the packets as transmitted will still be monotonic.) SSRC: 32 bitsThisThe SSRC field identifies theformat ofsynchronization source. This identifier is chosen randomly, with the intent that no two synchronization sources within the same RTPpayload and determines its interpretation bysession will have theapplication. A profile specifies a default static mapping of payload type codes to payload formats. Additional payload type codes may be defined dynamically through non-RTP means (see Section 3).same SSRC identifier. Aninitial set of default mappingsexample algorithm foraudio and videogenerating a random identifier isspecifiedpresented in Appendix A.6. Although thecompanion RFC 1890 (updated by Internet-Draft draft-ietf-avt- profile-new ), and may be extended in future editionsprobability of multiple sources choosing theAssigned Numbers RFC [6]. An RTP sender emits a single RTP payload type at any given time; this fieldsame identifier isnot intended for multiplexing separate media streams (see Section 5.2). A receiver MUST ignore packets with payload types that it does not understand. sequence number: 16 bits The sequence number increments by one for eachlow, all RTPdata packet sent, and mayimplementations must beused by the receiverprepared to detectpacket lossandto restore packet sequence. The initial value of the sequence number is random (unpredictable) to make known-plaintext attacks on encryption more difficult, even if the source itself does not encrypt, becauseresolve collisions. Section 8 describes thepackets may flow throughprobability of collision along with atranslator that does. Techniquesmechanism forchoosing unpredictable numbers are discussed in [7]. timestamp: 32 bits The timestamp reflectsresolving collisions and detecting RTP-level forwarding loops based on thesampling instantuniqueness of thefirst octet in the RTP data packet. The sampling instantSSRC identifier. If a source changes its source transport address, it mustbe derived fromalso choose aclock that increments monotonically and linearly in timenew SSRC identifier toallow synchronization and jitter calculationsavoid being interpreted as a looped source (see Section6.4.1).8.2). CSRC list: 0 to 15 items, 32 bits each Theresolution ofCSRC list identifies theclock must be sufficientcontributing sources for thedesired synchronization accuracy and for measuring packet arrival jitter (one tick per video frame is typically not sufficient).payload contained in this packet. Theclock frequency is dependent on the formatnumber ofdata carried as payload andidentifiers isspecified statically in the profile or payload format specification that definesgiven by theformat, orCC field. If there are more than 15 contributing sources, only 15 may bespecified dynamically for payload formats defined through non-RTP means. If RTP packetsidentified. CSRC identifiers aregenerated periodically, the nominal sampling instant as determined frominserted by mixers, using thesampling clock is to be used, not a readingSSRC identifiers ofthe system clock. As ancontributing sources. For example, forfixed-rate audio the timestamp clock would likely increment by one for each sampling period. If anaudioapplication reads blocks covering 160 sampling periods from the input device,packets thetimestamp would be increased by 160 for each such block, regardlessSSRC identifiers ofwhether the block is transmitted in a packet or dropped as silent.all sources that were mixed together to create a packet are listed, allowing correct talker indication at the receiver. 5.2 Multiplexing RTP Sessions Schulzrinne/Casner/Frederick/Jacobson [Page15]13] Internet Draft RTPDecember 5, 1997 The initial valueAugust 7, 1998 For efficient protocol processing, the number of multiplexing points should be minimized, as described in thetimestampintegrated layer processing design principle [1]. In RTP, multiplexing israndom, as forprovided by thesequence number. Several consecutivedestination transport address (network address and port number) which define an RTPpackets may have equal timestamps if they are (logically) generated at once, e.g., belong to the samesession. For example, in a teleconference composed of audio and videoframe. Consecutivemedia encoded separately, each medium should be carried in a separate RTPpackets may contain timestamps that are not monotonic if the datasession with its own destination transport address. It is nottransmitted in the order it was sampled, as inintended that thecase of MPEG interpolatedaudio and videoframes. (The sequence numbers of the packets as transmitted will stillstreams bemonotonic.) SSRC: 32 bits The SSRC field identifiescarried in a single RTP session and demultiplexed based on thesynchronization source. This identifier is chosen randomly,payload type or SSRC fields. Interleaving packets with different RTP media types but using theintent that nosame SSRC would introduce several problems: 1. If, say, twosynchronization sources withinaudio streams shared the same RTP sessionwill haveand the same SSRCidentifier. An example algorithm for generatingvalue, and one were to change encodings and thus acquire arandom identifier is presented in Appendix A.6. Although the probability of multiple sources choosing the same identifier is low, alldifferent RTPimplementations mustpayload type, there would beprepared to detect and resolve collisions. Section 8 describes the probability of collision along with a mechanism for resolving collisions and detecting RTP-level forwarding loops based on the uniquenessno general way ofthe SSRC identifier. If a source changes its source transport address, it must also choose a newidentifying which stream had changed encodings. 2. An SSRCidentifieris defined toavoid being interpreted asidentify alooped source (see Section 8.2). CSRC list: 0 to 15 items, 32 bits each The CSRC list identifies the contributing sources forsingle timing and sequence number space. Interleaving multiple payload types would require different timing spaces if the media clock rates differ and would require different sequence number spaces to tell which payloadcontained in this packet.type suffered packet loss. 3. The RTCP sender and receiver reports (see Section 6.4) can only describe one timing and sequence numberof identifiers is given by the CCspace per SSRC and do not carry a payload type field.If there are more than 15 contributing sources, only 15 may4. An RTP mixer would not beidentified. CSRC identifiers are inserted by mixers, usingable to combine interleaved streams of incompatible media into one stream. 5. Carrying multiple media in one RTP session precludes: theSSRC identifiersuse ofcontributing sources. For example,different network paths or network resource allocations if appropriate; reception of a subset of the media if desired, for example just audiopacketsif video would exceed theSSRC identifiers of all sources that were mixed together to create a packet are listed, allowing correct talker indication at the receiver. 5.2 Multiplexing RTP Sessions For efficient protocol processing, the number of multiplexing points should be minimized, as described in the integrated layer processing design principle [1]. In RTP, multiplexing is provided by the destination transport address (network addressavailable bandwidth; andport number) which define anreceiver implementations that use separate processes for the different media, whereas using separate RTPsession. For example, insessions permits either single- or multiple-process implementations. Using ateleconference composed of audio and video media encoded separately,different SSRC for each mediumshould be carriedbut sending them ina separatethe same RTP sessionwith its own destination transport address. It iswould avoid the first three problems but notintended thattheaudio and video streams be carried in a single RTP session and demultiplexed based onlast two. 5.3 Profile-Specific Modifications to thepayload type or SSRC fields. Interleaving packets with differentRTPmedia types but usingHeader The existing RTP data packet header is believed to be complete for thesame SSRC would introduce several problems:set of functions required in common across all the application Schulzrinne/Casner/Frederick/Jacobson [Page16]14] Internet Draft RTPDecember 5, 1997 1. If, say, two audio streams shared the sameAugust 7, 1998 classes that RTPsession andmight support. However, in keeping with thesame SSRC value,ALF design principle, the header may be tailored through modifications or additions defined in a profile specification while still allowing profile-independent monitoring andone wererecording tools tochange encodingsfunction. o The marker bit andthus acquire a different RTPpayloadtype, there would be no general way of identifying which stream had changed encodings. 2. An SSRC is definedtype field carry profile-specific information, but they are allocated in the fixed header since many applications are expected toidentify a single timingneed them andsequence number space. Interleaving multiple payload types would requiremight otherwise have to add another 32-bit word just to hold them. The octet containing these fields may be redefined by a profile to suit differenttiming spaces ifrequirements, for example with a more or fewer marker bits. If there are any marker bits, one should be located in themedia clock rates differ and would require different sequence number spacesmost significant bit of the octet since profile-independent monitors may be able totell which payload type sufferedobserve a correlation between packetloss. 3. The RTCP sender and receiver reports (see Section 6.4) can only describe one timing and sequence number space per SSRCloss patterns anddo not carrythe marker bit. o Additional information that is required for a particular payloadtype field. 4. An RTP mixer would notformat, such as a video encoding, should beable to combine interleaved streamscarried in the payload section ofincompatible media into one stream. 5. Carrying multiple mediathe packet. This might be inone RTP session precludes:a header that is always present at theusestart ofdifferent network pathsthe payload section, ornetwork resource allocations if appropriate; reception ofmight be indicated by asubsetreserved value in the data pattern. o If a particular class of applications needs additional functionality independent of payload format, themedia if desired, for example just audio if video would exceedprofile under which those applications operate should define additional fixed fields to follow immediately after theavailable bandwidth;SSRC field of the existing fixed header. Those applications will be able to quickly andreceiver implementations that use separate processes fordirectly access thedifferent media, whereas using separate RTP sessions permits either single-additional fields while profile-independent monitors ormultiple-process implementations. Using a different SSRC for each medium but sending them inrecorders can still process thesameRTPsession would avoidpackets by interpreting only the firstthree problems but not the last two. 5.3 Profile-Specific Modifications to the RTP Header The existing RTP data packet headertwelve octets. If it turns out that additional functionality isbelieved to be complete for the set of functions requiredneeded in common across allthe application classes thatprofiles, then a new version of RTPmight support. However, in keeping with the ALF design principle, the header mayshould betailored through modifications or additionsdefinedinto make aprofile specification while still allowing profile-independent monitoring and recording toolspermanent change tofunction. o The marker bit and payload type field carry profile-specific information, but they are allocated inthe fixedheader since many applications are expectedheader. 5.3.1 RTP Header Extension An extension mechanism is provided toneed them and might otherwise haveallow individual implementations toadd another 32-bit word justexperiment with new payload-format-independent functions that require additional information tohold them. The octet containing these fieldsbe carried in the RTP data packet header. This mechanism is designed so that the header extension may beredefinedignored bya profile to suit different requirements,other interoperating implementations that have not been extended. Note that this header extension is intended only forexample with a more or fewer marker bits. If there are any marker bits, one shouldlimited use. Most potential uses of this mechanism would belocatedbetter done another way, using the methods described in the previous section. For example, a profile-specific extension to the fixed header is less Schulzrinne/Casner/Frederick/Jacobson [Page17]15] Internet Draft RTPDecember 5, 1997 the most significant bit of the octet since profile-independent monitors may be ableAugust 7, 1998 expensive toobserveprocess because it is not conditional nor in acorrelation between packet loss patterns and the marker bit. ovariable location. Additional informationthat isrequired for a particular payloadformat, such as a video encoding,format should not use this header extension, but should be carried in the payload section of the packet.This might be0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | defined by profile | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | header extension | | .... | If the X bit in the RTP header is one, a variable-length headerthatextension isalways present atappended to thestart ofRTP header, following thepayload section, or might be indicated byCSRC list if present. The header extension contains areserved value in16-bit length field that counts thedata pattern. o If a particular class of applications needs additional functionality independentnumber ofpayload format,32-bit words in theprofile under which those applications operate should define additional fixed fields to follow immediately after the SSRC field of the existing fixed header. Those applications will be able to quickly and directly access the additional fields while profile-independent monitors or recorders can still process the RTP packets by interpreting onlyextension, excluding thefirst twelve octets. If it turns out that additional functionalityfour-octet extension header (therefore zero isneeded in common across all profiles, thenanew version of RTP should be defined to makevalid length). Only apermanent changesingle extension may be appended to thefixed header. 5.3.1RTPHeader Extension An extension mechanism is provided todata header. To allowindividualmultiple interoperating implementations to each experiment independently withnew payload-format-independent functions that require additional informationdifferent header extensions, or tobe carried inallow a particular implementation to experiment with more than one type of header extension, theRTP data packet header. This mechanism is designed so thatfirst 16 bits of the header extensionmayare left open for distinguishing identifiers or parameters. The format of these 16 bits is to beignoreddefined byother interoperatingthe profile specification under which the implementationsthat haveare operating. This RTP specification does notbeen extended. Note that thisdefine any headerextensionextensions itself. 6 RTP Control Protocol -- RTCP The RTP control protocol (RTCP) isintended only for limited use. Most potential usesbased on the periodic transmission ofthis mechanism would be better done another way,control packets to all participants in the session, using themethods described insame distribution mechanism as theprevious section. For example, a profile-specific extension todata packets. The underlying protocol must provide multiplexing of thefixed header is less expensivedata and control packets, for example using separate port numbers with UDP. RTCP performs four functions: 1. The primary function is toprocess because itprovide feedback on the quality of the data distribution. This isnot conditional nor inan integral part of the RTP's role as avariable location. Additional information requiredtransport protocol and is related to the flow and congestion control functions of other transport protocols. The feedback may be directly useful fora particular payload format should not use this header extension,control of adaptive encodings [9,10], butshould be carried inexperiments with IP multicasting have shown that it is also critical to get feedback from thepayload section ofreceivers to diagnose faults in thepacket.distribution. Sending reception feedback reports to all Schulzrinne/Casner/Frederick/Jacobson [Page18]16] Internet Draft RTPDecember 5, 1997 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | defined by profile | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | header extension | | .... | If the X bit in the RTP headerAugust 7, 1998 participants allows one who isone,observing problems to evaluate whether those problems are local or global. With avariable-length header extensiondistribution mechanism like IP multicast, it isappended to the RTP header, following the CSRC list if present. The header extension containsalso possible for an entity such as a16-bit length field that counts the number of 32-bit wordsnetwork service provider who is not otherwise involved in theextension, excludingsession to receive thefour-octet extension header (therefore zero is a valid length). Onlyfeedback information and act as asingle extension may be appendedthird-party monitor to diagnose network problems. This feedback function is performed by the RTCP sender and receiver reports, described below in Section 6.4. 2. RTCP carries a persistent transport-level identifier for an RTPdata header. To allow multiple interoperating implementations to each experiment independently with different header extensions,source called the canonical name orto allowCNAME, Section 6.5.1. Since the SSRC identifier may change if aparticular implementationconflict is discovered or a program is restarted, receivers require the CNAME toexperiment with more than one typekeep track ofheader extension,each participant. Receivers may also require thefirst 16 bitsCNAME to associate multiple data streams from a given participant in a set ofthe header extension are left openrelated RTP sessions, fordistinguishing identifiers or parameters. The format of these 16 bits isexample tobe defined by the profile specification under whichsynchronize audio and video. Inter-media synchronization also requires theimplementations are operating. This RTP specification does not define any header extensions itself. 6NTP and RTPControl Protocol --timestamps included in RTCP packets by data senders. 3. TheRTP control protocol (RTCP) is based onfirst two functions require that all participants send RTCP packets, therefore theperiodic transmissionrate must be controlled in order for RTP to scale up to a large number of participants. By having each participant send its control packets to allparticipants in the session, usingthesame distribution mechanism asothers, each can independently observe thedata packets. The underlying protocol must provide multiplexingnumber of participants. This number is used to calculate thedata andrate at which the packets are sent, as explained in Section 6.2. 4. A fourth, optional function is to convey minimal session controlpackets,information, for exampleusing separate port numbers with UDP. RTCP performs four functions: 1. The primary function isparticipant identification toprovide feedback on the quality ofbe displayed in thedata distribution.user interface. This isan integral part of the RTP's rolemost likely to be useful in "loosely controlled" sessions where participants enter and leave without membership control or parameter negotiation. RTCP serves as atransport protocol andconvenient channel to reach all the participants, but it isrelatednot necessarily expected to support all theflow and congestioncontrolfunctionscommunication requirements ofother transport protocols. The feedback may be directly useful foran application. A higher-level session control protocol, which is beyond the scope ofadaptive encodings [8,9],this document, may be needed. Functions 1-3 SHOULD be used in all environments, butexperiments withparticularly in the IPmulticasting have shownmulticast environment. RTP application designers SHOULD avoid mechanisms thatit is also critical to get feedback from the receivers to diagnose faultscan only work inthe distribution. Sending reception feedback reports to all participants allows one who is observing problemsunicast mode and will not scale toevaluate whether those problems are local or global. With a distribution mechanism like IP multicast, it is also possiblelarger numbers. Transmission of RTCP MAY be controlled separately foran entitysenders and receivers for cases such asa network service provider whounidirectional links where feedback from receivers is nototherwise involved in the session to receive thepossible. Schulzrinne/Casner/Frederick/Jacobson [Page19]17] Internet Draft RTPDecember 5, 1997 feedback information and act as a third-party monitor to diagnose network problems.August 7, 1998 6.1 RTCP Packet Format Thisfeedback function is performed by thespecification defines several RTCPsenderpacket types to carry a variety of control information: SR: Sender report, for transmission andreceiver reports, described below in Section 6.4. 2.reception statistics from participants that are active senders RR: Receiver report, for reception statistics from participants that are not active senders SDES: Source description items, including CNAME BYE: Indicates end of participation APP: Application specific functions Each RTCPcarriespacket begins with apersistent transport-level identifier for anfixed part similar to that of RTPsource called the canonical name or CNAME, Section 6.5.1. Since the SSRC identifierdata packets, followed by structured elements that maychange ifbe of variable length according to the packet type but always end on aconflict is discovered or32-bit boundary. The alignment requirement and aprogram is restarted, receivers requirelength field in theCNAME to keep trackfixed part of eachparticipant. Receiverspacket are included to make RTCP packets "stackable". Multiple RTCP packets mayalso require the CNAMEbe concatenated without any intervening separators toassociate multiple data streams fromform agiven participantcompound RTCP packet that is sent in asetsingle packet ofrelated RTP sessions,the lower layer protocol, for exampleto synchronize audio and video. 3. The first two functions require that all participants sendUDP. There is no explicit count of individual RTCPpackets, therefore the rate must be controlledpackets inorder for RTP to scale up to a large number of participants. By having each participant send its control packetsthe compound packet since the lower layer protocols are expected toallprovide an overall length to determine theothers, each canend of the compound packet. Each individual RTCP packet in the compound packet may be processed independentlyobservewith no requirements upon thenumberorder or combination ofparticipants. This number is usedpackets. However, in order tocalculateperform therate at whichfunctions of thepacketsprotocol, the following constraints aresent,imposed: o Reception statistics (in SR or RR) should be sent asexplained in Section 6.2. 4. A fourth, optional function is to convey minimal session control information, for example participant identificationoften as bandwidth constraints will allow tobe displayed inmaximize theuser interface. This is most likely to be useful in "loosely controlled" sessions where participants enter and leave without membership control or parameter negotiation.resolution of the statistics, therefore each periodically transmitted compound RTCPserves aspacket should include aconvenient channelreport packet. o New receivers need toreach allreceive theparticipants, but it is not necessarily expectedCNAME for a source as soon as possible tosupport allidentify thecontrol communication requirements of an application. A higher-level session control protocol, which is beyondsource and to begin associating media for purposes such as lip-sync, so each compound RTCP packet should also include thescopeSDES CNAME. o The number ofthis document,packet types that may appear first in the compound packet should beneeded. Functions 1-3 are mandatory when RTP is usedlimited to increase the number of constant bits in theIP multicast environment, and are recommended for all environments. RTP application designers are advised to avoid mechanisms that can only work in unicast modefirst word andwill not scale to larger numbers. 6.1 RTCP Packet Format This specification defines several RTCP packet types to carry a varietythe probability ofcontrol information: SR: Sender report, for transmission and reception statistics from participants that are active senderssuccessfully validating RTCP packets against misaddressed RTP Schulzrinne/Casner/Frederick/Jacobson [Page20]18] Internet Draft RTPDecember 5, 1997 RR: Receiver report, for reception statistics from participants that are not active senders SDES: Source description items, including CNAME BYE: Indicates end of participation APP: Application specific functions EachAugust 7, 1998 data packets or other unrelated packets. Thus, all RTCPpacket begins withpackets MUST be sent in afixed part similar to thatcompound packet ofRTP dataat least two individual packets,followed by structured elements that may be of variable length according towith thepacket type but always end on a 32-bit boundary. The alignment requirementfollowing format recommended: Encryption prefix: If anda length field inonly if thefixed part of eachcompound packetare includedis tomake RTCP packets "stackable". Multiple RTCP packets maybeconcatenated without any intervening separatorsencrypted according toformthe method in Section 9.1, it MUST be prefixed by a random 32-bit quantity redrawn for every compoundRTCPpacketthattransmitted. If padding issent in a singlerequired for the encryption, it MUST be added to the last packet of thelower layer protocol, for example UDP. There is no explicit count of individualcompound packet. SR or RR: The first RTCPpacketspacket in the compound packetsince the lower layer protocols are expected to provide an overall length to determine the end of the compound packet. Each individual RTCP packet in the compound packet may be processed independently with no requirements upon the order or combination of packets. However, in order to perform the functions of the protocol, the following constraints are imposed: o Reception statistics (in SR or RR) should be sent as often as bandwidth constraints will allow to maximize the resolution of the statistics, therefore each periodically transmitted compound RTCP packet should include a report packet. o New receivers need to receive the CNAME for a source as soon as possible to identify the source and to begin associating media for purposes such as lip-sync, so each compound RTCP packet should also include the SDES CNAME. o The number of packet types that may appear first in the compound packet should be limited to increase the number of constant bits in the first word and the probability of successfully validating RTCP packets against misaddressed RTP data packets or other unrelated packets. Thus, all RTCP packets must be sent in a compound packet of at least two individual packets, with the following format recommended: Encryption prefix: If and only if the compound packet is to be encrypted, it is prefixed by a random 32-bit quantity redrawn for every compound packet transmitted. Schulzrinne/Casner/Frederick/Jacobson [Page 21] Internet Draft RTP December 5, 1997 SR or RR: The first RTCP packet in the compound packet mustMUST always be a report packet to facilitate header validation as described in Appendix A.2. This is true even if no data has been sent nor received, in which case an empty RR is sent, and even if the only other RTCP packet in the compound packet is a BYE. Additional RRs: If the number of sources for which reception statistics are being reported exceeds 31, the number that will fit into one SR or RR packet, then additional RR packets should follow the initial report packet. SDES: An SDES packet containing a CNAME item must be included in each compound RTCP packet. Other source description items may optionally be included if required by a particular application, subject to bandwidth constraints (see Section 6.3.9). BYE or APP: Other RTCP packet types, including those yet to be defined, may follow in any order, except that BYE should be the last packet sent with a given SSRC/CSRC. Packet types may appear more than once. It is advisable for translators and mixers to combine individual RTCP packets from the multiple sources they are forwarding into one compound packet whenever feasible in order to amortize the packet overhead (see Section 7). An example RTCP compound packet as might be produced by a mixer is shown in Fig. 1. If the overall length of a compound packet would exceed the maximum transmission unit (MTU) of the network path, it may be segmented into multiple shorter compound packets to be transmitted in separate packets of the underlying protocol. Note that each of the compound packets must begin with an SR or RR packet. An implementation may ignore incoming RTCP packets with types unknown to it. Additional RTCP packet types may be registered with the Internet Assigned Numbers Authority (IANA).6.2 RTCP Transmission IntervalSchulzrinne/Casner/Frederick/Jacobson [Page 19] Internet Draft RTPis designed to allow anAugust 7, 1998 if encrypted: random 32-bit integer | |[------- packet -------][----------- packet -----------][-packet-] | | receiver chunk chunk V reports item item item item -------------------------------------------------------------------- |R[SR|# sender #site#site][SDES|# CNAME PHONE |#CNAME LOC][BYE##why] |R[ |# report # 1 # 2 ][ |# |# ][ ## ] |R[ |# # # ][ |# |# ][ ## ] |R[ |# # # ][ |# |# ][ ## ] -------------------------------------------------------------------- |<------------------ UDP packet (compound packet) --------------->| #: SSRC/CSRC Figure 1: Example of an RTCP compound packet 6.2 RTCP Transmission Interval RTP is designed to allow an application to scale automatically over session sizes ranging from a few participants to thousands. For example, in an audio conference the data traffic is inherently self- limiting because only one or two people will speak at a time, so with multicast distribution the data rate on any given link remains relatively constant independent of the number of participants. However, the control traffic is not self-limiting. If the reception reports from each participant were sent at a constant rate, the control traffic would grow linearly with the number of participants.Schulzrinne/Casner/Frederick/Jacobson [Page 22] Internet Draft RTP December 5, 1997 if encrypted: random 32-bit integer | |[------- packet -------][----------- packet -----------][-packet-] | | receiver chunk chunk V reports item item item item -------------------------------------------------------------------- |R[SR|# sender #site#site][SDES|# CNAME PHONE |#CNAME LOC][BYE##why] |R[ |# report # 1 # 2 ][ |# |# ][ ## ] |R[ |# # # ][ |# |# ][ ## ] |R[ |# # # ][ |# |# ][ ## ] -------------------------------------------------------------------- |<------------------ UDP packet (compound packet) --------------->| #: SSRC/CSRC Figure 1: Example of an RTCP compound packetTherefore, the rate must be scaleddown.down by dynamically calculating the interval between RTCP packet transmissions. For each session, it is assumed that the data traffic is subject to an aggregate limit called the "session bandwidth" to be divided among the participants. This bandwidth might be reserved and the limit enforced by the network. If there is no reservation, there may be other constraints, depending on the environment, that establish the "reasonable" maximum for the session to use, and that would be the session bandwidth. The session bandwidth may be chosen based or some cost or a priori knowledge of the available network bandwidth for the session. It is somewhat independent of the media encoding, but the encoding choice may be limited by the session bandwidth. Often, the session bandwidth is the sum of the nominal bandwidths of the senders expected to be concurrently active. For teleconference audio, this number would typically be one sender's bandwidth. For layered Schulzrinne/Casner/Frederick/Jacobson [Page 20] Internet Draft RTP August 7, 1998 encodings, each layer is a separate RTP session with its own session bandwidth parameter. The session bandwidth parameter is expected to be supplied by a session management application when it invokes a media application, but media applications may also set a default based on the single- sender data bandwidth for the encoding selected for the session. The application may also enforce bandwidth limits based on multicast scope rules or other criteria. Bandwidth calculations for control and data traffic include lower- layer transport and network protocols (e.g., UDP and IP) since thatSchulzrinne/Casner/Frederick/Jacobson [Page 23] Internet Draft RTP December 5, 1997is what the resource reservation system would need to know. The application can also be expected to know which of these protocols are in use. Link level headers are not included in the calculation since the packet will be encapsulated with different link level headers as it travels. The control traffic should be limited to a small and known fraction of the session bandwidth: small so that the primary function of the transport protocol to carry data is not impaired; known so that the control traffic can be included in the bandwidth specification given to a resource reservation protocol, and so that each participant can independently calculate its share. It issuggestedRECOMMENDED that the fraction of the session bandwidth allocated to RTCP be fixed at 5%.While the valueIt is also RECOMMENDED that 1/4 ofthis and other constants intheinterval calculation is not critical, allRTCP bandwidth be dedicated to participants that are sending data so that in sessions with a large number of receivers but a small number of senders, newly joining participants will more quickly receive thesession mustCNAME for the sending sites. When the proportion of senders is greater than 1/4 of the participants, the senders get their proportion of the full RTCP bandwidth. While the values of these and other constants in the interval calculation are not critical, all participants in the session MUST use the same values so the same interval will be calculated. Therefore, these constants should be fixed for a particular profile.The algorithm described in Appendix A.7 was designed to meet the goals outlined above. It calculates the interval between sending compound RTCP packets to divideA profile MAY specify that theallowedcontrol traffic bandwidthamongmay be a separate parameter of theparticipants. Thissession rather than a strict percentage of the session bandwidth. Using a separate parameter allowsan applicationrate- adaptive applications toprovide fast response for small sessions where, for example, identification of all participantsset an RTCP bandwidth consistent with a "typical" data bandwidth that isimportant, yet automatically adapt to large sessions. The algorithm incorporateslower than thefollowing characteristics: o Senders are collectively allocated at least 1/4 ofmaximum bandwidth specified by the session bandwidth parameter. The profile MAY further specify that the control traffic bandwidthsomay be divided into two separate session parameters for those participants which are active data senders and those which are not. Following the recommendation thatin sessions with a large number of receivers but a small number1/4 of the RTCP bandwidth be Schulzrinne/Casner/Frederick/Jacobson [Page 21] Internet Draft RTP August 7, 1998 dedicated to data senders,newly joining participants will more quickly receivetheCNAMERECOMMENDED default values for these two parameters would be 1.25% and 3.75%, respectively. When thesending sites. o The calculated interval between RTCP packetsproportion of senders isrequired to begreater thana minimum of 5 seconds to avoid having bursts1/4 of the participants, the senders get their proportion of the sum of these parameters. Using two parameters allows RTCP reception reports to be turned off entirely for a particular session by setting the RTCP bandwidth for non-data-senders to zero while keeping the RTCP bandwidth for data senders non-zero so that sender reports can still be sent for inter- media synchronization. This may be appropriate for systems operating on unidirectional links or for sessions that don't require feedback on the quality of reception. The calculated interval between transmissions of compound RTCP packets SHOULD also have a lower bound to avoid having bursts of packets exceed the allowed bandwidth when the number of participants is small and the traffic isn't smoothed according to the law of large numbers.o The calculatedIt also keeps the report intervalbetween RTCP packets scales linearly withfrom becoming too small during transient outages like a network partition such that adaptation is delayed when thenumber of members inpartition heals. At application startup, a delay SHOULD be imposed before thegroup. Itfirst compound RTCP packet isthis linear factor which allowssent to allow time fora constant amount of control traffic when summed across all members. o The interval betweenRTCP packetsis varied randomly overto be received from other participants so therange [0.5,1.5] timesreport interval will converge to thecalculatedcorrect value more quickly. This delay MAY be set to half the minimum interval toavoid unintended synchronization of all participants [10].allow quicker notification that the new participant is present. Thefirst RTCP packet sent after joiningRECOMMENDED value for asessionfixed minimum interval isalso delayed by a random variation of half5 seconds. An implementation MAY scale the minimum RTCP intervalin caseto a smaller value inversely proportional to theSchulzrinne/Casner/Frederick/Jacobson [Page 24] Internet Draft RTP December 5, 1997 application is started at multiple sites simultaneously, for example as initiated by asessionannouncement.bandwidth parameter with the following limitations: oA dynamic estimate ofFor multicast sessions, only active data senders MAY use theaverage compound RTCP packet size is calculated, including all those received and sent, to automatically adaptreduced minimum value tochanges incalculate theamountinterval for transmission ofcontrol information carried.compound RTCP packets. oSince the calculated interval is dependent onFor unicast sessions, thenumber of observed group members, there mayreduced value MAY bean undesirable startup effects when a new user joins an existing session, or many users simultaneously join a new session. These new users will initially have incorrect estimates of the group membership,used by participants that are not active data senders as well, andthus theirthe delay before sending the initial compound RTCPtransmission interval willpacket may betoo low. This problem canzero. o For all sessions, the fixed minimum SHOULD besignificant if many users joinused when calculating thesession simultaneously. To deal with this, an algorithm called "timer reconsideration" is employed. This algorithm implements a simple back-off mechanismparticipant timeout interval (see Section 6.3.5 so that implementations whichcauses usersdo not tohold back RTCP packet transmission ifuse thegroup sizesreduced value for transmitting RTCP packets areincreasing. o When users leave a session, either with a BYE ornot timed out bytimeout, the group membership decreases, and thusother participants prematurely. o The RECOMMENDED value for thecalculated interval should decrease. A "reverse reconsideration" algorithmreduced minimum in seconds isused to allow members to more quickly reduce their intervals360 divided by the session bandwidth inresponse to group membership decreases. o BYE packets are given different treatmentkilobits/second. This Schulzrinne/Casner/Frederick/Jacobson [Page 22] Internet Draft RTP August 7, 1998 minimum is smaller thannormal RTCP packets. When a user leaves a group,5 seconds for bandwidths greater than 72 kb/s. The algorithm described in Section 6.3 andwishesAppendix A.7 was designed tosend a BYE packet, it may do so before its next scheduledmeet the goals outlined above. It calculates the interval between sending compound RTCPpacket. However, transmission of BYE's follows a back-off algorithm which avoids floods of BYEpacketsshould a large number of members simultaneously leaveto divide thesession.allowed control traffic bandwidth among the participants. Thisalgorithm may be usedallows an application to provide fast response for small sessionsin whichwhere, for example, identification of all participantsare allowedis important, yet automatically adapt tosend. In that case,large sessions. The algorithm incorporates thesession bandwidth parameter isfollowing characteristics: o The calculated interval between RTCP packets scales linearly with theproductnumber of members in theindividual sender's bandwidthgroup. It is this linear factor which allows for a constant amount of control traffic when summed across all members. o The interval between RTCP packets is varied randomly over the range [0.5,1.5] times thenumbercalculated interval to avoid unintended synchronization ofparticipants, and theall participants [11]. The first RTCPbandwidthpacket sent after joining a session is5% of that. Detailsalso delayed by a random variation of half thealgorithm's operation are given in the sections that follow. Appendix A.7 gives an example implementation. 6.3minimum RTCPPacket Send and Receive Rules The rules for how to send, and what to do when receiving an RTCP packet are outlined here. To execute these rules, a session participant must maintain several piecesinterval. o A dynamic estimate ofstate: tp:thelast time anaverage compound RTCP packetwas transmitted; Schulzrinne/Casner/Frederick/Jacobson [Page 25] Internet Draft RTP December 5, 1997 tc: the current time; tn: the next scheduled transmission time of an RTCP packet; pmembers:size is calculated, including all those received and sent, to automatically adapt to changes in theestimated numberamount ofsession members at time tp members:control information carried. o Since themost current estimate forcalculated interval is dependent on the number ofsession members; senders: the most current estimate for the numberobserved group members, there may be undesirable startup effects when a new user joins an existing session, or many users simultaneously join a new session. These new users will initially have incorrect estimates ofsenders inthesession; rtcp_bw: The targetgroup membership, and thus their RTCPbandwidth, i.e., the total bandwidth thattransmission interval will beused for RTCP packets by all members of this session, in octets per second.too short. Thisshouldproblem can be5% of the "session bandwidth" parameter supplied to the application at startup. we_sent: Flag that is truesignificant if many users join theapplication has sent data since the 2nd previous RTCP report was transmitted. avg_rtcp_size: The average compoundsession simultaneously. To deal with this, an algorithm called "timer reconsideration" is employed. This algorithm implements a simple back-off mechanism which causes users to hold back RTCP packetsize, in octets, over all RTCP packets sent and received by this user. initial: Flag that is truetransmission if theapplication has not yet sent an RTCP packet. Many of these rules make use ofgroup sizes are increasing. o When users leave a session, either with a BYE or by timeout, the"calculated interval" between packet transmissions. Thisgroup membership decreases, and thus the calculated interval should decrease. A "reverse reconsideration" algorithm isdescribedused to allow members to more quickly reduce their intervals inthe following section. 6.3.1 Computing theresponse to group membership decreases. o BYE packets are given different treatment than other RTCP packets. When a user leaves a group, and wishes to send a BYE Schulzrinne/Casner/Frederick/Jacobson [Page 23] Internet Draft RTP August 7, 1998 packet, it may do so before its next scheduled RTCP packet. However, transmissioninterval To maintain scalability, the average interval between packets fromof BYE's follows asession participantback-off algorithm which avoids floods of BYE packets shouldscale witha large number of members simultaneously leave thegroup size.session. Thisinterval is calledalgorithm may be used for sessions in which all participants are allowed to send. In that case, thecalculated interval. Itsession bandwidth parameter isobtained by combining a numberthe product of thepiecesindividual sender's bandwidth times the number ofstate described above. The calculated interval Tparticipants, and the RTCP bandwidth isthen determined as follows: 1. If there5% of that. Details of the algorithm's operation areany senders (senders > 0)given in thesession, butsections that follow. Appendix A.7 gives an example implementation. 6.2.1 Maintaining the number ofsenders is less than 25%session members Calculation of themembership (members), theRTCP packet interval dependson whether the user is a sender or not (based on the valueupon an estimate ofwe_sent). Iftheuser is a sender (we_sent true),number of sites participating in theconstant C is setsession. New sites are added to theaverage rtcp packet size (avg_rtcp_size) divided by 25% of the rtcp bandwidth (rtcp_bw),count when they are heard, and an entry for each SHOULD be created in a table indexed by theconstant n is setSSRC or CSRC identifier (see Section 8.2) tothe numberkeep track ofsenders. If we_sent isthem. New entries MAY be considered nottrue,valid until multiple packets carrying theconstant C is set tonew SSRC have been received (see Appendix A.1). Entries MAY be deleted from theaverage rtcptable when an RTCP BYE packetsize divided by 75% ofwith thertcp bandwidth. The constant ncorresponding SSRC identifier isset toreceived, except that some straggler data packets might arrive after thenumber of receivers (members - senders). Schulzrinne/Casner/Frederick/Jacobson [Page 26] Internet Draft RTP December 5, 1997 2. IfBYE and cause theuser hasentry to be recreated. Instead, the entry should be marked as having received a BYE and then deleted after an appropriate delay. A participant may mark another site inactive, or delete it if not yetsent anvalid, if no RTP or RTCP packet(the variable initialhas been received for a small number of RTCP report intervals (5 isfalse),suggested). This provides some robustness against packet loss. All sites must calculate roughly theconstant Tmin is setsame value for the RTCP report interval in order for this timeout to5 seconds, elsework properly. Once a site has been validated, then if it isset to 2.5 seconds. 3. The deterministic calculated interval Td is set to max(Tmin, n*C). 4. The calculated interval T is set to a number uniformly distributed between halflater marked inactive the state for that site should still be retained andthree halfthedeterministic calculated interval. This procedure resultssite should continue to be counted inan interval which is random, but which, on average, gives 25% ofthertcptotal number of sites sharing RTCP bandwidth for a period long enough tosenders, and 75% to receivers. 6.3.2 Initialization Upon joining the session, the user initializes tp to 0, tc to 0, senders to 0, initial to 1, pmembers to 1, members to 1, we_sent to false, rtcp_bwspan typical network partitions. This is to5% ofavoid excessive traffic, when thesession bandwidth, initial to true, and avg_pkt_szpartition heals, due tothe size of the very first packet constructed by the application. The calculatedan RTCP report intervalT is then computed, and the first packet is scheduled for time tn = T. This meansthata transmission timerisset which expires at time T.too small. A timeout of 30 minutes is suggested. Note thatthe user MAY use any desired approach for implementingthistimer. The user adds their own SSRCis still larger than 5 times the largest value to which themember table. 6.3.3 Receiving an RTP or non-BYE RTCP packet When an RTP orRTCPpacketreport interval isreceived fromexpected to usefully scale, about 2-5 minutes. For sessions with auser whose SSRC is not in the member table, the SSRC is addedvery large number of participants, it may be impractical to maintain a table to store thetable,SSRC identifier andthe valuestate information formembers is incremented by 1. When an RTP packet is received from a user whose SSRC is not in the sender table, the SSRC is added to the table, and the value for senders is incremented by 1. For large scale applications, such as a broadcast session, the approach of storingallthe received SSRC identifiers in a table does not scale well. For huge groups, the amoundofmemory required to store all the SSRC identifiers and related per-source state may become impractical. To reduce this storage burden, an applicationthem. An implementation MAYinstead store only a sampling of the receiveduse SSRCidentifiers using the algorithmSchulzrinne/Casner/Frederick/Jacobson [Page 24] Internet Draft RTP August 7, 1998 sampling, as described here,orto reduce the storage requirements. An implementation MAY use any other algorithm with similarbehavior. Theperformance. A key requirement is that any algorithmoperates by attempting to maintain the number of entries Schulzrinne/Casner/Frederick/Jacobson [Page 27] Internet Draft RTP December 5, 1997 stored below some threshold, B. This thresholdconsidered SHOULD NOTbe less than 100 in order to achieve sufficient statistical accuracy insubstantially underestimate thesampling.group size, although it MAY overestimate. Theidea is to filter which SSRC identifiers are stored based onsampling algorithm employs amask. A participantmask with the m least significant bits set to one and usesitsthe participant's own SSRC identifier asthea (random)key, and starts withkey. If amask of 0 bits (so all other SSRC identifiersnewly receivedwill match). MatchingSSRCidentifiers are placed intomatches thetable. Whenkey when both are ANDed with thetable reaches full capacity (B),mask, themasknew SSRC identifier isextended by 1 bit. (Shifting 1 bits intoadded to theleast significant bittable, otherwise it isrecommended.) Now, all ofignored. An exception is that the SSRCvaluesidentifiers of data senders must be maintained in the tablewhich no longer equal the keyeven when their SSRC does not match under the masking operationare discarded. On average, this reducesbecause thesizepotentially small number of senders must be accurate for the RTCP interval calculation. Initially, the mask starts with m=0, so that every SSRC identifier is accepted and placed into the table. When the number of table entries reaches some threshold, B, m is increased by1/2. As new1 bit, and all the SSRC identifiersare received, they are only added toin the tableif theywhich no longer matchthe keyunder themasking operation. Again, whenmask are discarded. This will reduce the table sizeincreasesby roughly half. As the group size continues toB,increase, the user MAY further increase the maskis extendedsize byanother bit, andan additional bit when thenonmatching entries are discarded.table size once again approaches the threshold. An implementation MUST maintain a table that can accomodate at least B=100 users, for reasonable statistical accuracy. Themask may not be extended beyondalgorithm also maintains a set of 32bits, in which case only the participants ownbins, numbered 0 through 31. When a new participant shows up whose SSRCwould match. Ifmatches the key under the current mask (with m bits), the SSRC identifier is placed in bin in bin m that still match under thenumberm+1 bit mask are moved from bin m to bin m+1, otherwise they are discarded as mentioned previously. The SSRC identifiers of1 bitsdata senders are always kept and are always placed in themask, and n0th bin. When a sender stops sending, its SSRC is moved to thenumber ofbin corresponding to the current mask length m if its SSRCinmatches thetable,key under the masking operation and otherwise is discarded. To obtain the estimate of thegroup size is given bynumber of session members L, the following formula is used: L =n * 2**m. The algorithm described attemptsSUM from i=0 tokeepi=31 of B(i) * (2**i) Where B(i) is thevaluenumber ofm toSSRC identifiers in bin i. Note that this formula counts senders only once since they are all represented in bin 0, but multiplies thesmallest possible value without overflowingsampled count of non-senders (receivers) by thetable. This yieldssampling factor. As participants leave thebest group size estimate possible forsession by sending agivenBYE or being timed Schulzrinne/Casner/Frederick/Jacobson [Page 25] Internet Draft RTP August 7, 1998 out, their entries are removed from the tablesize B. Note that this sampling algorithm MUST NOT be appliedand the number of entries in the table may become too small toSSRC identifiers that correspondprovide a reasonable statistical estimate. When this occurs, it is necessary tosenders because otherwisedecrease thecalculationnumber of bits in theRTCP bandwidth when we_sent is true would be inaccurate. Themask so that additional SSRC identifiersfor senders MUST alwayswill beadded to the table when first received and not removed fromkept. It is recommended that thetable whenmask be decreased by one when: L/(2**m) < B/4 When the mask size isextended. For each compound RTCP packet received,reduced from m to m-1, all thevalueSSRC identifiers remain in their current bins. Thus the estimate ofavg_rtcp_sz is updated: avg_rtcp_sz = (1/16)*packet_size + (15/16)* avg_rtcp_sz, where packet_size isthesizenumber of session members is not immediately affected by theRTCPchange in mask size. When a packetjust received. 6.3.4 Receivingarrives from anRTCP BYE packet If the received packetSSRC that isan RTCP BYE packet, thecurrently in some bin x where x<m, that SSRC identifier ischecked against the member table. If present,moved from bin x to bin m, reducing theentryestimate. When a BYE packet isremovedreceived from a participant or thetable,participant is timed out, and thevalue for members is decremented by 1. Theparticipant's SSRCis then checked against the sender table. If present,exists in theentrymembership table, that SSRC identifier is removed from its bin. Thus thetable, and the value for senders is decremented by 1. If an SSRC sampling algorithm is in usecontributions from higher bins fade away asdescribed in the previous Schulzrinne/Casner/Frederick/Jacobson [Page 28] Internet Draft RTP December 5, 1997 section, then when the numberbin m acquires a more complete list ofentries in the member table falls below B/2,themask SHOULD be reduced by 1 bit unless m is already zero. NoteSSRC identifiers thatthiswillcausenow be kept because of thegroup size estimate to drop by 1/ 2.reduced mask. 6.3 RTCP Packet Send and Receive Rules Theestimate will eventually convergerules for how tothe correct value as SSRC identifiers which did not previously match the key under masking,send, andnow do, are added to the table. Furthermore, to make the transmission rate of RTCP packets more adaptivewhat tochanges in group membership, the following "reverse reconsideration" algorithm SHOULD be executeddo when receiving an RTCP packet are outlined here. To execute these rules, aBYEsession participant must maintain several pieces of state: tp: the last time an RTCP packetis received: o The value for tn is updated according towas transmitted; tc: thefollowing formula: tn = tc + (members/pmembers)(tn - tc). o The value for tp is updated accordingcurrent time; tn: thefollowing formula: tp = tc - (members/pmembers)(tc - tp). o ThenextRTCP packet is rescheduled forscheduled transmissionattimetn, which is now earlier. o The valueofpmembers is set equal to members. 6.3.5 Timing OutanSSRC At occassional intervals,RTCP packet; pmembers: theuser MUST check to see if anyestimated number of session members at time tp members: theother users timeout. To do this,most current estimate for theuser computesnumber of session members; senders: thedeterministic calculated interval (withoutmost current estimate for therandomization factor) Td. Any other session member who has not sent a packet since time tc - MTd (M isnumber of senders in thetimeout multiplier, and defaults to 5) is timed out.session; rtcp_bw: The target RTCP bandwidth, i.e., the total bandwidth that will be used for RTCP packets by all members of this session, in octets per second. Thismeansshould be 5% of the "session bandwidth" parameter supplied to the application at startup. we_sent: Flag thattheir SSRCisremoved fromtrue if themember list,application has sent data since the 2nd previous RTCP report was transmitted. Schulzrinne/Casner/Frederick/Jacobson [Page 26] Internet Draft RTP August 7, 1998 avg_rtcp_size: The average compound RTCP packet size, in octets, over all RTCP packets sent andmembers is decrementedreceived by1. A similar checkthis participant. initial: Flag that isperformed on the sender list. Any member ontrue if thesender list whoapplication has not yet sent anRTP packet since time tc - T (note the absenceRTCP packet. Many of these rules make use of theM factor)"calculated interval" between packet transmissions. This interval isremoveddescribed in the following section. 6.3.1 Computing the RTCP transmission interval To maintain scalability, the average interval between packets from a session participant should scale with thesender list, and sendersgroup size. This interval is called the calculated interval. It isdecrementedobtained by1. The user SHOULD perform this check every time an RTCP packetcombining a number ofany type is received. The user MAY perform the check less frequently, but it MUST be done at least once between RTCP packet transmissions fromtheuser. Aspieces of state describedin the previous section, if an SSRC sampling algorithmabove. The calculated interval T isin usethenwhendetermined as follows: 1. If there are any senders (senders > 0) in the session, but the number ofentries insenders is less than 25% of themember table falls below B/2,membership (members), themask SHOULD be reduced by 1 bit unless minterval depends on whether the participant isalready zero. 6.3.6 Expirationa sender or not (based on the value oftransmission timer Schulzrinne/Casner/Frederick/Jacobson [Page 29] Internet Draft RTP December 5, 1997 Whenwe_sent). If thepacket transmission timer expires,participant is a sender (we_sent true), theuser performs one ofconstant C is set to thefollowing operations: Option A: o If members mbers, anaverage RTCP packet size (avg_rtcp_size) divided by 25% of the RTCP bandwidth (rtcp_bw), and the constant n istransmitted. The transmission interval T, includingset to therandomization factor,number of senders. If we_sent iscomputed. pmembersnot true, the constant C is set tomembers, tpthe average RTCP packet size divided by 75% of the RTCP bandwidth. The constant n is set totc,the number of receivers (members - senders). If the number of senders is greater than 25%, senders andtnreceivers are treated together. The constant C is set totc + T. The transmission timerthe total RTCP bandwidth and n is set toexpire again at time tn. o If members > pmembers,thetransmission interval T, including the randomization factor, is computed.total number of members. 2. Iftp + T is less than or equal to tc,the participant has not yet sent an RTCP packet (the variable initial istransmitted. pmembers is set to members, tptrue), the constant Tmin is set totc, and tn2.5 seconds, else it is set totc + T.5 seconds. 3. Thetransmission timerdeterministic calculated interval Td is set toexpire again at time tn. If tp +max(Tmin, n*C). 4. The calculated interval T isgreater than tc, pmembers isset tomembers,a number uniformly distributed between 0.5 andtn1.5 times the deterministic calculated interval. This procedure results in an interval which isset to tc + T. Norandom, but which, on average, gives 25% of the RTCPpacket is transmitted. The transmission timer is setbandwidth toexpire at time tn. Option B: o The transmission interval T, includingsenders, and 75% to receivers. Schulzrinne/Casner/Frederick/Jacobson [Page 27] Internet Draft RTP August 7, 1998 6.3.2 Initialization Upon joining therandomization factor, is computed. o Ifsession, the participant initializes tp+ T is less than or equaltotc, an RTCP packet is transmitted.0, tc to 0, senders to 0, pmembersis settomembers, tp is set1, members totc,1, we_sent to false, rtcp_bw to 5% of the session bandwidth, initial to true, andtn is setavg_rtcp_size totc + T.the size of the very first packet constructed by the application. The calculated interval T is then computed, and the first packet is scheduled for time tn = T. This means that a transmission timer is setto expire againwhich expires at timetn. If tp + T is greater than tc, pmembers is set to members, and tn is set to tc +T.NoNote that an application MAY use any desired approach for implementing this timer. The participant adds their own SSRC to the member table. 6.3.3 Receiving an RTP or non-BYE RTCP packet When an RTP or RTCP packet istransmitted. The transmission timer is set to expire at time tn. Option C: o Option Breceived from a participant whose SSRC isexecuted fornot in thefirst RTCP packet. o Option Amember table, the SSRC isexecuted for all subsequent packets. Users SHOULD use Option B. Users MAY use options Cadded to the table, andA. Option B providesthebest protection against RTCPvalue for members is updated. When an RTP packetfloodsis received from a participant whose SSRC is not in theevent of simultaneous joins or when network partitions heal. If an RTCP packetsender table, the SSRC istransmitted (using any ofadded to theabove options),table, and the valueof initialfor senders isset to FALSE. Furthermore,updated. For each compound RTCP packet received, the value ofavg_rtcp_szavg_rtcp_size is updated:avg_rtcp_szavg_rtcp_size = (1/16)*packet_size + (15/16)*avg_rtcp_sz,avg_rtcp_size, where packet_size is the size of the RTCP packet justtransmitted. Schulzrinne/Casner/Frederick/Jacobson [Page 30] Internet Draft RTP December 5, 1997 6.3.7 Transmitting areceived. 6.3.4 Receiving an RTCP BYE packetWhen a user wishes to leave a session, aExcept as described in Section 6.3.7 for the case when an RTCP BYEpacketistransmittedtoinform the other users ofbe transmitted, if theevent. In order to avoid a flood ofreceived packet is an RTCP BYEpackets when many users leavepacket, thesystem, a client MUST implementSSRC is checked against thefollowing algorithm ifmember table. If present, thenumber of membersentry ismore than 50 when the user chooses to leave: o When the user decides to leaveremoved from thesystem, tp is reset to tc,table, and thecurrent time,value for membersand pmembers are initialized to 1, initialisset to 1, we_sentupdated. The SSRC isset to 0, sendersthen checked against the sender table. If present, the entry isset to 0,removed from the table, andavg_rtcp_szthe value for senders issetupdated. Furthermore, to make thesizetransmission rate of RTCP packets more adaptive to changes in group membership, theBYE packet. The calculated interval T is computed. Thefollowing "reverse reconsideration" algorithm SHOULD be executed when a BYE packet isthen scheduledreceived: o The value fortimetn is updated according to the following formula: tn = tc +T.(members/pmembers)(tn - tc). oEvery time a BYE packet from another user is received, members is incremented by 1. membersThe value for tp isNOT incremented when otherupdated according the following formula: tp = tc - (members/pmembers)(tc - tp). o The next RTCPpackets or RTP packets are received, but onlypacket is rescheduled forBYE packets.transmission at time Schulzrinne/Casner/Frederick/Jacobson [Page 28] Internet Draft RTP August 7, 1998 tn, which is now earlier. oTransmissionThe value of pmembers is set equal to members. This algorithm does not prevent theBYE packet then follows the rulesgroup size estimate from incorrectly dropping to zero fortransmittingaregular RTCP packet, as above. Option B SHOULD be used. This allows BYE packetsshort time when most participants of a large session leave at once but some remain. The algorithm does make the estimate return tobe sent right away, yet controls their total bandwidth usage. In the worst case, this could cause RTCP control packets to use twicethebandwidth as normal (10%) - 5% for non BYE RTCP packetscorrect value more rapidly. This situation is unusual enough and5% for BYE. A client which does not want to wait fortheabove mechanism to allow them to transmitconsequences are sufficiently harmless that this problem is deemed only aBYE packet MAY leavesecondary concern. 6.3.5 Timing Out an SSRC At occassional intervals, thegroup without sending a BYE at all. They will eventually be timed out byparticipant MUST check to see if any of the othergroup members. Whenparticipants time out. To do this, thegroup size estimate members is less than 50 whenparticipant computes theuser decides to leave,deterministic calculated interval (without theuser MAY sendrandomization factor) Td. Any other session member who has not sent aBYEpacketimmediately. Alternatively,since time tc - MTd (M is theuser MAY choosetimeout multiplier, and defaults toimplement5) is timed out. This means that their SSRC is removed from theabove BYE backoff algorithm. In either case, a client which never sent an RTP or RTCP packet MUST NOT send a BYE packet when they leavemember list, and members is updated. A similar check is performed on thegroup. 6.3.8 Updating we_sent The variable we_sent contains TRUE ifsender list. Any member on theusersender list who has not sent an RTP packetrecently, false otherwise. This determination is made by usingsince time tc - 2T (within thesame mechanisms for managinglast two RTCP report intervals) is removed from the sender list, and senderstable. When the user first sends an RTP packet, they add themselves to the sender table. Every time another RTP packetissent, theupdated. If any members timeofout, the reverse reconsideration algorithm described in Section 6.3.4 SHOULD be performed. The participant MUST perform this check at least once per RTCP transmission interval. 6.3.6 Expiration ofthat Schulzrinne/Casner/Frederick/Jacobson [Page 31] Internet Draft RTP December 5, 1997transmission timer When the packetis maintained intransmission timer expires, thetable. The normal sender timeout algorithmparticipant performs one of the following operations: Option A ("conditional reconsideration"): o If members isthen appliedless than or equal tothe user - ifpmembers, anRTPRTCP packethas not been transmitted since time tc -is transmitted. The transmission interval T, including theuser removes themselves from the sender table, decrements the sender count, and sents we_sent to false. Whenever an RTP packetrandomization factor, issent, we_sentcomputed. pmembers is set totrue. 6.3.9 Allocation of source description bandwidth This specification defines several source description (SDES) items in additionmembers, tp is set tothe mandatory CNAME item, such as NAME (personal name)tc, andEMAIL (email address). It also provides a meanstn is set todefine new application-specific RTCP packet types. Applications should exercise caution in allocating control bandwidthtc + T. The transmission timer is set tothis additional information because it will slow down the rateexpire again atwhich reception reports and CNAME are sent, thus impairingtime tn. o If members is greater than pmembers, theperformance oftransmission interval T, including theprotocol. Itrandomization factor, isrecommended that no morecomputed. If tp + T is less than20% of the RTCP bandwidth allocatedor equal toa single participant be usedtc, an RTCP packet is transmitted. pmembers is set tocarry the additional information. Furthermore, itmembers, tp isnot intended that all SDES items should be included in every application. Those that are included should be assigned a fraction of the bandwidth accordingset totheir utility. Rathertc, and tn is set to tc + T. The transmission timer is set to expire again at time Schulzrinne/Casner/Frederick/Jacobson [Page 29] Internet Draft RTP August 7, 1998 tn. If tp + T is greater thanestimate these fractions dynamically, ittc, pmembers isrecommended that the percentages be translated statically into report interval counts based on the typical length of an item. For example, an application may be designedset tosend only CNAME, NAME and EMAILmembers, andnot any others. NAME might be given much higher priority than EMAIL because the NAME would be displayed continuously in the application's user interface, whereas EMAIL would be displayed only when requested. At everytn is set to tp + T. No RTCPinterval, an RR packet and an SDESpacketwithis transmitted. The transmission timer is set to expire at time tn. Option B ("unconditional reconsideration"): o The transmission interval T is computed, including theCNAME item would be sent. Forrandomization factor and asmall session operating atfactor e-3/2=1.21828 times theminimum interval, that would be every 5 seconds onrtcp_bw to compensate for theaverage. Every third interval (15 seconds), one extra item would be included infact that theSDES packet. Seven out of eight times this would beunconditional reconsideration algorithm converges to a value below theNAME item,intended average. o If tp + T is less than or equal to tc, an RTCP packet is transmitted. tp is set to tc, andevery eighthtn is set to tc + T. The transmission timer is set to expire again at time(2 minutes) it would be the EMAIL item. When multiple applications operate in concert using cross-application binding through a common CNAME for each participant,tn. If tp + T is greater than tc, pmembers is set to members, and tn is set to tp + T. No RTCP packet is transmitted. The transmission timer is set to expire at time tn. Option C ("hybrid reconsideration"): o Option B is executed forexample in a multimedia conference composed of an RTP sessionthe first RTCP packet. o Option A is executed foreach medium,all subsequent packets. Implementationss SHOULD use Option B. Implementations MAY use options C and A. Option B provides theadditional SDES information might be sentbest protection against RTCP packet floods inonly one RTP session. The other sessions would carry only the CNAME item. In particular, this approach should be applied tothemultiple sessionsevent ofa layered encoding scheme (see Section 2.4). 6.4 Sender and Receiver Reports RTP receivers provide reception quality feedback usingsimultaneous joins or when network partitions heal. If an RTCPreport packets which may take onepacket is transmitted (using any oftwo forms depending upon whether or not Schulzrinne/Casner/Frederick/Jacobson [Page 32] Internet Draft RTP December 5, 1997thereceiverabove options), the value of initial isalso a sender. The only difference betweenset to FALSE. Furthermore, thesender report (SR) and receiver report (RR) forms, besides the packet type code,value of avg_rtcp_size isthat the sender report includes a 20-byte sender information section for use by active senders. The SRupdated: avg_rtcp_size = (1/16)*packet_size + (15/16)* avg_rtcp_size, where packet_size isissued if a site has sent any data packets during the interval since issuingthelast report or the previous one, otherwisesize of theRRRTCP packet just transmitted. 6.3.7 Transmitting a BYE packet When a participant wishes to leave a session, a BYE packet isissued. Bothtransmitted to inform theSR and RR forms include zero or more reception report blocks, one for eachother participants of thesynchronization sources from which this receiver has received RTP dataevent. In order to avoid a flood of BYE packetssince the last report. Reports are not issued for contributing sources listed inwhen many participants leave theCSRC list. Each reception report block provides statistics aboutsystem, a participant MUST execute thedata received fromfollowing algorithm if theparticular source indicated in that block. Since a maximumnumber of31 reception report blocks will fit in an SR or RR packet, additional RR packets may be stacked aftermembers is more than 50 when theinitial SR or RR packet as neededparticipant chooses tocontainleave. This algorithm usurps thereception reports for all sources heard duringnormal role of theinterval sincemembers variable to count BYE packets instead: o When thelast report. The next sections defineparticipant decides to leave theformats ofsystem, tp is reset to tc, thetwo reports, how they may be extended in a profile-specific manner if an application requires additional feedback information, and how the reports may be used. Details of reception reporting by translatorscurrent time, members andmixers is given in Section 7. 6.4.1 SR: Sender report RTCP packetpmembers are initialized Schulzrinne/Casner/Frederick/Jacobson [Page33]30] Internet Draft RTPDecember 5, 1997 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P| RC | PT=SR=200 | length | header +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRCAugust 7, 1998 to 1, initial is set to 1, we_sent is set to 0, senders is set to 0, and avg_rtcp_size is set to the size ofsender | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | NTP timestamp, most significant word | sender +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ info | NTP timestamp, least significant word | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sender'sthe BYE packet. The calculated interval T is computed. The BYE packetcount | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sender's octet count | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC_1 (SSRCis then scheduled for time tn = tc + T. o Every time a BYE packet from another participant is received, members is incremented by 1 regardless offirst source) | report +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block | fraction lost | cumulative numberwhether that participant exists in the member table or not, and when SSRC sampling is in use, regardless of whether the BYE SSRC matches the key or not. members is NOT incremented when other RTCP packetslost | 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | extended highest sequence number received | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | interarrival jitter | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | last SR (LSR) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | delay since last SR (DLSR) | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC_2 (SSRC of second source) | report +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block : ... : 2 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | profile-specific extensions | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The sender report packet consistsor RTP packets are received, but only for BYE packets. o Transmission ofthree sections, possibly followed by a fourth profile-specific extension section if defined. The first section, the header, is 8 octets long. The fields have the following meaning: version (V): 2 bits Identifiestheversion of RTP, which isBYE packet then follows thesame inrules for transmitting a regular RTCPpacketspacket, asin RTP data packets. The version defined by this specification is two (2). padding (P): 1 bit Schulzrinne/Casner/Frederick/Jacobson [Page 34] Internet Draft RTP December 5, 1997 Ifabove. Option B SHOULD be used. This allows BYE packets to be sent right away, yet controls their total bandwidth usage. In thepadding bit is set,worst case, thisindividualcould cause RTCPpacket contains some additional padding octets atcontrol packets to use twice theend which arebandwidth as normal (10%) -- 5% for non BYE RTCP packets and 5% for BYE. A participant that does notpart of the control information but are included inwant to wait for thelength field. The last octetabove mechanism to allow transmission of a BYE packet MAY leave thepadding isgroup without sending acount of how many padding octets should be ignored, including itself (itBYE at all. That participant will eventually bea multiple of four). Padding may be neededtimed out bysome encryption algorithms with fixed block sizes. In a compound RTCP packet, padding should only be required onthelast individual packet becauseother group members. If thecompound packetgroup size estimate members isencrypted as a whole. Thus,less than 50 when thepadding bit would be set only onparticipant decides to leave, thelast individual packet. reception report count (RC): 5 bits The number of reception report blocks contained in this packet. A value of zero is valid.participant MAY send a BYE packettype (PT): 8 bits Containsimmediately. Alternatively, theconstant 200participant MAY choose toidentify this as an RTCP SR packet. length: 16 bits The length of this RTCP packet in 32-bit words minus one, includingexecute theheader and any padding. (The offset of one makes zero a valid length and avoids a possible infinite loop in scanningabove BYE backoff algorithm. In either case, acompoundparticipant which never sent an RTP or RTCPpacket, while counting 32-bit words avoids a validity check forpacket MUST NOT send amultiple of 4.) SSRC: 32 bitsBYE packet when they leave the group. 6.3.8 Updating we_sent Thesynchronization source identifiervariable we_sent contains true if the participant has sent an RTP packet recently, false otherwise. This determination is made by using the same mechanisms for managing theoriginator of thissenders table and sending SRpacket. The second section,packets. If thesender information,participant sends an RTP packet when we_sent is20 octets longfalse, it adds itself to the sender table and sets we_sent to true. Every time another RTP packet ispresent in every sender report packet. It summarizes the data transmissions from this sender. The fields have the following meaning: NTP timestamp: 64 bits Indicatessent, thewallclocktimewhen this report was sent so that it may be used in combination with timestamps returned in reception reports from other receivers to measure round-trip propagation to those receivers. Receivers should expect that the measurement accuracy of the timestamp may be limited to far less than the resolutionofthe NTP timestamp. The measurement uncertaintytransmission ofthe timestamp is not indicated as it may not be known. A senderthatcan keep track of elapsed time but has no notion of wallclock time may use the elapsed time since joining the session instead. Thispacket isassumed to be less than 68 years, somaintained in thehigh bit will be zero. Ittable. The normal sender timeout algorithm ispermissiblethen applied tousethesampling clock to estimate elapsed wallclock time. A sender thatparticipant -- if an RTP packet hasno notion of wallclock or elapsednot been transmitted since timemay settc - 2T, theNTPparticipant removes itself from the sender table, decrements the sender count, and sets we_sent to false. Schulzrinne/Casner/Frederick/Jacobson [Page35]31] Internet Draft RTPDecember 5, 1997 timestamp to zero. RTP timestamp: 32 bits Corresponds to the same time as the NTP timestamp (above), butAugust 7, 1998 6.3.9 Allocation of source description bandwidth This specification defines several source description (SDES) items in addition to thesame units and with the same random offsetmandatory CNAME item, such asthe RTP timestamps in data packets. This correspondence may be used for intra-NAME (personal name) andinter-media synchronization for sources whose NTP timestamps are synchronized, and may be used by media- independent receiversEMAIL (email address). It also provides a means toestimate the nominal RTP clock frequency. Note thatdefine new application-specific RTCP packet types. Applications should exercise caution inmost cases this timestamp will not be equalallocating control bandwidth tothe RTP timestamp in any adjacent data packet. Rather,this additional information because itis calculated from the corresponding NTP timestamp using the relationship betweenwill slow down theRTP timestamp counterrate at which reception reports andreal time as maintained by periodically checkingCNAME are sent, thus impairing thewallclock time at a sampling instant. sender's packet count: 32 bits The total numberperformance ofRTP data packets transmitted by the sender since starting transmission up untilthetime this SR packet was generated. The countprotocol. It isreset if the sender changes its SSRC identifier. sender's octet count: 32 bits The total numberrecommended that no more than 20% ofpayload octets (i.e., not including header or padding) transmitted in RTP data packets by the sender since starting transmission up untilthetime this SR packet was generated. The count is reset if the sender changes its SSRC identifier. This field canRTCP bandwidth allocated to a single participant be used toestimate the average payload data rate. The third section contains zero or more reception report blocks depending oncarry thenumberadditional information. Furthermore, it is not intended that all SDES items should be included in every application. Those that are included should be assigned a fraction ofother sources heard by this sender sincethelast report. Each receptionbandwidth according to their utility. Rather than estimate these fractions dynamically, it is recommended that the percentages be translated statically into reportblock conveys statisticsinterval counts based on thereception of RTP packets from a single synchronization source. Receivers do not carry over statistics when a source changes its SSRC identifier due to a collision. These statistics are: SSRC_n (source identifier): 32 bits The SSRC identifiertypical length ofthe sourcean item. For example, an application may be designed towhichsend only CNAME, NAME and EMAIL and not any others. NAME might be given much higher priority than EMAIL because theinformationNAME would be displayed continuously inthis reception report block pertains. fraction lost: 8 bits The fraction of RTP data packets from source SSRC_n lost sincetheprevious SR orapplication's user interface, whereas EMAIL would be displayed only when requested. At every RTCP interval, an RR packetwas sent, expressed as a fixed point numberand an SDES packet with thebinary pointCNAME item would be sent. For a small session operating at theleft edge of the field. (That is equivalent to taking the integer part after multiplyingminimum interval, that would be every 5 seconds on theloss fraction by 256.) This fraction is defined toaverage. Every third interval (15 seconds), one extra item would be included in thenumberSDES packet. Seven out ofpackets lost divided byeight times this would be thenumber of Schulzrinne/Casner/Frederick/Jacobson [Page 36] Internet Draft RTP December 5, 1997 packets expected, as defined inNAME item, and every eighth time (2 minutes) it would be thenext paragraph. An implementation is shownEMAIL item. When multiple applications operate inAppendix A.3. Ifconcert using cross-application binding through a common CNAME for each participant, for example in a multimedia conference composed of an RTP session for each medium, theloss is negative due to duplicates,additional SDES information might be sent in only one RTP session. The other sessions would carry only thefraction lost is setCNAME item. In particular, this approach should be applied tozero. Note thatthe multiple sessions of areceiver cannot tell whether anylayered encoding scheme (see Section 2.4). 6.4 Sender and Receiver Reports RTP receivers provide reception quality feedback using RTCP report packetswere lost after the lastwhich may take onereceived,of two forms depending upon whether or not the receiver is also a sender. The only difference between the sender report (SR) and receiver report (RR) forms, besides the packet type code, is thatthere will be no receptionthe sender reportblock issued forincludes asource20-byte sender information section for use by active senders. The SR is issued ifall packets from that sourcea site has sent any data packets during thelast reportingintervalhave been lost. cumulative number of packets lost: 24 bits The total number of RTP data packets from source SSRC_n that have been lostsince issuing thebeginning of reception. This numberlast report or the previous one, otherwise the RR isdefined to beissued. Schulzrinne/Casner/Frederick/Jacobson [Page 32] Internet Draft RTP August 7, 1998 Both thenumberSR and RR forms include zero or more reception report blocks, one for each ofpackets expected lessthenumber of packets actually received, where the number of packets received includes anysynchronization sources from whichare late or duplicates. Thus packets that arrive late are not counted as lost, and the loss may be negative if there are duplicates. The number ofthis receiver has received RTP data packetsexpected is defined to besince theextendedlastsequence number received, as defined next, less the initial sequence number received. This may be calculated as shownreport. Reports are not issued for contributing sources listed inAppendix A.3. extended highest sequence number received: 32 bits The low 16 bits containthehighest sequence number received in an RTPCSRC list. Each reception report block provides statistics about the datapacketreceived fromsource SSRC_n, andthemost significant 16 bits extendparticular source indicated in thatsequence number with the corresponding countblock. Since a maximum ofsequence number cycles, which31 reception report blocks will fit in an SR or RR packet, additional RR packets may bemaintained according to the algorithm in Appendix A.1. Note that different receivers withinstacked after thesame session will generate different extensionsinitial SR or RR packet as needed to contain thesequence number if their start times differ significantly. interarrival jitter: 32 bits An estimate ofreception reports for all sources heard during thestatistical variance ofinterval since theRTP data packet interarrival time, measured in timestamp units and expressed as an unsigned integer.last report. Theinterarrival jitter J is defined to benext sections define themean deviation (smoothed absolute value)formats of thedifference Dtwo reports, how they may be extended inpacket spacing at the receiver compared to the sender for a pair of packets. As shown in the equation below, this is equivalent to the difference in the "relative transit time" for the two packets; the relative transit time is the difference betweenapacket's RTP timestampprofile-specific manner if an application requires additional feedback information, and how thereceiver's clock at the timereports may be used. Details ofarrival, measured in the same units. If Si is the RTP timestamp from packet i,reception reporting by translators andRimixers isthe time of arrivalgiven inRTP timestamp units for packet i, then for two packets i and j, D may be expressed as D(i,j) = (R_j - R_i) - (S_j - S_i) = (R_j - S_j) - (R_i - S_i) The interarrival jitter is calculated continuously as each dataSection 7. 6.4.1 SR: Sender report RTCP packeti is received from source SSRC_n, using this difference D forSchulzrinne/Casner/Frederick/Jacobson [Page37]33] Internet Draft RTPDecember 5, 1997 that packet and the previousAugust 7, 1998 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P| RC | PT=SR=200 | length | header +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC of sender | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | NTP timestamp, most significant word | sender +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ info | NTP timestamp, least significant word | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sender's packeti-1 in ordercount | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sender's octet count | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC_1 (SSRC ofarrival (not necessarily in sequence), according to the formula J_i = J_i-1 + (|D(i-1,i)| - J_i-1)/16 Whenever a receptionfirst source) | reportis issued, the current value+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block | fraction lost | cumulative number ofJ is sampled. Thepackets lost | 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | extended highest sequence number received | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | interarrival jittercalculation is prescribed here to allow profile- independent monitors to make valid interpretations| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | last SR (LSR) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | delay since last SR (DLSR) | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC_2 (SSRC ofreports coming from different implementations. This algorithm is the optimal first- order estimator and the gain parameter 1/16 gives a good noise reduction ratio while maintaining a reasonable ratesecond source) | report +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block : ... : 2 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | profile-specific extensions | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The sender report packet consists ofconvergence [11]. A sample implementationthree sections, possibly followed by a fourth profile-specific extension section if defined. The first section, the header, isshown in Appendix A.8. last SR timestamp (LSR): 32 bits8 octets long. Themiddle 32fields have the following meaning: version (V): 2 bitsoutIdentifies the version of64 inRTP, which is theNTP timestamp (as explainedsame inSection 4) receivedRTCP packets aspart ofin RTP data packets. The version defined by this specification is two (2). padding (P): 1 bit Schulzrinne/Casner/Frederick/Jacobson [Page 34] Internet Draft RTP August 7, 1998 If themost recentpadding bit is set, this individual RTCPsender report (SR)packetfrom source SSRC_n. If no SR has been received yet,contains some additional padding octets at thefield is set to zero. delay since last SR (DLSR): 32 bits The delay, expressed in unitsend which are not part of1/65536 seconds, between receivingthe control information but are included in the length field. The lastSR packet from source SSRC_n and sending this reception report block. If no SR packet has been received yet from SSRC_n,octet of theDLSR fieldpadding isset to zero. Let SSRC_r denote the receiver issuing this receiver report. Source SSRC_n can compute the round propagation delay to SSRC_ra count of how many padding octets should be ignored, including itself (it will be a multiple of four). Padding may be needed byrecording the time A when this reception reportsome encryption algorithms with fixed block sizes. In a compound RTCP packet, padding isreceived. It calculatesonly required on one individual packet because thetotal round-trip time A-LSR usingcompound packet is encrypted as a whole for the method in Section 9.1. Thus, padding MUST only be added to the lastSR timestamp (LSR) field,individual packet, andthen subtracting this fieldif padding is added toleavethat packet, theround-trip propagation delay as (A- LSR - DLSR).padding bit MUST be set only on that packet. Thisis illustratedconvention aids the header validity checks described inFig. 2. This may be used as an approximate measureAppendix A.2 and allows detection ofdistance to cluster receivers, althoughpackets from somelinks have very asymmetric delays. 6.4.2 RR: Receiver report RTCPearly implementations that incorrectly set the padding bit on the first individual packetSchulzrinne/Casner/Frederick/Jacobson [Page 38] Internet Draft RTP December 5, 1997 [10 Nov 1995 11:33:25.125] [10 Nov 1995 11:33:36.5] n SR(n) A=b710:8000 (46864.500 s) ----------------------------------------------------------------> v ^ ntp_sec =0xb44db705 v ^ dlsr=0x0005.4000 ( 5.250s) ntp_frac=0x20000000 v ^ lsr =0xb705:2000 (46853.125s) (3024992016.125 s) v ^ r v ^ RR(n) ----------------------------------------------------------------> |<-DLSR->| (5.250 s) A 0xb710:8000 (46864.500 s) DLSR -0x0005:4000 ( 5.250 s) LSR -0xb705:2000 (46853.125 s) ------------------------------- delay 0x 6:2000 ( 6.125 s) Figure 2: Example for round-trip time computation 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P| RC | PT=RR=201 | length | header +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC of packet sender | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC_1 (SSRC of first source) | report +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block | fraction lost | cumulative number of packets lost | 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | extended highest sequence number received | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | interarrival jitter | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | last SR (LSR) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | delay sinceand add padding to the lastSR (DLSR) | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC_2 (SSRC of second source) |individual packet. reception report+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block : ... : 2 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | profile-specific extensions | Schulzrinne/Casner/Frederick/Jacobson [Page 39] Internet Draft RTP December 5, 1997 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+count (RC): 5 bits Theformatnumber ofthe receiverreception report(RR) packet is the same as thatblocks contained in this packet. A value ofthe SR packet except that thezero is valid. packet typefield contains(PT): 8 bits Contains the constant201 and the five words200 to identify this as an RTCP SR packet. length: 16 bits The length ofsender information are omitted (these arethis RTCP packet in 32-bit words minus one, including theNTP and RTP timestampsheader andsender's packetany padding. (The offset of one makes zero a valid length andoctet counts).avoids a possible infinite loop in scanning a compound RTCP packet, while counting 32-bit words avoids a validity check for a multiple of 4.) SSRC: 32 bits Theremaining fields have the same meaning assynchronization source identifier for the originator of this SR packet.An empty RR packet (RC = 0) is put atThe second section, thehead of a compound RTCP packet when theresender information, isno20 octets long and is present in every sender report packet. It summarizes the datatransmission or reception to report. 6.4.3 Extendingtransmissions from this sender. The fields have thesender and receiver reports A profile should define profile- or application-specific extensions tofollowing meaning: NTP timestamp: 64 bits Indicates thesenderwallclock time (see Section 4) when this reportand receiver if there is additional informationwas sent so thatshould be reported regularly about the sender or receivers. This method shouldit may be used inpreference to defining another RTCP packet type because it requires less overhead: o fewer octetscombination with timestamps returned inthe packet (no RTCP header or SSRC field); o simpler and faster parsing because applications running under that profile would be programmedreception reports from other receivers toalwaysmeasure round-trip propagation to those receivers. Receivers should expect that theextension fields in the directly accessible location aftermeasurement accuracy of thereception reports. If additional sender information is required, it shouldtimestamp may beincluded first in the extension for sender reports, but would not be present in receiver reports. If information about receivers is to be included, that data may be structured as an array of blocks parallellimited to far less than theexisting arrayresolution ofreception report blocks; that is,thenumberNTP timestamp. The measurement uncertainty ofblocks would be indicated bytheRC field. 6.4.4 Analyzing sender and receiver reports Ittimestamp isexpected that reception quality feedback will be usefulnotonly for the senderindicated as Schulzrinne/Casner/Frederick/Jacobson [Page 35] Internet Draft RTP August 7, 1998 it may not be known. On a system that has no notion of wallclock time butalso for other receivers and third-party monitors. Thedoes have some system-specific clock such as "system uptime", a sendermay modify its transmissions based on the feedback; receivers can determine whether problems are local, regional or global; network managers mayMAY useprofile-independent monitorsthatreceive only the RTCP packets and not the corresponding RTP data packetsclock as a reference toevaluate the performance of their networks for multicast distribution. Cumulative counts arecalculate relative NTP timestamps. It is important to choose a commonly usedin both the sender information and receiver report blocksclock so thatdifferences may be calculated between any two reports to make measurements over both short and long time periods, andif separate implementations are used toprovide resilience againstproduce thelossindividual streams of areport. The Schulzrinne/Casner/Frederick/Jacobson [Page 40] Internet Draft RTP December 5, 1997 difference betweenmultimedia session, all implementations will use thelast two reports received can be usedsame clock. These relative NTP timestamps are assumed toestimatehave a reference time less than 68 years in therecent qualitypast, so the high bit will be zero to serve as an indication of relative timestamps. A sender that has no notion of wallclock or elapsed time may set thedistribution. TheNTP timestampis included so that rates may be calculated from these differences overto zero. RTP timestamp: 32 bits Corresponds to theinterval between two reports. Since thatsame time as the NTP timestampis(above), but in the same units and with the same random offset as the RTP timestamps in data packets. This correspondence may be used for intra- and inter-media synchronization for sources whose NTP timestamps are synchronized, and may be used by media- independentofreceivers to estimate the nominal RTP clockrate forfrequency. Note that in most cases this timestamp will not be equal to the RTP timestamp in any adjacent dataencoding,packet. Rather, it ispossible to implement encoding- and profile-independent quality monitors. An example calculation iscalculated from thepacket loss rate overcorresponding NTP timestamp using theintervalrelationship betweentwo reception reports. The difference inthecumulative number of packets lost givesRTP timestamp counter and real time as maintained by periodically checking thenumber lost during that interval.wallclock time at a sampling instant. sender's packet count: 32 bits Thedifference in the extended last sequence numbers received gives thetotal number of RTP data packetsexpected duringtransmitted by theinterval. The ratio of these two issender since starting transmission up until the time this SR packetloss fraction over the interval. This ratio should equal the fraction lost fieldwas generated. The count is reset if thetwo reports are consecutive, but otherwise not. The loss rate per second can be obtained by dividing the loss fraction by the difference in NTP timestamps, expressed in seconds.sender changes its SSRC identifier. sender's octet count: 32 bits The total number of payload octets (i.e., not including header or padding) transmitted in RTP data packetsreceived isby thenumber of packets expected minussender since starting transmission up until thenumber lost.time this SR packet was generated. Thenumber of packets expected may alsocount is reset if the sender changes its SSRC identifier. This field can be used tojudgeestimate thestatistical validityaverage payload data rate. The third section contains zero or more reception report blocks depending on the number ofany loss estimates. For example, 1 outother sources heard by this sender since the last report. Each reception report block conveys statistics on the reception of5RTP packetslost hasfrom alower significance than 200 outsingle synchronization source. Receivers do not carry over statistics when a source changes its SSRC identifier due to a collision. These statistics are: SSRC_n (source identifier): 32 bits Schulzrinne/Casner/Frederick/Jacobson [Page 36] Internet Draft RTP August 7, 1998 The SSRC identifier of1000. Fromthesender information, a third-party monitor can calculatesource to which theaverage payloadinformation in this reception report block pertains. fraction lost: 8 bits The fraction of RTP datarate andpackets from source SSRC_n lost since theaverageprevious SR or RR packetrate over an interval without receivingwas sent, expressed as a fixed point number with thedata. Takingbinary point at theratioleft edge of thetwo givesfield. (That is equivalent to taking the integer part after multiplying theaverage payload size. If it can be assumed that packetloss fraction by 256.) This fraction isindependent of packet size, thendefined to be the number of packetsreceivedlost divided bya particular receiver times the average payload size (orthecorresponding packet size) givesnumber of packets expected, as defined in theapparent throughput available to that receiver. In addition tonext paragraph. An implementation is shown in Appendix A.3. If thecumulative counts which allow long-term packetlossmeasurements using differences between reports,is negative due to duplicates, the fraction lostfield provides a short-term measurement fromis set to zero. Note that asingle report. This becomes more important asreceiver cannot tell whether any packets were lost after thesize of a session scales up enoughlast one received, and thatreception state information might notthere will bekeptno reception report block issued for a source if allreceivers orpackets from that source sent during the last reporting intervalbetween reports becomes long enough that only one report mighthave beenreceived from a particular receiver.lost. cumulative number of packets lost: 24 bits Theinterarrival jitter field provides a second short-term measuretotal number ofnetwork congestion. Packet loss tracks persistent congestion whileRTP data packets from source SSRC_n that have been lost since thejitter measure tracks transient congestion. The jitter measure may indicate congestion before it leadsbeginning of reception. This number is defined topacket loss. Sincebe theinterarrival jitter field is only a snapshotnumber of packets expected less thejitter atnumber of packets actually received, where thetimenumber ofa report, itpackets received includes any which are late or duplicates. Thus packets that arrive late are not counted as lost, and the loss may benecessary to analyze a number of reports from one receiver over time or from multiple receivers, e.g., within a single network. Schulzrinne/Casner/Frederick/Jacobson [Page 41] Internet Draft RTP December 5, 1997 6.5 SDES: Source description RTCP packet 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P| SC | PT=SDES=202 | length | header +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC/CSRC_1 | chunk +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1 | SDES items | | ... | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC/CSRC_2 | chunk +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2 | SDES items | | ... | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+negative if there are duplicates. TheSDES packet is a three-level structure composed of a header and zero or more chunks, each ofnumber ofwhichpackets expected iscomposed of items describingdefined to be thesource identifiedextended last sequence number received, as defined next, less the initial sequence number received. This may be calculated as shown inthat chunk.Appendix A.3. extended highest sequence number received: 32 bits Theitems are described individually in subsequent sections. version (V), padding (P), length: As described for the SR packet (see Section 6.4.1). packet type (PT): 8low 16 bitsContainscontain theconstant 202 to identify this ashighest sequence number received in anRTCP SDES packet.RTP data packet from sourcecount (SC): 5SSRC_n, and the most significant 16 bitsTheextend that sequence numberof SSRC/CSRC chunks contained in this SDES packet. A value of zero is valid but useless. Each chunk consists of an SSRC/CSRC identifier followed by a list of zero or more items, which carry information aboutwith theSSRC/CSRC. Each chunk starts on a 32-bit boundary. Each item consists of an 8-bit type field, an 8-bit octetcorresponding countdescribing the lengthof sequence number cycles, which may be maintained according to thetext (thus, not including this two-octet header), and the text itself.algorithm in Appendix A.1. Note that different receivers within thetext can be no longer than 255 octets, but this is consistent with the need to limit RTCP bandwidth consumption. The text is encoded accordingsame session will generate different extensions to theUTF-8 encoding specified in RFC 2044. US-ASCII is a subsetsequence number if their start times differ significantly. interarrival jitter: 32 bits An estimate ofthis encodingthe statistical variance of the RTP data packet interarrival time, measured in timestamp units andrequires no additional encoding.expressed as an unsigned integer. Thepresence of multi-octet encodingsinterarrival jitter J isindicated by settingdefined to be themost significant bitmean deviation (smoothed absolute value) ofa characterthe difference D in packet spacing at the receiver compared to the sender for avaluepair ofone.packets. As shown in the equation below, this is equivalent to the difference in the "relative transit time" for the two packets; the relative transit time is the difference Schulzrinne/Casner/Frederick/Jacobson [Page42]37] Internet Draft RTPDecember 5, 1997 Items are contiguous, i.e., items are not individually padded toAugust 7, 1998 between a32-bit boundary. Text is not null terminated because some multi-octet encodings include null octets. The list of items in each chunk is terminated by one or more null octets,packet's RTP timestamp and thefirst of which is interpreted as an item type of zero to denotereceiver's clock at theendtime of arrival, measured in thelist. No length octet follows the null item type octet, but additional null octets are included if needed to pad until the next 32-bit boundary. Note that this paddingsame units. If Si isseparatethe RTP timestamp fromthat indicated bypacket i, and Ri is theP bittime of arrival inthe RTCP header. A chunk with zero items (four null octets)RTP timestamp units for packet i, then for two packets i and j, D may be expressed as D(i,j) = (R_j - R_i) - (S_j - S_i) = (R_j - S_j) - (R_i - S_i) The interarrival jitter isvalid but useless. End systems send one SDEScalculated continuously as each data packetcontaining their owni is received from sourceidentifier (the same asSSRC_n, using this difference D for that packet and theSSRCprevious packet i-1 in order of arrival (not necessarily in sequence), according to thefixed RTP header). A mixer sends one SDES packet containingformula J_i = J_i-1 + (|D(i-1,i)| - J_i-1)/16 Whenever achunk for each contributing source from which itreception report isreceiving SDES information, or multiple complete SDES packets inissued, theformat above if there are more than 31 such sources (see Section 7).current value of J is sampled. TheSDES items currently defined are described in the next sections. Only the CNAME itemjitter calculation ismandatory. Some items shownprescribed heremay be useful only for particular profiles, but the item types are all assigned from one common space to promote shared use andtosimplifyallow profile- independentapplications. Additional items may be defined in a profile by registeringmonitors to make valid interpretations of reports coming from different implementations. This algorithm is thetype numbers with IANA. 6.5.1 CNAME: Canonical end-point identifier SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CNAME=1 | length | useroptimal first- order estimator anddomain name ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The CNAME identifier has the following properties: o Becausetherandomly allocated SSRC identifier may change ifgain parameter 1/16 gives aconflict is discovered or ifgood noise reduction ratio while maintaining aprogramreasonable rate of convergence [12]. A sample implementation isrestarted,shown in Appendix A.8. last SR timestamp (LSR): 32 bits The middle 32 bits out of 64 in theCNAME item is required to provideNTP timestamp (as explained in Section 4) received as part of thebindingmost recent RTCP sender report (SR) packet from source SSRC_n. If no SR has been received yet, theSSRC identifierfield is set toan identifier forzero. delay since last SR (DLSR): 32 bits The delay, expressed in units of 1/65536 seconds, between receiving the last SR packet from sourcethat remains constant. o Like the SSRC identifier,SSRC_n and sending this reception report block. If no SR packet has been received yet from SSRC_n, theCNAME identifier should also be unique among all participants within one RTP session. o To provide a binding across multiple media tools used by one participant in aDLSR field is setof related RTP sessions,to zero. Let SSRC_r denote theCNAME should be fixed for that participant. Schulzrinne/Casner/Frederick/Jacobson [Page 43] Internet Draft RTP December 5, 1997 o To facilitate third-party monitoring,receiver issuing this receiver report. Source SSRC_n can compute theCNAME should be suitable for either a program or a personround propagation delay tolocate the source. Therefore,SSRC_r by recording theCNAME should be derived algorithmically and not entered manually,time A whenpossible. To meet these requirements, the following format should be used unless a profile specifies an alternate syntax or semantics. The CNAME item should have the format "user@host", or "host" if a user name is not available as on single- user systems. For both formats, "host"this reception report block iseither the fully qualified domain name ofreceived. It calculates thehost from whichtotal round-trip time A-LSR using thereal-time data originates, formatted accordinglast SR timestamp (LSR) field, and then subtracting this field to leave therules specifiedround-trip propagation delay as (A- LSR - DLSR). This is illustrated inRFC 1034 [14], RFC 1035 [15] and Section 2.1 of RFC 1123 [16]; or the standard ASCII representation of the host's numeric address on the interfaceFig. 2. This may be usedfor the RTP communication. For example, the standard ASCII representation of an IP Version 4 address is "dotted decimal", also known as dotted quad. Other address types are expected to have ASCII representations that are mutually unique. The fully qualified domain name is more convenient for a human observer and may avoid the need to send a NAME item in addition, but it may be difficult or impossible to obtain reliably in some operating environments. Applications that may be run in such environments should use the ASCII representation of the address instead. Examples are "doe@sleepy.megacorp.com" or "doe@192.0.2.89" for a multi-user system. On a system with no user name, examples would be "sleepy.megacorp.com" or "192.0.2.89". The user name should be in a form that a program suchas"finger" or "talk" could use, i.e., it typically is the login name rather than the personal name. The host name is not necessarily identical to the one in the participant's electronic mail address. This syntax will not provide unique identifiers for each source if an application permits a user to generate multiple sources from one host. Suchanapplication would have to rely on the SSRCapproximate measure of distance tofurther identify the source, or the profile for that application wouldcluster receivers, although some links haveto specify additional syntax for the CNAME identifier. If each application creates its CNAME independently, the resulting CNAMEs may not be identical as would be required to provide a binding across multiple media tools belonging to one participant in a set of related RTP sessions. If cross-media binding is required, it may be necessary for the CNAME of each tool to be externally configured with the same value by a coordination tool. Application writers should be aware that private network address assignments such as the Net-10 assignment proposed in RFC 1597 [17] may create network addresses that are not globally unique. This wouldvery asymmetric delays. 6.4.2 RR: Receiver report RTCP packet Schulzrinne/Casner/Frederick/Jacobson [Page44]38] Internet Draft RTPDecember 5, 1997 lead to non-unique CNAMEs if hosts with private addresses and no direct IP connectivity to the public Internet have their RTP packets forwarded to the public Internet through an RTP-level translator. (See also RFC 1627 [18].) To handle this case, applications may provide a means to configure a unique CNAME, but the burden is on the translator to translate CNAMEs from private addresses to public addresses if necessary to keep private addresses from being exposed. 6.5.2 NAME: User name SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NAME=2 | length | common name of source ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This is the real name used to describe the source, e.g., "John Doe, Bit Recycler, Megacorp". It may be in any form desired by the user. For applications such as conferencing, this form of name may be the most desirable for display in participant lists, and therefore might be sent most frequently of those items other than CNAME. Profiles may establish such priorities. The NAME value is expected to remain constant at least for the duration of a session. It should not be relied upon to be unique among all participants in the session. 6.5.3 EMAIL: Electronic mail address SDES itemAugust 7, 1998 [10 Nov 1995 11:33:25.125] [10 Nov 1995 11:33:36.5] n SR(n) A=b710:8000 (46864.500 s) ----------------------------------------------------------------> v ^ ntp_sec =0xb44db705 v ^ dlsr=0x0005.4000 ( 5.250s) ntp_frac=0x20000000 v ^ lsr =0xb705:2000 (46853.125s) (3024992016.125 s) v ^ r v ^ RR(n) ----------------------------------------------------------------> |<-DLSR->| (5.250 s) A 0xb710:8000 (46864.500 s) DLSR -0x0005:4000 ( 5.250 s) LSR -0xb705:2000 (46853.125 s) ------------------------------- delay 0x 6:2000 ( 6.125 s) Figure 2: Example for round-trip time computation 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P| RC |EMAIL=3PT=RR=201 | length |email addressheader +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC ofsourcepacket sender | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC_1 (SSRC of first source) | report +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block | fraction lost | cumulative number of packets lost | 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | extended highest sequence number received | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | interarrival jitter | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | last SR (LSR) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | delay since last SR (DLSR) | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC_2 (SSRC of second source) | report +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block : ... : 2 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | profile-specific extensions | Schulzrinne/Casner/Frederick/Jacobson [Page 39] Internet Draft RTP August 7, 1998 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Theemail addressformat of the receiver report (RR) packet isformatted according to RFC 822 [19], for example, "John.Doe@megacorp.com".the same as that of the SR packet except that the packet type field contains the constant 201 and the five words of sender information are omitted (these are the NTP and RTP timestamps and sender's packet and octet counts). TheEMAIL valueremaining fields have the same meaning as for the SR packet. An empty RR packet (RC = 0) isexpectedput at the head of a compound RTCP packet when there is no data transmission or reception to report. 6.4.3 Extending the sender and receiver reports A profile SHOULD define profile-specific extensions to the sender report and receiver report if there is additional information that needs to be reported regularly about the sender or receivers. This method SHOULD be used in preference toremain constant fordefining another RTCP packet type because it requires less overhead: o fewer octets in theduration of a session. 6.5.4 PHONE: Phone number SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PHONE=4 | length | phone number of source ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Schulzrinne/Casner/Frederick/Jacobson [Page 45] Internet Draft RTP December 5, 1997 The phone number shouldpacket (no RTCP header or SSRC field); o simpler and faster parsing because applications running under that profile would beformatted withprogrammed to always expect theplus sign replacingextension fields in theinternational access code. For example, "+1 908 555 1212" fordirectly accessible location after the reception reports. The extension is anumberfourth section in theUnited States. 6.5.5 LOC: Geographic user location SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LOC=5 | length | geographic location of site ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Depending onsender- or receiver-report packet which comes at theapplication, different degrees of detail are appropriateend after the reception report blocks, if any. If additional sender information is required, then forthis item. For conference applications, a string like "Murray Hill, New Jersey"sender reports it should be included first in the extension section, but for receiver reports it would not be present. If information about receivers is to be included, that data may besufficient, while, forstructured as anactive badge system, strings like "Room 2A244, AT&T BL MH" might be appropriate. The degreearray ofdetail is leftblocks parallel to theimplementation and/or user, but format and content mayexisting array of reception report blocks; that is, the number of blocks would beprescribedindicated bya profile. The LOC valuethe RC field. 6.4.4 Analyzing sender and receiver reports It is expectedto remain constantthat reception quality feedback will be useful not only for theduration of a session, exceptsender but also formobile hosts. 6.5.6 TOOL: Applicationother receivers and third-party monitors. The sender may modify its transmissions based on the feedback; receivers can determine whether problems are local, regional ortool name SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TOOL=6 | length | name/versionglobal; network managers may use profile-independent monitors that receive only the RTCP packets and not the corresponding RTP data packets to evaluate the performance ofsource appl. ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ A string givingtheir networks for multicast distribution. Cumulative counts are used in both thenamesender information andpossibly versionreceiver report blocks so that differences may be calculated between Schulzrinne/Casner/Frederick/Jacobson [Page 40] Internet Draft RTP August 7, 1998 any two reports to make measurements over both short and long time periods, and to provide resilience against the loss of a report. The difference between theapplication generatinglast two reports received can be used to estimate thestream, e.g., "videotool 1.2". This informationrecent quality of the distribution. The NTP timestamp is included so that rates may beusefulcalculated from these differences over the interval between two reports. Since that timestamp is independent of the clock rate fordebugging purposes andthe data encoding, it issimilarpossible to implement encoding- and profile-independent quality monitors. An example calculation is theMailer or Mail- System-Version SMTP headers.packet loss rate over the interval between two reception reports. TheTOOL value is expected to remain constant fordifference in thedurationcumulative number of packets lost gives thesession. 6.5.7 NOTE: Notice/status SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NOTE=7 | length | note aboutnumber lost during that interval. The difference in thesource ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+extended last sequence numbers received gives the number of packets expected during the interval. Thefollowing semanticsratio of these two is the packet loss fraction over the interval. This ratio should equal the fraction lost field if the two reports aresuggested for this item,consecutive, butthese or other semantics mayotherwise not. The loss rate per second can beexplicitly definedobtained bya profile.dividing the loss fraction by the difference in NTP timestamps, expressed in seconds. TheNOTE item Schulzrinne/Casner/Frederick/Jacobson [Page 46] Internet Draft RTP December 5, 1997number of packets received isintended for transient messages describingthecurrent statenumber of packets expected minus thesource, e.g., "on the phone, can't talk". Or, during a seminar, this item mightnumber lost. The number of packets expected may also be used toconveyjudge thetitlestatistical validity of any loss estimates. For example, 1 out of 5 packets lost has a lower significance than 200 out of 1000. From thetalk. It should be used only to carry exceptional information and should not be included routinely by all participants because this would slow downsender information, a third-party monitor can calculate the average payload data rateat which reception reportsandCNAME are sent, thus impairingtheperformanceaverage packet rate over an interval without receiving the data. Taking the ratio of theprotocol. In particular,two gives the average payload size. If itshould notcan beincluded as an item in a user's configuration file nor automatically generated as inassumed that packet loss is independent of packet size, then the number of packets received by aquote-of-the-day. Sinceparticular receiver times theNOTE item may be importantaverage payload size (or the corresponding packet size) gives the apparent throughput available to that receiver. In addition todisplay while it is active,therate atcumulative counts whichother non-CNAME items suchallow long-term packet loss measurements using differences between reports, the fraction lost field provides a short-term measurement from a single report. This becomes more important asNAME are transmittedthe size of a session scales up enough that reception state information might not bereduced so thatkept for all receivers or theNOTE item can takeinterval between reports becomes long enough thatpartonly one report might have been received from a particular receiver. The interarrival jitter field provides a second short-term measure of network congestion. Packet loss tracks persistent congestion while theRTCP bandwidth. When thejitter measure tracks transientmessage becomes inactive, the NOTE item should continuecongestion. The jitter measure may indicate congestion before it leads tobe transmitted a few times atpacket loss. Since thesame repetition rate but withinterarrival jitter field is only astringsnapshot oflength zero to signalthereceivers. However, receivers should also considerjitter at theNOTE item inactive iftime of a report, itis not received formay be necessary to analyze asmall multiplenumber ofthe repetition rate,reports from one receiver over time orperhaps 20-30from multiple receivers, e.g., within Schulzrinne/Casner/Frederick/Jacobson [Page 41] Internet Draft RTP August 7, 1998 a single network. 6.5 SDES: Source description RTCPintervals. 6.5.8 PRIV: Private extensions SDES itempacket 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P| SC |PRIV=8PT=SDES=202 | length |prefix lengthheader +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ |prefix string...SSRC/CSRC_1 | chunk +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...1 | SDES items | |value string... | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC/CSRC_2 | chunk +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+This item2 | SDES items | | ... | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ The SDES packet isused to define experimentala three-level structure composed of a header and zero orapplication-specificmore chunks, each of of which is composed of items describing the source identified in that chunk. The items are described individually in subsequent sections. version (V), padding (P), length: As described for the SR packet (see Section 6.4.1). packet type (PT): 8 bits Contains the constant 202 to identify this as an RTCP SDESextensions.packet. source count (SC): 5 bits Theitem contains a prefix consistingnumber ofa length-string pair,SSRC/CSRC chunks contained in this SDES packet. A value of zero is valid but useless. Each chunk consists of an SSRC/CSRC identifier followed bythe value string filling the remaindera list of zero or more items, which carry information about the SSRC/CSRC. Each chunk starts on a 32-bit boundary. Each itemand carryingconsists of an 8-bit type field, an 8-bit octet count describing thedesired information. The prefixlengthfield is 8 bits long. The prefix string is a name chosen byof theperson definingtext (thus, not including this two-octet header), and thePRIV item totext itself. Note that the text can beuniqueno longer than 255 octets, but this is consistent withrespectthe need toother PRIV items this application might receive.limit RTCP bandwidth consumption. Theapplication creator might choosetext is encoded according tousetheapplication name plus anUTF-8 encoding specified in RFC 2279 [13]. US-ASCII is a subset of this encoding and requires no additionalsubtype identification if needed. Alternatively, itencoding. The presence of multi-octet encodings isrecommended that others chooseSchulzrinne/Casner/Frederick/Jacobson [Page 42] Internet Draft RTP August 7, 1998 indicated by setting the most significant bit of aname based oncharacter to a value of one. Items are contiguous, i.e., items are not individually padded to a 32-bit boundary. Text is not null terminated because some multi-octet encodings include null octets. The list of items in each chunk is terminated by one or more null octets, theentity they represent, then coordinatefirst of which is interpreted as an item type of zero to denote theuseend of thename within that entity.list. No length octet follows the null item type octet, but additional null octets are included if needed to pad until the next 32-bit boundary. Note that this padding is separate from that indicated by theprefix consumes some space within the item's total length of 255 octets, soP bit in theprefix should be kept as shortRTCP header. A chunk with zero items (four null octets) is valid but useless. End systems send one SDES packet containing their own source identifier (the same aspossible. This facility andtheconstrained RTCP bandwidth should not be overloaded;SSRC in the fixed RTP header). A mixer sends one SDES packet containing a chunk for each contributing source from which it isnot intended to satisfy allreceiving SDES information, or multiple complete SDES packets in thecontrol communication requirements of all applications. Schulzrinne/Casner/Frederick/Jacobson [Page 47] Internet Draft RTP December 5, 1997format above if there are more than 31 such sources (see Section 7). The SDESPRIV prefixes will notitems currently defined are described in the next sections. Only the CNAME item is mandatory. Some items shown here may beregistered by IANA. If some form ofuseful only for particular profiles, but thePRIVitemprovestypes are all assigned from one common space to promote shared use and to simplify profile- independent applications. Additional items may beof general utility, it should instead be assigneddefined in aregularprofile by registering the type numbers with IANA. 6.5.1 CNAME: Canonical end-point identifier SDES itemtype registered with IANA so that no prefix is required. This simplifies use and increases transmission efficiency. 6.6 BYE: Goodbye RTCP packet0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|V=2|P| SC | PT=BYE=203 | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC/CSRC|+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : ... : +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+CNAME=1 | length |reason for leavinguser and domain name ...(opt)+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ TheBYE packet indicatesCNAME identifier has the following properties: o Because the randomly allocated SSRC identifier may change if a conflict is discovered or if a program is restarted, the CNAME item is required to provide the binding from the SSRC identifier to an identifier for the source that remains constant. o Like the SSRC identifier, the CNAME identifier should also be unique among all participants within one RTP session. Schulzrinne/Casner/Frederick/Jacobson [Page 43] Internet Draft RTP August 7, 1998 o To provide a binding across multiple media tools used by one participant in a set of related RTP sessions, the CNAME should be fixed for that participant. o To facilitate third-party monitoring, the CNAME should be suitable for either a program or a person to locate the source. Therefore, the CNAME should be derived algorithmically and not entered manually, when possible. To meet these requirements, the following format should be used unless a profile specifies an alternate syntax or semantics. The CNAME item should have the format "user@host", or "host" if a user name is not available as on single- user systems. For both formats, "host" is either the fully qualified domain name of the host from which the real-time data originates, formatted according to the rules specified in RFC 1034 [14], RFC 1035 [15] and Section 2.1 of RFC 1123 [16]; ormore sources are no longer active. version (V), padding (P), length: As describedthe standard ASCII representation of the host's numeric address on the interface used for theSR packet (see Section 6.4.1). packet type (PT): 8 bits ContainsRTP communication. For example, theconstant 203 to identify this asstandard ASCII representation of anRTCP BYE packet. source count (SC): 5 bitsIP Version 4 address is "dotted decimal", also known as dotted quad. Other address types are expected to have ASCII representations that are mutually unique. Thenumber of SSRC/CSRC identifiers included in this BYE packet. A count value of zerofully qualified domain name isvalid,more convenient for a human observer and may avoid the need to send a NAME item in addition, butuseless. The rulesit may be difficult or impossible to obtain reliably in some operating environments. Applications that may be run in such environments should use the ASCII representation of the address instead. Examples are "doe@sleepy.megacorp.com" or "doe@192.0.2.89" forwhenaBYE packetmulti-user system. On a system with no user name, examples would be "sleepy.megacorp.com" or "192.0.2.89". The user name should besent are specifiedinSection 6.3.7. IfaBYE packet is received byform that amixer,program such as "finger" or "talk" could use, i.e., it typically is themixer forwardslogin name rather than theBYE packet withpersonal name. The host name is not necessarily identical to theSSRC/CSRC identifier(s) unchanged. If a mixer shuts down, it should sendone in the participant's electronic mail address. This syntax will not provide unique identifiers for each source if an application permits aBYE packet listing all contributinguser to generate multiple sourcesit handles, as well as its own SSRC identifier. Optionally, the BYE packet may includefrom one host. Such an8-bit octet count followed by that many octets of text indicatingapplication would have to rely on thereason for leaving, e.g., "camera malfunction"SSRC to further identify the source, or"RTP loop detected". The string hasthesame encoding as that describedprofile forSDES. If the string fills the packetthat application would have to specify additional syntax for thenext 32-bit boundary,CNAME identifier. If each application creates its CNAME independently, thestring isresulting CNAMEs may notnull terminated.be identical as would be required to provide a binding across multiple media tools belonging to one participant in a set of related RTP sessions. Ifnot,cross-media binding is required, it may be necessary for theBYE packetCNAME of each tool to be externally configured with the same value by a coordination tool. Schulzrinne/Casner/Frederick/Jacobson [Page48]44] Internet Draft RTPDecember 5, 1997 is paddedAugust 7, 1998 Application writers should be aware that private network address assignments such as the Net-10 assignment proposed in RFC 1597 [17] may create network addresses that are not globally unique. This would lead to non-unique CNAMEs if hosts withnull octetsprivate addresses and no direct IP connectivity to the public Internet have their RTP packets forwarded to the public Internet through an RTP-level translator. (See also RFC 1627 [18].) To handle this case, applications may provide a means to configure a unique CNAME, but thenext 32-bit boundary. This paddingburden isseparate from that indicated by the P bit inon theRTCP header. 6.7 APP: Application-defined RTCP packettranslator to translate CNAMEs from private addresses to public addresses if necessary to keep private addresses from being exposed. 6.5.2 NAME: User name SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|V=2|P| subtype|PT=APP=204NAME=2 | length |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC/CSRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |common name(ASCII) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | application-dependent dataof source ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+The APP packet is intended for experimental use as new applications and new features are developed, without requiring packet type value registration. APP packets with unrecognized names should be ignored. After testing and if wider use is justified, itThis isrecommended that each APP packet be redefined withoutthesubtype andreal namefields and registered with the Internet Assigned Numbers Authority using an RTCP packet type. version (V), padding (P), length: As described for the SR packet (see Section 6.4.1). subtype: 5 bits May beusedas a subtype to allow a set of APP packetsto describe the source, e.g., "John Doe, Bit Recycler, Megacorp". It may bedefined under one unique name, or forin anyapplication-dependent data. packet type (PT): 8 bits Containsform desired by theconstant 204 to identify thisuser. For applications such asan RTCP APP packet. name: 4 octets Aconferencing, this form of namechosen bymay be theperson definingmost desirable for display in participant lists, and therefore might be sent most frequently of those items other than CNAME. Profiles may establish such priorities. The NAME value is expected to remain constant at least for thesetduration ofAPP packetsa session. It should not be relied upon to be uniquewith respect to other APP packets this application might receive. The application creator might choose to use the application name, and then coordinateamong all participants in theallocationsession. 6.5.3 EMAIL: Electronic mail address SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EMAIL=3 | length | email address ofsubtype values to others who wantsource ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The email address is formatted according todefine new packet typesRFC 822 [19], forthe application. Alternatively, itexample, "John.Doe@megacorp.com". The EMAIL value isrecommended that others choose a name based on the entity they represent, then coordinateexpected to remain constant for theuseduration ofthe name within that entity. The name is interpreted asasequence of four ASCII characters, withsession. 6.5.4 PHONE: Phone number SDES item Schulzrinne/Casner/Frederick/Jacobson [Page49]45] Internet Draft RTPDecember 5, 1997 uppercase and lowercase characters treated as distinct. application-dependent data: variable length Application-dependent data may or may not appear in an APP packet. It is interpreted by the application and not RTP itself. It must be a multiple of 32 bits long.August 7, 1998 0 1 2 3 0 1 2 3 4 5 6 7RTP Translators and Mixers In addition to end systems, RTP supports the notion8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PHONE=4 | length | phone number of"translators" and "mixers", which couldsource ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The phone number should beconsidered as "intermediate systems" at the RTP level. Although this support adds some complexity toformatted with theprotocol,plus sign replacing theneedinternational access code. For example, "+1 908 555 1212" forthese functions has been clearly established by experiments with multicast audio and video applicationsa number in theInternet. Example usesUnited States. 6.5.5 LOC: Geographic user location SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LOC=5 | length | geographic location oftranslators and mixers given in Section 2.3 stem fromsite ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Depending on thepresence of firewalls and low bandwidth connections, bothapplication, different degrees ofwhichdetail arelikely to remain. 7.1 General Description An RTP translator/mixer connects two or more transport-level "clouds". Typically, each cloud is defined by a common network and transport protocol (e.g., IP/UDP) plus a multicast address and transport level destination port or a pair of unicast addresses and ports. (Network-level protocol translators, such as IP version 4 to IP version 6, may be present within a cloud invisibly to RTP.) One system may serve as a translator or mixerappropriate for this item. For conference applications, anumber of RTP sessions, but each is considered a logically separate entity. In order to avoid creating a loop when a translator or mixer is installed, the following rules muststring like "Murray Hill, New Jersey" may beobserved: o Eachsufficient, while, for an active badge system, strings like "Room 2A244, AT&T BL MH" might be appropriate. The degree of detail is left to theclouds connected by translatorsimplementation and/or user, but format andmixers participating in one RTP session either mustcontent may bedistinct from allprescribed by a profile. The LOC value is expected to remain constant for theothers in at least oneduration ofthese parameters (protocol, address, port),a session, except for mobile hosts. 6.5.6 TOOL: Application ormust be isolated at the network level from the others. o A derivativetool name SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TOOL=6 | length | name/version of source appl. ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ A string giving thefirst rule is that there must not be multiple translators or mixers connected in parallel unless by some arrangement they partition the setname and possibly version ofsources to be forwarded. Similarly, all RTP end systems that can communicate through one or more RTP translators or mixers share the same SSRC space, that is, the SSRC identifiers must be unique among all these end systems. Section 8.2 describesthecollision resolution algorithm by which SSRC identifiers are kept unique and loops are detected. Schulzrinne/Casner/Frederick/Jacobson [Page 50] Internet Draft RTP December 5, 1997 Thereapplication generating the stream, e.g., "videotool 1.2". This information may bemany varieties of translators and mixers designeduseful fordifferentdebugging purposes andapplications. Some examples areis similar toadd or remove encryption, change the encoding of the data ortheunderlying protocols, or replicate between a multicast address and oneMailer ormore unicast addresses.Mail- System-Version SMTP headers. Thedistinction between translators and mixersTOOL value isthat a translator passes through the data streams from different sources separately, whereas a mixer combines themexpected toform one new stream: Translator: Forwards RTP packets with their SSRC identifier intact; this makes it possibleremain constant forreceivers to identify individual sources even though packets from all the sources pass through the same translator and carry the translator's network source address. Some kinds of translators will pass through the data untouched, but others may changetheencodingduration of thedata and thus thesession. 6.5.7 NOTE: Notice/status SDES item Schulzrinne/Casner/Frederick/Jacobson [Page 46] Internet Draft RTPdata payload type and timestamp. If multiple data packets are re-encoded into one, or vice versa, a translator must assign new sequence numbers to the outgoing packets. Losses in the incoming packet stream may induce corresponding gaps in the outgoing sequence numbers. Receivers cannot detect the presence of a translator unless they know by some other means what payload type or transport address was used byAugust 7, 1998 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NOTE=7 | length | note about theoriginal source. Mixer: Receives streams of RTP data packets from onesource ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The following semantics are suggested for this item, but these ormore sources, possibly changes the data format, combines the streams in some manner and then forwards the combined stream. Since the timing among multiple input sources will not generallyother semantics may besynchronized, the mixer will make timing adjustments among the streams and generate its own timingexplicitly defined by a profile. The NOTE item is intended for transient messages describing thecombined stream, so it iscurrent state of thesynchronization source. Thus, all data packets forwarded bysource, e.g., "on the phone, can't talk". Or, during amixer willseminar, this item might bemarked with the mixer's own SSRC identifier. In orderused topreserveconvey theidentitytitle of theoriginal sources contributingtalk. It should be used only tothe mixed packet, the mixercarry exceptional information and shouldinsert their SSRC identifiers intonot be included routinely by all participants because this would slow down theCSRC identifier list followingrate at which reception reports and CNAME are sent, thus impairing thefixed RTP headerperformance of thepacket. A mixer that is also itself a contributing source for some packet should explicitly include its own SSRC identifier in the CSRC list for that packet. For some applications,protocol. In particular, itmay be acceptable for a mixer not to identify sources in the CSRC list. However, this introduces the danger that loops involving those sources couldshould not bedetected. The advantage ofincluded as an item in amixer overuser's configuration file nor automatically generated as in atranslator for applications like audio is that the output bandwidth is limited to that of one source even when multiple sources are active onquote-of-the-day. Since theinput side. ThisNOTE item may be importantfor low-bandwidth links. The disadvantageto display while it isthat receivers onactive, theoutput side don't have any control overrate at whichsourcesother non-CNAME items such as NAME areSchulzrinne/Casner/Frederick/Jacobson [Page 51] Internet Draft RTP December 5, 1997 passed through or muted, unless some mechanism is implemented for remote controltransmitted might be reduced so that the NOTE item can take that part of themixer. The regenerationRTCP bandwidth. When the transient message becomes inactive, the NOTE item should continue to be transmitted a few times at the same repetition rate but with a string ofsynchronization information by mixers also means thatlength zero to signal the receivers. However, receiverscan't do inter-media synchronizationshould also consider the NOTE item inactive if it is not received for a small multiple of theoriginal streams. A multi-media mixer could do it. [E1] [E6] | | E1:17 | E6:15 | | | E6:15 V M1:48 (1,17) M1:48 (1,17) V M1:48 (1,17) (M1)-------------><T1>-----------------><T2>-------------->[E7] ^ ^ E4:47 ^ E4:47 E2:1 | E4:47 | | M3:89 (64,45) | |repetition rate, or perhaps 20-30 RTCP intervals. 6.5.8 PRIV: Private extensions SDES item 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |[E2] [E4] M3:89 (64,45)PRIV=8 | length |legend: [E3] --------->(M2)----------->(M3)------------| [End system] E3:64 M2:12 (64) ^ (Mixer)prefix length |E5:45 <Translator>prefix string... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... |[E5] source: SSRC (CSRCs) -------------------> Figure 3: Sample RTP network with end systems, mixers and translators A collection of mixers and translatorsvalue string ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This item isshown in Figure 3used toillustrate their effect on SSRC and CSRC identifiers. In the figure, end systems are shown as rectangles (named E), translators as triangles (named T) and mixers as ovals (named M).define experimental or application-specific SDES extensions. Thenotation "M1: 48(1,17)" designatesitem contains apacket originatingprefix consisting of amixer M1, identified with M1's (random) SSRClength-string pair, followed by the valueof 48 and two CSRC identifiers, 1 and 17, copied fromstring filling theSSRC identifiersremainder ofpackets from E1 and E2. 7.2 RTCP Processing in Translators In addition to forwarding data packets, perhaps modified, translators and mixers must also process RTCP packets. In many cases, they will take apartthecompound RTCP packets received from end systems to aggregate SDES informationitem andto modifycarrying theSR or RR packets. Retransmission of this information may be triggereddesired information. The prefix length field is 8 bits long. The prefix string is a name chosen by thepacket arrival or byperson defining theRTCP interval timer ofPRIV item to be unique with respect to other PRIV items this application might receive. The application creator might choose to use thetranslator or mixer itself.application name plus an additional subtype identification if Schulzrinne/Casner/Frederick/Jacobson [Page52]47] Internet Draft RTPDecember 5, 1997 A translatorAugust 7, 1998 needed. Alternatively, it is recommended thatdoes not modifyothers choose a name based on thedata packets, for example oneentity they represent, then coordinate the use of the name within thatjust replicates between a multicast address and a unicast address, may simply forward RTCP packets unmodified as well. A translatorentity. Note thattransformsthepayload inprefix consumes someway must make corresponding transformations in the SR and RR information so that it still reflectsspace within thecharacteristicsitem's total length of 255 octets, so thedataprefix should be kept as short as possible. This facility and thereception quality. These translators must not simply forwardconstrained RTCPpackets. In general, a translatorbandwidth should notaggregate SRbe overloaded; it is not intended to satisfy all the control communication requirements of all applications. SDES PRIV prefixes will not be registered by IANA. If some form of the PRIV item proves to be of general utility, it should instead be assigned a regular SDES item type registered with IANA so that no prefix is required. This simplifies use andRR packets from different sources into oneincreases transmission efficiency. 6.6 BYE: Goodbye RTCP packet 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P| SC | PT=BYE=203 | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC/CSRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : ... : +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | length | reason for leaving ... (opt) +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The BYE packetsinceindicates thatwould reduce the accuracy of the propagation delay measurements based onone or more sources are no longer active. version (V), padding (P), length: As described for theLSR and DLSR fields.SRsender information: A translator does not generate its own sender information, but forwardspacket (see Section 6.4.1). packet type (PT): 8 bits Contains theSR packets received from one cloudconstant 203 tothe others.identify this as an RTCP BYE packet. source count (SC): 5 bits TheSSRCnumber of SSRC/CSRC identifiers included in this BYE packet. A count value of zero isleft intactvalid, butthe sender information mustuseless. The rules for when a BYE packet should bemodified if required by the translation.sent are specified in Section 6.3.7. Schulzrinne/Casner/Frederick/Jacobson [Page 48] Internet Draft RTP August 7, 1998 If atranslator changesBYE packet is received by a mixer, thedata encoding, it must changemixer forwards the"sender's byte count" field.BYE packet with the SSRC/CSRC identifier(s) unchanged. If a mixer shuts down, italso combines several data packets into one output packet, it must change the "sender'sshould send a BYE packetcount" field. Iflisting all contributing sources itchangeshandles, as well as its own SSRC identifier. Optionally, thetimestamp frequency, it must changeBYE packet may include an 8-bit octet count followed by that many octets of text indicating the reason for leaving, e.g., "camera malfunction" or "RTPtimestamp" field inloop detected". The string has theSR packet. SR/RR reception report blocks: A translator forwards reception reports received from one cloud tosame encoding as that described for SDES. If theothers. Note that these flow instring fills thedirection oppositepacket to thedata. The SSRCnext 32-bit boundary, the string isleft intact.not null terminated. Ifa translator combines several data packets into one output packet, and therefore changes the sequence numbers, it must make the inverse manipulation fornot, the BYE packetloss fields andis padded with null octets to the"extended last sequence number" field.next 32-bit boundary. Thismay be complex. In the extreme case, there may be no meaningful way to translatepadding is separate from that indicated by thereception reports, soP bit in thetranslator may pass on no reception report at all or a synthetic report based on its own reception.RTCP header. 6.7 APP: Application-defined RTCP packet 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P| subtype | PT=APP=204 | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC/CSRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | name (ASCII) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | application-dependent data ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Thegeneral ruleAPP packet isto do what makes sense for a particular translation. A translator does not require an SSRC identifier of its own, but may choose to allocate oneintended forthe purpose of sending reports about whatexperimental use as new applications and new features are developed, without requiring packet type value registration. APP packets with unrecognized names should be ignored. After testing and if wider use is justified, ithas received. These wouldis recommended that each APP packet besent to allredefined without theconnected clouds, each corresponding tosubtype and name fields and registered with thetranslation ofInternet Assigned Numbers Authority using an RTCP packet type. version (V), padding (P), length: As described for thedata streamSR packet (see Section 6.4.1). subtype: 5 bits May be used assenta subtype tothat cloud, since reception reports are normally multicastallow a set of APP packets toall participants. SDES: Translators typically forward without change the SDES information they receive frombe defined under onecloud to the others, but may,unique name, or forexample, decide to filter non-CNAME SDES information if bandwidth is limited. The CNAMEs must be forwarded to allow SSRC identifier collision detectionany application-dependent data. packet type (PT): 8 bits Contains the constant 204 towork. A translator that generates its own RR packets must send SDES CNAME informationidentify this as an RTCP APP packet. name: 4 octets Schulzrinne/Casner/Frederick/Jacobson [Page53]49] Internet Draft RTPDecember 5, 1997 about itself to the same clouds that it sends those RR packets. BYE: Translators forward BYE packets unchanged.August 7, 1998 Atranslator that is about to cease forwarding packets should send a BYE packet to each connected cloud containing allname chosen by theSSRC identifiers that were previously being forwarded to that cloud, includingperson defining thetranslator's own SSRC identifier if it sent reportsset ofits own. APP: Translators forwardAPP packetsunchanged. 7.3 RTCP Processing in Mixers Since a mixer generates a new data stream of its own, it does not pass through SR or RRto be unique with respect to other APP packetsat allthis application might receive. The application creator might choose to use the application name, andinstead generatesthen coordinate the allocation of subtype values to others who want to define newinformationpacket types forboth sides. SR sender information: A mixer does not pass through sender information fromthesourcesapplication. Alternatively, itmixes becauseis recommended that others choose a name based on thecharacteristics ofentity they represent, then coordinate thesource streams are lost inuse of themix. Asname within that entity. The name is interpreted as asynchronization source, the mixer generates its own SR packetssequence of four ASCII characters, withsender information about the mixed data streamuppercase andsends themlowercase characters treated as distinct. application-dependent data: variable length Application-dependent data may or may not appear in an APP packet. It is interpreted by thesame directionapplication and not RTP itself. It must be a multiple of 32 bits long. 7 RTP Translators and Mixers In addition to end systems, RTP supports the notion of "translators" and "mixers", which could be considered as "intermediate systems" at themixed stream. SR/RR reception report blocks: A mixer generates its own reception reportsRTP level. Although this support adds some complexity to the protocol, the need forsourcesthese functions has been clearly established by experiments with multicast audio and video applications in the Internet. Example uses of translators and mixers given in Section 2.3 stem from the presence of firewalls and low bandwidth connections, both of which are likely to remain. 7.1 General Description An RTP translator/mixer connects two or more transport-level "clouds". Typically, each cloud is defined by a common network andsends them out only to the same cloud. It does not send these reception reports to the other cloudstransport protocol (e.g., IP/UDP) plus a multicast address anddoes not forward reception reports from one cloudtransport level destination port or a pair of unicast addresses and ports. (Network-level protocol translators, such as IP version 4 tothe others because the sources would notIP version 6, may beSSRCs there (only CSRCs). SDES: Mixers typically forward without change the SDES information they receive from onepresent within a cloud invisibly tothe others, but may,RTP.) One system may serve as a translator or mixer forexample, decide to filter non-CNAME SDES information if bandwidtha number of RTP sessions, but each islimited. The CNAMEs must be forwarded to allow SSRC identifier collision detectionconsidered a logically separate entity. In order towork. (An identifier inavoid creating aCSRC list generated byloop when a translator or mixermight collide with an SSRC identifier generated by an end system.) A mixeris installed, the following rules mustsend SDES CNAME information about itself tobe observed: o Each of thesamecloudsthat it sends SR or RR packets. Sinceconnected by translators and mixersdo not forward SR or RR packets, they will typicallyparticipating in one RTP session either must beextracting SDES packets from a compound RTCP packet. To minimize overhead, chunksdistinct from all theSDES packets may be aggregated into a single SDES packet which is then stacked on an SRothers in at least one of these parameters (protocol, address, port), orRR packet originatingmust be isolated at the network level from themixer. The RTCP packet rate may be different on each sideothers. o A derivative of themixer. A mixerfirst rule is thatdoesthere must notinsert CSRC identifiers may also refrain frombe Schulzrinne/Casner/Frederick/Jacobson [Page54]50] Internet Draft RTPDecember 5, 1997 forwarding SDES CNAMEs. In this case, the SSRC identifier spacesAugust 7, 1998 multiple translators or mixers connected in parallel unless by some arrangement they partition thetwo clouds are independent. As mentioned earlier, this modeset ofoperation creates a dangersources to be forwarded. Similarly, all RTP end systems thatloops can'tcan communicate through one or more RTP translators or mixers share the same SSRC space, that is, the SSRC identifiers must be unique among all these end systems. Section 8.2 describes the collision resolution algorithm by which SSRC identifiers are kept unique and loops are detected.BYE: Mixers needThere may be many varieties of translators and mixers designed for different purposes and applications. Some examples are toforward BYE packets. A mixer thatadd or remove encryption, change the encoding of the data or the underlying protocols, or replicate between a multicast address and one or more unicast addresses. The distinction between translators and mixers isaboutthat a translator passes through the data streams from different sources separately, whereas a mixer combines them tocease forwardingform one new stream: Translator: Forwards RTP packetsshould send a BYE packetwith their SSRC identifier intact; this makes it possible for receivers toeach connected cloud containingidentify individual sources even though packets from all theSSRC identifiers that were previously being forwarded to that cloud, includingsources pass through themixer's own SSRC identifier if it sent reports of its own. APP: The treatmentsame translator and carry the translator's network source address. Some kinds ofAPP packets by mixers is application-specific. 7.4 Cascaded Mixers An RTP sessiontranslators will pass through the data untouched, but others mayinvolve a collectionchange the encoding ofmixersthe data andtranslators as shown in Figure 3.thus the RTP data payload type and timestamp. Iftwo mixersmultiple data packets arecascaded, such as M2 and M3re-encoded into one, or vice versa, a translator must assign new sequence numbers to the outgoing packets. Losses in thefigure, packets received by a mixer may already have been mixed andincoming packet stream mayinclude a CSRC list with multiple identifiers. The second mixer should build the CSRC list forinduce corresponding gaps in the outgoingpacket usingsequence numbers. Receivers cannot detect theCSRC identifiers from already-mixed inputpresence of a translator unless they know by some other means what payload type or transport address was used by the original source. Mixer: Receives streams of RTP data packets from one or more sources, possibly changes the data format, combines the streams in some manner and then forwards theSSRC identifiers from unmixedcombined stream. Since the timing among multiple inputpackets. Thissources will not generally be synchronized, the mixer will make timing adjustments among the streams and generate its own timing for the combined stream, so it isshown intheoutput arc fromsynchronization source. Thus, all data packets forwarded by a mixerM3 labeled M3:89(64,45) inwill be marked with thefigure. As inmixer's own SSRC identifier. In order to preserve thecaseidentity ofmixers that are not cascaded, iftheresulting CSRC list has more than 15 identifiers,original sources contributing to theremainder cannot be included. 8 SSRC Identifier Allocation and Use Themixed packet, the mixer should insert their SSRC identifiers into the CSRC identifiercarried inlist following the fixed RTP headerand in various fieldsofRTCP packets is a random 32-bit number that is required to be globally unique within an RTP session. It is crucial thatthenumber be chosen with care in orderpacket. A mixer thatparticipants on the same network or starting at the same time are not likely to choose the same number. Itisnot sufficient to use the local network address (such as an IPv4 address)also itself a contributing source forthesome packet should explicitly include its own SSRC identifierbecausein theaddressCSRC list for that packet. Schulzrinne/Casner/Frederick/Jacobson [Page 51] Internet Draft RTP August 7, 1998 For some applications, it maynotbeunique. Since RTP translators and mixers enable interoperation among multiple networks with different address spaces, the allocation patternsacceptable foraddresses within two spaces might result inamuch higher rate of collision than would occur with random allocation. Multiple sources running on one host would also conflict. It is alsomixer notsufficienttoobtain an SSRC identifier simply by calling random() without carefully initializingidentify sources in thestate. An exampleCSRC list. However, this introduces the danger that loops involving those sources could not be detected. The advantage ofhow to generatearandom identifiermixer over a translator for applications like audio ispresented in Appendix A.6. 8.1 Probability of Collision Schulzrinne/Casner/Frederick/Jacobson [Page 55] Internet Draft RTP December 5, 1997 Sincethat theidentifiers are chosen randomly, itoutput bandwidth ispossiblelimited to thattwo or moreof one source even when multiple sourceswill chooseare active on thesame number. Collision occurs withinput side. This may be important for low-bandwidth links. The disadvantage is that receivers on thehighest probability when alloutput side don't have any control over which sources arestarted simultaneously,passed through or muted, unless some mechanism is implemented forexample when triggered automaticallyremote control of the mixer. The regeneration of synchronization information bysome session management event. If Nmixers also means that receivers can't do inter-media synchronization of the original streams. A multi-media mixer could do it. [E1] [E6] | | E1:17 | E6:15 | | | E6:15 V M1:48 (1,17) M1:48 (1,17) V M1:48 (1,17) (M1)-------------><T1>-----------------><T2>-------------->[E7] ^ ^ E4:47 ^ E4:47 E2:1 | E4:47 | | M3:89 (64,45) | | | [E2] [E4] M3:89 (64,45) | | legend: [E3] --------->(M2)----------->(M3)------------| [End system] E3:64 M2:12 (64) ^ (Mixer) | E5:45 <Translator> | [E5] source: SSRC (CSRCs) -------------------> Figure 3: Sample RTP network with end systems, mixers and translators A collection of mixers and translators is shown in Figure 3 to illustrate their effect on SSRC and CSRC identifiers. In thenumberfigure, end systems are shown as rectangles (named E), translators as triangles (named T) and mixers as ovals (named M). The notation "M1: 48(1,17)" designates a packet originating a mixer M1, identified with M1's (random) SSRC value ofsources48 andL the length of the identifier (here, 32 bits), the probability thattwosources independently pick the same value can be approximated for large N [20] asCSRC identifiers, 1- exp(-N**2 / 2**(L+1)). For N=1000, the probability is roughly 10**-4. The typical collision probability is much lower thanand 17, copied from theworst-case above. When one new source joins anSSRC identifiers of packets from E1 and E2. Schulzrinne/Casner/Frederick/Jacobson [Page 52] Internet Draft RTPsessionAugust 7, 1998 7.2 RTCP Processing inwhich allTranslators In addition to forwarding data packets, perhaps modified, translators and mixers must also process RTCP packets. In many cases, they will take apart theother sources already have unique identifiers,compound RTCP packets received from end systems to aggregate SDES information and to modify theprobabilitySR or RR packets. Retransmission ofcollision is justthis information may be triggered by thefraction of numbers used outpacket arrival or by the RTCP interval timer of thespace. Again, if N istranslator or mixer itself. A translator that does not modify thenumber of sourcesdata packets, for example one that just replicates between a multicast address andL the length ofa unicast address, may simply forward RTCP packets unmodified as well. A translator that transforms theidentifier,payload in some way must make corresponding transformations in theprobability of collision is N / 2**L. For N=1000,SR and RR information so that it still reflects theprobability is roughly 2*10**-7. The probability of collision is further reduced bycharacteristics of theopportunity fordata and the reception quality. These translators must not simply forward RTCP packets. In general, anew source to receivetranslator should not aggregate SR and RR packets fromother participants before sending its firstdifferent sources into one packet(either data or control). Ifsince that would reduce thenew source keeps trackaccuracy of theother participants (by SSRC identifier), then before transmitting its first packetpropagation delay measurements based on thenew source can verify that its identifierLSR and DLSR fields. SR sender information: A translator does notconflict with any that have been received, or else choose again. 8.2 Collision Resolution and Loop Detection Althoughgenerate its own sender information, but forwards theprobability ofSR packets received from one cloud to the others. The SSRCidentifier collisionislow, all RTP implementationsleft intact but the sender information must beprepared to detect collisions and takemodified if required by theappropriate actions to resolve them.translation. If asource discovers at any time that another source is usingtranslator changes thesame SSRC identifier as its own,data encoding, it mustsend an RTCP BYEchange the "sender's byte count" field. If it also combines several data packets into one output packet, it must change the "sender's packetforcount" field. If it changes theold identifier and choose another random one. (As explained below, this steptimestamp frequency, it must change the "RTP timestamp" field in the SR packet. SR/RR reception report blocks: A translator forwards reception reports received from one cloud to the others. Note that these flow in the direction opposite to the data. The SSRC istaken only once in case of a loop.)left intact. If areceiver discovers that two other sources are colliding, it may keep thetranslator combines several data packetsfrominto one output packet, anddiscardtherefore changes thepackets fromsequence numbers, it must make theother when this can be detected by different source transport addresses or CNAMEs. The two sources are expected to resolveinverse manipulation for thecollision so thatpacket loss fields and thesituation doesn't last. Because"extended last sequence number" field. This may be complex. In therandom SSRC identifiers are kept globally unique for each RTP session, they can alsoextreme case, there may beusedno meaningful way todetect loops thattranslate the reception reports, so the translator maybe introduced by mixers or translators. A loop causes duplication of data and control information, either unmodifiedpass on no reception report at all orpossibly mixed, as in the following examples: oa synthetic report based on its own reception. The general rule is to do what makes sense for a particular translation. A translator does not require an SSRC identifier of its own, but mayincorrectly forward a packetchoose to allocate one for thesame multicast group from whichpurpose of sending reports about what it hasreceivedreceived. These would be sent to all thepacket, eitherconnected clouds, Schulzrinne/Casner/Frederick/Jacobson [Page56]53] Internet Draft RTPDecember 5, 1997 directly or through a chain of translators. In that case,August 7, 1998 each corresponding to thesame packet appears several times, originating from different network sources. o Two translators incorrectly set up in parallel, i.e., withtranslation of thesamedata stream as sent to that cloud, since reception reports are normally multicastgroups on both sides, would bothto all participants. SDES: Translators typically forwardpacketswithout change the SDES information they receive from onemulticast groupcloud to theother. Unidirectional translators would produce two copies; bidirectional translators would form a loop. oothers, but may, for example, decide to filter non-CNAME SDES information if bandwidth is limited. The CNAMEs must be forwarded to allow SSRC identifier collision detection to work. Amixer can close a loop by sendingtranslator that generates its own RR packets must send SDES CNAME information about itself to the sametransport destination upon whichclouds that itreceives packets, either directly or through another mixer or translator. In this case a source might show up both as an SSRC onsends those RR packets. BYE: Translators forward BYE packets unchanged. A translator that is about to cease forwarding packets should send adataBYE packetand a CSRCto each connected cloud containing all the SSRC identifiers that were previously being forwarded to that cloud, including the translator's own SSRC identifier if it sent reports of its own. APP: Translators forward APP packets unchanged. 7.3 RTCP Processing in Mixers Since amixedmixer generates a new datapacket. A source may discover thatstream of itsown packets are being looped,own, it does not pass through SR orthatRR packetsfrom another source are being looped (a third-party loop). Both loopsat all andcollisions ininstead generates new information for both sides. SR sender information: A mixer does not pass through sender information from therandom selectionsources it mixes because the characteristics ofathe sourceidentifier resultstreams are lost inpackets arriving withthesame SSRC identifier butmix. As adifferent source transport address, which may be that of the end system originatingsynchronization source, thepacket or an intermediate system. Therefore, if a source changesmixer generates itssource transport address, it must also choose a new SSRC identifier to avoid being interpreted as a looped source. Note that if a translator restartsown SR packets with sender information about the mixed data stream andconsequently changessends them in thesource transport address (e.g., changessame direction as theUDP source port number) on which it forwards packets, then all those packets will appear to receiversmixed stream. SR/RR reception report blocks: A mixer generates its own reception reports for sources in each cloud and sends them out only tobe looped becausetheSSRC identifiers are applied bysame cloud. It does not send these reception reports to theoriginal sourceother clouds andwilldoes notchange. This problem may be avoided by keeping the source transport addressed fixed across restarts, but in any case will be resolved after a timeout atforward reception reports from one cloud to thereceivers. Loops or collisions occurring onothers because thefar side of a translator or mixer cannotsources would not bedetected using the source transport address if all copies of the packets go throughSSRCs there (only CSRCs). SDES: Mixers typically forward without change thetranslator or mixer, however collisions may still be detected when chunks from two RTCPSDESpackets containinformation they receive from one cloud to thesame SSRC identifierothers, butdifferent CNAMEs. To detect and resolve these conflicts, an RTP implementationmay, for example, decide to filter non-CNAME SDES information if bandwidth is limited. The CNAMEs mustinclude an algorithm similarbe forwarded tothe one described below. It ignores packets fromallow SSRC identifier collision detection to work. (An identifier in anew source or loop thatCSRC list generated by a mixer might collide with anestablished source. It resolves collisions with the participant's ownSSRC identifier generated bysendinganRTCP BYE for the old identifier and choosing a new one. However, when the collision was induced by a loop of the participant's own packets, the algorithm will choose a new identifier only once and thereafter ignore packets fromend system.) A mixer must send SDES CNAME information about itself to thelooping sourcesame clouds that it sends SR or Schulzrinne/Casner/Frederick/Jacobson [Page57]54] Internet Draft RTPDecember 5, 1997 transport address. This is required to avoid a flood of BYEAugust 7, 1998 RR packets.This algorithm requires keeping a table indexed by the source identifier and containing the source transport addressesSince mixers do not forward SR or RR packets, they will typically be extracting SDES packets fromthe first RTP packet and firsta compound RTCPpacket received with that identifier, along with other state for that source. Two source transport addresses are required since, for example,packet. To minimize overhead, chunks from theUDP source port numbersSDES packets may bedifferentaggregated into a single SDES packet which is then stacked onRTP and RTCP packets. However, itan SR or RR packet originating from the mixer. The RTCP packet rate may beassumed thatdifferent on each side of thenetwork address ismixer. A mixer that does not insert CSRC identifiers may also refrain from forwarding SDES CNAMEs. In this case, thesame in both source transport addresses. EachSSRCor CSRCidentifierreceived in an RTP or RTCP packet is looked upspaces in thesource identifier table in ordertwo clouds are independent. As mentioned earlier, this mode of operation creates a danger that loops can't be detected. BYE: Mixers need toprocessforward BYE packets. A mixer thatdata or control information. The source transport address from the packetiscompared to the corresponding source transport address in the tableabout todetectcease forwarding packets should send aloop or collision if they don't match. For control packets,BYE packet to eachelement with its own SSRC id, for example an SDES chunk, requires a separate lookup. (The SSRC id in a reception report block is an exception because it identifies a source heard byconnected cloud containing all thereporter, and thatSSRCid is unrelatedidentifiers that were previously being forwarded to that cloud, including thesource transport adddress of the RTCP packetmixer's own SSRC identifier if it sent reports of its own. APP: The treatment of APP packets bythe reporter.) If the SSRC or CSRCmixers isnot found,application-specific. 7.4 Cascaded Mixers An RTP session may involve anew entry is created. These table entriescollection of mixers and translators as shown in Figure 3. If two mixers areremoved when an RTCP BYE packet is received with the corresponding SSRC idcascaded, such as M2 andvalidatedM3 in the figure, packets received by amatching source transport address, or after no packetsmixer may already havearrived forbeen mixed and may include arelatively long time (see Section 6.3). Note that if two sources on the same host are transmittingCSRC list with multiple identifiers. The second mixer should build thesame source identifier at the time a receiver begins operation, it would be possible thatCSRC list for thefirst RTPoutgoing packetreceived cameusing the CSRC identifiers fromone ofalready-mixed input packets and thesources whileSSRC identifiers from unmixed input packets. This is shown in thefirst RTCP packet received cameoutput arc from mixer M3 labeled M3:89(64,45) in theother. This would causefigure. As in thewrong RTCP information tocase of mixers that are not cascaded, if the resulting CSRC list has more than 15 identifiers, the remainder cannot beassociated withincluded. 8 SSRC Identifier Allocation and Use The SSRC identifier carried in the RTPdata, but this situation should be sufficiently rareheader andharmless that it may be disregarded. In order to track loopsin various fields ofthe participant's own data packets, itRTCP packets isalso necessary to keepaseparate list of source transport addresses (not identifiers)random 32-bit number thathave been foundis required to beconflicting. As inglobally unique within an RTP session. It is crucial that thesource identifier table, two source transport addresses mustnumber bekeptchosen with care in order that participants on the same network or starting at the same time are not likely to choose the same number. It is not sufficient toseparately track conflicting RTP and RTCP packets. Note thatuse theconflictinglocal network addresslist should be a short, usually empty. Each element in this list stores the source addresses plus(such as an IPv4 address) for thetime whenidentifier because themost recent conflicting packet was received. An elementaddress may not beremoved fromunique. Since RTP translators and mixers enable interoperation among multiple networks with different address spaces, thelist when no conflicting packet has arrived from that sourceallocation patterns for addresses within two spaces might result in atimemuch Schulzrinne/Casner/Frederick/Jacobson [Page 55] Internet Draft RTP August 7, 1998 higher rate of collision than would occur with random allocation. Multiple sources running on one host would also conflict. It is also not sufficient to obtain an SSRC identifier simply by calling random() without carefully initializing theorderstate. An example of10 RTCP report intervals (see Section 6.2). Forhow to generate a random identifier is presented in Appendix A.6. 8.1 Probability of Collision Since thealgorithm as shown,identifiers are chosen randomly, it isassumedpossible that two or more sources will choose theparticipant's own Schulzrinne/Casner/Frederick/Jacobson [Page 58] Internet Draft RTP December 5, 1997 source identifier and statesame number. Collision occurs with the highest probability when all sources areincluded instarted simultaneously, for example when triggered automatically by some session management event. If N is the number of sources and L the length of thesourceidentifiertable. The algorithm could be restructured to first make a separate comparison against(here, 32 bits), theparticipant's own source identifier. IFprobability that two sources independently pick the same value can be approximated for large N [20] as 1 - exp(-N**2 / 2**(L+1)). For N=1000, theSSRC or CSRC identifierprobability isnot found inroughly 10**-4. The typical collision probability is much lower than thesource identifier table: THEN create aworst-case above. When one newentry storing the data or controlsourcetransport address,joins an RTP session in which all theSSRC or CSRC id andotherstate. CONTINUE with normal processing. (identifiersources already have unique identifiers, the probability of collision isfound injust thetable) IFfraction of numbers used out of thetable entry was created on receiptspace. Again, if N is the number ofa control packetsources andthisL the length of the identifier, the probability of collision is N / 2**L. For N=1000, thefirst data packet or vice versa: THEN storeprobability is roughly 2*10**-7. The probability of collision is further reduced by the opportunity for a new sourcetransport addressto receive packets fromthis packet. CONTINUE with normal processing. IFother participants before sending its first packet (either data or control). If the new sourcetransport address fromkeeps track of the other participants (by SSRC identifier), then before transmitting its first packetmatches the one saved inthetable entry for this identifier: THEN CONTINUEnew source can verify that its identifier does not conflict withnormal processing. (anany that have been received, or else choose again. 8.2 Collision Resolution and Loop Detection Although the probability of SSRC identifier collisionor a loopisindicated) IFlow, all RTP implementations must be prepared to detect collisions and take the appropriate actions to resolve them. If a source discovers at any time that another sourceidentifierisnot the participant's own: THEN IFusing thesourcesame SSRC identifieris fromas its own, it must send an RTCPSDES chunk containing a CNAME item that differs fromBYE packet for theCNAMEold identifier and choose another random one. (As explained below, this step is taken only once inthe table entry: THEN (optionally) countcase of athird-party collision. ELSE (optionally) countloop.) If athird-party loop. ABORT processing of data packet or control element. (a collision or loop ofreceiver discovers that two other sources are colliding, it may keep theparticipant's own packets) IFpackets from one and discard thesource transport address is found inpackets from thelist of conflicting data or controlother when this can be detected by different source transportaddresses: THEN IFaddresses or CNAMEs. The two sources are expected to Schulzrinne/Casner/Frederick/Jacobson [Page 56] Internet Draft RTP August 7, 1998 resolve thesource identifier is not from an RTCP SDES chunk containing a CNAME item OR ifcollision so thatCNAME istheparticipant's own: THEN (optionally) count occurrence of own traffic looped. mark current time in conflicting address list entry. ABORT processingsituation doesn't last. Because the random SSRC identifiers are kept globally unique for each RTP session, they can also be used to detect loops that may be introduced by mixers or translators. A loop causes duplication of data and control information, either unmodified or possibly mixed, as in the following examples: o A translator may incorrectly forward a packet to the same multicast group from which it has received the packet, either directly orcontrol element. log occurrence of a collision. createthrough anew entry inchain of translators. In that case, theconflicting data or control source transport address list and mark current time. send an RTCP BYEsame packet appears several times, originating from different network sources. o Two translators incorrectly set up in parallel, i.e., with theold SSRC identifier. choosesame multicast groups on both sides, would both forward packets from one multicast group to the other. Unidirectional translators would produce two copies; bidirectional translators would form anew identifier. createloop. o A mixer can close anew entry in the source identifier table with the old SSRC plusloop by sending to thesourcesame transportaddress from the data Schulzrinne/Casner/Frederick/Jacobson [Page 59] Internet Draft RTP December 5, 1997destination upon which it receives packets, either directly orcontrol packet being processed. CONTINUE with normal processing.through another mixer or translator. In thisalgorithm, packets fromcase anewly conflicting source address will be ignored and packets from the original source will be kept. (If the originalsourcewas throughmight show up both as an SSRC on amixerdata packet andlater the samea CSRC in a mixed data packet. A sourceis received directly, the receivermaybe well advised to switch unless other sources in the mix would be lost.) If nodiscover that its own packets are being looped, or that packetsarrivefromthe original source for an extended period, the table entry will be timed out and the new source will be able to take over. This might occur if the originalanother sourcedetects the collisionare being looped (a third-party loop). Both loops andmoves tocollisions in the random selection of anewsourceidentifier, butidentifier result in packets arriving with theusual case an RTCP BYE packet willsame SSRC identifier but a different source transport address, which may bereceived fromthat of theoriginal source to deleteend system originating thestate without having to wait forpacket or an intermediate system. Therefore, if atimeout. Whensource changes its source transport address, it must also choose a new SSRC identifieris chosen dueto avoid being interpreted as acollision, the candidate identifier should first be looked up in the source identifier table to see if it was already in use by some otherlooped source.If so, another candidate should be generated and the process repeated. A loop of data packets toNote that if amulticast destination can cause severe network flooding. All mixerstranslator restarts andtranslators are required to implement a loop detection algorithm like the one here so that they can break loops. This should limitconsequently changes theexcess traffic to no more than one duplicate copy ofsource transport address (e.g., changes theoriginal traffic,UDP source port number) on whichmay allow the sessionit forwards packets, then all those packets will appear to receivers tocontinue so that the cause of the loop canbefound and fixed. However, in extreme cases where a mixer or translator does not properly breaklooped because theloop and high traffic levels result, it may be necessary for end systems to cease transmitting data or control packets entirely.SSRC identifiers are applied by the original source and will not change. Thisdecisionproblem maydepend upon the application. An error condition shouldbeindicated as appropriate. Transmission mightavoided by keeping the source transport addressed fixed across restarts, but in any case will beattempted again periodicallyresolved after along, random time (ontimeout at theorder of minutes). 8.3 Use with Layered Encodings For layered encodings transmittedreceivers. Loops or collisions occurring onseparate RTP sessions (see Section 2.4),the far side of asingle SSRC identifier space shouldtranslator or mixer cannot beused acrossdetected using thesessions ofsource transport address if alllayers andcopies of thecore (base) layer shouldpackets go through the translator or mixer, however collisions may still beused for SSRC identifier allocation and collision resolution. When a source discovers that it has collided, it transmits andetected when chunks from two RTCPBYE message on only the base layer but changes the SSRC identifier to the new value in all layers. 9 SecuritySDES Schulzrinne/Casner/Frederick/Jacobson [Page60]57] Internet Draft RTPDecember 5, 1997 Lower layer protocols may eventually provide allAugust 7, 1998 packets contain thesecurity services that may be desired for applications of RTP, including authentication, integrity,same SSRC identifier but different CNAMEs. To detect andconfidentiality. These services have recently been specified for IP. Sinceresolve these conflicts, an RTP implementation must include an algorithm similar to theneed forone described below. It ignores packets from aconfidentiality service is well established in the initial audio and video applicationsnew source or loop thatare expected to use RTP, a confidentiality service is defined in the next section for usecollide with an established source. It resolves collisions withRTP and RTCP until lower layer services are available. The overhead ontheprotocolparticipant's own SSRC identifier by sending an RTCP BYE forthis service is low, sothepenalty will be minimal if this service is obsoleted by lower layer services inold identifier and choosing a new one. However, when thefuture. Alternatively, other services, other implementationscollision was induced by a loop ofservicesthe participant's own packets, the algorithm will choose a new identifier only once andother algorithms may be defined for RTP inthereafter ignore packets from thefuture if warranted. The selection presented herelooping source transport address. This ismeantrequired tosimplify implementationavoid a flood ofinteroperable, secure applicationsBYE packets. This algorithm requires keeping a table indexed by the source identifier andprovide guidance to implementors. No claim is made thatcontaining themethods presented heresource transport addresses from the first RTP packet and first RTCP packet received with that identifier, along with other state for that source. Two source transport addresses areappropriaterequired since, fora particular security need. A profileexample, the UDP source port numbers mayspecify which services and algorithms shouldbeoffered by applications,different on RTP and RTCP packets. However, it mayprovide guidance as to their appropriate use. Key distribution and certificates are outside the scope of this document. 9.1 Confidentiality Confidentiality meansbe assumed thatonly the intended receiver(s) can decode the received packets; for others, the packet contains no useful information. Confidentiality ofthecontentnetwork address isachieved by encryption. When encryption ofthe same in both source transport addresses. Each SSRC or CSRC identifier received in an RTP or RTCP packet isdesired, alllooked up in theoctets that will be encapsulated for transmissionsource identifier table ina single lower-layerorder to process that data or control information. The source transport address from the packetare encrypted as a unit. For RTCP, a 32-bit random numberisprependedcompared to theunit before encryptioncorresponding source transport address in the table todeter known plaintext attacks.detect a loop or collision if they don't match. ForRTP, no prefixcontrol packets, each element with its own SSRC id, for example an SDES chunk, requires a separate lookup. (The SSRC id in a reception report block isrequiredan exception because it identifies a source heard by thesequence numberreporter, andtimestamp fields are initialized with random offsets. For RTCP, itthat SSRC id isallowedunrelated tosplit a compoundthe source transport adddress of the RTCP packetinto two lower-layer packets, one to be encrypted and one to besentinby theclear. For example, SDES information might be encrypted while reception reports were sent inreporter.) If theclear to accommodate third-party monitors that areSSRC or CSRC is notprivy to the encryption key. In this example, depicted in Fig. 4, the SDES information must be appended tofound, a new entry is created. These table entries are removed when anRRRTCP BYE packet is received withno reports (andtheencrypted) to satisfycorresponding SSRC id and validated by a matching source transport address, or after no packets have arrived for a relatively long time (see Section 6.2.1). Note that if two sources on therequirement that all compound RTCP packets beginsame host are transmitting withan SR or RR packet. The presence of encryption andtheuse ofsame source identifier at thecorrect key are Schulzrinne/Casner/Frederick/Jacobson [Page 61] Internet Drafttime a receiver begins operation, it would be possible that the first RTPDecember 5, 1997 UDPpacketUDPreceived came from one of the sources while the first RTCP packet------------------------------------- ------------------------- [32-bit ][ ][ # ] [ # sender # receiver] [random ][ RR ][SDES # CNAME, ...] [ SR # report # report ] [integer][(empty)][ # ] [ # # ] ------------------------------------- ------------------------- encrypted not encrypted #: SSRC Figure 4: Encrypted and non-encryptedreceived came from the other. This would cause the wrong RTCPpackets confirmed byinformation to be associated with thereceiver through header or payload validity checks. Examples of such validity checks forRTP data, but this situation should be sufficiently rare andRTCP headers are given in Appendices A.1 and A.2. The default encryption algorithm is the Data Encryption Standard (DES) algorithm in cipher block chaining (CBC) mode, as described in Section 1.1 of RFC 1423 [21], exceptharmless thatpaddingit may be disregarded. In order toa multipletrack loops of8 octets is indicated as described fortheP bit in Section 5.1. The initialization vectorparticipant's own data packets, it iszero because random values are suppliedalso necessary to keep a separate list of source transport addresses (not identifiers) that have been found to be conflicting. As in the Schulzrinne/Casner/Frederick/Jacobson [Page 58] Internet Draft RTP August 7, 1998 source identifier table, two source transport addresses must be kept to separately track conflicting RTP and RTCP packets. Note that the conflicting address list should be a short, usually empty. Each element in this list stores theRTP header or bysource addresses plus therandom prefixtime when the most recent conflicting packet was received. An element may be removed from the list when no conflicting packet has arrived from that source forcompound RTCP packets. For detailsa time on theuseorder ofCBC initialization vectors, see [22]. Implementations that support encryption should always support10 RTCP report intervals (see Section 6.2). For theDESalgorithmin CBC modeasthe default to maximize interoperability. This method is chosen becauseshown, ithas been demonstrated to be easy and practical to use in experimental audiois assumed that the participant's own source identifier andvideo toolsstate are included inoperation ontheInternet. Other encryption algorithms maysource identifier table. The algorithm could bespecified dynamically for a session by non-RTP means. As an alternativerestructured toencryption atfirst make a separate comparison against theRTP level as described above, profiles may define additional payload types for encrypted encodings. Those encodings must specify how paddingparticipant's own source identifier. IF the SSRC or CSRC identifier is not found in the source identifier table: THEN create a new entry storing the data or control source transport address, the SSRC or CSRC id and otheraspects ofstate. CONTINUE with normal processing. (identifier is found in theencryption should be handled. This method allows encrypting onlytable) IF the table entry was created on receipt of a control packet and this is the first datawhile leavingpacket or vice versa: THEN store theheaderssource transport address from this packet. CONTINUE with normal processing. IF the source transport address from the packet matches the one saved in thecleartable entry forapplications where thatthis identifier: THEN CONTINUE with normal processing. (an identifier collision or a loop is indicated) IF the source identifier isdesired. It may be particularly useful for hardware devices that will handle both decryption and decoding. 9.2 Authentication and Message Integrity Authentication and message integrity arenotdefinedthe participant's own: THEN IF the source identifier is from an RTCP SDES chunk containing a CNAME item that differs from the CNAME in thecurrent specificationtable entry: THEN (optionally) count a third-party collision. ELSE (optionally) count a third-party loop. ABORT processing ofRTP since these services woulddata packet or control element. (a collision or loop of the participant's own packets) IF the source transport address is found in the list of conflicting data or control source transport addresses: THEN IF the source identifier is notbe directly feasible withoutfrom an RTCP SDES chunk containing akey management infrastructure. It is expectedCNAME item OR if thatauthentication and integrity services will be provided by lower layerCNAME is the participant's own: Schulzrinne/Casner/Frederick/Jacobson [Page62]59] Internet Draft RTPDecember 5, 1997 protocols in the future. 10 RTP over Network and Transport Protocols This section describes issues specific to carrying RTP packets within particular network and transport protocols. The following rules apply unless superseded by protocol-specific definitions outside this specification. RTP relies on the underlying protocol(s) to provide demultiplexing of RTPAugust 7, 1998 THEN (optionally) count occurrence of own traffic looped. mark current time in conflicting address list entry. ABORT processing of dataand RTCPpacket or controlstreams. For UDPelement. log occurrence of a collision. create a new entry in the conflicting data or control source transport address list andsimilar protocols, RTP usesmark current time. send aneven port number and the correspondingRTCPstream uses the next higher (odd) port number. If an application is suppliedBYE packet withan odd number for use astheRTP port, it should replace this numberold SSRC identifier. choose a new identifier. create a new entry in the source identifier table with thenext lower (even) number.old SSRC plus the source transport address from the data or control packet being processed. CONTINUE with normal processing. In this algorithm, packets from aunicast session, applications shouldnewly conflicting source address will beprepared to receive RTP data and control on one port pairignored andsend to another. It is recommended that layered encoding applications (see Section 2.4) use a set of contiguous port numbers. Ports mustpackets from the original source will bedistinct because of a widespread deficiency in existing operating systems that prevents use ofkept. (If thesame port with multiple multicast addresses,original source was through a mixer andfor unicast, there is only one permissible address. Thus for layer n,later thedata portsame source isP + 2n, andreceived directly, thecontrol port is P + 2n + 1. When IP multicast is used,receiver may be well advised to switch unless other sources in theaddresses must alsomix would bedistinct because multicast routing and group membership are managed onlost.) If no packets arrive from the original source for anaddress granularity. However, allocation of contiguous IP multicast addresses cannotextended period, the table entry will beassumed because some groups may require different scopestimed out andmay thereforethe new source will beallocatedable to take over. This might occur if the original source detects the collision and moves to a new source identifier, but in the usual case an RTCP BYE packet will be received fromdifferent address ranges. RTP data packets contain no length field or other delineation, therefore RTP relies ontheunderlying protocol(s)original source toprovidedelete the state without having to wait for alength indication. The maximum length of RTP packetstimeout. When a new SSRC identifier islimited only by the underlying protocols. If RTP packets arechosen due to a collision, the candidate identifier should first becarriedlooked up inan underlying protocol that providestheabstraction of a continuous octet stream rather than messages (packets), an encapsulation ofsource identifier table to see if it was already in use by some other source. If so, another candidate should be generated and theRTPprocess repeated. A loop of data packetsmust be definedtoprovideaframing mechanism. Framing is also needed ifmulticast destination can cause severe network flooding. All mixers and translators are required to implement a loop detection algorithm like the one here so that they can break loops. This should limit the excess traffic to no more than one duplicate copy of theunderlying protocoloriginal traffic, which maycontain paddingallow the session to continue so that theextentcause of theRTP payload cannotloop can bedetermined. The framing mechanism isfound and fixed. However, in extreme cases where a mixer or translator does notdefined here. A profileproperly break the loop and high traffic levels result, it mayspecify a framing method tobeused even when RTP is carried in protocols that do provide framing in ordernecessary for end systems toallow carrying several RTP packets in one lower-layer protocolcease transmitting dataunit, suchor control packets entirely. This decision may depend upon the application. An error condition should be indicated as appropriate. Transmission might be attempted again periodically after aUDP packet. Carrying several RTP packets in one network orlong, random time (on the order of minutes). 8.3 Use with Layered Encodings Schulzrinne/Casner/Frederick/Jacobson [Page63]60] Internet Draft RTPDecember 5, 1997 transport packet reduces header overhead and may simplify synchronization between different streams. 11 Summary of Protocol Constants This section containsAugust 7, 1998 For layered encodings transmitted on separate RTP sessions (see Section 2.4), asummary listingsingle SSRC identifier space should be used across the sessions of all layers and theconstants defined in this specification. The RTP payload type (PT) constants are definedcore (base) layer should be used for SSRC identifier allocation and collision resolution. When a source discovers that it has collided, it transmits an RTCP BYE message on only the base layer but changes the SSRC identifier to the new value inprofiles rather than this document. However,all layers. 9 Security Lower layer protocols may eventually provide all theoctetsecurity services that may be desired for applications of RTP, including authentication, integrity, and confidentiality. These services have been specified for IP in [21]. Since the initial audio and video applications using RTPheader which containsneeded a confidentiality service before such services were available for themarker bit(s) and payload type must avoidIP layer, thereserved values 200confidentiality service described in the next section was defined for use with RTP and201 (decimal)RTCP. That description is included here todistinguishcodify existing practice. New applications of RTPpackets fromMAY implement this RTP-specific confidentiality service for backward compatibility, and/or they MAY implement IP layer security services. The overhead on theRTCP SR and RR packet typesRTP protocol for this confidentiality service is low, so theheader validation procedure describedpenalty will be minimal if this service is obsoleted by lower layer services inAppendix A.1. Forthestandard definitionfuture. Alternatively, other services, other implementations ofone marker bitservices anda 7-bit payload type field as shownother algorithms may be defined for RTP inthis specification, this restriction means that payload types 72the future if warranted. The selection presented here is meant to simplify implementation of interoperable, secure applications and73provide guidance to implementors. No claim is made that the methods presented here arereserved. 11.1 RTCP packet types abbrev. name value SR sender report 200 RR receiver report 201 SDES source description 202 BYE goodbye 203 APP application-defined 204 These type values were chosen inappropriate for a particular security need. A profile may specify which services and algorithms should be offered by applications, and may provide guidance as to their appropriate use. Key distribution and certificates are outside therange 200-204scope of this document. 9.1 Confidentiality Confidentiality means that only the intended receiver(s) can decode the received packets; forimproved header validity checkingothers, the packet contains no useful information. Confidentiality of the content is achieved by encryption. When encryption ofRTCP packets compared toRTPpacketsorother unrelated packets. When theRTCP is desired, all the octets that will be encapsulated for transmission in a single lower-layer packettype fieldare encrypted as a unit. For RTCP, a 32-bit random number iscomparedprepended to thecorresponding octet of the RTP header, this range correspondsunit before encryption to deter known plaintext attacks. For RTP, no prefix is required because themarker bit being 1 (whichsequence number and timestamp Schulzrinne/Casner/Frederick/Jacobson [Page 61] Internet Draft RTP August 7, 1998 fields are initialized with random offsets. For RTCP, itusuallyisnot in data packets)allowed to split a compound RTCP packet into two lower-layer packets, one to be encrypted and one to be sent in thehigh bit of the standard payload type field being 1 (sinceclear. For example, SDES information might be encrypted while reception reports were sent in thestatic payload typesclear to accommodate third-party monitors that aretypically definednot privy to the encryption key. In this example, depicted in Fig. 4, thelow half). This range was also chosen toSDES information must besome distance numerically from 0 and 255 since all-zeros and all-ones are common data patterns. Sinceappended to an RR packet with no reports (and the encrypted) to satisfy the requirement that all compound RTCP packetsmustbegin with an SR orRR, these codes were chosen as an even/odd pair to allow theRR packet. UDP packet UDP packet ------------------------------------- ------------------------- [32-bit ][ ][ # ] [ # sender # receiver] [random ][ RR ][SDES # CNAME, ...] [ SR # report # report ] [integer][(empty)][ # ] [ # # ] ------------------------------------- ------------------------- encrypted not encrypted #: SSRC Figure 4: Encrypted and non-encrypted RTCPvalidity check to test the maximum numberpackets The presence ofbits with maskencryption andvalue. Other constants are assigned by IANA. Experimenters are encouraged to registerthenumbers they need for experiments, and then unregister those which prove to be unneeded. 11.2 SDES types Schulzrinne/Casner/Frederick/Jacobson [Page 64] Internet Draft RTP December 5, 1997 abbrev. name value END end of SDES list 0 CNAME canonical name 1 NAME user name 2 EMAIL user's electronic mail address 3 PHONE user's phone number 4 LOC geographic user location 5 TOOL nameuse ofapplication or tool 6 NOTE notice aboutthesource 7 PRIV private extensions 8 Other constantscorrect key areassignedconfirmed byIANA. Experimenters are encouraged to registerthenumbers they needreceiver through header or payload validity checks. Examples of such validity checks forexperiments, and then unregister those which prove to be unneeded. 12RTPProfilesandPayload Format Specifications A complete specificationRTCP headers are given in Appendices A.1 and A.2. The default encryption algorithm is the Data Encryption Standard (DES) algorithm in cipher block chaining (CBC) mode, as described in Section 1.1 ofRTP forRFC 1423 [22], except that padding to aparticular application will require one or more companion documentsmultiple oftwo types8 octets is indicated as describedhere: profiles, and payload format specifications. RTP may be usedfora variety of applications with somewhat differing requirements.the P bit in Section 5.1. Theflexibility to adapt to those requirementsinitialization vector isprovided by allowing multiple choiceszero because random values are supplied in themain protocol specification, then selecting the appropriate choicesRTP header ordefining extensionsby the random prefix fora particular environment and classcompound RTCP packets. For details on the use ofapplicationsCBC initialization vectors, see [23]. Implementations that support encryption should always support the DES algorithm ina separate profile document. Typically an application will operate under only one profile so there is no explicit indication of which profileCBC mode as the default to maximize interoperability. This method is chosen because it has been demonstrated to be easy and practical to use inuse. A profile forexperimental audio and videoapplications may be foundtools in operation on thecompanion RFC 1890 (updated by Internet-Draft draft- ietf-avt-profile-new ). Profiles are typically titled "RTP Profile for ...". The second type of companion document is a payload format specification, which defines how a particular kind of payload data, such as H.261 encoded video, should be carried in RTP. These documents are typically titled "RTP Payload Format for XYZ Audio/Video Encoding". Payload formats may be useful under multiple profiles andInternet. Other encryption algorithms maythereforebedefined independently of any particular profile. The profile documents are then responsible for assigning a default mapping of that format to a payload type value if needed. Within this specification, the following items have been identifiedspecified dynamically forpossible definition withinaprofile, but this list is not meantsession by non-RTP means. As an alternative tobe exhaustive: RTP data header: The octet inencryption at theRTP data header that containsIP level or at the RTP level Schulzrinne/Casner/Frederick/Jacobson [Page65]62] Internet Draft RTPDecember 5, 1997 marker bit andAugust 7, 1998 as described above, profiles may define additional payloadtype fieldtypes for encrypted encodings. Those encodings must specify how padding and other aspects of the encryption should be handled. This method allows encrypting only the data while leaving the headers in the clear for applications where that is desired. It may beredefined by a profile to suit different requirements,particularly useful forexample with more or fewer marker bits (Section 5.3, p. 14). Payload types: Assuminghardware devices that will handle both decryption and decoding. 9.2 Authentication and Message Integrity Authentication and message integrity services are not defined at the RTP level since these services would not be directly feasible without apayload type fieldkey management infrastructure. It isincluded, the profileexpected that authentication and integrity services willusually define a set of payload formats (e.g., media encodings)be provided by lower layer protocols. 10 RTP over Network anda default static mapping of those formatsTransport Protocols This section describes issues specific topayload type values. Some of the payload formats may be definedcarrying RTP packets within particular network and transport protocols. The following rules apply unless superseded byreferenceprotocol-specific definitions outside this specification. RTP relies on the underlying protocol(s) toseparate payload format specifications.provide demultiplexing of RTP data and RTCP control streams. Foreach payload type defined,UDP and similar protocols, RTP uses an even port number and theprofile must specifycorresponding RTCP stream uses the next higher (odd) port number. If an application is supplied with an odd number for use as the RTPtimestamp clock rate to be used (Section 5.1, p. 13). RTP data header additions: Additional fields mayport, it should replace this number with the next lower (even) number. In a unicast session, applications should beappendedprepared tothe fixedreceive RTP dataheader if some additional functionalityand control on one port pair and send to another. It is