MMUSIC Working Group H. Schulzrinne Internet-Draft Columbia University Intended status: Standards Track A. Rao Expires: November 6, 2008 Cisco R. Lanphier M. Westerlund Ericsson AB M. Stiemerling (Ed.) NEC May 5, 2008 Real Time Streaming Protocol 2.0 (RTSP) draft-ietf-mmusic-rfc2326bis-18.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on November 6, 2008. Schulzrinne, et al. Expires November 6, 2008 [Page 1] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 Abstract This memorandum defines RTSP version 2.0 which is a revision of the Proposed Standard RTSP version 1.0 which is defined in RFC 2326. The Real Time Streaming Protocol, or RTSP, is an application-level protocol for control over the delivery of data with real-time properties. RTSP provides an extensible framework to enable controlled, on-demand delivery of real-time data, such as audio and video. Sources of data can include both live data feeds and stored clips. This protocol is intended to control multiple data delivery sessions, provide a means for choosing delivery channels such as UDP, multicast UDP and TCP, and provide a means for choosing delivery mechanisms based upon RTP (RFC 3550). Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 9 1.1. Scope and Background . . . . . . . . . . . . . . . . . . 9 1.2. RTSP Specificication Update . . . . . . . . . . . . . . 10 1.3. Notational Conventions . . . . . . . . . . . . . . . . . 10 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . 11 1.5. Media Properties . . . . . . . . . . . . . . . . . . . . 14 1.5.1. Random Access . . . . . . . . . . . . . . . . . . . 15 1.5.2. Retention . . . . . . . . . . . . . . . . . . . . . 15 1.5.3. Content Modifications . . . . . . . . . . . . . . . 16 1.5.4. Mapping to the Attributes . . . . . . . . . . . . . 16 2. RTSP Introduction . . . . . . . . . . . . . . . . . . . . . . 17 2.1. Protocol Properties . . . . . . . . . . . . . . . . . . 17 2.2. RTSP's Relationship to HTTP . . . . . . . . . . . . . . 18 2.3. Extending RTSP . . . . . . . . . . . . . . . . . . . . . 19 2.4. Overall Operation . . . . . . . . . . . . . . . . . . . 20 2.5. RTSP States . . . . . . . . . . . . . . . . . . . . . . 21 2.6. Relationship with Other Protocols . . . . . . . . . . . 22 3. RTSP Use Cases . . . . . . . . . . . . . . . . . . . . . . . 23 3.1. On-demand Playback of Stored Content . . . . . . . . . . 23 3.2. Unicast distribution of Live Content . . . . . . . . . . 24 3.3. On-demand Playback using Multicast . . . . . . . . . . . 25 3.4. Inviting an RTSP server into a conference . . . . . . . 25 3.5. Live Content using Multicast . . . . . . . . . . . . . . 26 4. Protocol Parameters . . . . . . . . . . . . . . . . . . . . . 28 4.1. RTSP Version . . . . . . . . . . . . . . . . . . . . . . 28 4.2. RTSP IRI and URI . . . . . . . . . . . . . . . . . . . . 28 4.3. Session Identifiers . . . . . . . . . . . . . . . . . . 30 4.4. SMPTE Relative Timestamps . . . . . . . . . . . . . . . 30 4.5. Normal Play Time . . . . . . . . . . . . . . . . . . . . 30 4.6. Absolute Time . . . . . . . . . . . . . . . . . . . . . 31 Schulzrinne, et al. Expires November 6, 2008 [Page 2] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 4.7. Feature-tags . . . . . . . . . . . . . . . . . . . . . . 31 4.8. Entity Tags . . . . . . . . . . . . . . . . . . . . . . 32 5. RTSP Message . . . . . . . . . . . . . . . . . . . . . . . . 33 5.1. Message Types . . . . . . . . . . . . . . . . . . . . . 33 5.2. Message Headers . . . . . . . . . . . . . . . . . . . . 34 5.3. Message Body . . . . . . . . . . . . . . . . . . . . . . 34 5.4. Message Length . . . . . . . . . . . . . . . . . . . . . 34 6. General Header Fields . . . . . . . . . . . . . . . . . . . . 35 7. Request . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 7.1. Request Line . . . . . . . . . . . . . . . . . . . . . . 36 7.2. Request Header Fields . . . . . . . . . . . . . . . . . 38 8. Response . . . . . . . . . . . . . . . . . . . . . . . . . . 40 8.1. Status-Line . . . . . . . . . . . . . . . . . . . . . . 40 8.1.1. Status Code and Reason Phrase . . . . . . . . . . . 40 8.2. Response Header Fields . . . . . . . . . . . . . . . . . 43 9. Entity . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 9.1. Entity Header Fields . . . . . . . . . . . . . . . . . . 46 9.2. Entity Body . . . . . . . . . . . . . . . . . . . . . . 47 10. Connections . . . . . . . . . . . . . . . . . . . . . . . . . 48 10.1. Reliability and Acknowledgements . . . . . . . . . . . . 48 10.2. Using Connections . . . . . . . . . . . . . . . . . . . 49 10.3. Closing Connections . . . . . . . . . . . . . . . . . . 50 10.4. Timing Out Connections and RTSP Messages . . . . . . . . 51 10.5. Showing Liveness . . . . . . . . . . . . . . . . . . . . 51 10.6. Use of IPv6 . . . . . . . . . . . . . . . . . . . . . . 52 11. Capability Handling . . . . . . . . . . . . . . . . . . . . . 53 12. Pipelining Support . . . . . . . . . . . . . . . . . . . . . 55 13. Method Definitions . . . . . . . . . . . . . . . . . . . . . 56 13.1. OPTIONS . . . . . . . . . . . . . . . . . . . . . . . . 57 13.2. DESCRIBE . . . . . . . . . . . . . . . . . . . . . . . . 58 13.3. SETUP . . . . . . . . . . . . . . . . . . . . . . . . . 60 13.3.1. Changing Transport Parameters . . . . . . . . . . . 63 13.4. PLAY . . . . . . . . . . . . . . . . . . . . . . . . . . 64 13.4.1. General Usage . . . . . . . . . . . . . . . . . . . 64 13.4.2. Aggregated Sessions . . . . . . . . . . . . . . . . 68 13.4.3. Updating current PLAY Requests . . . . . . . . . . . 68 13.4.4. Playing On-Demand Media . . . . . . . . . . . . . . 70 13.4.5. Playing Dynamic On-Demand Media . . . . . . . . . . 71 13.4.6. Playing Live Media . . . . . . . . . . . . . . . . . 71 13.4.7. Playing Live with Recording . . . . . . . . . . . . 72 13.4.8. Playing Live with Time-Shift . . . . . . . . . . . . 72 13.5. PLAY_NOTIFY . . . . . . . . . . . . . . . . . . . . . . 73 13.5.1. End-of-Stream . . . . . . . . . . . . . . . . . . . 74 13.5.2. Media-Properties-Update . . . . . . . . . . . . . . 75 13.5.3. Scale-Change . . . . . . . . . . . . . . . . . . . . 76 13.6. PAUSE . . . . . . . . . . . . . . . . . . . . . . . . . 77 13.7. TEARDOWN . . . . . . . . . . . . . . . . . . . . . . . . 79 13.8. GET_PARAMETER . . . . . . . . . . . . . . . . . . . . . 80 Schulzrinne, et al. Expires November 6, 2008 [Page 3] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 13.9. SET_PARAMETER . . . . . . . . . . . . . . . . . . . . . 81 13.10. REDIRECT . . . . . . . . . . . . . . . . . . . . . . . . 82 14. Embedded (Interleaved) Binary Data . . . . . . . . . . . . . 86 15. Status Code Definitions . . . . . . . . . . . . . . . . . . . 88 15.1. Success 1xx . . . . . . . . . . . . . . . . . . . . . . 88 15.1.1. 100 Continue . . . . . . . . . . . . . . . . . . . . 88 15.2. Success 2xx . . . . . . . . . . . . . . . . . . . . . . 88 15.2.1. 200 OK . . . . . . . . . . . . . . . . . . . . . . . 88 15.3. Redirection 3xx . . . . . . . . . . . . . . . . . . . . 88 15.3.1. 300 Multiple Choices . . . . . . . . . . . . . . . . 89 15.3.2. 301 Moved Permanently . . . . . . . . . . . . . . . 89 15.3.3. 302 Found . . . . . . . . . . . . . . . . . . . . . 89 15.3.4. 303 See Other . . . . . . . . . . . . . . . . . . . 89 15.3.5. 304 Not Modified . . . . . . . . . . . . . . . . . . 89 15.3.6. 305 Use Proxy . . . . . . . . . . . . . . . . . . . 90 15.4. Client Error 4xx . . . . . . . . . . . . . . . . . . . . 90 15.4.1. 400 Bad Request . . . . . . . . . . . . . . . . . . 90 15.4.2. 405 Method Not Allowed . . . . . . . . . . . . . . . 90 15.4.3. 451 Parameter Not Understood . . . . . . . . . . . . 90 15.4.4. 452 reserved . . . . . . . . . . . . . . . . . . . . 90 15.4.5. 453 Not Enough Bandwidth . . . . . . . . . . . . . . 91 15.4.6. 454 Session Not Found . . . . . . . . . . . . . . . 91 15.4.7. 455 Method Not Valid in This State . . . . . . . . . 91 15.4.8. 456 Header Field Not Valid for Resource . . . . . . 91 15.4.9. 457 Invalid Range . . . . . . . . . . . . . . . . . 91 15.4.10. 458 Parameter Is Read-Only . . . . . . . . . . . . . 91 15.4.11. 459 Aggregate Operation Not Allowed . . . . . . . . 91 15.4.12. 460 Only Aggregate Operation Allowed . . . . . . . . 91 15.4.13. 461 Unsupported Transport . . . . . . . . . . . . . 92 15.4.14. 462 Destination Unreachable . . . . . . . . . . . . 92 15.4.15. 463 Destination Prohibited . . . . . . . . . . . . . 92 15.4.16. 464 Data Transport Not Ready Yet . . . . . . . . . . 92 15.4.17. 465 Notification Reason Unknown . . . . . . . . . . 92 15.4.18. 470 Connection Authorization Required . . . . . . . 92 15.4.19. 471 Connection Credentials not accepted . . . . . . 93 15.4.20. 472 Failure to establish secure connection . . . . . 93 15.5. Server Error 5xx . . . . . . . . . . . . . . . . . . . . 93 15.5.1. 551 Option not supported . . . . . . . . . . . . . . 93 16. Header Field Definitions . . . . . . . . . . . . . . . . . . 94 16.1. Accept . . . . . . . . . . . . . . . . . . . . . . . . . 103 16.2. Accept-Credentials . . . . . . . . . . . . . . . . . . . 103 16.3. Accept-Encoding . . . . . . . . . . . . . . . . . . . . 104 16.4. Accept-Language . . . . . . . . . . . . . . . . . . . . 104 16.5. Accept-Ranges . . . . . . . . . . . . . . . . . . . . . 104 16.6. Allow . . . . . . . . . . . . . . . . . . . . . . . . . 105 16.7. Authorization . . . . . . . . . . . . . . . . . . . . . 105 16.8. Bandwidth . . . . . . . . . . . . . . . . . . . . . . . 105 16.9. Blocksize . . . . . . . . . . . . . . . . . . . . . . . 105 Schulzrinne, et al. Expires November 6, 2008 [Page 4] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 16.10. Cache-Control . . . . . . . . . . . . . . . . . . . . . 106 16.11. Connection . . . . . . . . . . . . . . . . . . . . . . . 108 16.12. Connection-Credentials . . . . . . . . . . . . . . . . . 108 16.13. Content-Base . . . . . . . . . . . . . . . . . . . . . . 109 16.14. Content-Encoding . . . . . . . . . . . . . . . . . . . . 109 16.15. Content-Language . . . . . . . . . . . . . . . . . . . . 109 16.16. Content-Length . . . . . . . . . . . . . . . . . . . . . 110 16.17. Content-Location . . . . . . . . . . . . . . . . . . . . 110 16.18. Content-Type . . . . . . . . . . . . . . . . . . . . . . 110 16.19. CSeq . . . . . . . . . . . . . . . . . . . . . . . . . . 110 16.20. Date . . . . . . . . . . . . . . . . . . . . . . . . . . 110 16.21. ETag . . . . . . . . . . . . . . . . . . . . . . . . . . 111 16.22. Expires . . . . . . . . . . . . . . . . . . . . . . . . 111 16.23. From . . . . . . . . . . . . . . . . . . . . . . . . . . 112 16.24. If-Match . . . . . . . . . . . . . . . . . . . . . . . . 112 16.25. If-Modified-Since . . . . . . . . . . . . . . . . . . . 112 16.26. If-None-Match . . . . . . . . . . . . . . . . . . . . . 113 16.27. Last-Modified . . . . . . . . . . . . . . . . . . . . . 113 16.28. Location . . . . . . . . . . . . . . . . . . . . . . . . 113 16.29. Media-Properties . . . . . . . . . . . . . . . . . . . . 113 16.30. Media-Range . . . . . . . . . . . . . . . . . . . . . . 115 16.31. Notify-Reason . . . . . . . . . . . . . . . . . . . . . 115 16.32. Pipelined-Requests . . . . . . . . . . . . . . . . . . . 115 16.33. Proxy-Authenticate . . . . . . . . . . . . . . . . . . . 116 16.34. Proxy-Authorization . . . . . . . . . . . . . . . . . . 116 16.35. Proxy-Require . . . . . . . . . . . . . . . . . . . . . 116 16.36. Proxy-Supported . . . . . . . . . . . . . . . . . . . . 117 16.37. Public . . . . . . . . . . . . . . . . . . . . . . . . . 118 16.38. Range . . . . . . . . . . . . . . . . . . . . . . . . . 119 16.39. Referer . . . . . . . . . . . . . . . . . . . . . . . . 120 16.40. Retry-After . . . . . . . . . . . . . . . . . . . . . . 120 16.41. Request-Status . . . . . . . . . . . . . . . . . . . . . 120 16.42. Require . . . . . . . . . . . . . . . . . . . . . . . . 121 16.43. RTP-Info . . . . . . . . . . . . . . . . . . . . . . . . 122 16.44. Scale . . . . . . . . . . . . . . . . . . . . . . . . . 123 16.45. Seek-Style . . . . . . . . . . . . . . . . . . . . . . . 124 16.46. Speed . . . . . . . . . . . . . . . . . . . . . . . . . 125 16.47. Server . . . . . . . . . . . . . . . . . . . . . . . . . 126 16.48. Session . . . . . . . . . . . . . . . . . . . . . . . . 126 16.49. Supported . . . . . . . . . . . . . . . . . . . . . . . 127 16.50. Timestamp . . . . . . . . . . . . . . . . . . . . . . . 127 16.51. Transport . . . . . . . . . . . . . . . . . . . . . . . 127 16.52. Unsupported . . . . . . . . . . . . . . . . . . . . . . 133 16.53. User-Agent . . . . . . . . . . . . . . . . . . . . . . . 134 16.54. Vary . . . . . . . . . . . . . . . . . . . . . . . . . . 134 16.55. Via . . . . . . . . . . . . . . . . . . . . . . . . . . 134 16.56. WWW-Authenticate . . . . . . . . . . . . . . . . . . . . 134 17. Proxies . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Schulzrinne, et al. Expires November 6, 2008 [Page 5] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 18. Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 19. Security Framework . . . . . . . . . . . . . . . . . . . . . 138 19.1. RTSP and HTTP Authentication . . . . . . . . . . . . . . 138 19.2. RTSP over TLS . . . . . . . . . . . . . . . . . . . . . 138 19.3. Security and Proxies . . . . . . . . . . . . . . . . . . 139 19.3.1. Accept-Credentials . . . . . . . . . . . . . . . . . 140 19.3.2. User approved TLS procedure . . . . . . . . . . . . 141 20. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 20.1. Base Syntax . . . . . . . . . . . . . . . . . . . . . . 143 20.2. RTSP Protocol Definition . . . . . . . . . . . . . . . . 145 20.2.1. Generic Protocol elements . . . . . . . . . . . . . 145 20.2.2. Message Syntax . . . . . . . . . . . . . . . . . . . 148 20.2.3. Header Syntax . . . . . . . . . . . . . . . . . . . 152 20.3. SDP extension Syntax . . . . . . . . . . . . . . . . . . 160 21. Security Considerations . . . . . . . . . . . . . . . . . . . 161 21.1. Remote denial of Service Attack . . . . . . . . . . . . 163 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 165 22.1. Feature-tags . . . . . . . . . . . . . . . . . . . . . . 165 22.1.1. Description . . . . . . . . . . . . . . . . . . . . 165 22.1.2. Registering New Feature-tags with IANA . . . . . . . 166 22.1.3. Registered entries . . . . . . . . . . . . . . . . . 166 22.2. RTSP Methods . . . . . . . . . . . . . . . . . . . . . . 166 22.2.1. Description . . . . . . . . . . . . . . . . . . . . 166 22.2.2. Registering New Methods with IANA . . . . . . . . . 166 22.2.3. Registered Entries . . . . . . . . . . . . . . . . . 167 22.3. RTSP Status Codes . . . . . . . . . . . . . . . . . . . 167 22.3.1. Description . . . . . . . . . . . . . . . . . . . . 167 22.3.2. Registering New Status Codes with IANA . . . . . . . 167 22.3.3. Registered Entries . . . . . . . . . . . . . . . . . 167 22.4. RTSP Headers . . . . . . . . . . . . . . . . . . . . . . 167 22.4.1. Description . . . . . . . . . . . . . . . . . . . . 167 22.4.2. Registering New Headers with IANA . . . . . . . . . 168 22.4.3. Registered entries . . . . . . . . . . . . . . . . . 168 22.5. Transport Header Registries . . . . . . . . . . . . . . 169 22.5.1. Transport Protocol Specification . . . . . . . . . . 169 22.5.2. Transport modes . . . . . . . . . . . . . . . . . . 170 22.5.3. Transport Parameters . . . . . . . . . . . . . . . . 171 22.6. Cache Directive Extensions . . . . . . . . . . . . . . . 171 22.7. Accept-Credentials . . . . . . . . . . . . . . . . . . . 172 22.7.1. Accept-Credentials policies . . . . . . . . . . . . 172 22.7.2. Accept-Credentials hash algorithms . . . . . . . . . 172 22.8. Range header formats . . . . . . . . . . . . . . . . . . 173 22.9. Media Property Values . . . . . . . . . . . . . . . . . 173 22.9.1. Description . . . . . . . . . . . . . . . . . . . . 173 22.9.2. Registration Rules . . . . . . . . . . . . . . . . . 173 22.9.3. Registered Values . . . . . . . . . . . . . . . . . 174 22.10. Notify-Reason header . . . . . . . . . . . . . . . . . . 174 22.10.1. Description . . . . . . . . . . . . . . . . . . . . 174 Schulzrinne, et al. Expires November 6, 2008 [Page 6] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 22.10.2. Registration Rules . . . . . . . . . . . . . . . . . 174 22.10.3. Registered Values . . . . . . . . . . . . . . . . . 174 22.11. Seek-Style . . . . . . . . . . . . . . . . . . . . . . . 175 22.11.1. Description . . . . . . . . . . . . . . . . . . . . 175 22.11.2. Registration Rules . . . . . . . . . . . . . . . . . 175 22.11.3. Registered Values . . . . . . . . . . . . . . . . . 175 22.12. URI Schemes . . . . . . . . . . . . . . . . . . . . . . 175 22.12.1. The rtsp URI Scheme . . . . . . . . . . . . . . . . 175 22.12.2. The rtsps URI Scheme . . . . . . . . . . . . . . . . 176 22.12.3. The rtspu URI Scheme . . . . . . . . . . . . . . . . 177 22.13. SDP attributes . . . . . . . . . . . . . . . . . . . . . 178 22.14. Media Type Registration for text/parameters . . . . . . 179 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 181 23.1. Normative References . . . . . . . . . . . . . . . . . . 181 23.2. Informative References . . . . . . . . . . . . . . . . . 183 Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 186 A.1. Media on Demand (Unicast) . . . . . . . . . . . . . . . 186 A.2. Media on Demand using Pipelining . . . . . . . . . . . . 190 A.3. Media on Demand (Unicast) . . . . . . . . . . . . . . . 192 A.4. Single Stream Container Files . . . . . . . . . . . . . 196 A.5. Live Media Presentation Using Multicast . . . . . . . . 198 A.6. Capability Negotiation . . . . . . . . . . . . . . . . . 199 Appendix B. RTSP Protocol State Machine . . . . . . . . . . . . 201 B.1. States . . . . . . . . . . . . . . . . . . . . . . . . . 201 B.2. State variables . . . . . . . . . . . . . . . . . . . . 201 B.3. Abbreviations . . . . . . . . . . . . . . . . . . . . . 201 B.4. State Tables . . . . . . . . . . . . . . . . . . . . . . 202 Appendix C. Media Transport Alternatives . . . . . . . . . . . . 207 C.1. RTP . . . . . . . . . . . . . . . . . . . . . . . . . . 207 C.1.1. AVP . . . . . . . . . . . . . . . . . . . . . . . . 207 C.1.2. AVP/UDP . . . . . . . . . . . . . . . . . . . . . . 207 C.1.3. AVPF/UDP . . . . . . . . . . . . . . . . . . . . . . 208 C.1.4. SAVP/UDP . . . . . . . . . . . . . . . . . . . . . . 209 C.1.5. SAVPF/UDP . . . . . . . . . . . . . . . . . . . . . 209 C.1.6. RTCP usage with RTSP . . . . . . . . . . . . . . . . 209 C.2. RTP over TCP . . . . . . . . . . . . . . . . . . . . . . 210 C.2.1. Interleaved RTP over TCP . . . . . . . . . . . . . . 210 C.2.2. RTP over independent TCP . . . . . . . . . . . . . . 210 C.2.3. Handling NPT Jumps in the RTP Media Layer . . . . . 214 C.2.4. Handling RTP Timestamps after PAUSE . . . . . . . . 217 C.2.5. RTSP / RTP Integration . . . . . . . . . . . . . . . 219 C.2.6. Scaling with RTP . . . . . . . . . . . . . . . . . . 219 C.2.7. Maintaining NPT synchronization with RTP timestamps . . . . . . . . . . . . . . . . . . . . . 219 C.2.8. Continuous Audio . . . . . . . . . . . . . . . . . . 219 C.2.9. Multiple Sources in an RTP Session . . . . . . . . . 219 C.2.10. Usage of SSRCs and the RTCP BYE Message During an RTSP Session . . . . . . . . . . . . . . . . . . . . 219 Schulzrinne, et al. Expires November 6, 2008 [Page 7] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 C.3. Future Additions . . . . . . . . . . . . . . . . . . . . 220 Appendix D. Use of SDP for RTSP Session Descriptions . . . . . . 221 D.1. Definitions . . . . . . . . . . . . . . . . . . . . . . 221 D.1.1. Control URI . . . . . . . . . . . . . . . . . . . . 221 D.1.2. Media Streams . . . . . . . . . . . . . . . . . . . 222 D.1.3. Payload Type(s) . . . . . . . . . . . . . . . . . . 223 D.1.4. Format-Specific Parameters . . . . . . . . . . . . . 223 D.1.5. Directionality of media stream . . . . . . . . . . . 223 D.1.6. Range of Presentation . . . . . . . . . . . . . . . 224 D.1.7. Time of Availability . . . . . . . . . . . . . . . . 225 D.1.8. Connection Information . . . . . . . . . . . . . . . 225 D.1.9. Entity Tag . . . . . . . . . . . . . . . . . . . . . 225 D.2. Aggregate Control Not Available . . . . . . . . . . . . 226 D.3. Aggregate Control Available . . . . . . . . . . . . . . 226 D.4. RTSP external SDP delivery . . . . . . . . . . . . . . . 227 Appendix E. Text format for Parameters . . . . . . . . . . . . . 229 Appendix F. Requirements for Unreliable Transport of RTSP . . . 230 Appendix G. Backwards Compatibility Considerations . . . . . . . 232 G.1. Play Request in Play mode . . . . . . . . . . . . . . . 232 G.2. Using Persistent Connections . . . . . . . . . . . . . . 232 Appendix H. Open Issues . . . . . . . . . . . . . . . . . . . . 233 Appendix I. Changes . . . . . . . . . . . . . . . . . . . . . . 235 Appendix J. Acknowledgements . . . . . . . . . . . . . . . . . . 242 J.1. Contributors . . . . . . . . . . . . . . . . . . . . . . 242 Appendix K. RFC Editor Consideration . . . . . . . . . . . . . . 244 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 245 Intellectual Property and Copyright Statements . . . . . . . . . 246 Schulzrinne, et al. Expires November 6, 2008 [Page 8] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 1. Introduction 1.1. Scope and Background This memo defines version 2.0 of the Real Time Streaming Protocol (RTSP 2.0) which is an application-level protocol for control over the delivery of data with real-time properties, typically streaming media. Streaming media is, for instance, video on demand or audio live streaming. Put simply, RTSP acts as a "network remote control" for multimedia servers, as you know it from your TV set. The protocol operates between RTSP 2.0 clients and servers, but also supports the usage of RTSP 2.0 proxies between clients and servers. Basically, clients can request information about streaming media from servers, by asking for a description of the media or use media description provided externally. Based on the media description clients can request to play out the media, pause it, or stop it completely, as known from a regular TV remote control. The requested media can consist of multiple audio and video streams that are delivered as a time-synchronized streams from servers to clients. This memorandum describes the use of RTSP over a reliable connection based transport level protocol, such as TCP. For security, TLS over a connection oriented transport is supported. There is no dependency on an special RTSP connection in the protocol. Instead, an RTSP server maintains a session labeled by an identifier to associate groups of media streams and their states. An RTSP session is not tied to a transport-level connection such as a TCP connection. During a session, a client may open and close multiple reliable transport connections to the server to issue RTSP requests for that session. The set of streams to be controlled in an RTSP session is defined by a presentation description. This memorandum does not define a format for the presentation description. However Appendix D describes how SDP [RFC4566] is used for this purpose. The streams controlled by RTSP may use RTP [RFC3550] for their data transport, but the operation of RTSP does not depend on the transport mechanism used to carry continuous media. RTSP is intentionally similar in syntax and operation to HTTP/1.1 [RFC2616] so that extension mechanisms to HTTP may also be applied to RTSP. The RTSP 2.0 protocol supports the following operations: Schulzrinne, et al. Expires November 6, 2008 [Page 9] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 Retrieval of media from media server: The client can either request a presentation description via RTSP DESCRIBE, HTTP or some other method. If the presentation is being multicast, the presentation description contains the multicast addresses and ports to be used for the continuous media. If the presentation is to be sent only to the client via unicast, the client provides the destination. Invitation of a media server to a conference: A media server can be "invited" to join an existing conference to play back media into the presentation. This mode is useful, for example, in distributed teaching applications. Several parties in the conference may take turns "pushing the remote control buttons". Note: This functionality will require RTSP external application level functionality. RTSP requests may be handled by proxies, tunnels and caches as in HTTP/1.1 [RFC2616]. 1.2. RTSP Specificication Update This memorandum specifies RTSP 2.0 which is an update of RTSP 1.0, a proposed standard defined in [RFC2326]. The goal of this version is to correct the many flaws that have been identified in RTSP 1.0 since its publication. The corrections are such that backwards compatibility was impossible. Thus a new version was deemed the most appropriate solution to get a more functional protocol. There are no plans to revise RTSP 1.0. Appendix I catalogs the changes of this version in relation to RTSP 1.0. RTSP 2.0 as specified in this memo has reduced functionality compared to RTSP 1.0 and aims at specifying the RTSP core, functionality and rules for extensions, and basic interaction with the media delivery protocol RTP [RFC3550]. Any other functionality would need to be published as extension documents. This specification provides rules for such extensions and defines registries to avoid naming collisions. 1.3. Notational Conventions Since some of the definitions and syntax are identical to HTTP/1.1, this specification only points to the section where they are defined rather than copying it. For brevity, [HX.Y] is to be taken to refer to Section X.Y of the current HTTP/1.1 specification ([RFC2616]). All the mechanisms specified in this document are described in both prose and the Augmented Backus-Naur form (ABNF) described in detail Schulzrinne, et al. Expires November 6, 2008 [Page 10] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 in [RFC5234]. Indented and smaller-type paragraphs are used to provide informative background and motivation. This is intended to give readers who were not involved with the formulation of the specification an understanding of why things are the way they are in RTSP. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. The word, "unspecified" is used to indicate functionality or features that are not defined in this specification. Such functionality cannot be used in a standardized manner without further definition in an extension specification to RTSP. 1.4. Terminology Some of the terminology has been adopted from HTTP/1.1 [RFC2616]. Terms not listed here are defined as in HTTP/1.1. Aggregate control: The concept of controlling multiple streams using a single timeline, generally maintained by the server. A client, for example, uses aggregate control when it issues a single play or pause message to simultaneously control both the audio and video in a movie. A session which is under aggregate control is referred to as an aggregated session. Aggregate control URI: The URI used in an RTSP request to refer to and control an aggregated session. It normally, but not always, corresponds to the presentation URI specified in the session description. See Section 13.3 for more information. Conference: A multiparty, multimedia presentation, where "multi" implies greater than or equal to one. Client: The client requests media service from the media server. Connection: A transport layer virtual circuit established between two programs for the purpose of communication. Container file: A file which may contain multiple media streams which often constitutes a presentation when played together. The concept of a container file is not embedded in the protocol. However, RTSP servers may offer aggregate control on the media streams within these files. Schulzrinne, et al. Expires November 6, 2008 [Page 11] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 Continuous media: Data where there is a timing relationship between source and sink; that is, the sink needs to reproduce the timing relationship that existed at the source. The most common examples of continuous media are audio and motion video. Continuous media can be real-time (interactive or conversational), where there is a "tight" timing relationship between source and sink, or streaming (playback), where the relationship is less strict. Entity: The information transferred as the payload of a request or response. An entity consists of meta-information in the form of entity-header fields and content in the form of an entity-body, as described in Section 9. Feature-tag: A tag representing a certain set of functionality, i.e. a feature. IRI: Internationalized Resource Identifier, is the same as an URI, with the exception that it allows characters from the whole Universal Character Set (Unicode/ISO 10646), rather than the US- ASCII only. See [RFC3987] for more information. Live: Normally used to describe a presentation or session with media coming from an ongoing event. This generally results in the session having an unbound or only loosely defined duration, and sometimes no seek operations are possible. Media initialization: Datatype/codec specific initialization. This includes such things as clock rates, color tables, etc. Any transport-independent information which is required by a client for playback of a media stream occurs in the media initialization phase of stream setup. Media parameter: Parameter specific to a media type that may be changed before or during stream playback. Media server: The server providing playback services for one or more media streams. Different media streams within a presentation may originate from different media servers. A media server may reside on the same host or on a different host from which the presentation is invoked. Media server indirection: Redirection of a media client to a different media server. (Media) stream: A single media instance, e.g., an audio stream or a video stream as well as a single whiteboard or shared application group. When using RTP, a stream consists of all RTP and RTCP packets created by a source within an RTP session. Schulzrinne, et al. Expires November 6, 2008 [Page 12] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 Message: The basic unit of RTSP communication, consisting of a structured sequence of octets matching the syntax defined in Section 20 and transmitted over a connection or a connectionless transport. Non-Aggregated Control: Control of a single media stream. This is only possible in RTSP sessions with a single media. Participant: Member of a conference. A participant may be a machine, e.g., a playback server. Presentation: A set of one or more streams presented to the client as a complete media feed and described by a presentation description as defined below. Presentations with more than one media stream are often handled in RTSP under aggregate control. Presentation description: A presentation description contains information about one or more media streams within a presentation, such as the set of encodings, network addresses and information about the content. Other IETF protocols such as SDP ([RFC4566]) use the term "session" for a presentation. The presentation description may take several different formats, including but not limited to the session description protocol format, SDP. Response: An RTSP response. If an HTTP response is meant, that is indicated explicitly. Request: An RTSP request. If an HTTP request is meant, that is indicated explicitly. Request-URI: The URI used in a request to indicate the resource on which the request is to be performed. RTSP agent: Refers to either an RTSP client, an RTSP server, or an RTSP Proxy. In this specification, there are many capabilities that are common to these three entities such as the capability to send requests or receive responses. This term will be used when describing functionality that is applicable to all three of these entities. RTSP session: A stateful abstraction upon which the main control methods of RTSP operate. An RTSP session is a server entity; it is created, maintained and destroyed by the server. It is established by an RTSP server upon the completion of a successful SETUP request (when a 200 OK response is sent) and is labelled with a session identifier at that time. The session exists until timed out by the server or explicitly removed by a TEARDOWN request. An RTSP session is a stateful entity; an RTSP server Schulzrinne, et al. Expires November 6, 2008 [Page 13] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 maintains an explicit session state machine (see Appendix A) where most state transitions are triggered by client requests. The existence of a session implies the existence of state about the session's media streams and their respective transport mechanisms. A given session can have one or more media streams associated with it. An RTSP server uses the session to aggregate control over multiple media streams. Transport initialization: The negotiation of transport information (e.g., port numbers, transport protocols) between the client and the server. URI: Universal Resource Identifier, see [RFC3986]. The URIs used in RTSP are generally URLs as they give a location for the resource. As URLs are a subset of URIs, they will be referred to as URIs to cover also the cases when an RTSP URI would not be an URL. URL: Universal Resource Locator, is an URI which identifies the resource through its primary access mechanism, rather than identifying the resource by name or by some other attribute(s) of that resource. 1.5. Media Properties When RTSP handles media it is important to consider the different properties a media instance for playback can have. This specification considers the below listed media properties in its protocol operations. They are derived from the differencies between a number of supported usages. On-demand: Media that has a fixed (given) duration that doesn't change during the life time of the RTSP session and are known at the time of the creation of the session. It is expected that the content of the media will not change, even if the representation, i.e encoding, quality, etc, may change. Generally one can seek within the media i.e. randomly access any range of the media stream to playback. Dynamic On-demand: This is a variation of the on-demand case where external methods are used to manipulate the actual content of the media setup for the RTSP session. The main example is where a playlist determines the content of the session. Schulzrinne, et al. Expires November 6, 2008 [Page 14] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 Live: Live media represents a progressing content stream (such as broadcast TV) where the duration may or may not be known. It is not seakable, only the content presently being delivered can be accessed. Live with Recording: A Live stream that is combined with a server side capability to store and retain the content of the live session for random access playback within the part of the already recorded content. The actual behavior of the media stream is very much depending on the retention policy for the media stream. Either the server will be able to capture the complete media stream, or it will have a limitation in how much will be retained. The media range will dynamically change as the session progress. For servers with a limited amount of storage available for recording, there will be a sliding window that goes forwards while data is made available and content that is older than the limitation will be discarded. Considering the above usages one get the following media properties and their different instance values. 1.5.1. Random Access Random Access, i.e. if one can request that the playback point is moved from one point in the media duration to another. The following different values are considered: Random Access: Yes the media are seekable to any out of a large number of points within the media. Due to media encoding limitations a particular point may not be reachable, but seeking to a point close by is enabled. A floating point number of seconds may be provied to express the worst case distance between random access points. Return To Start: Seeking is only possible to begining of the content. No seeking: Seeking is not possible at all. 1.5.2. Retention Media may have different retention policy in place that affect the operation on the media. The following different media retention policies are envisioned and taken into consideration where applicable. Schulzrinne, et al. Expires November 6, 2008 [Page 15] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 Unlimited: The media will not be removed as long as the RTSP session are in existence. Time Limited: The media will at least not be removed before given wall clock time. After that time it may or may not be available any more. Duration limited Each indiviudal unit of the media will be retained for the specified duration. 1.5.3. Content Modifications There is also the question of how the content may change during time for a give media resource: Unmutable: The content of the media will not change, even if the representation, i.e encoding, quality, etc, may change. Dynamic: Between explicit updates the media content will not change, but the content may change due to external methods or triggers, such as playlists. Time Progressing: As times progress new content will become available. If the content also is retained it will become longer and longer as everything between the start point and the point in currently being made available can be accessed. 1.5.4. Mapping to the Attributes This section exemplifies how one would map the above listed usages to the properties and their values. On-demand: Random Access: Random Access=5s, Content Modifications: Unmutable, Retention: unlimted or time limited. Dynamic On-demand: Random Access: Random Access=3s, Content Modifications: Dynamic, Retention: unlimted or time limited. Live: Random Access: No seeking, Content Modifications: Time Progressing, Retention: Duration limited=0.0s Live with Recording: Random Access: Random Access=3s, Content Modifications: Time Progressing, Retention: Duration limited=2H Schulzrinne, et al. Expires November 6, 2008 [Page 16] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 2. RTSP Introduction 2.1. Protocol Properties RTSP has the following properties: Extendable: New methods and parameters can be easily added to RTSP. Secure: RTSP re-uses web security mechanisms, either at the transport level (TLS, [RFC4346]) or within the protocol itself. All HTTP authentication mechanisms such as basic ([RFC2616]) and digest authentication ([RFC2617]) are directly applicable. Transport-independent: RTSP does not preclude the use of unreliable datagram protocol (UDP) ([RFC0768]) as it would be possible to implement application-level reliability. The use of a connectionless datagram protocol such as UDP requires additional definition that may be provided as extensions to the core RTSP specification. The reliable stream protocol TCP ([RFC0793]) and the secured reliable stream protocol TLS over TCP [RFC4346] are the currently defined transport protocols for RTSP messages. Media-delivery protocol independent: The operation of RTSP does not depend on the transport mechanism used to carry continuous media. While most real-time media will use RTP as a transport protocol, RTSP does not preclude the use of other protocols such as MPEG-2 [ISO.13818-1.2000]. The use of other protocols requires additional definition that may be provided as extensions to the core RTSP specification. Multi-server capable: Each media stream within a presentation can reside on a different server. The client automatically establishes several concurrent control sessions with the different media servers. Media synchronization in those cases is performed at the transport level. Separation of stream control and conference initiation: Stream control is divorced from inviting a media server to a conference. In particular, SIP [RFC3261] or H.323 [ITU.H323.1996] may be used to invite a server to a conference; however, the exact procedures are unspecified. Suitable for professional applications: RTSP supports frame- level accuracy through SMPTE time stamps to allow remote digital editing. Schulzrinne, et al. Expires November 6, 2008 [Page 17] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 Presentation description neutral: The protocol does not impose a particular presentation description or metafile format and can convey the type of format to be used. However, the presentation description is required to contain at least one RTSP URI. Proxy and firewall friendly: The protocol should be readily handled by both application and transport-layer (SOCKS [RFC1961]) firewalls. A firewall may need to understand the SETUP method to open a "hole" for the media stream. HTTP-friendly: Where sensible, RTSP reuses HTTP concepts, so that the existing infrastructure can be reused. This infrastructure includes PICS (Platform for Internet Content Selection [W3C.REC-PICS-services] [W3C.REC-PICS-labels]) for associating labels with content. However, RTSP does not just add methods to HTTP since controlling continuous media requires server state in most cases. Appropriate server control: If a client can start a stream, it needs to be able to stop a stream. Servers should not start streaming to clients in such a way that clients cannot stop the stream. Transport negotiation: The client can negotiate the transport method prior to actually needing to process a continuous media stream. 2.2. RTSP's Relationship to HTTP RTSP is intentionally similar in syntax and operation to HTTP/1.1 [RFC2616] so that extension mechanisms to HTTP can in some cases also be applied to RTSP. However, RTSP differs in a number of important aspects from HTTP: * RTSP introduces a number of new methods and has a different protocol identifier. * RTSP has the notion of a session built into the protocol. * An RTSP server needs to maintain state in almost all cases, as opposed to the stateless nature of HTTP. * Both an RTSP server and client can issue requests. * Data is usually carried out-of-band by a different protocol. Session descriptions returned in a DESCRIBE response (see Section 13.2) and interleaving of RTP with RTSP over TCP are exceptions to this rule (see Section 14). Schulzrinne, et al. Expires November 6, 2008 [Page 18] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 * RTSP is defined to use ISO 10646 (UTF-8) rather than ISO 8859-1, consistent with HTML internationalization efforts [RFC2070]. * The Request-URI always contains the absolute URI. Because of backward compatibility with a historical blunder, HTTP/1.1 [RFC2616] carries only the absolute path in the request and puts the host name in a separate header field. This makes "virtual hosting" easier, where a single host with one IP address hosts several document trees. 2.3. Extending RTSP Since not all media servers have the same functionality, media servers by necessity will support different sets of requests. For example: o A server may not be capable of seeking (absolute positioning) if it is to support live events only. o Some servers may not support setting stream parameters and thus not support GET_PARAMETER and SET_PARAMETER. o Some server may support an RTSP extension. It is up to the creators of presentation descriptions not to ask the impossible of a server. This situation is similar in HTTP/1.1 [RFC2616], where the methods described in [H19.5] are not likely to be supported across all servers. RTSP can be extended in three ways, listed here in order of the magnitude of changes supported: o Existing methods can be extended with new parameters, e.g. headers, as long as these parameters can be safely ignored by the recipient. If the client needs negative acknowledgement when a method extension is not supported, a tag corresponding to the extension may be added in the field of the Require or Proxy- Require headers (see Section 16.35). o New methods can be added. If the recipient of the message does not understand the request, it MUST respond with error code 501 (Not Implemented) so that the sender can avoid using this method again. A client may also use the OPTIONS method to inquire about methods supported by the server. The server MUST list the methods it supports using the Public response header. Schulzrinne, et al. Expires November 6, 2008 [Page 19] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 o A new version of the protocol can be defined, allowing almost all aspects (except the position of the protocol version number) to change. A new version of the protocol MUST be registered through an IETF standard track document. The basic capability discovery mechanism can be used to both discover support for a certain feature and to ensure that a feature is available when performing a request. For detailed explanation of this see Section 11. 2.4. Overall Operation Each presentation and media stream is identified by an RTSP URI. The overall presentation and the properties of the media the presentation is composed of are defined by a presentation description file, the format of which is outside the scope of this specification. The presentation description file may be obtained by the client using HTTP or other means such as email and may not necessarily be stored on the media server. For the purposes of this specification, a presentation description is assumed to describe one or more presentations, each of which maintains a common time axis. For simplicity of exposition and without loss of generality, it is assumed that the presentation description contains exactly one such presentation. A presentation may contain several media streams. The presentation description file contains a description of the media streams making up the presentation, including their encodings, language, and other parameters that enable the client to choose the most appropriate combination of media. In this presentation description, each media stream that is individually controllable by RTSP is identified by an RTSP URI, which points to the media server handling that particular media stream and names the stream stored on that server. Several media streams can be located on different servers; for example, audio and video streams can be split across servers for load sharing. The description also enumerates which transport methods the server is capable of. Besides the media parameters, the network destination address and port need to be determined. Several modes of operation can be distinguished: Unicast: The media is transmitted to the source of the RTSP request or the requested destination, with the port number chosen by the client. Alternatively, the media is transmitted on the same reliable stream as RTSP. Schulzrinne, et al. Expires November 6, 2008 [Page 20] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 Multicast, server chooses address: The media server picks the multicast address and port. This is the typical case for a live or near-media-on-demand transmission. Multicast, client chooses address: If the server is to participate in an existing multicast conference, the multicast address, port and encryption key are given by the conference description, established by means outside the scope of this specification, for example by a SIP created conference. 2.5. RTSP States RTSP controls a stream which may be sent via a separate protocol, independent of the control channel. For example, RTSP control may be transported on a TCP connection while the media data is conveyed via UDP. Thus, data delivery continues even if no RTSP requests are received by the media server. Also, during its lifetime a single media stream may be controlled by RTSP requests issued sequentially on different TCP connections. Therefore, the server needs to maintain "session state" to be able to correlate RTSP requests with a stream. The state transitions are described in Appendix A. Many methods in RTSP do not contribute to state. However, the following play a central role in defining the allocation and usage of stream resources on the server: SETUP, PLAY, PAUSE, REDIRECT, and TEARDOWN. SETUP: Causes the server to allocate resources for a stream and create an RTSP session. PLAY: Starts data transmission on a stream allocated via SETUP. PAUSE: Temporarily halts a stream without freeing server resources. REDIRECT: Indicates that the session should be moved to a new server or location TEARDOWN: Frees resources associated with the stream. The RTSP session ceases to exist on the server. RTSP methods that contribute to state use the Session header field (Section 16.49) to identify the RTSP session whose state is being manipulated. The server generates session identifiers in response to SETUP requests (Section 13.3). Schulzrinne, et al. Expires November 6, 2008 [Page 21] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 2.6. Relationship with Other Protocols RTSP has some overlap in functionality with HTTP. It also may interact with HTTP in that the initial contact with streaming content will often be made through a web page. The current protocol specification aims to allow different hand-off points between a web server and the media server implementing RTSP. For example, the presentation description can be retrieved using HTTP or RTSP, which reduces round trips in web-browser-based scenarios, yet also allows for stand alone RTSP servers and clients which do not rely on HTTP at all. However, RTSP differs fundamentally from HTTP in that most data delivery takes place out-of-band in a different protocol. HTTP is an asymmetric protocol where the client issues requests and the server responds. In RTSP, both the media client and media server can issue requests. RTSP requests are also stateful; they may set parameters and continue to control a media stream long after the request has been acknowledged. Re-using HTTP functionality has advantages in at least two areas, namely security and proxies. The requirements are very similar, so having the ability to adopt HTTP work on caches, proxies and authentication is valuable. RTSP assumes the existence of a presentation description format that can express both static and temporal properties of a presentation containing several media streams. Session Description Protocol (SDP) [RFC4566] is generally the format of choice; however, RTSP is not bound to it. For data delivery, most real-time media will use RTP as a transport protocol. While RTSP works well with RTP, it is not tied to RTP. Schulzrinne, et al. Expires November 6, 2008 [Page 22] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 3. RTSP Use Cases This section describes the most important and considered use cases for RTSP. They are listed in descending order of importance in regards to ensuring that all necessary functionality is present. This specification only fully supports usage of the two first. Also in these first two cases, there are special cases or exceptions that are not supported without extensions, e.g. the redirection of media to another address than the controlling entity. 3.1. On-demand Playback of Stored Content An RTSP capable server stores content suitable for being streamed to a client. A client desiring playback of any of the stored content uses RTSP to set up the media transport required to deliver the desired content. RTSP is then used to initiate, halt and manipulate the actual transmission (playout) of the content. RTSP is also required to provide necessary description and synchronization information for the content. The above high level description can be broken down into a number of functions that RTSP needs to be capable of. Presentation Description: Provide initialization information about the presentation (content); for example, which media codecs are needed for the content. Other information that is important includes the number of media stream the presentation contains, the transport protocols used for the media streams, and identifiers for these media streams. This information is required before setup of the content is possible and to determine if the client is even capable of using the content. This information need not be sent using RTSP; other external protocols can be used to transmit the transport presentation descriptions. Two good examples are the use of HTTP [RFC2616] or email to fetch or receive presentation descriptions like SDP [RFC4566] Setup: Set up some or all of the media streams in a presentation. The setup itself consist of selecting the protocol for media transport and the necessary parameters for the protocol, like addresses and ports. Control of Transmission: After the necessary media streams have been established the client can request the server to start transmitting the content. The client must be allowed to start or stop the transmission of the content at arbitrary times. The client must also be able to start the transmission at any Schulzrinne, et al. Expires November 6, 2008 [Page 23] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 point in the timeline of the presentation. Synchronization: For media transport protocols like RTP [RFC3550] it might be beneficial to carry synchronization information within RTSP. This may be due to either the lack of inter-media synchronization within the protocol itself, or the potential delay before the synchronization is established (which is the case for RTP when using RTCP). Termination: Terminate the established contexts. For this use case there are a number of assumptions about how it works. These are: On-Demand content: The content is stored at the server and can be accessed at any time during a time period when it is intended to be available. Independent sessions: A server is capable of serving a number of clients simultaneously, including from the same piece of content at different points in that presentations time-line. Unicast Transport: Content for each individual client is transmitted to them using unicast traffic. It is also possible to redirect the media traffic to a different destination than that of the entity controlling the traffic. However, allowing this without appropriate mechanisms for checking that the destination approves of this allows for distributed denial of service attacks (DDoS). 3.2. Unicast distribution of Live Content This use cases is similar to the above on-demand content case (see Section 3.1) the difference is the nature of the content itself. Live content is continuously distributed as it becomes available from a source; i.e., the main difference from on-demand is that one starts distributing content before the end of it has become available to the server. In many cases the consumer of live content is only interested in consuming what is actually happens "now"; i.e., very similar to broadcast TV. However in this case it is assumed that there exist no broadcast or multicast channel to the users, and instead the server functions as a distribution node, sending the same content to multiple receivers, using unicast traffic between server and client. This unicast traffic and the transport parameters are individually negotiated for each receiving client. Schulzrinne, et al. Expires November 6, 2008 [Page 24] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 Another aspect of live content is that it often has a very limited time of availability, as it is only is available for the duration of the event the content covers. An example of such a live content could be a music concert which lasts 2 hour and starts at a predetermined time. Thus there is need to announce when and for how long the live content is available. In some cases, the server providing live content may be saving some or all of the content to allow clients to pause the stream and resume it from the paused point, or to "rewind" and play continuously from a point earlier than the live point. Hence, this use case does not necessarily exclude playing from other than the live point of the stream, playing with scales other than 1.0, etc. 3.3. On-demand Playback using Multicast It is possible to use RTSP to request that media be delivered to a multicast group. The entity setting up the session (the controller) will then control when and what media is delivered to the group. This use case has some potential for denial of service attacks by flooding a multicast group. Therefore, a mechanism is needed to indicate that the group actually accepts the traffic from the RTSP server. An open issue in this use case is how one ensures that all receivers listening to the multicast or broadcast receives the session presentation configuring the receivers. This memo has to rely on a external solution to solve this issue. 3.4. Inviting an RTSP server into a conference If one has an established conference or group session, it is possible to have an RTSP server distribute media to the whole group. Transmission to the group is simplest when controlled by a single participant or leader of the conference. Shared control might be possible, but would require further investigation and possibly extensions. This use case assumes that there exists either multicast or a conference focus that redistribute media to all participants. This use case is intended to be able to handle the following scenario: A conference leader or participant (hereafter called the controller) has some pre-stored content on an RTSP server that he wants to share with the group. The controller sets up an RTSP session at the streaming server for this content and retrieves the session description for the content. The destination for the media content is set to the shared multicast group or conference focus. Schulzrinne, et al. Expires November 6, 2008 [Page 25] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 When desired by the controller, he/she can start and stop the transmission of the media to the conference group. There are several issues with this use case that are not solved by this core specification for RTSP: Denial of service: To avoid an RTSP server from being an unknowing participant in a denial of service attack the server needs to be able to verify the destination's acceptance of the media. Such a mechanism to verify the approval of received media does not yet exist; instead, only policies can be used, which can be made to work in controlled environments. Distributing the presentation description to all participants in the group: To enable a media receiver to correctly decode the content the media configuration information needs to be distributed reliably to all participants. This will most likely require support from an external protocol. Passing control of the session: If it is desired to pass control of the RTSP session between the participants, some support will be required by an external protocol to exchange state information and possibly floor control of who is controlling the RTSP session. If there interest in this use case, further work is required on the necessary extensions. 3.5. Live Content using Multicast This use case in its simplest form does not require any use of RTSP at all; this is what multicast conferences being announced with SAP [RFC2974] and SDP are intended to handle. However in use cases where more advanced features like access control to the multicast session are desired, RTSP could be used for session establishment. A client desiring to join a live multicasted media session with cryptographic (encryption) access control could use RTSP in the following way. The source of the session announces the session and gives all interested an RTSP URI. The client connects to the server and requests the presentation description, allowing configuration for reception of the media. In this step it is possible for the client to use secured transport and any desired level of authentication; for example, for billing or access control. An RTSP link also allows for load balancing between multiple servers. If these were the only goals, they could be achieved by simply using HTTP. However, for cases where the sender likes to keep track of Schulzrinne, et al. Expires November 6, 2008 [Page 26] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 each individual receiver of a session, and possibly use the session as a side channel for distributing key-updates or other information on a per-receiver basis, and the full set of receivers is not know prior to the session start, the state establishment that RTSP provides can be beneficial. In this case a client would establish an RTSP session for this multicast group with the RTSP server. The RTSP server will not transmit any media, but instead will point to the multicast group. The client and server will be able to keep the session alive for as long as the receiver participates in the session thus enabling, for example, the server to push updates to the client. This use case will most likely not be able to be implemented without some extensions to the server-to-client push mechanism. Here the PLAY_NOTIFY method (see Section 13.5) with a suitable extension could provide clear benefits. Schulzrinne, et al. Expires November 6, 2008 [Page 27] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 4. Protocol Parameters 4.1. RTSP Version HTTP specification section [H3.1] applies, with "HTTP" replaced by "RTSP". This specification defines version 2.0 of RTSP. 4.2. RTSP IRI and URI RTSP 2.0 defines and registers three URI schemas "rtsp", "rtsps" and "rtspu". The usage of the last, "rtspu", is unspecified in RTSP 2.0, and is defined here to register and reserve the URI scheme that is defined in RTSP 1.0. The "rtspu" scheme indicates undefined transport of the RTSP messages over unreliable transport (UDP). The syntax of "rtsp" and "rtsps" URIs has been changed from RTSP 1.0. This specification also defines the format of the RTSP IRI [RFC3987] that can be used as RTSP resource identifiers and locators, in web pages, user interfaces, on paper, etc. However, the RTSP request message format only allows usage of the absolute URI format. The RTSP IRI format SHALL use the rules and transformation for IRIs defined in [RFC3987]. This way RTSP 2.0 URIs for request can be produced from an RTSP IRI. The RTSP IRI and URI are both syntax restricted compared to the generic syntax defined in [RFC3986] and RFC [RFC3987]: o An absolute URI requires the authority part; i.e., a host identity must be provided. o Parameters in the path element are prefixed with the reserved separator ";". The RTSP URI and IRI is case sensitive, with the exception of those parts that [RFC3986] and [RFC3987] defines as case-insensitive; for example, the scheme and host part. The fragment identifier is used as defined in sections 3.5 and 4.3 of [RFC3986], i.e. the fragment is to be stripped from the URI by the requestor and not included in the request. The user agent also needs to interpret the value of the fragment based on the media type the request relates to; i.e., the media type indicated in Content-Type header in the response to DESCRIBE. The syntax of any URI query string is unspecified and responder (usually the server) specific. The query is, from the requestor's perspective, an opaque string and needs to be handled as such. Schulzrinne, et al. Expires November 6, 2008 [Page 28] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 The URI scheme "rtsp" requires that commands are issued via a reliable protocol (within the Internet, TCP), while the scheme "rtsps" identifies a reliable transport using secure transport (TLS [RFC4346], see (Section 19). For the scheme "rtsp", if no port number is provided in the authority part of the URI port number 554 SHALL be used. For the scheme "rtsps", the TCP port 322 is registered and SHALL be assumed. A presentation or a stream is identified by a textual media identifier, using the character set and escape conventions of URIs [RFC3986]. URIs may refer to a stream or an aggregate of streams; i.e., a presentation. Accordingly, requests described in (Section 13) can apply to either the whole presentation or an individual stream within the presentation. Note that some request methods can only be applied to streams, not presentations, and vice versa. For example, the RTSP URI: rtsp://media.example.com:554/twister/audiotrack may identify the audio stream within the presentation "twister", which can be controlled via RTSP requests issued over a TCP connection to port 554 of host media.example.com. Also, the RTSP URI: rtsp://media.example.com:554/twister identifies the presentation "twister", which may be composed of audio and video streams, but could also be something else like a random media redirector. This does not imply a standard way to reference streams in URIs. The presentation description defines the hierarchical relationships in the presentation and the URIs for the individual streams. A presentation description may name a stream "a.mov" and the whole presentation "b.mov". The path components of the RTSP URI are opaque to the client and do not imply any particular file system structure for the server. This decoupling also allows presentation descriptions to be used with non-RTSP media control protocols simply by replacing the scheme in the URI. Schulzrinne, et al. Expires November 6, 2008 [Page 29] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 4.3. Session Identifiers Session identifiers are strings of any arbitrary length but with a minimum length of 8 characters. A session identifier MUST be chosen cryptographically random (see [RFC4086]) and MUST be at least 8 characters long (can contain a maximum of 48 bits of entropy) to make guessing it more difficult. It is RECOMMENDED that it contains 128 bits of entropy, i.e. approxamitely 22 characters from a high quality generator. (see Section 21.) However, it needs to be noted that the session identifier does not provide any security against session hijacking unless it is kept confidential between client, server and trusted proxies. 4.4. SMPTE Relative Timestamps A SMPTE relative timestamp expresses time relative to the start of the clip. Relative timestamps are expressed as SMPTE time codes for frame-level access accuracy. The time code has the format hours:minutes:seconds:frames.subframes, with the origin at the start of the clip. The default smpte format is "SMPTE 30 drop" format, with frame rate is 29.97 frames per second. Other SMPTE codes MAY be supported (such as "SMPTE 25") through the use of alternative use of "smpte-type". For SMPTE 30, the "frames" field in the time value can assume the values 0 through 29. The difference between 30 and 29.97 frames per second is handled by dropping the first two frame indices (values 00 and 01) of every minute, except every tenth minute. If the frame and the subframe values are zero, they may be omitted. Subframes are measured in one- hundredth of a frame. Examples: smpte=10:12:33:20- smpte=10:07:33- smpte=10:07:00-10:07:33:05.01 smpte-25=10:07:00-10:07:33:05.01 4.5. Normal Play Time Normal play time (NPT) indicates the stream absolute position relative to the beginning of the presentation, not to be confused with the Network Time Protocol (NTP) [RFC1305]. The timestamp consists of a decimal fraction. The part left of the decimal may be expressed in either seconds or hours, minutes, and seconds. The part right of the decimal point measures fractions of a second. Schulzrinne, et al. Expires November 6, 2008 [Page 30] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 The beginning of a presentation corresponds to 0.0 seconds. Negative values are not defined. The special constant "now" is defined as the current instant of a live event. It MAY only be used for live events, and SHALL NOT be used for on-demand (i.e., non-live) content. NPT is defined as in DSM-CC [ISO.13818-6.1995]: "Intuitively, NPT is the clock the viewer associates with a program. It is often digitally displayed on a VCR. NPT advances normally when in normal play mode (scale = 1), advances at a faster rate when in fast scan forward (high positive scale ratio), decrements when in scan reverse (high negative scale ratio) and is fixed in pause mode. NPT is (logically) equivalent to SMPTE time codes." Examples: npt=123.45-125 npt=12:05:35.3- npt=now- The syntax conforms to ISO 8601 [ISO.8601.2000]. The npt-sec notation is optimized for automatic generation, the npt-hhmmss notation for consumption by human readers. The "now" constant allows clients to request to receive the live feed rather than the stored or time-delayed version. This is needed since neither absolute time nor zero time are appropriate for this case. 4.6. Absolute Time Absolute time is expressed as ISO 8601 [ISO.8601.2000] timestamps, using UTC (GMT). Fractions of a second may be indicated. Example for November 8, 1996 at 14h37 and 20 and a quarter seconds UTC: 19961108T143720.25Z 4.7. Feature-tags Feature-tags are unique identifiers used to designate features in RTSP. These tags are used in Require (Section 16.42), Proxy-Require (Section 16.35), Proxy-Supported (Section 16.36), and Unsupported (Section 16.52) header fields. A feature-tag definition MUST indicate which combination of clients, servers or proxies they applies to. The creator of a new RTSP feature-tag should either prefix the feature-tag with a reverse domain name (e.g., Schulzrinne, et al. Expires November 6, 2008 [Page 31] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 "com.example.mynewfeature" is an apt name for a feature whose inventor can be reached at "example.com"), or register the new feature-tag with the Internet Assigned Numbers Authority (IANA) (see IANA Section 22). The usage of feature-tags is further described in Section 11 that deals with capability handling. 4.8. Entity Tags Entity tags are opaque strings that are used to compare two entities from the same resource, for example in caches or to optimize setup after a redirect. Further explanation is present in [H3.11]. For an explanation of how to compare entity tags see [H13.3]. Entity tags can be carried in the ETag header (see Section 16.21) or in SDP (see Appendix D.1.9). Entity tags are used in RTSP to make some methods conditional. The methods are made conditional through the inclusion of headers, see Section 16.24 and Section 16.26. Note that RTSP entity tags apply to the complete presentation; i.e., both the session description and the individual media streams. Thus entity tags can be used to verify at setup time after a redirect that the same session description applies to the media at the new location using the If-Match header. Schulzrinne, et al. Expires November 6, 2008 [Page 32] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 5. RTSP Message RTSP is a text-based protocol and uses the ISO 10646 character set in UTF-8 encoding (RFC 3629 [RFC3629]). Lines SHALL be terminated by CRLF. Text-based protocols make it easier to add optional parameters in a self-describing manner. Since the number of parameters and the frequency of commands is low, processing efficiency is not a concern. Text-based protocols, if done carefully, also allow easy implementation of research prototypes in scripting languages such as Tcl, Visual Basic and Perl. The ISO 10646 character set avoids tricky character set switching, but is invisible to the application as long as US-ASCII is being used. This is also the encoding used for RTCP [RFC3550]. ISO 8859-1 translates directly into Unicode with a high-order octet of zero. ISO 8859-1 characters with the most-significant bit set are represented as 1100001x 10xxxxxx. (See RFC 3629 [RFC3629]) Requests contain methods, the object the method is operating upon and parameters to further describe the method. Methods are idempotent unless otherwise noted. Methods are also designed to require little or no state maintenance at the media server. 5.1. Message Types RTSP messages consist of requests from client to server or server to client and responses in the reverse direction. Request Section 7 and Response Section 8 messages use the generic message format of RFC 822 [9] for transferring entities (the payload of the message). Both types of message consist of a start-line, zero or more header fields (also known as "headers"), an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, and possibly a message-body. generic-message = start-line *(message-header CRLF) CRLF [ message-body ] start-line = Request-Line | Status-Line In the interest of robustness, servers SHOULD ignore any empty line(s) received where a Request-Line is expected. In other words, if the server is reading the protocol stream at the beginning of a message and receives a CRLF first, it should ignore the CRLF. Schulzrinne, et al. Expires November 6, 2008 [Page 33] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 5.2. Message Headers See [H4.2]. 5.3. Message Body See [H4.3]. Unlike HTTP, the presence of a message-body in either a request or a response MUST be signaled by the inclusion of a Content-Length header field (see Section 16.16). 5.4. Message Length When a message body is included with a message, the length of that body is determined by one of the following (in order of precedence): 1. Any response message which MUST NOT include a message body (such as the 1xx, 204, and 304 responses) is always terminated by the first empty line after the header fields, regardless of the entity-header fields present in the message. (Note: An empty line is a line with nothing preceding the CRLF.) 2. If a Content-Length header field (Section 16.16) is present, its value in bytes represents the length of the message-body. If this header field is not present, a value of zero is assumed. Unlike an HTTP message, an RTSP message MUST contain a Content-Length header field whenever it contains a message body. Note that RTSP does not support the HTTP/1.1 "chunked" transfer coding (see [H3.6.1]). Given the moderate length of presentation descriptions returned, the server should always be able to determine its length, even if it is generated dynamically, making the chunked transfer encoding unnecessary. Schulzrinne, et al. Expires November 6, 2008 [Page 34] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 6. General Header Fields See [H4.5], except that the Pragma, Trailer, Transfer-Encoding, Upgrade, and Warning headers are not defined. RTSP further defines the CSeq, Pipelined-Requests, Proxy-Supported and Timestamp headers. The general headers are listed in Table 1: +--------------------+--------------------+ | Header Name | Defined in Section | +--------------------+--------------------+ | Cache-Control | Section 16.10 | | | | | Connection | Section 16.11 | | | | | CSeq | Section 16.19 | | | | | Date | Section 16.20 | | | | | Media-Properties | Section 16.29 | | | | | Media-Range | Section 16.30 | | | | | Pipelined-Requests | Section 16.32 | | | | | Proxy-Supported | Section 16.36 | | | | | Seek-Style | Section 16.45 | | | | | Supported | Section 16.49 | | | | | Timestamp | Section 16.50 | | | | | Via | Section 16.55 | +--------------------+--------------------+ Table 1: The general headers used in RTSP Schulzrinne, et al. Expires November 6, 2008 [Page 35] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 7. Request A request message uses the format outlined below regardless of the direction of a request, client to server or server to client: o Request line, containing the method to be applied to the resource, the identifier of the resource, and the protocol version in use; o Zero or more Header lines, that can be of the following types: general (Section 6), request (Section 7.2), or entity (Section 9.1); o One empty line (CRLF) to indicate the end of the header section; o Optionally a message body (entity), consisting of one or more lines. The length of the message body in bytes is indicated by the Content-Length entity header. 7.1. Request Line The request line provides the key information about the request: what method, on what resources and using which RTSP version. The methods that are defined by this specification are listed in Table 2. Schulzrinne, et al. Expires November 6, 2008 [Page 36] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 +---------------+--------------------+ | Method | Defined in Section | +---------------+--------------------+ | DESCRIBE | Section 13.2 | | | | | GET_PARAMETER | Section 13.8 | | | | | OPTIONS | Section 13.1 | | | | | PAUSE | Section 13.6 | | | | | PLAY | Section 13.4 | | | | | PLAY_NOTIFY | Section 13.5 | | | | | REDIRECT | Section 13.10 | | | | | SETUP | Section 13.3 | | | | | SET_PARAMETER | Section 13.9 | | | | | TEARDOWN | Section 13.7 | +---------------+--------------------+ Table 2: The RTSP Methods The syntax of the RTSP request line is the following: CRLF Note: This syntax cannot be freely changed in future versions of RTSP. This line needs to remain parsable by older RTSP implementations since it indicates the RTSP version of the message. In contrast to HTTP/1.1 [RFC2616], RTSP requests identify the resource through an absolute RTSP URI (scheme, host, and port) (see Section 4.2) rather than just the absolute path. HTTP/1.1 requires servers to understand the absolute URI, but clients are supposed to use the Host request header. This is purely needed for backward-compatibility with HTTP/1.0 servers, a consideration that does not apply to RTSP. An asterisk "*" can be used instead of an absolute URI in the Request-URI part to indicate that the request does not apply to a particular resource, but to the server or proxy itself, and is only allowed when the request method does not necessarily apply to a resource. Schulzrinne, et al. Expires November 6, 2008 [Page 37] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 For example: OPTIONS * RTSP/2.0 An OPTIONS in this form will determine the capabilities of the server or the proxy that first receives the request. If the capability of the specific server needs to be determined, without regard to the capability of an intervening proxy, the server should be addressed explicitly with an absolute URI that contains the server's address. For example: OPTIONS rtsp://example.com RTSP/2.0 7.2. Request Header Fields The RTSP headers in Table 3 can be included in a request, as request headers, to modify the specifics of the request. Some of these headers may also be used in the response to a request, as response headers, to modify the specifics of a response (Section 8.2). +--------------------+--------------------+ | Header | Defined in Section | +--------------------+--------------------+ | Accept | Section 16.1 | | | | | Accept-Credentials | Section 16.2 | | | | | Accept-Encoding | Section 16.3 | | | | | Accept-Language | Section 16.4 | | | | | Authorization | Section 16.7 | | | | | Bandwidth | Section 16.8 | | | | | Blocksize | Section 16.9 | | | | | From | Section 16.23 | | | | | If-Match | Section 16.24 | | | | | If-Modified-Since | Section 16.25 | | | | | If-None-Match | Section 16.26 | | | | | Notify-Reason | Section 16.31 | | | | Schulzrinne, et al. Expires November 6, 2008 [Page 38] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 | Proxy-Require | Section 16.35 | | | | | Range | Section 16.38 | | | | | Referer | Section 16.39 | | | | | Request-Status | Section 16.41 | | | | | Require | Section 16.42 | | | | | Scale | Section 16.44 | | | | | Session | Section 16.48 | | | | | Speed | Section 16.46 | | | | | Supported | Section 16.49 | | | | | Transport | Section 16.51 | | | | | User-Agent | Section 16.53 | +--------------------+--------------------+ Table 3: The RTSP request headers Detailed headers definition are provided in Section 16. New request headers may be defined. If the receiver of the request is required to understand the request header, the request MUST include a corresponding feature tag in a Require or Proxy-Require header to ensure the processing of the header. actually happens. Schulzrinne, et al. Expires November 6, 2008 [Page 39] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 8. Response [H6] applies except that HTTP-Version is replaced by RTSP-Version. Also, RTSP defines additional status codes and does not define some of the HTTP codes. The valid response codes and the methods they can be used with are listed in Table 4. After receiving and interpreting a request message, the recipient responds with an RTSP response message. 8.1. Status-Line The first line of a Response message is the Status-Line, consisting of the protocol version followed by a numeric status code and the textual phrase associated with the status code, with each element separated by SP characters. No CR or LF is allowed except in the final CRLF sequence. SP SP CRLF 8.1.1. Status Code and Reason Phrase The Status-Code element is a 3-digit integer result code of the attempt to understand and satisfy the request. These codes are fully defined in Section 15. The Reason-Phrase is intended to give a short textual description of the Status-Code. The Status-Code is intended for use by automata and the Reason-Phrase is intended for the human user. The client is not required to examine or display the Reason- Phrase. The first digit of the Status-Code defines the class of response. The last two digits do not have any categorization role. There are 5 values for the first digit: 1xx: Informational - Request received, continuing process 2xx: Success - The action was successfully received, understood, and accepted 3rr: Redirection - Further action needs to be taken in order to complete the request 4xx: Client Error - The request contains bad syntax or cannot be fulfilled Schulzrinne, et al. Expires November 6, 2008 [Page 40] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 5xx: Server Error - The server failed to fulfill an apparently valid request The individual values of the numeric status codes defined for RTSP/2.0, and an example set of corresponding Reason-Phrases, are presented in Table 4. The reason phrases listed here are only recommended; they may be replaced by local equivalents without affecting the protocol. Note that RTSP adopts most HTTP/1.1 [RFC2616] status codes and adds RTSP-specific status codes starting at x50 to avoid conflicts with newly defined HTTP status codes. RTSP status codes are extensible. RTSP applications are not required to understand the meaning of all registered status codes, though such understanding is obviously desirable. However, applications MUST understand the class of any status code, as indicated by the first digit, and treat any unrecognized response as being equivalent to the x00 status code of that class, with the exception that an unrecognized response MUST NOT be cached. For example, if an unrecognized status code of 431 is received by the client, it can safely assume that there was something wrong with its request and treat the response as if it had received a 400 status code. In such cases, user agents SHOULD present to the user the entity returned with the response, since that entity is likely to include human- readable information which will explain the unusual status. +------+----------------------------------------+-----------------+ | Code | Reason | Method | +------+----------------------------------------+-----------------+ | 100 | Continue | all | | | | | | | | | | 200 | OK | all | | | | | | | | | | 300 | Multiple Choices | all | | | | | | 301 | Moved Permanently | all | | | | | | 302 | Found | all | | | | | | 303 | See Other | all | | | | | | 305 | Use Proxy | all | | | | | | | | | | 400 | Bad Request | all | | | | | | 401 | Unauthorized | all | Schulzrinne, et al. Expires November 6, 2008 [Page 41] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 | 402 | Payment Required | all | | | | | | 403 | Forbidden | all | | | | | | 404 | Not Found | all | | | | | | 405 | Method Not Allowed | all | | | | | | 406 | Not Acceptable | all | | | | | | 407 | Proxy Authentication Required | all | | | | | | 408 | Request Timeout | all | | | | | | 410 | Gone | all | | | | | | 411 | Length Required | all | | | | | | 412 | Precondition Failed | DESCRIBE, SETUP | | | | | | 413 | Request Entity Too Large | all | | | | | | 414 | Request-URI Too Long | all | | | | | | 415 | Unsupported Media Type | all | | | | | | 451 | Parameter Not Understood | SET_PARAMETER | | | | | | 452 | reserved | n/a | | | | | | 453 | Not Enough Bandwidth | SETUP | | | | | | 454 | Session Not Found | all | | | | | | 455 | Method Not Valid In This State | all | | | | | | 456 | Header Field Not Valid | all | | | | | | 457 | Invalid Range | PLAY, PAUSE | | | | | | 458 | Parameter Is Read-Only | SET_PARAMETER | | | | | | 459 | Aggregate Operation Not Allowed | all | | | | | | 460 | Only Aggregate Operation Allowed | all | | | | | | 461 | Unsupported Transport | all | | | | | Schulzrinne, et al. Expires November 6, 2008 [Page 42] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 | 462 | Destination Unreachable | all | | | | | | 463 | Destination Prohibited | SETUP | | | | | | 464 | Data Transport Not Ready Yet | PLAY | | | | | | 465 | Notification Reason Unknown | PLAY_NOTIFY | | | | | | 470 | Connection Authorization Required | all | | | | | | 471 | Connection Credentials not accepted | all | | | | | | 472 | Failure to establish secure connection | all | | | | | | | | | | 500 | Internal Server Error | all | | | | | | 501 | Not Implemented | all | | | | | | 502 | Bad Gateway | all | | | | | | 503 | Service Unavailable | all | | | | | | 504 | Gateway Timeout | all | | | | | | 505 | RTSP Version Not Supported | all | | | | | | 551 | Option not support | all | +------+----------------------------------------+-----------------+ Table 4: Status codes and their usage with RTSP methods 8.2. Response Header Fields The response-header fields allow the request recipient to pass additional information about the response which cannot be placed in the Status-Line. These header fields give information about the server and about further access to the resource identified by the Request-URI. All headers currently classified as response headers are listed in Table 5. Schulzrinne, et al. Expires November 6, 2008 [Page 43] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 +------------------------+--------------------+ | Header | Defined in Section | +------------------------+--------------------+ | Accept-Credentials | Section 16.2 | | | | | Accept-Ranges | Section 16.5 | | | | | Connection-Credentials | Section 16.12 | | | | | ETag | Section 16.21 | | | | | Location | Section 16.28 | | | | | Proxy-Authenticate | Section 16.33 | | | | | Public | Section 16.37 | | | | | Range | Section 16.38 | | | | | Retry-After | Section 16.40 | | | | | RTP-Info | Section 16.43 | | | | | Scale | Section 16.44 | | | | | Session | Section 16.48 | | | | | Server | Section 16.47 | | | | | Speed | Section 16.46 | | | | | Transport | Section 16.51 | | | | | Unsupported | Section 16.52 | | | | | Vary | Section 16.54 | | | | | WWW-Authenticate | Section 16.56 | +------------------------+--------------------+ Table 5: The RTSP response headers Response-header field names can be extended reliably only in combination with a change in the protocol version. However the usage of feature-tags in the request allows the responding party to learn the capability of the receiver of the response. New or experimental header fields MAY be given the semantics of response-header fields if all parties in the communication recognize them to be response-header Schulzrinne, et al. Expires November 6, 2008 [Page 44] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 fields. Unrecognized header fields in responses are treated as entity-header fields. Schulzrinne, et al. Expires November 6, 2008 [Page 45] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 9. Entity Request and Response messages MAY transfer an entity if not otherwise restricted by the request method or response status code. An entity consists of entity-header fields and an entity-body, although some responses will only include the entity-headers. The SET_PARAMETER and GET_PARAMETER request and response, and DESCRIBE response MAY have an entity. All 4xx and 5xx responses MAY also have an entity. In this section, both sender and recipient refer to either the client or the server, depending on who sends and who receives the entity. 9.1. Entity Header Fields Entity-header fields define meta-information about the entity-body or, if no body is present, about the resource identified by the request. The entity header fields are listed in Table 6. +------------------+--------------------+ | Header | Defined in Section | +------------------+--------------------+ | Allow | Section 16.6 | | | | | Content-Base | Section 16.13 | | | | | Content-Encoding | Section 16.14 | | | | | Content-Language | Section 16.15 | | | | | Content-Length | Section 16.16 | | | | | Content-Location | Section 16.17 | | | | | Content-Type | Section 16.18 | | | | | Expires | Section 16.22 | | | | | Last-Modified | Section 16.27 | +------------------+--------------------+ Table 6: The RTSP entity headers The extension-header mechanism allows additional entity-header fields to be defined without changing the protocol, but these fields cannot be assumed to be recognizable by the recipient. Unrecognized header fields SHOULD be ignored by the recipient and forwarded by proxies. Schulzrinne, et al. Expires November 6, 2008 [Page 46] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 9.2. Entity Body See [H7.2] with the addition that an RTSP message with an entity body MUST include the Content-Type and Content-Length headers. Schulzrinne, et al. Expires November 6, 2008 [Page 47] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 10. Connections RTSP requests can be transmitted using the two different connection scenarios listed below: o persistent - a transport connection is used for several request/ response transactions; o transient - a transport connection is used for a single request/ response transaction. RFC 2326 attempted to specify an optional mechanism for transmitting RTSP messages in connectionless mode over a transport protocol such as UDP. However, it was not specified in sufficient detail to allow for interoperable implementations. In an attempt to reduce complexity and scope, and due to lack of interest, RTSP 2.0 does not attempt to define a mechanism for supporting RTSP over UDP or other connectionless transport protocols. A side-effect of this is that RTSP requests SHALL NOT be sent to multicast groups since no connection can be established with a specific receiver in multicast environments. Certain RTSP headers, such as the CSeq header (Section 16.19), which may appear to be relevant only to connectionless transport scenarios are still retained and must be implemented according to the specification. In the case of CSeq, it is quite useful for matching responses to requests if the requests are pipelined (see Section 12). It is also useful in proxies for keeping track of the different requests when aggregating several client requests on a single TCP connection. 10.1. Reliability and Acknowledgements When RTSP messages are transmitted using reliable transport protocols, they MUST NOT be retransmitted at the RTSP protocol level. Instead, the implementation must rely on the underlying transport to provide reliability. The RTSP implementation may use any indication of reception acknowledgement of the message from the underlying transport protocols to optimize the RTSP behavior. If both the underlying reliable transport such as TCP and the RTSP application retransmit requests, each packet loss or message loss may result in two retransmissions. The receiver typically cannot take advantage of the application-layer retransmission since the transport stack will not deliver the application-layer retransmission before the first attempt has reached the receiver. If the packet loss is caused by congestion, multiple retransmissions at different layers will exacerbate the Schulzrinne, et al. Expires November 6, 2008 [Page 48] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 congestion. Lack of acknowledgement of an RTSP request should be handled within the constraints of the connection timeout considerations described below (Section 10.4). 10.2. Using Connections A TCP transport can be used for both persistent connections (for several message exchanges) and transient connections (for a single message exchange). Implementations of this specification MUST support RTSP over TCP. The scheme of the RTSP URI (Section 4.2) indicates the default port that the server will listen on. A server MUST handle both persistent and transient connections. Transient connections facilitate mechanisms for fault tolerance. They also allow for application layer mobility. A server and client pair that support transient connections can survive the loss of a TCP connection; e.g., due to a NAT timeout. When the client has discovered that the TCP connection has been lost, it can set up a new one when there is need to communicate again. A persistent connection MAY be used for all transactions between the server and client, including messages for multiple RTSP sessions. However a persistent connection MAY also be closed after a few message exchanges. For example, a client may use a persistent connection for the initial SETUP and PLAY message exchanges in a session and then close the connection. Later, when the client wishes to send a new request, such as a PAUSE for the session, a new connection would be opened. This connection may either be transient or persistent. An RTSP agent SHOULD NOT have more than one connection to the server at any given point. If a client or proxy handles multiple RTSP sessions on the same server, it SHOULD use only one connection for managing those sessions. This saves connection resources on the server. It also reduces complexity by and enabling the server to maintain less state about its sessions and connections. Unlike HTTP, RTSP allows a server to send requests to a client. However, this can be supported only if a client establishes a persistent connection with the server. In cases where a persistent connection does not exist between a server and its client, due to the lack of a signalling channel the server may be forced to drop an RTSP session without notifying the client. An example of such a case is Schulzrinne, et al. Expires November 6, 2008 [Page 49] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 when the server desires to send a REDIRECT request for an RTSP session to the client but is not able to do so because it cannot reach the client. Without a persistent connection between the client and the server, the media server has no reliable way of reaching the client. Also, this is the only way that requests from a server to its client are likely to traverse firewalls. In light of the above, it is RECOMMENDED that clients use persistent connections whenever possible. A client that supports persistent connections MAY "pipeline" its requests (see Section 12). 10.3. Closing Connections The client MAY close a connection at any point when no outstanding request/response transactions exist for any RTSP session being managed through the connection. The server, however, SHOULD NOT close a connection until all RTSP sessions being managed through the connection have been timed out (Section 16.48). A server SHOULD NOT close a connection immediately after responding to a session-level TEARDOWN request for the last RTSP session being controlled through the connection. Instead, it should wait for a reasonable amount of time for the client to receive the TEARDOWN response, take appropriate action, and initiate the connection closing. The server SHOULD wait at least 10 seconds after sending the TEARDOWN response before closing the connection. This is to ensure that the client has time to issue a SETUP for a new session on the existing connection after having torn the last one down. 10 seconds should give the client ample opportunity get its message to the server. A server SHOULD NOT close the connection directly as a result of responding to a request with an error code. Certain error responses such as "460 Only Aggregate Operation Allowed" (Section 15.4.12) are used for negotiating capabilities of a server with respect to content or other factors. In such cases, it is inefficient for the server to close a connection on an error response. Also, such behavior would prevent implementation of advanced/special types of requests or result in extra overhead for the client when testing for new features. On the flip side, keeping connections open after sending an error response poses a Denial of Service security risk (Section 21). If a server closes a connection while the client is attempting to send a new request, the client will have to close its current Schulzrinne, et al. Expires November 6, 2008 [Page 50] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 connection, establish a new connection and send its request over the new connection. An RTSP message should not be terminated by closing the connection. Such a message MAY be considered to be incomplete by the receiver and discarded. An RTSP message is properly terminated as defined in Section 5. 10.4. Timing Out Connections and RTSP Messages Receivers of a request (responder) SHOULD respond to requests in a timely manner even when a reliable transport such as TCP is used. Similarly, the sender of a request (requestor) SHOULD wait for a sufficient time for a response before concluding that the responder will not be acting upon its request. A responder SHOULD respond to all requests within 5 seconds. If the responder recognizes that processing of a request will take longer than 5 seconds, it SHOULD send a 100 (Continue) response as soon as possible. It SHOULD continue sending a 100 response every 5 seconds thereafter until it is ready to send the final response to the requestor. After sending a 100 response, the receiver MUST send a final response indicating the success or failure of the request. A requestor SHOULD wait at least 10 seconds for a response before concluding that the responder will not be responding to its request. After receiving a 100 response, the requestor SHOULD continue waiting for further responses. If more than 10 seconds elapses without receiving any response, the requestor MAY assume that the responder is unresponsive and abort the connection. A requestor SHOULD wait longer than 10 seconds for a response if it is experiencing significant transport delays on its connection to the responder. The requestor is capable of determining the RTT of the request/response cycle using the Timestamp header (Section 16.50) in any RTSP request. 10.5. Showing Liveness The mechanisms for showing liveness of the client is, any RTSP request with a Session header, if RTP & RTCP is used an RTCP message, or through any other used media protocol capable of indicating liveness of the RTSP client. It is RECOMMENDED that a client does not wait to the last second of the timeout before trying to send a liveness message. The RTSP message may be lost or when using reliable protocols, such as TCP, the message may take some time to arrive safely at the receiver. To show liveness between RTSP request issued to accomplish other things, the following mechanisms can be Schulzrinne, et al. Expires November 6, 2008 [Page 51] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 used, in descending order of preference: RTCP: If RTP is used for media transport RTCP SHOULD be used. If RTCP is used to report transport statistics, it SHALL also work as keep alive. The server can determine the client by used network address and port together with the fact that the client is reporting on the servers SSRC(s). A downside of using RTCP is that it only gives statistical guarantees to reach the server. However that probability is so low that it can be ignored in most cases. For example, a session with 60 seconds timeout and enough bitrate assigned to RTCP messages to send a message from client to server on average every 5 seconds. That client have for a network with 5 % packet loss, the probability to fail showing liveness sign in that session within the timeout interval of 2.4*E-16. In sessions with shorter timeout times, or much higher packet loss, or small RTCP bandwidths SHOULD also use any of the mechanisms below. SET_PARAMETER: When using SET_PARAMETER for keep alive, no body SHOULD be included. This method is the RECOMMENDED RTSP method to use in request only intended to perform keep-alive. OPTIONS: This method does also work. However it causes the server to perform more unnecessary processing and result in bigger responses than necessary for the task. The reason for this is that the server needs to determine what capabilities that are associated with the media resource to correctly populate the Public and Allow headers. The timeout parameter MAY be included in a SETUP response, and SHALL NOT be included in requests. The server uses it to indicate to the client how long the server is prepared to wait between RTSP commands or other signs of life before closing the session due to lack of activity (see below and Appendix B). The timeout is measured in seconds, with a default of 60 seconds. The length of the session timeout SHALL NOT be changed in a established session. 10.6. Use of IPv6 Explicit IPv6 support was not present in RTSP 1.0 (RFC 2326). RTSP 2.0 has been updated for explicit IPv6 support. Implementations of RTSP 2.0 MUST understand literal IPv6 addresses in URIs and headers. Schulzrinne, et al. Expires November 6, 2008 [Page 52] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 11. Capability Handling This section describes the available capability handling mechanism which allows RTSP to be extended. Extensions to this version of the protocol are basically done in two ways. First, new headers can be added. Secondly, new methods can be added. The capability handling mechanism is designed to handle both cases. When a method is added, the involved parties can use the OPTIONS method to discover wether it is supported. This is done by issuing a OPTIONS request to the other party. Depending on the URI it will either apply in regards to a certain media resource, the whole server in general, or simply the next hop. The OPTIONS response MUST contain a Public header which declares all methods supported for the indicated resource. It is not necessary to use OPTIONS to discover support of a method, the client could simply try the method. If the receiver of the request does not support the method it will respond with an error code indicating the the method is either not implemented (501) or does not apply for the resource (405). The choice between the two discovery methods depends on the requirements of the service. Feature-Tags are defined to handle functionality additions that are not new methods. Each feature-tag represents a certain block of functionality. The amount of functionality that a feature-tag represents can vary significantly. A feature-tag can for example represent the functionality a single RTSP header provides. Another feature-tag can represent much more functionality, such as the "play.basic" feature-tag which represents the minimal playback implementation. Feature-tags are used to determine wether the client, server or proxy supports the functionality that is necessary to achieve the desired service. To determine support of a feature-tag, several different headers can be used, each explained below: Supported: The supported header is used to determine the complete set of functionality that both client and server have. The intended usage is to determine before one needs to use a functionality that it is supported. It can be used in any method, however OPTIONS is the most suitable one as it at the same time determines all methods that are implemented. When sending a request the requestor declares all its capabilities by including all supported feature-tags. This results in that the receiver learns the requestors feature support. The receiver then includes its set of features in the response. Schulzrinne, et al. Expires November 6, 2008 [Page 53] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 Proxy-Supported: The Proxy-Supported header is used similar to the Supported header, but instead of giving the supported functionality of the client or server it provides both the requestor and the responder a view of what functionality the proxy chain between the two supports. Proxies are required to add this header whenever the Supported header is present, but proxies may independently of the requestor add it. Require: The Require header can be included in any request where the end-point, i.e. the client or server, is required to understand the feature to correctly perform the request. This can, for example, be a SETUP request where the server is required to understand a certain parameter to be able to set up the media delivery correctly. Ignoring this parameter would not have the desired effect and is not acceptable. Therefore the end-point receiving a request containing a Require MUST negatively acknowledge any feature that it does not understand and not perform the request. The response in cases where features are not supported are 551 (Option Not Supported). Also the features that are not supported are given in the Unsupported header in the response. Proxy-Require: This method has the same purpose and workings as Require except that it only applies to proxies and not the end- point. Features that needs to be supported by both proxies and end-point needs to be included in both the Require and Proxy- Require header. Unsupported: This header is used in a 551 error response, to indicate which feature(s) that was not supported. Such a response is only the result of the usage of the Require and/or Proxy-Require header where one or more feature where not supported. This information allows the requestor to make the best of situations as it knows which features are not supported. Schulzrinne, et al. Expires November 6, 2008 [Page 54] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) May 2008 12. Pipelining Support Pipelining is a general method to improve performance of r