view Side-By-Side changes
INTERNET-DRAFTUCLA/ICIR draft-ietf-dccp-spec-05.txtUCLA draft-ietf-dccp-spec-06.txt Mark Handley Expires:AprilAugust 2004 UCL Sally Floyd ICIRJitendra Padhye Microsoft Research 27 October 200316 February 2004 Datagram Congestion Control Protocol (DCCP) Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of [RFC 2026]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright Notice Copyright (C) The Internet Society(2003).(2004). All Rights Reserved. Abstract This document specifies the Datagram Congestion Control Protocol (DCCP), which implements a congestion-controlled, unreliable flow of unicast datagrams suitable for use by applications such as streaming media, Internet telephony, and on-line games.Kohler/Handley/Floyd/PadhyeKohler/Handley/Floyd [Page 1] INTERNET-DRAFT Expires:AprilAugust 2004 February 2004October 2003TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: Changes sincedraft-ietf-dccp-spec-04.txt: * Rearchitected feature negotiation (Junwen Lai).draft-ietf-dccp-spec-05.txt: *Added figures, and modified text, to the Overview section. Figures and text partly from Eric Rescorla. * New synchronziation mechanism: DCCP-Sync.Organization overhaul. *DCCP-Move:AddMobility ID and remove Old Address and Old Port, because they wouldn't work through a NAT. * The MD5 ID Regime is now number 1. (It is still the default.) ID Regime 0 is the Null Regime. Also switch the meaning of the ID Regime feature. * Rename Drop States to Drop Codes, and renumber them. * Ignored cannot contain more option data bytes than the offending option. * Rename Service Name to Service Code (Gorry Fairhurst).pseudocode for event processing. *Rename Cslen/Checksum Length to CsCov/Checksum Coverage and change its values by analogyRemove # NDP; replace withUDP-Lite. * Be more specific about what Slow Receiver means. * Allow a textual error message in DCCP-Reset. * Mention new PMTUD, but this mention needs work. * CCID 1: Specify when acks may be sent. * Specify Request retransmission strategy. * Other changes throughout. Changes since draft-ietf-dccp-spec-03.txt: * Specify how the Loss Window is arranged. * Ignored can contain multiple bytes of option data. * Refine the tables in Section 8.5.1, onAckVector Consistency. Kohler/Handley/Floyd/Padhye [Page 2] INTERNET-DRAFT Expires: April 2004 October 2003 * CC mechanisms must treat Data Dropped like ECN Marked unless otherwise specified.Count. *An MTU is mandatory (although PMTU is not),Remove Identification, Challenge, ID Regime, andCCIDs can affect the MTU. * Clarifications in response to reviewer comments. Changes since draft-ietf-dccp-spec-02.txt: * Identification options include the Acknowledgement Number in their hash. * Added an additional condition to accepting a packet with an invalid Sequence Number: the Acknowledgement Number must be valid, as well as the Identification options. * Explicitly allow Connection Nonces to be negotiated in other ways than theConnectionNonce feature. * Bad Moves are ignored, not reset, to avoid leaking information to attackers. Changes since draft-ietf-dccp-spec-01.txt: * Revise definition of when packets are reported as received, due to ECN Nonce verification problems with the previous definition and options. * Replace Receive Buffer Drops with Data Dropped. * Remove Data Discarded in favor of Data Dropped with Drop State 0.Nonce. *Remove Buffer Closed in favor ofDataDropped with Drop State 4 [NB: now Drop Code 1]. * Add Initial Sequence Number setting guidelines. * Add sections on retransmission of Requests, andChecksum (formerly Payload Checksum) uses atable to the state diagram. * Made the 4-bit Reserved field in the DCCP generic header available for use by CCIDs.32-bit CRC. *Refine descriptionSwitch location ofCCID 1. * Add Middlebox Considerations. Kohler/Handley/Floyd/Padhye [Page 3] INTERNET-DRAFT Expires: April 2004 October 2003 * Change Identification option to allow middleboxesnon-negotiable features tochange port numbers, DCCP options, and/or packet data without disruptingclarify presentation; now theconnection.feature location controls its value. *Specify that Ignored should be sent only on packets with Acknowledgement Numbers.Rename "value type" to "reconciliation rule". *Add Aggression Penalty Reset Reason.Rename "Reset Reason" to "Reset Code". *Add Payload Checksum option.Mobility ID becomes 128 bits long. * AddElapsed Time option (formerly specificprobabilities toCCID 3). * Timestamp Echo option can omit Elapsed Time, or provide a two-byte Elapsed Time value. Elapsed Time is measured in tenths of milliseconds, not microseconds. * Clean up DCCP-Move and feature-negotiation options discussions. * Confirm(Connection Nonce) sends no data. * Ack Vector implementation supports ECN Nonce Echo.Mobility ID discussion. * AddCSlen and Partial Checksumming Design Motivation. * Clarify that Ack Vectors may be sent even if Use Ack Vector is false. Kohler/Handley/Floyd/PadhyeSyncAck. Kohler/Handley/Floyd [Page4]2] INTERNET-DRAFT Expires:AprilAugust 2004 February 2004October 2003Table of Contents 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . .87 2. Design Rationale. . . . . . . . . . . . . . . . . . . . . . .98 3. Conventions and Terminology . . . . . . . . . . . . . . . . .109 3.1.Robustness PrincipleNumbers and Fields . . . . . . . . . . . . . . . . . .10. 9 3.2.Packet TypesParts of a Connection. . . . . . . . . . . . . . . . . . 9 3.3. Features . . . . .11 3.3. States. . . . . . . . . . . . . . . . . . . 10 3.4. Round-Trip Times . . . . . .11 3.4. Parts of a Connection.. . . . . . . . . . . . . . 10 3.5. Robustness Principle . . .13 4. Overview.. . . . . . . . . . . . . . . 10 4. Overview. . . . . . . . . . . .14 4.1. Connection Initiation and Termination.. . . . . . . . .14 4.2. Congestion Control. . . . . . 11 4.1. Packet Types . . . . . . . . . . . . .16 4.2.1. CCID 2.. . . . . . . . . 11 4.2. Sequence Numbers . . . . . . . . . . . . .16 4.2.2. CCID 3.. . . . . . . 12 4.3. States . . . . . . . . . . . . . . .16 4.3. Features. . . . . . . . . . 13 4.4. Congestion Control . . . . . . . . . . . . . .16 4.4. Example Connection. . . . . 15 4.5. Features . . . . . . . . . . . . . .18 4.5. Examples of DCCP Congestion Control.. . . . . . . . . .19 4.5.1. DCCP with TCP-like Congestion Control16 4.6. Other Differences from TCP . . . . . . .19 4.5.2. DCCP with TFRC Congestion Control. . . . . . . . 17 4.7. Example Connection .21 5. Packet Formats.. . . . . . . . . . . . . . . . . . 18 5. Header Formats. . . . . .22 5.1. Generic Packet Header.. . . . . . . . . . . . . . . . .22 5.2. Sequence Number Synchronization.. 19 5.1. Generic Header . . . . . . . . . . .27 5.2.1. Variables. . . . . . . . . . 20 5.2. DCCP-Request Header. . . . . . . . . . . .27 5.2.2. Appropriate Sequence Numbers.. . . . . . . 23 5.3. DCCP-Response Header . . . .28 5.2.3. Appropriate Acknowledgement Numbers. . . . . . . .29 5.2.4. Sequence-Validity By State.. . . . . . 23 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Head- ers . . . . . .29 5.2.5. Handling Sequence-Invalid Packets. . . . . . . . .31 5.2.6. Examples.. . . . . . . . . . . . . . 24 5.5. DCCP-CloseReq and DCCP-Close Headers . . . . . . .31 5.3. Extended Sequence Numbers.. . . 25 5.6. DCCP-Reset Header. . . . . . . . . . . . .32 5.3.1. Transitioning to Extended Sequence Num- bers. . . . . . . 26 5.7. DCCP-Move Header . . . . . . . . . . . . . . . . . . . .34 5.4. DCCP State Diagram27 5.8. DCCP-Sync and DCCP-SyncAck Headers . . . . . . . . . . . 28 5.9. Options. . . . . . . . .36 5.5. DCCP-Request Packet Format. . . . . . . . . . . . . . .37 5.6. DCCP-Response Packet Format.. 29 5.9.1. Padding Option. . . . . . . . . . . . . .38 5.7. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packet Formats. . . . . 30 5.9.2. Mandatory Option. . . . . . . . . . . . . . . . . . 30 6. Feature Negotiation . . . . .40 5.8. DCCP-CloseReq and DCCP-Close Packet Format. . . . . . .42 5.9. DCCP-Reset Packet Format. . . . . . . . . 31 6.1. Change Options . . . . . . .42 5.10. DCCP-Move Packet Format. . . . . . . . . . . . . . 31 6.2. Confirm Options. . .44 5.11. DCCP-Sync Packet Format. . . . . . . . . . . . . . . .46 6. Options and Features.. . 32 6.3. Reconciliation Rules . . . . . . . . . . . . . . . . . .47 6.1. Padding Option32 6.3.1. Server-Priority . . . . . . . . . . . . . . . . . . 33 6.3.2. Non-Negotiable. . . .48 6.2. Ignored Option. . . . . . . . . . . . . . . 33 6.4. Feature Numbers. . . . . . .48 6.3. Mandatory Option. . . . . . . . . . . . . . 33 6.5. Examples . . . . . .49 6.4. Feature Negotiation.. . . . . . . . . . . . . . . . . .49 6.4.1. Value Types34 6.6. Option Exchange. . . . . . . . . . . . . . . . . . . . .51 6.4.2. Feature Numbers36 6.6.1. Normal Exchange . . . . . . . . . . . . . . . . . .52 6.4.3. Change L Option36 6.6.2. Loss and Retransmission . . . . . . . . . . . . . . 37 6.6.3. Reordering. . . . .52 Kohler/Handley/Floyd/Padhye [Page 5] INTERNET-DRAFT Expires: April 2004 October 2003 6.4.4. Confirm L Option.. . . . . . . . . . . . . . . . 38 6.6.4. Preference Changes. .53 6.4.5. Change R Option. . . . . . . . . . . . . . . 39 6.6.5. Simultaneous Negotiation. . . .53 6.4.6. Confirm R Option.. . . . . . . . . . 39 6.6.6. Unknown Features. . . . . . . .54 6.4.7. Unknown Features.. . . . . . . . . . 39 6.6.7. Invalid Options . . . . . . .54 6.4.8. State Diagram. . . . . . . . . . . 40 6.6.8. Mandatory Feature Negotiation . . . . . . . .55 6.4.9. Streamlined Negotiation. . . 40 Kohler/Handley/Floyd [Page 3] INTERNET-DRAFT Expires: August 2004 February 2004 6.6.9. Out-of-Band Agreement . . . . . . . . . . .58 6.5. Identification Options. . . . 41 6.6.10. State Diagram. . . . . . . . . . . . . .58 6.5.1. Identification Regime Feature. . . . . 41 7. Sequence Numbers. . . . . . .59 6.5.2. Connection Nonce Feature.. . . . . . . . . . . . .59 6.5.3. Identification Option. . . 42 7.1. Variables. . . . . . . . . . . . .60 6.5.4. Challenge Option.. . . . . . . . . . . 42 7.2. Initial Sequence Numbers . . . . . .61 6.6. Init Cookie Option. . . . . . . . . . 43 7.3. Quiet Time . . . . . . . . .62 6.7. Timestamp Option. . . . . . . . . . . . . . 44 7.4. Acknowledgement Numbers. . . . . . .63 6.8. Elapsed Time Option.. . . . . . . . . . 44 7.5. Validity and Synchronization . . . . . . . .63 6.9. Timestamp Echo Option.. . . . . . 45 7.5.1. Sequence-Validity Rules . . . . . . . . . . .64 6.10. Loss Window Feature. . . 45 7.5.2. Handling Sequence-Invalid Packets . . . . . . . . . 47 7.5.3. Sequence and Acknowledgement Number Windows. . . . . . .65 7. Congestion Control IDs.. . . . . . . . . . . . . . . . . . .65 7.1. Unspecified Sender-Based Congestion Control48 7.5.4. Sequence Window Feature . . . . . . . . . . . . . . 49 7.5.5. Sequence Number Attacks . . . . . . . . . . . . .66 7.2. TCP-like Congestion Control.. 49 7.5.6. Examples. . . . . . . . . . . . . .67 7.3. TFRC Congestion Control.. . . . . . . . 50 7.6. Extended Sequence Numbers. . . . . . . . .68 7.4. CCID-Specific Options, Features, and Reset Reasons. . . . . . . 51 7.6.1. When to Use Extended Sequence Numbers . . . . . . . 51 7.6.2. Header Processing . . . . . . . . . . . . .68 8. Acknowledgements.. . . . 52 7.6.3. Transitioning to Extended Sequence Num- bers . . . . . . . . . . . . . . . . . .70 8.1. Acks of Acks and Unidirectional Connections. . . . . . . . . 53 7.6.4. Sequence Transition Capable Feature . . . . . . . . 54 7.7. NDP Count and Detecting Application Loss . . . . . . . .70 8.2. Ack Piggybacking55 7.7.1. Usage Notes . . . . . . . . . . . . . . . . . . . .72 8.3. Ack Ratio56 7.7.2. Send NDP Count Feature. . . . . . . . . . . . . . . 56 8. Event Processing. . . . . .72 8.4. Use Ack Vector Feature. . . . . . . . . . . . . . . . .73 8.5. Ack Vector Options56 8.1. Connection Establishment . . . . . . . . . . . . . . . . 56 8.1.1. Client Request. . . .73 8.5.1. Ack Vector Consistency.. . . . . . . . . . . . . .75 8.5.2. Ack Vector Coverage. 57 8.1.2. Service Codes . . . . . . . . . . . . . . .77 8.6. Slow Receiver Option. . . . 57 8.1.3. Server Response . . . . . . . . . . . . . .77 8.7. Data Dropped Option.. . . . 59 8.1.4. Init Cookie Option. . . . . . . . . . . . . . .78 8.7.1. Data Dropped and Normal Congestion Response. . 60 8.1.5. Handshake Completion. . . . . . . . . . . . . . . . 60 8.2. Data Transfer. . . . . . . . .81 8.7.2. Particular Drop Codes. . . . . . . . . . . . . 61 8.3. Termination. . .81 8.8. Payload Checksum Option.. . . . . . . . . . . . . . . .82 9. Explicit Congestion Notification.. . . . 62 8.3.1. Abnormal Termination. . . . . . . . . . .83 9.1. ECN Capable Feature.. . . . . 63 8.4. DCCP State Diagram . . . . . . . . . . . . .83 9.2. ECN Nonces. . . . . . 63 8.5. Pseudocode . . . . . . . . . . . . . . . . .84 9.3. Other Aggression Penalties. . . . . . 64 9. Checksums . . . . . . . . .85 10. Multihoming and Mobility. . . . . . . . . . . . . . . . . 68 9.1. Header Checksum Field. .85 10.1. Mobility Capable Feature.. . . . . . . . . . . . . . .86 10.2. Mobility ID. 68 9.2. Header Checksum Coverage Field . . . . . . . . . . . . . 69 9.3. Data Checksum Option . . . . . . . .86 10.3. Security.. . . . . . . . . . 70 9.3.1. Check Data Checksum Feature . . . . . . . . . . . . 71 9.3.2. Usage Notes .87 10.4.. . . . . . . . . . . . . . . . . . . 71 10. Congestion ControlState.IDs . . . . . . . . . . . . . . . .87 10.5. Loss During Transition.. . . 71 10.1. Unspecified Sender-Based Congestion Control . . . . . . . . . . . . .87 Kohler/Handley/Floyd/Padhye. . . . . . . . . . . . . . 72 10.2. TCP-like Congestion Control . . . . . . . . . . . . . . 74 10.3. TFRC Congestion Control . . . . . . . . . . . . . . . . 74 10.4. CCID-Specific Options, Features, and Reset Kohler/Handley/Floyd [Page6]4] INTERNET-DRAFT Expires:AprilAugust 2004October 2003February 2004 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 11.Maximum Packet Size.Acknowledgements . . . . . . . . . . . . . . . . . . . .88 12. Middlebox Considerations. . 76 11.1. Acks of Acks and Unidirectional Connections . . . . . . . . . . . . . . . .90 13. Abstract API. . . . . . . . . 77 11.2. Ack Piggybacking. . . . . . . . . . . . . . . .91 14. Multiplexing Issues.. . . . 78 11.3. Ack Ratio Feature . . . . . . . . . . . . . . . .91 15. DCCP and RTP. . . 79 11.4. Ack Vector Options. . . . . . . . . . . . . . . . . . . 79 11.4.1. Ack Vector Consistency . . .92 16. Security Considerations.. . . . . . . . . . . 81 11.4.2. Ack Vector Coverage. . . . . . . .93 16.1. Security Considerations for Mobility.. . . . . . . . 83 11.5. Send Ack Vector Feature .94 16.2. Security Considerations for Partial Check- sums.. . . . . . . . . . . . . . . 83 11.6. Slow Receiver Option. . . . . . . . . . . . . .94 17. IANA Considerations.. . . . 84 11.7. Data Dropped Option . . . . . . . . . . . . . . . .95 18. Thanks. . 84 11.7.1. Data Dropped and Normal Congestion Response . . . . . . . . . . . . . . . . . . . . . . . . .96 A. Appendix: Ack Vector Implementation Notes87 11.7.2. Particular Drop Codes. . . . . . . . . . .97 A.1. Packet Arrival. . . . 87 12. Explicit Congestion Notification . . . . . . . . . . . . . . 88 12.1. ECN Capable Feature . . .99 A.1.1. New Packets. . . . . . . . . . . . . . . 88 12.2. ECN Nonces. . . . . .99 A.1.2. Old Packets. . . . . . . . . . . . . . . . . 89 12.3. Other Aggression Penalties. . . .100 A.2. Sending Acknowledgements. . . . . . . . . . . 90 13. Timing Options . . . . .101 A.3. Clearing State. . . . . . . . . . . . . . . . . . 90 13.1. Timestamp Option. . . .102 A.4. Processing Acknowledgements.. . . . . . . . . . . . . .103 B. Appendix: Design Motivation. . 90 13.2. Elapsed Time Option . . . . . . . . . . . . . . . . .104 B.1. CsCov. 91 13.3. Timestamp Echo Option . . . . . . . . . . . . . . . . . 92 14. Multihoming andPartial ChecksummingMobility . . . . . . . . . . . . .104 Normative References. . . . . 92 14.1. Mobility Capable Feature. . . . . . . . . . . . . . . . 93 14.2. Mobility ID Feature . .105 Informative References. . . . . . . . . . . . . . . . 93 14.3. Mobile Host Processing. . . . . .106 Authors' Addresses. . . . . . . . . . . 94 14.4. Stationary Host Processing. . . . . . . . . . . . .107 Kohler/Handley/Floyd/Padhye [Page 7] INTERNET-DRAFT Expires: April 2004 October 2003 1. Introduction This document specifies the Datagram. . 95 14.5. Congestion ControlProtocol (DCCP). DCCP provides the following features: o An unreliable flow of datagrams, with acknowledgements. o A reliable handshake for connection setup and teardown. o Reliable negotiation of options, including negotiation of a suitable congestion control mechanism. o Mechanisms allowing a server to avoid holding any state for unacknowledged connection attempts or already-finished connections. o Optional mechanisms that tell the sender, with high reliability, which packets reached the receiver, and whether those packets were ECN marked, corrupted, or dropped in the receive buffer. o Congestion control incorporating Explicit Congestion Notification (ECN) and the ECN Nonce, as per [RFC 3168] and [ECN NONCE]. o Path MTU discovery, as per [RFC 1191]. DCCP is intended for applications that requireState. . . . . . . . . . . . . . . . 96 14.6. Security. . . . . . . . . . . . . . . . . . . . . . . . 96 15. Maximum Packet Size. . . . . . . . . . . . . . . . . . . . . 97 16. Forward Compatibility. . . . . . . . . . . . . . . . . . . . 99 17. Middlebox Considerations . . . . . . . . . . . . . . . . . . 100 18. Relations to Other Specifications. . . . . . . . . . . . . . 101 18.1. DCCP and RTP. . . . . . . . . . . . . . . . . . . . . . 101 18.2. Multiplexing Issues . . . . . . . . . . . . . . . . . . 102 19. Security Considerations. . . . . . . . . . . . . . . . . . . 103 19.1. Security Considerations for Mobility. . . . . . . . . . 103 19.2. Security Considerations for Partial Check- sums. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 20. IANA Considerations. . . . . . . . . . . . . . . . . . . . . 105 21. Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 A. Appendix: Ack Vector Implementation Notes . . . . . . . . . . 106 A.1. Packet Arrival . . . . . . . . . . . . . . . . . . . . . 108 A.1.1. New Packets . . . . . . . . . . . . . . . . . . . . 108 A.1.2. Old Packets . . . . . . . . . . . . . . . . . . . . 109 A.2. Sending Acknowledgements . . . . . . . . . . . . . . . . 110 A.3. Clearing State . . . . . . . . . . . . . . . . . . . . . 110 Kohler/Handley/Floyd [Page 5] INTERNET-DRAFT Expires: August 2004 February 2004 A.4. Processing Acknowledgements. . . . . . . . . . . . . . . 112 B. Appendix: Design Motivation . . . . . . . . . . . . . . . . . 113 B.1. CsCov and Partial Checksumming . . . . . . . . . . . . . 113 Normative References . . . . . . . . . . . . . . . . . . . . . . 114 Informative References . . . . . . . . . . . . . . . . . . . . . 115 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 116 Intellectual Property Notice . . . . . . . . . . . . . . . . . . 117 Kohler/Handley/Floyd [Page 6] INTERNET-DRAFT Expires: August 2004 February 2004 1. Introduction This document describes theflow-based semanticsDatagram Congestion Control Protocol (DCCP), a transport protocol that implements a congestion- controlled, bidirectional stream ofTCP, but which do not want TCP's in-order deliveryunreliable datagrams. Specifically, DCCP provides: o An unreliable flow of datagrams, with acknowledgements. o Reliable handshakes for connection setup andreliability semantics, or which would like differentteardown. o Reliable negotiation of options, including negotiation of a suitable congestion controldynamics than TCP. Similarly, DCCP is intendedmechanism. o Mechanisms allowing a server to avoid holding any state forapplications that do not require features of SCTPunacknowledged connection attempts or already-finished connections. o Congestion control incorporating Explicit Congestion Notification (ECN) and the ECN Nonce, as per [RFC2960] such3168] and [RFC 3540]. o Acknowledgement mechanisms communicating packet loss and ECN mark information. Acks are transmitted assequenced delivery within multiple streams. Applicationsreliably as the relevant congestion control mechanism requires, possibly completely reliably. o Optional mechanisms thatcould make use of DCCP include those with timing constraints ontell thedelivery ofsending application, with high reliability, which data packets reached the receiver, and whether those packets were ECN marked, corrupted, or dropped in the receive buffer. o Path Maximum Transfer Unit (PMTU) discovery, as per [RFC 1191]. DCCP is intended for applications, suchthatas streaming media and Internet telephony, where reliable in-order delivery,whencombined with congestion control,is likely tocan result in some information arriving at the receiver after it is no longer of use.Such applications might include streaming media and Internet telephony. To dateSo far, most such applications haveusedeither used TCP, with the attendant quality problemsdescribed above,caused by late data delivery, or used UDP and implemented their own congestion controlmechanisms(or no congestion control at all).The purpose ofDCCPis to provide aprovides standardway to implement congestion control andcongestion controlnegotiationmechanisms for such applications.One of the motivations for DCCP is to enableIt enables the use of ECN, along with conformantend-to-endend- to-end congestion control, for applications that would otherwise be using UDP. In addition, DCCP implements reliable connection setup, teardown, and featureKohler/Handley/Floyd/Padhyenegotiation. DCCP's target applications require the flow-based semantics of TCP, but do not want TCP's in-order delivery and reliability, or would Kohler/Handley/Floyd Section 1. [Page8]7] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 negotiation. A DCCP connection contains acknowledgement traffic as well as data traffic. Acknowledgements inform a sender whether its packets arrived, and whether they were ECN marked. Acks are transmitted as reliably as theFebruary 2004 like different congestion controlmechanism in use requires, possibly completely reliably.dynamics than TCP. 2. Design Rationale DCCPiswas intended to be used by applications that currently use UDP without end-to-end congestion control.The desire is for manyMost streaming UDP applicationstoshould have little reason not touse DCCP instead of UDP,switch to DCCP, onceDCCPit is deployed. Thus, DCCP was designed to have as little overhead as possible, both in termsboth of the sizeof the packet header size and in terms of the state and CPU overhead required attheend hosts.This desire for minimal overhead results in the design decision to include onlyOnly the minimal necessary functionality was included in DCCP, leaving other functionality, such asFEC or semi-reliability,forward error correction (FEC), semi- reliability, and multiple streams, to be layered on top of DCCP as desired.TheThis desire for minimal overhead is also one of the reasons topropose DCCP instead of justavoid proposing an unreliableversionvariant ofSCTP for applications currently using UDP.the Stream Control Transmission Protocol (SCTP, [RFC 2960]). Different forms of conformant congestion control are appropriate for different applications. For example, applications such as on-line games might want to make quick use of any available bandwidth. Other applications,andsuch as streaming media, might trade off this responsiveness for asecond motivation behindsteadier, less bursty rate, since sudden rate changes cause unacceptable UI glitches (such as audible pauses or clicks in thedesign ofplayout stream). Thus, DCCPis to allowallows applications to choose between several forms of congestion control. One choice, TCP-like Congestion Control, halves the congestion window in response to a packet drop or mark, as in TCP. Applications using this congestion control mechanism will respond quickly to changes in available bandwidth, butmust be able to tolerate the abrupt changes in congestion window typical of TCP. A second alternative, TCP-Friendly Rate Control (TFRC), a form of equation-based congestion control, minimizes abrupt changes in the sending rate while maintaining longer-term fairness with TCP. TCP- like Congestion Control is appropriate for applications such as on- line games that want to make use of all the available bandwidth quickly, but can tolerate rapid reductions in rate without serious consequences. TFRC is more appropriate for applications such as streaming media, where rapid rate changes cause unacceptable UI glitches (audible pauses or clicks in the playout stream, for example). These applications would prefer to give up on rapid consumption of available bandwidthmust be able to tolerate the abrupt changes infavorcongestion window typical of TCP. A second alternative, TCP- Friendly Rate Control (TFRC, [RFC 3448]), asteadier rate.form of equation-based congestion control, minimizes abrupt changes in the sending rate while maintaining longer-term fairness with TCP. DCCP alsoallowslets unreliable traffictosafely useECN safely.ECN. A UDP kernel API might not allow applications to set UDP packets as ECN-capable, since the API could not guarantee the application would properlyKohler/Handley/Floyd/Padhye Section 2. [Page 9] INTERNET-DRAFT Expires: April 2004 October 2003detect or respond to congestion. DCCP kernel APIs will have no such issues, since DCCP itself implements congestion control.In proposing a new transport protocol, it is necessary to justify the design decisionWe chose not to require the use of the CongestionManager, as well as the design decision to add a new transport protocol to the current family of UDP, TCP, and SCTP. The CongestionManager [RFC3124]3124], which allows multiple concurrent streams between the same sender and receiver to share congestion control.However, theThe current Congestion Manager can only be used by applications that have their own end-to-end feedback about packet losses,andbut this is not the case for many of the applications currently using UDP. In addition, the current Congestion Manager does not easily support multiple congestion control mechanisms, or lend itself to the use of forms of Kohler/Handley/Floyd Section 2. [Page 8] INTERNET-DRAFT Expires: August 2004 February 2004 TFRC where the state about past packet drops or marks is maintained at the receiver rather than at the sender.WhileDCCP should be able to make use of CM where desired by the application, but we do not see any benefit in making the deployment of DCCP contingent on the deployment of CM itself. 3. Conventions and TerminologyEach DCCP connection runs between two endpoints, which we often name DCCP A and DCCP B. Data may pass over the connection in either or both directions.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119]. 3.1. Numbers and Fields All multi-byte numerical quantities in DCCP, such as port numbers, SequenceNumbersNumbers, and arguments to options, are transmitted innetwork byte order (most significant byte first). We occasionally refer to the "left" and "right" sides of a bit field. "Left" means towards the most significant bit, and "right" means towards the least significant bit. Reserved bitfields in DCCP packet headers MUST be ignored by receivers, and MUST be set to zero by senders, unless otherwise specified. 3.1. Robustness Principle DCCP implementations should follow TCP's "general principle of robustness": be conservative in what you do, be liberal in what you accept from others. Kohler/Handley/Floyd/Padhye Section 3.1. [Page 10] INTERNET-DRAFT Expires: April 2004 October 2003 3.2. Packet Types DCCP has ten different packet types. The DCCP-Request and DCCP-Response packets are used in connection initiation, and the DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets are used in connection termination, as described in Section 4.1. The other five packet types are as follows: DCCP-Data Used to transmit data. It carries no acknowledgement information. DCCP-Ack Used for pure acknowledgements. DCCP-DataAck Used for piggybacked data-plus-acknowledgements. DCCP-Move Supports multihoming and mobility. DCCP-Sync Used to resynchronize sequence numbers after a large burst of loss. All of these packets except for DCCP-DataAck, DCCP-Move, and DCCP- Sync are shown in the example diagram below. 3.3. States DCCP endpoints progress through different states during the course of a connection. The figure below shows the typical progress through these states for a client and server. Kohler/Handley/Floyd/Padhye Section 3.3. [Page 11] INTERNET-DRAFT Expires: April 2004 October 2003 Client State: Server State: ------------- ------------- CLOSED LISTEN REQUEST DCCP-Request -> <- DCCP-Response RESPOND OPEN DCCP-Ack -> <- DCCP-Data OPEN DCCP-Ack -> <- DCCP-CloseReq CLOSEREQ CLOSING DCCP-Close -> <- DCCP-Reset CLOSED TIME-WAIT CLOSED The client and server's typical progress through states. CLOSED Represents a nonexistent connection. LISTEN Represents a server socket in the passive listening state. LISTEN and CLOSED are not associated with any particular DCCP connection. REQUEST The client socket enters this state, from CLOSED, after sending a DCCP-Request packet to try to initiate a connection. RESPOND A server socket enters this state, from LISTEN, after receiving a DCCP-Request from a client. OPEN The central, data transfer portion of a DCCP connection. Client and server enter into this state from REQUEST and RESPOND, respectively. Sometimes we speak of SERVER-OPEN and CLIENT-OPEN states, corresponding to the server's OPEN state and the client's OPEN state. CLOSEREQ A server socket enters this state, from SERVER-OPEN,network byte order (most significant byte first). We occasionally refer tosignal thattheconnection is over, but"left" and "right" sides of a bit field. "Left" means towards theclient must hold Time-Wait state. CLOSING Either server or client can enter this state to closemost significant bit, and "right" means towards theconnection. Kohler/Handley/Floyd/Padhye Section 3.3. [Page 12] INTERNET-DRAFT Expires: April 2004 October 2003 TIME-WAIT A socket remainsleast significant bit. Reserved bitfields inthis state for 2MSL after the connection has been torn down,DCCP packet headers MUST be ignored by receivers, and MUST be set toprevent mistakes duezero by senders, unless otherwise specified. Random numbers in DCCP are used for their security properties, and MUST be chosen according to thedelivery of old packets. 3.4.guidelines in [RFC 1750]. 3.2. Parts of a ConnectionTheEach DCCP connection runs between two endpoints, which we often name DCCP A and DCCPB consists of four sets of packets, as follows: (1) Data packets from DCCP A to DCCPB.(2) Acknowledgements from DCCP B toDCCPA. (3) Data packets from DCCP B toconnections are actively initiated by one endpoint. The active endpoint is called the client, and the passive endpoint is called the server. DCCPA. (4) Acknowledgementsconnections are bidirectional; data may pass fromDCCP Aeither endpoint to the other. This means that data and acknowledgements may be flowing in both directions simultaneously. Logically, however, a DCCPB. These four subflows are grouped intoconnection consists of twohalf-connections, illustratedseparate unidirectional connections, called half-connections. Each half-connection consists of the data packets sent by one endpoint and the corresponding acknowledgements sent by the other endpoint. We can illustrate this as follows: Kohler/Handley/Floyd Section 3.2. [Page 9] INTERNET-DRAFT Expires: August 2004 February 2004 +--------+ A-to-B half-connection: +--------+ | |+ - - - - - - - - - - - - - - - - - - - + | | | | | (1) | | | | |--> data packets --> | | | || (2) | | | | |<-- acknowledgements| | | | + - - - - - - - - - - - - - - - - - - - +<-- | | | DCCP A | | DCCP B | | | B-to-A half-connection: | | | |+ - - - - - - - - - - - - - - - - - - - + | | | | | (3) | | | | |<-- data packets <-- | || | | (4) | | | | |+--------+ --> acknowledgements -->| | +--------+ + - - - - - - - - - - - - - - - - - - - ++--------+WeAlthough they are logically distinct, in practice the half- connections overlap; a DCCP-DataAck packet, for example, contains application data relevant to one half-connection and acknowledgement information relevant to the other. In the context of a single half-connection, the HC-Sender is the endpoint sending data, while the HC-Receiver is the endpoint sending acknowledgements. For example, in the A-to-B half-connection, DCCP A is the HC-Sender and DCCP B is the HC-Receiver. 3.3. Features A feature is a DCCP connection attribute, identified by a feature number and an endpoint, on whose value the two endpoints agree. Many properties of a DCCP connection are controlled by features, including the congestion control mechanisms in use on the two half- connections, whether mobility is allowed, and whether ECN is supported. The endpoints can achieve agreement by out-of-band communication, or through the exchange of feature negotiation options in DCCP headers. The notation F/A represents thefollowing terms to refer to subsetsfeature with feature number F located at DCCP endpoint A; the feature F/B has the same feature number, but is located at the other endpoint. Both DCCP A andendpointsDCCP B know, and agree on, the values ofaboth F/A and F/B, but F/A and F/B may have different values. DCCPconnection. SubflowsAsubflow consists of either data or acknowledgement packets, sent in one direction. Each ofis called thefour sets of packets abovefeature location for all features F/A, and the feature remote for all features F/B. 3.4. Round-Trip Times We sometimes refer to a round-trip time for setting timers, for example. If no useful round-trip time estimate is available, asubflow. (Subflows may overlap to some extent, since acknowledgements mayDCCP implementation SHOULD use 0.2 seconds instead. 3.5. Robustness Principle DCCP implementations should follow TCP's "general principle of robustness": bepiggybacked on data packets.) Kohler/Handley/Floyd/Padhyeconservative in what you do, be liberal in what you Kohler/Handley/Floyd Section3.4.3.5. [Page13]10] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 Sequences A sequence consists of all packets sent in one direction, regardless of whether they areFebruary 2004 accept from others. 4. Overview DCCP's high-level connection dynamics should seem familiar to anyone who knows TCP. DCCP connections, like TCP connections, progress through three phases: initiation (including a three-way handshake), dataor acknowledgements. The sets 1+4transfer, and2+3, above, are sequences. Each packet on a sequence has a different sequence number. Half-connections A half-connection consists oftermination. Data can flow both ways over the connection. An acknowledgement framework lets senders discover how much datapackets sent in one direction, plushas been lost; congestion control uses this information to avoid unfairly congesting thecorresponding acknowledgements.network. Of course, DCCP provides unreliable datagram semantics, not TCP's reliable bytestream semantics. Thesets 1+2 and 3+4, above, are half-connections. Half-connections are named after the direction ofapplication must package its dataflow, so the A-to-B half- connection contains theinto explicit frames, and must retransmit its own datapackets from Aas necessary. It may be useful toBthink of DCCP either as TCP minus bytestream semantics and reliability, or as UDP plus congestion control, handshakes, and acknowledgements. 4.1. Packet Types DCCP uses eleven packet types to implement various protocol functions. For example, every new connection attempt begins with a DCCP-Request packet sent by theacknowledgements from Bclient. A DCCP-Request packet thus resembles a TCP SYN; but DCCP-Request is a packet type, not a flag, so there's no way toA. HC-Sender and HC-Receiver Insend an unexpected combination such as TCP's SYN+FIN+ACK+RST. Eight packet types occur during thecontextprogress of asingle half-connection,typical connection---two only during theHC-Sender isinitiation phase, three during theendpoint sending data, whiledata transfer phase, and three only during theHC-Receiver istermination phase: Client Server ------ ------ (1) Initiation DCCP-Request --> <-- DCCP-Response DCCP-Ack --> (2) Data transfer DCCP-Data, DCCP-Ack, DCCP-DataAck --> <-- DCCP-Data, DCCP-Ack, DCCP-DataAck (3) Termination <-- DCCP-CloseReq DCCP-Close --> <-- DCCP-Reset Note the three-way handshakes during initiation and termination. The three remaining packet types are used for special purposes: when an endpointsending acknowledgements.moves, or to resynchronize after bursts of loss. Kohler/Handley/Floyd Section 4.1. [Page 11] INTERNET-DRAFT Expires: August 2004 February 2004 Every DCCP packet starts with a common, 12-byte generic header, but different packet types may include different amounts of additional data. For example,intheA-to-B half- connection, DCCP A isDCCP-Ack packet type includes an Acknowledgement Number. Every packet type may also contain options, up to around 1000 bytes' worth. All of theHC-Sender and DCCP B ispacket types are described below. DCCP-Request Sent by theHC- Receiver. 4. Overview 4.1. Connection Initiation and Termination Every DCCPclient to initiate a connectionis actively initiated(the first part of the three-way handshake). DCCP-Response Sent byone DCCP, which connectsthe server in response to aDCCP socket inDCCP-Request (the second part of thepassive listening state. We referthree-way handshake). DCCP-Data Used tothe active endpoint as "the client" and the passive endpoint as "the server". Client Server ------ ------ DCCP-Request -> [Ports, service, features] <- DCCP-Response [Features, cookie]transmit data. DCCP-Ack-> [Features, cookie] DCCP connection initiation. InUsed for pure acknowledgements. DCCP-DataAck Used for piggybacked data-plus-acknowledgements. DCCP-CloseReq Sent by theDCCP-Request message,server to request that the clienttellsclose theserverconnection. DCCP-Close Used to close theports it wantsconnection; elicits a DCCP-Reset in response. DCCP-Reset Used tocommunicate on and possiblyterminate theService Codeconnection, either normally or abnormally. DCCP-Move Supports multihoming and mobility. DCCP-Sync, DCCP-SyncAck Used to resynchronize sequence numbers after large bursts ofthe service it wantsloss. 4.2. Sequence Numbers Each DCCP packet carries a sequence number, so that losses can be detected and reported. But unlike TCP's byte-based sequence numbers, DCCP sequence numbers are attached totalk to. The DCCP-Request message also starts feature negotiation, which, for pedagogical reasons, we will present separately inpackets. Each packet sent increments thenext section. Kohler/Handley/Floyd/Padhyesequence number by one. For example: Kohler/Handley/Floyd Section4.1.4.2. [Page14]12] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 In the DCCP-Response message, the server tells the clientFebruary 2004 DCCP A DCCP B ------ ------ DCCP-Data(seqno 1) --> DCCP-Data(seqno 2) --> <-- DCCP-Ack(seqno 10, ackno 2) DCCP-DataAck(seqno 3, ackno 10) --> <-- DCCP-Data(seqno 11) Note thatit is willing to accepteven DCCP-Ack pure acknowledgements increment theconnection and continues feature negotiation. In order to prevent SYN-flood style DOS attacks, DCCP incorporates a cookie exchange: The server can providesequence number; after theclientDCCP-Ack witha cookie that contains allsequence number 10, thenegotiation state.following DCCP-Data packet uses the next sequence number, 11. Thiscookie must be echoed bylets theclientendpoints tell when acknowledgements are lost in theDCCP-Ack, thus removing the need for the servernetwork. It also means that endpoints can get out of sync after a long burst of loss. The DCCP-Sync and DCCP-SyncAck packet types let DCCP recover from large loss bursts; see Section 7.5. Also note that, since DCCP is an unreliable protocol, there are no retransmissions, and it doesn't make sense tokeep state. In the DCCP-Ack message,have a cumulative acknowledgement field. Acknowledgement Number (ackno) fields equal theclient acknowledgeslargest sequence number received, rather than theDCCP-Response and returnsTCP-style smallest sequence number not received. Separate options indicate any intermediate sequence numbers that weren't received. 4.3. States DCCP endpoints progress through different states during thecookiecourse of a connection, corresponding roughly topermittheserver to complete its sidethree phases of initiation, data transfer, and termination. The figure below shows theconnection. This message may also include feature negotiation messages. DCCP does not support TCP-style simultaneous open. In particular, a host MUST NOT respond totypical progress through these states for a client and server. Kohler/Handley/Floyd Section 4.3. [Page 13] INTERNET-DRAFT Expires: August 2004 February 2004 Client Server ------ ------ (0) No connection CLOSED LISTEN (1) Initiation REQUEST DCCP-Requestpacket with a--> <-- DCCP-Responsepacket unless the destination port specifiedRESPOND PARTOPEN DCCP-Ack or DCCP-DataAck --> (2) Data transfer OPEN <-- DCCP-Data, Ack, DataAck --> OPEN (3) Termination <-- DCCP-CloseReq CLOSEREQ CLOSING DCCP-Close --> <-- DCCP-Reset CLOSED TIMEWAIT CLOSED The client and server's typical progress through states. The states are as follows; Section 8 describes them inthe DCCP-Request corresponds tomore detail. CLOSED Represents a nonexistent connection. LISTEN Represents alocalserver socketopened for listening. This preservesin theinvariant that every connection has one clientpassive listening state. LISTEN andone server.CLOSED are not associated with any particular DCCP connection. REQUEST Theserver sendsclient socket enters this state, from CLOSED, after sending aDCCP-CloseReqDCCP-Request packet tothe client to ask ittry toclose the connection withinitiate aDCCP-Close. Theconnection. RESPOND A serversends DCCP- CloseReq, rather than DCCP-Close, when it wants thesocket enters this state, from LISTEN, after receiving a DCCP-Request from a client. PARTOPEN The clientto hold Time-Waitsocket enters this state, from REQUEST, after receiving a DCCP-Response from the server. This stateforrepresents theconnection. Onlythird phase of theserverthree-way handshake. The client maygeneratesend data in this state, but it MUST include an Acknowledgement Number on all of its packets. OPEN The central, data transfer portion of aDCCP-CloseReq packet. This means that the client cannot force theDCCP connection. Client Kohler/Handley/Floyd Section 4.3. [Page 14] INTERNET-DRAFT Expires: August 2004 February 2004 and server enter into this state from PARTOPEN and RESPOND, respectively. Sometimes we speak of SERVER-OPEN and CLIENT-OPEN states, corresponding tomaintain connectionthe server's OPEN stateafterand theconnection is closed. An endpoint sends a DCCP-Close packetclient's OPEN state. CLOSEREQ A server socket enters this state, from SERVER-OPEN, torequestsignal that theother endpoint tear down the connection via DCCP-Reset. Every explicitly- terminatedconnectionends with a DCCP-Reset packet. The receiver of DCCP-Reset holds Time-Waitis over, but the client must hold TIMEWAIT state. CLOSING Either server or client can enter this stateforto close the connection.DCCP-Reset is sentTIMEWAIT A socket remains inresponse to DCCP-Close during normal connection termination, or due to some inappropriate protocol event. Client Server ------ ------ <- DCCP-CloseReq DCCP-Close -> <- DCCP-Reset DCCPthis state for 2MSL after the connectiontermination. DCCP shuts down both half-connections as a unit; ithasno states analogousbeen torn down, toTCP's FINWAIT and CLOSEWAIT states, where one TCP "half-connection"prevent mistakes due to the delivery of old packets. One MSL, or Maximum Segment Lifetime, isclosed andtheother remains open. However, DCCP implementations SHOULD allow applications to declare that they are no longer interestedmaximum length of time a packet could survive inreceiving data. This would allow DCCP implementations to streamline state for certain half-connections. Kohler/Handley/Floyd/Padhye Section 4.1. [Page 15] INTERNET-DRAFT Expires: April 2004 October 2003 See Section 8.7, ontheData Dropped option---and particularly its Drop Code 1---for more information. 4.2.network. 4.4. Congestion ControlEach half-connection is managed by aDCCP connections are congestion controlled. Unlike TCP, however, DCCP supports multiple congestion controlmechanism namedmechanisms for applications to choose from. In fact, the two half-connections can be governed by different mechanisms. Each mechanism corresponds to asingle-byteone-byte congestion control identifier, or CCID.TheA CCIDfor a half-connectiondescribes how the HC-Sender limits data packet rates; how it maintains necessary parameters, such as congestion windows; how theHC-ReceiverHC- Receiver sends congestion feedback via acknowledgements; and how it manages the acknowledgement rate. The endpoints negotiate their CCIDsatduring connectionsetup; theinitiation. So far, CCIDs 2 and 3 have been defined forthe two half-connections need not be the same.use with DCCP; CCID 0 is reserved, and CCID 1 is used for special purposes (see Section7 introduces the currently allocated CCIDs, which are defined in separate profile documents. 4.2.1.10.1). CCID 2CCID 2's congestion controlcorresponds to TCP-like Congestion Control, which isextremelysimilar to that of TCP. The sender maintains a congestion window and sends packets until that window is full. Packets are acknowledged by the receiver. Dropped packets and ECN [RFC 3168] areused toindicatecongestion. Thecongestion; the response to congestion is to halve the congestion window.One subtle diference between DCCP and TCP is that the acknowledgementsAcknowledgements inDCCPCCID 2 contain the sequence numbers of all received packets withina givensome window,not just the highest sequence number as in TCP's cumulative ackowledgement. 4.2.2. CCID 3similar to a super selective-acknowledgement (SACK, [RFC 3517]). CCID 3isprovides TFRC Congestion Control, an equation-based form of congestion control which is intended to provide a smoother response Kohler/Handley/Floyd Section 4.4. [Page 15] INTERNET-DRAFT Expires: August 2004 February 2004 to congestion than CCID 2. The sender maintains a "transmit rate". The receiver sends acknowledgement packetswhich also containcontaining information about the receiver's estimate of packet loss. The sender uses this information to update its transmit rate. Although CCID 3 behaves somewhat differently from TCP in its short term congestion response, it is designed to operate fairly with TCP over the long term.4.3.The behaviors of CCIDs 2 and 3 are fully defined in separate profile documents [CCID 2 PROFILE] [CCID 3 PROFILE]. 4.5. FeaturesIn DCCP,Agreement on DCCP featurenegotiationvalues isperformedachieved byattachingexplicit negotiation, using optionsto otherin DCCPpackets. Thus featurepacket headers. This generally happens at connection startup, but negotiation canbe piggybacked onbegin at anyother DCCP message. This allows feature negotiation during connection initiation as well as feature renegotiation during data flow. Kohler/Handley/Floyd/Padhye Section 4.3. [Page 16] INTERNET-DRAFT Expires: April 2004 October 2003 DCCP features are one-sided. Thus, it's possible to have a different congestion control regime for data sent from client to server than from server to client.time. Theendpoint in charge of a particular feature is called its feature location; the other endpoint is called the feature remote. Feature negotiation is done with therelevant options are Change L, Confirm L, Change R, and ConfirmR options,R, with the "L" options sent by the featurelocation,location and the "R" options sent by the feature remote. A Change R message says to thepeerpeer, "change thisoption settingfeature value on your side". The peer responds with a Confirm L, meaning "I've changed it".Some sample exchanges follow:The suggested option setting in Change R can sometimes contain multiple values, which are sorted in preference order. For example: Client Server ------ ------ Change R(CCID, 2)-> <---> <-- Confirm L(CCID, 2) * agreement that(CCID,Server)CCID/Server = 2 * Change R(CCID, 3 4) --> <-- Confirm L(CCID, 4, 4 2) * agreement that CCID/Server = 4 * Inthisthe second exchange, thepeers agree to setclient requests that theserver'sserver use either CCID 3 or CCID 4, with 3 preferred. The server chooses 4, giving its preference list of "4 2". A party that wants to2.change a feature located at itself issues a "Change L" option, which elicits a "Confirm R" in reply. Client Server ------ ------ <-- ChangeR(CCID,L(CCID, 34) -> <-2) ConfirmL(CCID, 4, 4R(CCID, 3, 3 2) --> * agreement that(CCID,Server)CCID/Server =43 * Kohler/Handley/Floyd Section 4.5. [Page 16] INTERNET-DRAFT Expires: August 2004 February 2004 In this example, the server requests CCID value 3 or 2 for the server's CCID, with 3 preferred, and the client agrees. Retransmissions make feature negotiation reliable. Section 6 describes these options further. 4.6. Other Differences from TCP Interesting differences between DCCP and TCP, apart from those discussed so far, include: o Copious space for options (up to 1020 bytes). o Different acknowledgement formats. The CCID for a connection determines how much ack information needs to be transmitted. In CCID 2 (TCP-like), this is about one ack per 2 packets, and each ack must declare exactly which packets were received; in CCID 3 (TFRC), it's about one ack per RTT, and acks must declare at minimum just the lengths of recent loss intervals. o Denial-of-service (DoS) protection. Several DCCP mechanisms attempt to let servers limit the amount of state possibly- misbehaving clients can force them to maintain. An Init Cookie option, analogous to TCP's SYN Cookies [SYNCOOKIES], avoids SYN- flood-like attacks. Only one connection endpoint need hold TIMEWAIT state; the DCCP-CloseReq packet, which may only be sent by the server, passes that state to the client. Various rate limits let servers avoid attacks that might force extensive computation or packet generation. o Distinguishing different kinds of loss. A Data Dropped option (Section 11.7) lets an endpoint declare that a packet was dropped because of corruption, because of receive buffer overflow, and so on. This facilitates research into more appropriate rate-control responses for these non-network-congestion losses (although currently all losses will cause a congestion response). o Acknowledgement readiness. Inthis exchange,TCP, a packet is acknowledged only when theclient requests CCID value 3 or 4data is queued for delivery to theserver's CCID, with 3 preferred. Note that the client can offer multiple values. The server chooses 4, giving its preference list of "4 2". Ifapplication. This does not make sense in DCCP, where an application might request aparty wants to change one of his own options, he issuesdrop-from-front receive buffer, for example. We acknowledge a"Change L", as shown below. Client Server ------ ------ <- Change L(CCID, 3 2) Confirm R(CCID, 3, 3 2) -> * agreementpacket when its options have been processed. The Data Dropped option may later say that(CCID, Server) = 3 * In this example,theserver requests CCID value 3 or 2packet's payload was discarded. o Integrated support forthe server's CCID, with 3 preferred,mobility and multihoming via theclient agrees. Retransmissions make feature negotiation reliable. Section 6.4 describes these options further. Kohler/Handley/Floyd/PadhyeDCCP-Move packet type. Kohler/Handley/Floyd Section4.3.4.6. [Page 17] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 4.4.February 2004 o No receive window. DCCP is a congestion control protocol, not a flow control protocol. o No simultaneous open. Every connection has one client and one server. o No half-closed states. DCCP has no states corresponding to TCP's FINWAIT and CLOSEWAIT, where one half-connection is explicitly closed while the other is still active. 4.7. Example Connection The progress of a typical DCCP connection is as follows. (This description is informative, not normative.) Client Server ------ ------(1)0. [CLOSED] [LISTEN] 1. DCCP-Request-> <- (2)--> 2. <-- DCCP-Response(3) DCCP-Ack -> (5) DCCP-Data -> <- (5)3. DCCP-Ack<- (5) DCCP-Data (5)--> <-- DCCP-Ack-> <- (6)4. DCCP-Data, DCCP-Ack, DCCP-DataAck --> <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 5. <-- DCCP-CloseReq(7)6. DCCP-Close-> <- (8)--> 7. <-- DCCP-ResetTypical DCCP Connection. (1)8. [TIMEWAIT] 1. The client sends the server a DCCP-Request packet specifying the client and server ports, the service being requested, and any features being negotiated, including the CCID that the client would like the server to use. The client may optionally piggyback some data on the DCCP-Request packet---an application- level request, say---which the server may ignore.(2)2. The server sends the client a DCCP-Response packet indicating that it is willing to communicate with the client. The response indicates any features and options that the server agrees to, begins or continues other feature negotiations if desired, and optionally includes an Init Cookie that wraps up all this information and which must be returned by the client for the connection to complete.(3)3. The client sends the server a DCCP-Ack packet that acknowledges the DCCP-Response packet. This acknowledges the server's initial sequence number and returns the Init Cookie if there was Kohler/Handley/Floyd Section 4.7. [Page 18] INTERNET-DRAFT Expires: August 2004 February 2004 one in the DCCP-Response. It may also continue feature negotiation.(4) Next comesThere might follow zero or more DCCP-Ack exchanges as required to finalize feature negotiation. The client may piggyback an application-level request on its final ack, producing aDCCP- DataAckDCCP-DataAck packet.Kohler/Handley/Floyd/Padhye Section 4.4. [Page 18] INTERNET-DRAFT Expires: April 2004 October 2003 (5)4. The server and client then exchange DCCP-Data packets, DCCP-Ack packets acknowledging that data, and, optionally, DCCP-DataAck packets containing piggybacked data and acknowledgements. If the client has no data to send, then the server will send DCCP- Data and DCCP-DataAck packets, while the client will send DCCP- Acks exclusively.(6)5. The server sends a DCCP-CloseReq packet requesting aclose. (7) The client sends a DCCP-Close packet acknowledging the close. (8) The server sends a DCCP-Reset packet whose Reason field is set to "Closed", and clears its connection state. In DCCP, unlike TCP, Resets are part of normal connection termination; see Section 5.9. (9) The client receives the DCCP-Reset packet and holds state for a reasonable interval of time to allow any remaining packets to clear the network. An alternative connection closedown sequence is initiated by the client: (6) The client sends a DCCP-Close packet closing the connection. (7) The server sends a DCCP-Reset packet with Reason field set to "Closed" and clears its connection state. (8) The client receives the DCCP-Reset packet and holds state for a reasonable interval of time to allow any remaining packets to clear the network. This arrangement of setup and teardown handshakes permits the server to decline to hold any state until the handshake with the client has completed, and ensures that the client must hold the Time-Wait state at connection closedown. 4.5. Examples of DCCP Congestion Control Before giving the detailed specifications of DCCP, we present two more detailed examples showing DCCP congestion control in operation. Again, these examples are informative, not normative. 4.5.1. DCCP with TCP-like Congestion Control The first example is of a connection where both half-connections use TCP-like Congestion Control, specified by CCID 2 [CCID 2 PROFILE]. In this example, the client sends an application-level request to Kohler/Handley/Floyd/Padhye Section 4.5.1. [Page 19] INTERNET-DRAFT Expires: April 2004 October 2003 the server, and the server responds with a stream of data packets. This example is of a connection using ECN. (1)close. 6. The client sendsthe DCCP-Request, which includesaChange R option askingDCCP-Close packet acknowledging the close. 7. The serverto use CCID 2 forsends a DCCP-Reset packet with Reset Code 1, "Closed", and clears its connection state. In DCCP, unlike TCP, Resets are part of normal connection termination; see Section 5.6. 8. The client receives theserver's data packets,DCCP-Reset packet and holds state for aChange L option informingreasonable interval of time to allow any remaining packets to clear theserver thatnetwork. An alternative connection closedown sequence is initiated by the client: 5b. The clientwould like to use CCID 2 forsends a DCCP-Close packet closing theits data packets. (2)connection. 6b. The server sends aDCCP-Response, including a Confirm L option indicating that the server agrees to use CCID 2 for its data packets,DCCP-Reset packet with Reset Code 1, "Closed", anda Confirm R option indicating that the server agrees to the client's suggestion of CCID 2 for the client's data packets. (3)clears its connection state. 7b. The clientresponds with a DCCP-DataAck acknowledgingreceives theserver's initial sequence number,DCCP-Reset packet andincluding an application- level requestholds state fordata. We will not discussa reasonable interval of time to allow any remaining packets to clear theclient-to- server half-connection furthernetwork. 5. Header Formats The variable-length DCCP header appears first inthis example. (4)every DCCP packet. A header can be from 12 to 1020 bytes long. Theserver sends DCCP-Data packets, whereinitial 12 bytes of thenumberheader are the same regardless ofpackets sent is governed bypacket type. Following this comes optional additional fixed-length fields, depending on the packet type, and then acongestion window, as in TCP.variable-length list of options. Finally, some packet types include application data. Kohler/Handley/Floyd Section 5. [Page 19] INTERNET-DRAFT Expires: August 2004 February 2004 +---------------------------------------+ -. | Generic Header | | +---------------------------------------+ | | Additional Fields (depending on type) | +- DCCP Header +---------------------------------------+ | | Options (optional) | | +=======================================+ -' | Application Data (optional) | +=======================================+ 5.1. Generic Header ThedetailsDCCP generic header generally takes 12 bytes. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Dest Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data Offset | CCVal | CsCov | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type |X| Res | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Actually, there are two types of generic header, depending on the value of X, thecongestion window are defined inExtended Sequence Numbers bit. If X is zero, theprofile for CCID 2, whichSequence Number field takes 24 bits, as above. If X is one, the Sequence Number field extends for an additional 24 bits, for aseparate document [CCIDtotal of 48: 0 1 2PROFILE]. The server also sends Change R(Ack Ratio) feature options specifying3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Dest Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data Offset | CCVal | CsCov | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type |1| Res | Sequence Number (high bits) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . Sequence Number (low bits) | Reserved |T| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Source and Destination Ports: 16 bits each These fields identify thenumber of server data packetsconnection, similar tobe covered by an Ack packet from the client. Each DCCP-Data packet is sent as ECN-Capable, with either the ECT(0) ortheECT(1) codepoint set, as describedcorresponding fields in[ECN NONCE]. (5)TCP and UDP. Theclient sends a DCCP-Ack packet acknowledging the data packets for every Ack Ratio data packets transmitted bySource Port represents theserver. Each DCCP-Ack packet uses a sequence number and contains an Ack Vector, as defined in Section 8relevant port onAcknowledgements. These packets also include Confirm L options answering any Ack Ratio requests fromtheserver. The DCCP-Acks are alsoendpoint that sentas ECN-Capable, with either ECT(0) or ECT(1). The client's Ack Vector echoes the accumulated ECN Nonce for the server's packets. (6) The server must occasionally acknowledge the client's acknowledgements, so the client can clean its acknowledgement state. It can do so by sending separate DCCP-Acks as allowed by CCID 2, or by piggybacking acknowledgement information on its data packets withthis packet, theDCCP-DataAck packet type. The acknowledgement information may contain detailed Ack Vectors, Kohler/Handley/Floyd/PadhyeKohler/Handley/Floyd Section4.5.1.5.1. [Page 20] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 like the client's acknowledgements; but ifFebruary 2004 Destination Port theclient is sending nothing but acknowledgements,relevant port on theserver's acks-of-acks canother endpoint. Source Ports SHOULD bemore lightweight. See Section 8.1 for more information. Like the server's DCCP-Data packets,chosen randomly, to reduce theserver's DCCP-DataAck and DCCP-Ack packets are sent as ECN-Capable. (7)likelihood of attack. Data Offset: 8 bits Theserver continues sending DCCP-Data packets as controlled by the congestion window. Upon receiving DCCP-Ack packets,offset from theserver examinesstart of theAck VectorDCCP header tolearn about marked or dropped data packets, and adjusts its congestion window accordingly, as describedthe beginning of the packet's application data, in[CCID 2 PROFILE]. Because this is unreliable transfer,32-bit words. CCVal: 4 bits Used by theserver does not retransmit dropped packets. (8) Because DCCP-Ack packets use sequence numbers,HC-Sender CCID. For example, theserver has directA-to-B CCID's sender, which is active at DCCP A, MAY send 4 bits of informationaboutper packet to its receiver by encoding that information in CCVal. CCVal MUST be set to zero unless thefractionHC- Sender CCID specifies a different value. Checksum Coverage (CsCov): 4 bits Checksum Coverage specifies what parts ofloss or marked DCCP-Ack packets. [CCID 2 PROFILE] defines howtheserver modifiespacket are covered by theclient's Ack Ratio in response to any congestionChecksum field. This always includes the DCCP header and options, but if applications request it, some or all of the application data may be excluded. This can improve performance on noisy links, assuming theacknowledgement stream. (9)application can tolerate corruption. See Section 9. Checksum: 16 bits Theserver estimates round-trip times and calculatesInternet checksum of the packet's DCCP header (including options), aTimeOut (TO) value much asnetwork-layer pseudoheader, and, depending on Checksum Coverage, some or all of theRTO (Retransmit Timeout) is calculated in TCP. Again,application data. See Section 9. Type: 4 bits The Type field specifies thespecification for this is in [CCID 2 PROFILE].type of the packet. TheTOfollowing values are defined: Type Meaning ---- ------- 0 DCCP-Request 1 DCCP-Response 2 DCCP-Data 3 DCCP-Ack 4 DCCP-DataAck 5 DCCP-CloseReq 6 DCCP-Close 7 DCCP-Reset 8 DCCP-Move 9 DCCP-Sync 10 DCCP-SyncAck 11-15 Reserved Kohler/Handley/Floyd Section 5.1. [Page 21] INTERNET-DRAFT Expires: August 2004 February 2004 Extended Sequence Numbers (X): 1 bit This bit isusedset todetermine when a new DCCP-Data packet can be transmitted when the server has been limited byone to indicate thecongestion windowuse of an extended generic header with 48-bit Sequence andno feedback has been received from the client. (10)Acknowledgement Numbers. Very-high-rate connections SHOULD set X to one, and use 48-bit sequence numbers, to gain increased protection against wrapped sequence numbers and attacks. See Section 7.6. Reserved (Res): 3 bits TheDCCP-CloseReq, DCCP-Close,version of DCCP specified here MUST ignore this field on received packets, andDCCP-Reset packetsMUST set it tocloseall zeroes on generated packets. Sequence Number: 24 or 48 bits Identifies theconnection are aspacket uniquely in theexample above. 4.5.2. DCCP with TFRC Congestion Control This example issequence ofa connection where both half-connections use TFRC Congestion Control, specified by CCID 3 [CCID 3 PROFILE]. (1) The DCCP-Request and DCCP-Responseall packetsspecifying the use of CCID 3 andtheinitial DCCP-DataAcksource sent on this connection. Sequence Number increases by one with every packetare similarsent, including packets such as DCCP- Ack that carry no application data. See Section 7. Sequence Number Transition (T): 1 bit [X=1 only] Set tothoseone to indicate an ongoing transition from 24-bit to 48-bit sequence numbers. See Section 7.6. Many packet types also carry an Acknowledgement Number in theCCID 2 example above. (2)four or eight bytes immediately following the generic header. When X=0, its format is: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ And when X=1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number (high bits) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . Acknowledgement Number (low bits) | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Acknowledgement Number: 24 or 48 bits Theserver sends DCCP-Data packets, whereAcknowledgement Number field generally acknowledges the greatest valid sequence number received so far on this connection. ("Greatest" is, of course, measured in circular sequence space.) Acknowledgement numbers make no attempt to provide precise information about which packetssent is governed by an allowed transmit rate,have arrived; options such asin TFRC.the Ack Vector do this. Kohler/Handley/Floyd Section 5.1. [Page 22] INTERNET-DRAFT Expires: August 2004 February 2004 Reserved: 8 bits Thedetailsversion ofthe allowed transmit rate are defined in the profile for CCID 3, which isDCCP specified here MUST ignore these fields on received packets, and MUST set them to all zeroes on generated packets. 5.2. DCCP-Request Header A client initiates aseparate document [CCID 3 PROFILE]. Each DCCP-Data packet hasDCCP connection by sending asequence numberDCCP-Request packet. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=0 (DCCP-Request) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Service Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / Padding | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Application Data | | ... | Service Code: 32 bits Describes the service to which the client application wants to connect. Examples might include RTSP and DOOM. Service Codes are intended to make application protocols independent of well- known ports, and help middleboxes identify the protocol used on awindow counter value. Kohler/Handley/Floyd/Padhyegiven connection. See Section 8.1.2. 5.3. DCCP-Response Header The server responds to valid DCCP-Request packets with DCCP-Response packets. This is the second phase of the three-way handshake. Kohler/Handley/Floyd Section4.5.2.5.3. [Page21]23] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 Some of these data packets are DCCP-DataAck packets acknowledging packets from the client, but for simplicity weFebruary 2004 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=1 (DCCP-Response) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when (. Acknowledgement Number (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Service Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / Padding | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Application Data | | ... | Acknowledgement Number: 24 or 48 bits The Acknowledgement Number field willnot discussgenerally equal thehalf-connection of dataSequence Number from theclient toDCCP-Request. Service Code: 32 bits Echoes theserver in this example. The use of ECN follows TCP-like Congestion Control, above,Service Code on the DCCP-Request. 5.4. DCCP-Data, DCCP-Ack, andis described further in [CCID 3 PROFILE]. (3)DCCP-DataAck Headers Thereceiver sends DCCP-Ack packets at least once per round-trip time acknowledging thecentral datapackets, unless the server is sending at a ratetransfer portion ofless than one packet per RTT, as specified by [CCID 3 PROFILE]. These acknowledgements may be piggybacked on data packets, producing DCCP-DataAck packets. Each DCCP-Ack packetevery DCCP connection usesa sequence numberDCCP-Data, DCCP-Ack, andidentifies the most recent packet received from the server. Each DCCP-Ack packet includes feedback about the loss event rate calculated by the client, as specified by [CCID 3 PROFILE]. (4) The server continues sendingDCCP-DataAck packets. DCCP-Data packetsas controlled by the allowed transmit rate. Upon receiving DCCP-Ack packets, the server updates its allowed transmit rate as specified by [CCIDcarry application data. 0 1 2 3PROFILE]. (5) The server estimates round-trip times and calculates a TimeOut (TO) value much as the RTO (Retransmit Timeout) is calculated in TCP. Again, the specification for this is in [CCID0 1 2 3PROFILE]. (6) The DCCP-CloseReq, DCCP-Close, and DCCP-Reset4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=2 (DCCP-Data) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / Padding | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Application Data | | ... | DCCP-Ack packetsto closedispense with theconnectiondata, but contain an Acknowledgement Number. They areas in the examples above. 5. Packet Formats 5.1.used for pure acknowledgements. Kohler/Handley/Floyd Section 5.4. [Page 24] INTERNET-DRAFT Expires: August 2004 February 2004 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / GenericPacket Header AllDCCPpackets beginHeader (12 or 16 bytes) / / witha generic DCCP packet header:Type=3 (DCCP-Ack) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when (. Acknowledgement Number (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ DCCP-DataAck packets carry both application data and an Acknowledgement Number: acknowledgement information is piggybacked on a data packet. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=4 (DCCP-DataAck) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Source PortReserved |Dest PortAcknowledgement Number |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+(+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when (. Acknowledgement Number (low bits) |Data OffsetReserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |CCValOptions / Padding |CsCov+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ |ChecksumApplication Data |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|Type |X|# NDP| Sequence Number... |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Kohler/Handley/Floyd/Padhye Section 5.1. [Page 22] INTERNET-DRAFT Expires: April 2004 October 2003 Source and Destination Ports: 16 bits each These fields identify the connection, similar to the corresponding fields in TCP and UDP. The Source Port represents the relevant port on the endpoint that sent this packet, the Destination Port the relevant port on the other endpoint. Data Offset: 8 bits The offset from the start of the DCCP header to the beginning of the packet's payload, measured in 32-bit words. CCVal: 4 bits This field is reserved for use by the sending CCID. In particular, the A-to-B CCID's sender, which is active at DCCP A, MAY send information to the receiver at DCCP B by encoding that information in CCVal. If the relevant CCID does not specify its value, it MUST be set to zero. Checksum Coverage (CsCov): 4 bits The Checksum Coverage field specifies what parts of the packet are covered by the Checksum field, as follows: CsCov = 0 Checksum covers the DCCP header, DCCP options, network-layer pseudoheader (described below), and the entire DCCP payload, possibly padded on the right with zeros to an even number of bytes. CsCov = 1-15 Checksum covers the DCCP header, DCCP options, network-layer pseudoheader, and the initial (CsCov-1)*4 bytes of the DCCP payload. Thus, if CsCov is 1, none of the DCCP payload is protected by the header checksum. The value (CsCov-1)*4 MUST be less than or equal to the length of the DCCP payload. Packets with invalid CsCov values MUST be ignored; in particular, their options MUST NOT be processed. The meanings of values other than 0 and 1 should be considered experimental. Values other than 0 specify that corruption is acceptable in some or all of the DCCP packet's payload. In fact, DCCP cannot even detect corruption in areas not covered by the header checksum, unless the Payload Checksum option is used (Section 8.8). Applications should not make any assumptions about the correctness of received data not covered by the checksum, and should if necessary introduce their own appropriate validity checks. Kohler/Handley/Floyd/Padhye Section 5.1. [Page 23] INTERNET-DRAFT Expires: April 2004 October 2003 A DCCP application interface should let sending applications suggest a value for CsCov for sent packets, defaulting to 0 (full coverage). It should also let receiving applications refuse delivery of packets with checksum coverage less than a value provided by the application; by default, only packets with fully-covered payloads should be accepted. Lower layers that support partial error detection MAY use the Checksum Coverage field as a hint of where errors do not need to be detected. Lower layers MUST use a strong error detection mechanism to detect at least errors that occur in the sensitive part of the packet, and discard damaged packets. The sensitive part consists of the bytes between the first byte of the IP headerDCCP-Data and DCCP-DataAck packets may contain zero application data bytes if thelast byte identified by Checksum Coverage. For more details onapplicationand lower-layer interface issues relating to partial checksumming, see [UDP-LITE], from which this text was summarized. See Appendix B.1 for further motivation of partial checksums and discussion of partial checksumming issues. Partial checksums introduce some security considerations, which are describedsends a zero-length datagram. Also, a DCCP-Ack packet need not have a zero-length application data area. The receiver MUST ignore any "application data" inSection 16.2. DCCP partial checksumming was inspired by UDP-Lite [UDP-LITE]. Checksum: 16 bits DCCP uses the TCP/IP checksum algorithm.a DCCP-Ack packet. TheChecksum field equals the 16 bit one's complementsender will not generally send such data, but it may occasionally do so---to perform PMTU discovery without risking loss of user data, for example. DCCP-Ack and DCCP-DataAck packets often include additional acknowledgement options, such as Ack Vector, as required by theone's complement sum of all 16 bit wordscongestion control mechanism in use. 5.5. DCCP-CloseReq and DCCP-Close Headers DCCP-CloseReq and DCCP-Close packets begin theDCCP header, DCCP options,handshake that normally terminates apseudoheader taken from the network-layer header, and, depending on the value of the Checksum Coverage field, someconnection. Either client orall of the payload. When calculatingserver may send Kohler/Handley/Floyd Section 5.5. [Page 25] INTERNET-DRAFT Expires: August 2004 February 2004 a DCCP-Close packet, which will elicit a DCCP-Reset packet (see thechecksum,next section). Only theChecksum field itself is treated as 0. Ifserver can send apacket contains an odd number of header and text bytesDCCP-CloseReq packet, which indicates that the server wants tobe checksummed, 8 zero bits are added onclose therightconnection, but does not want toform ahold its TIMEWAIT state. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16bit word for checksum purposes.bytes) / / with Type=5 (DCCP-CloseReq) or 6 (DCCP-Close) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when (. Acknowledgement Number (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Thepad byte is not transmitted as part of thereceiver MUST ignore any "application data" in a DCCP-CloseReq or DCCP-Close packet.The pseudoheader is calculated as5.6. DCCP-Reset Header DCCP-Reset packets unconditionally shut down a connection. Connections normally terminate with a DCCP-Reset, but resets may be sent forTCP. For IPv4, it is 96 bits long,other reasons, including bad port numbers, bad option behavior, incorrect ECN Nonce Echoes, andconsists ofso forth. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=7 (DCCP-Reset) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when (. Acknowledgement Number (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reset Code | Data 1 | Data 2 | Data 3 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / Padding | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Error Text | | ... | Reset Code: 8 bits Represents the reason that theIPv4 source and destination addresses,sender reset theIP protocol number forDCCP(padded on the left with 8 zero bits),connection. Kohler/Handley/Floyd Section 5.6. [Page 26] INTERNET-DRAFT Expires: August 2004 February 2004 Data 1, Data 2, and Data 3: 8 bits each The Data fields provide additional information about why theDCCP length as a 16-bit quantity (the length ofsender reset the DCCPheader with options, plus the lengthconnection. The meanings ofany data); see Section 3.1these fields depend on the value of[RFC 793]. For IPv6, itReason. Error Text (application data area) If present, Error Text is320 bits long, and consists of the IPv6 sourcea human-readable text string, preferably in English anddestination addresses,encoded in Unicode UTF-8, that describes theDCCP length aserror in more detail. For example, a32-bit quantity, andDCCP-Reset with Reset Code 12, "Aggression Penalty", might contain Error Text such as "Aggression Penalty: Received 3 bad ECN Nonce Echoes, assuming misbehavior". The following Reset Codes are currently defined. The "Data" columns describe what theIP protocol numberData fields contain forDCCP (padded ona given Code. N/A means theleft with 24 zero bits); see Section 8.1 of [RFC 2460]. Packets with invalid header checksums MUST be ignored. In particular, their optionsData field MUSTNOTbeprocessed. Kohler/Handley/Floyd/Padhye Section 5.1. [Page 24] INTERNET-DRAFT Expires: April 2004 October 2003 Type: 4 bits The type field specifiesset to 0 by thetypesender of theDCCP message. The following values are defined:DCCP-Reset and ignored by its receiver. Reset Section Code Name Data 1 Data 2 Data 3 Reference ----- ---- ------ ------ ------ --------- 0DCCP-Request packet.Unspecified N/A N/A N/A 1DCCP-Response packet.Closed N/A N/A N/A 8.3 2DCCP-Data packet.Aborted N/A N/A N/A 8.1.1 3DCCP-Ack packet.No Connection N/A N/A N/A 8.3.1 4DCCP-DataAck packet.Packet Error packet N/A N/A 8.3.1 type 5DCCP-CloseReq packet.Option Error option option data number (if any) 6DCCP-Close packet.Mandatory Error option option data 5.9.2 number (if any) 7DCCP-Reset packet.Extended Seqnos N/A N/A N/A 7.6 8DCCP-Move packet.Connection Refused N/A N/A N/A 8.1.3 9DCCP-Sync packet. 10-15 Reserved. Extended Sequence Numbers (X): 1 bit This bitBad Service Code N/A N/A N/A 8.1.3 10 Too Busy N/A N/A N/A 8.1.3 11 Bad Init Cookie N/A N/A N/A 8.1.4 12 Aggression Penalty N/A N/A N/A 12.2 13 Move Refused N/A N/A N/A 14.4 13-127 Reserved 128-255 CCID-specific codes ... variable ... 10.4 5.7. DCCP-Move Header The DCCP-Move packet type isset to one to indicate the usepart ofan extended generic header with 48-bit SequenceDCCP's support for multihoming andAcknowledgement Numbers. The formatmobility, which is described further inthe section has X set to zero.Section5.3 describes the extended generic header. Number of Non-Data Packets (# NDP): 3 bits14. DCCPsets this field to the number of non-data packets it has sent so far on its sequence, modulo 8Anon-data packet is simply any packet not containing user data; DCCP-Ack, DCCP- Close, DCCP-CloseReq, and DCCP-Reset are always non-data packets, while DCCP-Request, DCCP-Response, and DCCP-Move might or might not be. When sending a non-data packet, DCCP increments the # NDP counter before storing its value in the packet header. This field can help the receiving DCCP decide whethersends alostDCCP-Move packetcontained any user data. (An application may wanttoknow when it has lost data.DCCPcould report everyB after changing its address and/or port number. The DCCP-Move packetloss as a potential data loss, but that would cause false loss reports when non-data packets were lost.) For example, sayrequests thatKohler/Handley/Floyd/PadhyeDCCP B start sending Kohler/Handley/Floyd Section5.1.5.7. [Page25]27] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 packet 10 had # NDP setFebruary 2004 packets to5; packet 11 was lost;a new address andpacket 12 had # NDP set to 5. Thenport number, which are read off thereceivingpacket's network header and generic DCCPcould deduce that packet 11 contained data, since # NDP did not change. Likewise, if # NDP had gone up to 6 (and packet 12 contained user data), then packet 11 must not have contained any data. # NDP can overflow, causing ambiguities. For example, if 8 packetsheader. The old address and port aredropped indefined through arow but # NDP does not change, the receiver will not be able to tell whetherMobility ID, which provides some protection against hijacked connections. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 ornot any16 bytes) / / with Type=8 (DCCP-Move) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when (. Acknowledgement Number (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Mobility ID (high bits) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . Mobility ID (bits 64-95) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . Mobility ID (bits 32-63) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . Mobility ID (low bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Mobility ID: 128 bits The value of thelost packets contained data. Thus, applications SHOULD NOT depend onreceiver's Mobility ID feature. This value uniquely identifies theavailability of unambiguous # NDP information. DCCP itself uses # NDP only as a hint of when acurrent connectionhas left unidirectional mode; potential ambiguities are not harmful there. Sequence Number: 24 bits The sequence number field is initialized by a DCCP-Request or DCCP-Response packet, and increases by one (modulo 16777216) with every packet sent. The receiver uses this information to determine whether packet losses have occurred. Even packets containing no data updateamong thesequence number. Sequence numbers also provide some protection against old and malicious packets and half-open connections; see Section 5.2 on sequence number validity. The two subflows' initial sequence numbers aresetbyof connections terminating at thefirst DCCP-Request and DCCP-Response packets sent, and SHOULD be chosen as for TCP. In particular, initial sequence number choice MUST include a random or pseudorandom component to makereceiver (meaning, the stationary endpoint); itharder for attackers to complete sequence number attacks [RFC 1948].MUST have been set in an earlier exchange. See Section 14.2. Theinitial sequence number chosen forreceiver MUST ignore any "application data" in agiven connection identifier (source address and port plus destination addressDCCP-Move packet. 5.8. DCCP-Sync andport) SHOULD increase over time, as TCP suggests [RFC 793], to prevent inappropriate delivery of old packets. If the header's X bit equals one, the Sequence Number field extends for another 24 bits for a totalDCCP-SyncAck Headers DCCP-Sync packets help DCCP endpoints recover synchronization after bursts of48. Very-high-rate connections SHOULD use these extended 48-bit sequence numbers to protect against wrapped sequence numbers; seeloss, or recover from half-open connections. Each valid DCCP-Sync received immediately elicits a DCCP-SyncAck. Kohler/Handley/Floyd Section5.3. Many packet types also carry an Acknowledgement Number in the four bytes following the generic header. Its format is as follows:5.8. [Page 28] INTERNET-DRAFT Expires: August 2004 February 2004 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=9 (DCCP-Sync) or 10 (DCCP-SyncAck) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Kohler/Handley/Floyd/Padhye Section 5.1. [Page 26] INTERNET-DRAFT Expires: April 2004 October 2003(+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when (. AcknowledgementNumber: 24 bitsNumber (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The Acknowledgement Numberfield acknowledgeson DCCP-Sync and DCCP-SyncAck packets need not equal the generating endpoint's greatest valid sequence number receivedso far(GSR). This differs from Acknowledgement Numbers onthis connection. ("Greatest" is, of course, measuredall other packet types. If a DCCP-Sync was generated incircularresponse to a packet with invalid sequencespace.)numbers, then the DCCP-Sync's Acknowledgementnumbers make no attempt to provide precise information about which packets have arrived; options such asNumber will equal theAck Vector do this.invalid packet's sequence number. The Acknowledgement Number on any DCCP-SyncAck packet MUST correspond to a"received" packet, where a packet is classified as "received" if and only if its options were processed byreceived, valid DCCP-Sync's Sequence Number; in thereceiving DCCP. (This means, for example, that received packets must be both header- checksum-valid and sequence-valid.) Even "received" packets may have their payloads dropped, due to receive buffer overflow or payload corruption, for instance. The HC-Receiver will send Data Dropped options whenpresence of reordering, thishappens (see Section 8.7); the HC-Sender will reduce its sending rate or congestion window as appropriate. This issue is discussed furthermight not equal GSR. The receiver MUST ignore any "application data" inSections 8.5 and 8.7. Ifa DCCP-Sync or DCCP-SyncAck packet. 5.9. Options All DCCP packets may contain options, which occupy space at theheader's X bit equals one,end of theAcknowledgement Number field extends for another 24 bits forDCCP header. Each option is atotalmultiple of48. Again, see Section 5.3. Reserved:8 bits in length. Theversioncombination ofDCCP specified here MUST ignore this field on received packets, and MUST set it toallzeroes on generated packets. 5.2. Sequence Number Synchronization DCCP implementations must reactoptions MUST add up topackets thata multiple of 32 bits. Individual options are notintended for the current connection. This can happen if the network delivers an old packet, if an attacker attemptspadded tohijack a connection, during the cleanupmultiples ofa half-open connection, or for other reasons. DCCP, like TCP, uses sequence number checks and Reset packets to defend against these packets. Every DCCP packet sent uses a new sequence number,32 bits, however;thus, given large enough burstsany option may begin on any byte boundary. All options are always included in the checksum. The first byte ofloss,an option is the option type. Options with types 0 through 31 are single-byte options. Other options are followed by aconnection's endpoints might get outbyte indicating the option's length. This length value includes the two bytes ofsync relative tooption-type and option-length as well as anywindow, requiring a mechanismoption-data bytes, and must therefore be greater than or equal torestore synchronization. This section describes the algorithms that determine when DCCP packetstwo. Options areintended for the current connection, andprocessed sequentially, starting at theactions taken on unintended packets. 5.2.1. Variables DCCP sequence number synchronization depends onfirst option in the packet header. The followingvariables, whichoptions aremaintained by each endpoint. Kohler/Handley/Floyd/Padhyecurrently defined: Kohler/Handley/Floyd Section5.2.1.5.9. [Page27]29] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 GSS The Greatest Sequence Number Sent by this endpoint so far. ("Greatest" is of course measured in circular sequence space.) GSR The Greatest Sequence Number Received from the other endpoint so far. GAR (Optional) The Greatest Acknowledgement Number Received from the other endpoint so far. Some other variables are derived from these primitives. SWL and SWR (Sequence Number Window Left and Right) TheFebruary 2004 Option Section Type Length Meaning Reference ---- ------ ------- --------- 0 1 Padding 5.9.1 1 1 Mandatory 5.9.2 2 1 Slow Receiver 11.6 3-31 1 Reserved 32 variable Change L 6.1 33 variable Confirm L 6.2 34 variable Change R 6.1 35 variable Confirm R 6.2 36 variable Init Cookie 8.1.4 37 4-5 NDP Count 7.7 38 variable Ack Vector [Nonce 0] 11.4 39 variable Ack Vector [Nonce 1] 11.4 40 variable Data Dropped 11.7 41 6 Timestamp 13.1 42 6-10 Timestamp Echo 13.3 43 4-6 Elapsed Time 13.2 44 4 Data Checksum 9.3 45-127 variable Reserved 128-255 variable CCID-specific options 10.4 This section describes twoendpoints of the window within which Sequence Numbers are appropriate. AWL and AWR (Acknowledgement Number Window Leftgeneric options, Padding andRight) The two endpoints of the window within which Acknowledgement NumbersMandatory. Other options areappropriate. 5.2.2. Appropriate Sequence Numbers A sequence number S is appropriate iff SWL <= S <= SWR in circular sequence space. This resembles TCP's receive window. However, in DCCP, sequence numbers changedescribed later. 5.9.1. Padding Option The Padding option, witheach packet sent, even pure acknowledgements. Thus,type 0, is aloss event that dropped many consecutive packets could cause two DCCPssingle byte option used toget outpad between or after options. It either ensures the application data begins on a 32-bit boundary (as required), or ensures alignment ofsync relative to any window, andfollowing options (not mandatory). +--------+ |00000000| +--------+ Type=0 5.9.2. Mandatory Option The Mandatory option, with type 1, is apacket beyondsingle byte option that indicates that thewindowimmediately following option is mandatory. If the receiving DCCP does notnecessarily a hard error. DCCP-Sync packets help in this situation.understand that following option, it MUST reset the connection, generally using Reset Code 6, "Mandatory Failure". For instance, say DCCP Asets SWL and SWR toreceives aloss window of W consecutive sequence numbers containing GSR. ("Consecutive", like "greatest", is measured in circular sequence space.) One-third of the loss window, rounded down, is placed at and before GSR,packet withtwo-thirds after GSR. Sequence numbers outside this loss window are inappropriate. inapprop. | appropriate Sequence Numbers | inapprop. <---------*|*===========*======================*|*---------> GSR -|GSR + 1 - GSR GSR +|GSR + 1 + floor(W/3)|floor(W/3) ceil(2W/3)|ceil(2W/3) = SWL = SWR During connection startup,two options: a Mandatory option, and immediately following, another option O. Then DCCP AMUST adjust SWL so thatwould reset the connection if itisdid notless than DCCP B's initial sequence number. DCCP B informsKohler/Handley/Floyd Section 5.9.2. [Page 30] INTERNET-DRAFT Expires: August 2004 February 2004 understand O's type; if it understood O's type, but not O's data; if O's data was invalid for O's type; if O was a feature negotiation option, and DCCP Aof W,did not understand theloss window widthenclosed feature number; if DCCP Ashould use, via the Loss Window feature (Section 6.10). W defaults to 1000,understood O, butKohler/Handley/Floyd/Padhyechose not to perform the action O implies; and so forth. Section5.2.2. [Page 28] INTERNET-DRAFT Expires: April 2004 October 2003 a proper value should reflect how many packets6.6.8 describes thesender expects to bebehavior of Mandatory feature negotiation options inflight. Onlymore detail. +--------+ |00000001| +--------+ Type=1 6. Feature Negotiation Four DCCP options, Change L, Confirm L, Change R, and Confirm R, implement in-band feature negotiation. Change options initiate a negotiation; Confirm options complete that negotiation. The "L" options are sent by thesender can anticipate this number. Too- small values increasefeature location, and therisk of"R" options are sent by theendpoints getting out sync after bursts of loss; too-large values increasefeature remote. Change options are retransmitted to ensure reliability. All these options have therisksame format. The first byte ofconnection hijacking. One good guidelineoption data isto set it to about 3 or 4 timesthemaximum number of packetsfeature number, and thesender expects to sendsecond and subsequent data bytes hold one or more feature values. The feature values are generally arranged in around-trip time. This value may not be available at connection initiation, whenlinear preference list, where theround-trip timefirst value isunknown, butmost preferred. +--------+--------+--------+--------+-------- | Type | Length |Feature#| Value(s) ... +--------+--------+--------+--------+-------- Together, thesender can always send updates asfeature number and theconnection progresses. 5.2.3. Appropriate Acknowledgement Numbersoption type ("L" or "R") uniquely identify the feature to which an option applies. TheAcknowledgement Numberexact format of the Value(s) area depends ona packet from DCCP B is appropriate iff it lies withinthewindow [AWL, AWR], where AWR = GSS,feature number. 6.1. Change Options Change L andthe window is W' packets wide. W' is the value of DCCP A's Loss Window feature, which it defined in its role as HC-SenderChange R options initiate feature negotiation. Either endpoint can start a negotiation forthe other half-connection. inapprop. | appropriate Acknowledgement Numbers | inapprop. <---------*|*===================================*|*----------> GSS - W'|GSS - W' + 1 GSS|GSS + 1 = AWL = AWR During connection startup,any feature; if DCCP AMUST adjust AWL so thatwants to start a negotiation for feature F/A, itis not less than its initial sequence number. 5.2.4. Sequence-Validity By State A packet is called sequence-valid when its sequence numbers indicate thatwill send a Change L option, while to start a negotiation for F/B, it will send a Change R option. Change options are retransmitted until some response isintended for the current connection.received. Normal Change options contain at least one Value, and thus have length at least 4. Kohler/Handley/Floyd Section 6.1. [Page 31] INTERNET-DRAFT Expires: August 2004 February 2004 +--------+--------+--------+--------+-------- Change L: |00100000| Length |Feature#| Value(s) ... +--------+--------+--------+--------+-------- Type=32 +--------+--------+--------+--------+-------- Change R: |00100010| Length |Feature#| Value(s) ... +--------+--------+--------+--------+-------- Type=34 Therules for sequence-validity depend on the state ofendpoint may check a feature's current value without attempting to change it by sending an empty Change option, containing just theconnection.feature number. Such options have length 3. Thebaseline rules for sequence-validityendpoints must agree on feature values anyway, so these options are useful in practice only in special situations, such asfollows: CLOSEDwhen a middlebox introduced in the middle of a connection wants to check a feature value. 6.2. Confirm Options Confirm L and Confirm R options complete feature negotiation, andLISTEN states All packetsaresequence-valid (but most packet types will cause a Resetsent in response to Change R and Change L options, respectively. Confirm options MUST NOT be generated except in response to Change options. Confirm options need not be retransmitted, since Change options are retransmitted as necessary. Normal Confirm options contain the selected Value, possibly followed bylater validity checks). REQUEST state A packet is sequence-valid if and only ifthe sender's preference list. +--------+--------+--------+--------+-------- Confirm L: |00100001| Length |Feature#| Value(s) ... +--------+--------+--------+--------+-------- Type=33 +--------+--------+--------+--------+-------- Confirm R: |00100011| Length |Feature#| Value(s) ... +--------+--------+--------+--------+-------- Type=35 If an endpoint receives an invalid Change option -- with an unknown feature number, or an invalid value -- ithaswill respond with anappropriate Acknowledgement Number. All other states (1) DCCP-Data packetsempty Confirm option containing no value. Such options have length 3. 6.3. Reconciliation Rules Reconciliation rules determine how the two sets of preferences for a given feature aresequence-valid if andresolved into a unique result. The reconciliation rule depends onlyif their Sequence Numbers are appropriate. Kohler/Handley/Floyd/Padhyeon the feature number. Each reconciliation rule must have the property that the result is uniquely determined given Kohler/Handley/Floyd Section5.2.4.6.3. [Page29]32] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 (2) DCCP-Sync and DCCP-Reset packets are sequence-valid if and only if their Acknowledgement Numbers are appropriate. (3) The sequence-validityFebruary 2004 the contents ofDCCP-Move packets is discussed in Section 5.10. (4)Change options sent by the two endpoints. Allother packets are sequence-valid if and only if both their Sequence and Acknowledgement Numbers are appropriate.current DCCPimplementations MAY implement additional checks to protect against packets that have valid sequence numbers, but are not partfeatures use one ofthis connection.two reconciliation rules, server-priority ("SP") and non-negotiable ("NN"). 6.3.1. Server-Priority Theadditional checks provide an incremental security advantage atfeature value is amoderate complexity cost. o DCCP-Reset packets may not have valid Sequence Numbers because they might be generatedfixed-length byte string (length determined by the feature number). Each Change option contains aclosed connectionpreference list of values, with the most preferred value coming first. Each Confirm option contains the confirmed value, followed by the confirmer's preference list. Thus, the feature's current value will generally appear twice inresponse to DCCP-Data packets, which have no Acknowledgement Number. However, DCCP implementations MUST supply a valid Sequence Number when one is available (either from connection information orConfirm options' data, once as theAcknowledgement Number),current value anduse Sequence Number 0 otherwise. Thus, valid DCCP-Reset packets fall into two categories: Either theyonce in the confirmer's preference list. Even responses to empty Change options containan appropriate Sequence Number, or they have Sequence Number 0the whole preference list. To reconcile the preference lists, select the first entry in the server's list that also occurs in the client's list. If there is no shared entry, the feature's value MUST NOT change, and the Confirm option will confirm the feature's previous value (unless the Change option was Mandatory; see Section 6.6.8). DCCP endpoints need not calculate theirAcknowledgement Number corresponds tovalue preference lists before feature negotiation begins. Thus, aDCCP- Request or DCCP-Data packet. Implementations that check this invariant MUST ignore DCCP-Resets that don't fit. (Do not,server might adjust its preference list based on the client's preference list, assuming the client opened the negotiation. Once a negotiation forexample, sendaDCCP-Sync in response to suchfeature has begun, however, the preference lists MUST remain stable until the negotiation has closed. 6.3.2. Non-Negotiable The feature value is aReset.) o DCCP implementations transition to CLOSED state after sendingbyte string. Each option contains exactly one feature value. The feature location signals aDCCP-Reset packet, and will not send further non-Reset packets on that connection. Therefore,value change by sending Change L options. The feature remote MUST accept any validDCCP-Reset packets have Sequence Numbers greater than GSR (except for thosevalue, responding withSequence Number 0, as mentioned above),a Confirm R option containing the new value, andAcknowledgement Numbers greater than or equal to GAR. Again, implementations that check this invariantit MUSTignore DCCP-Resets that don't fit. o Implementations that can detect duplicate sequence numbers within the current Loss Window should ignore duplicate packets. (Of course, sequence number space can wrap; this referssend empty Confirm R options in response topackets whose sequence numbers have recently been seen.) o DCCP-Sync packets with Sequence Number less than GSR, or with Acknowledgement Number less than GAR, are staleinvalid values. Non-negotiable features aren't really negotiated; they use feature negotiation as a mechanism for achieving reliability. Change R and Confirm L options MUST NOT beignored when detected. Implementing these checks should not cause interoperability problems, but augmentingsent for non-negotiable features. 6.4. Feature Numbers This document defines thelist with additional ad-hoc checks is NOT RECOMMENDED. Kohler/Handley/Floyd/Padhyefollowing feature numbers. Kohler/Handley/Floyd Section5.2.4.6.4. [Page30]33] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 5.2.5. Handling Sequence-Invalid Packets Sequence-invalid DCCP-Move, DCCP-Reset, and DCCP-Sync packets MUST be ignored. Otherwise, on receiving a sequence-invalid packet, a DCCP endpoint (say DCCP A) MUST reply with a DCCP-Sync packet, as allowed by the congestion control mechanism in use. This packet MUST acknowledge the packet's SequenceFebruary 2004 Rec'n Initial Section Number(not GSR!). Any DCCP-Sync MUST use a newMeaning Rule Value Req'd Reference ------ ------- ----- ----- ----- --------- 0 Reserved 1 Congestion Control ID (CCID) SP 2 Y 10 2 ECN Capable SP 1 Y 12.1 3 SequenceNumber,Window NN 100 Y 7.5.4 4 Sequence Transition Capable SP 0 N 7.6.4 5 Mobility Capable SP 0 N 14.1 6 Mobility ID NN 0 N 14.2 7 Ack Ratio NN 2 N 11.3 8 Send Ack Vector SP 0 N 11.5 9 Send NDP Count SP 0 N 7.7.2 10 Check Data Checksum SP 0 N 9.3.1 11-127 Reserved 128-255 CCID-specific features ? ? ? 10.4 Rec'n Rule The reconciliation rule used for the feature. SP is server-priority andthus will increase GSS; GSR will not change, however, sinceNN is non-negotiable. Initial Value The initial value for thepacket was sequence-invalid.feature. Every feature has a known initial value. Req'd This column is "Y" iff every DCCPAimplementation MUSTNOT otherwise process sequence-invalid packets. On receivingunderstand theDCCP-Sync, DCCP B will update its GSR variable and reply with a DCCP-Sync of its own. When DCCP A receives this DCCP- Sync, which acknowledges its DCCP-Sync (and is therefore sequence- valid),feature. If itwill update its GSR variable, thus getting the endpoints back into sync. Alternatively, ifis "N", then theconnection was half-open, DCCP B will send a Reset. To protect itself against denial-of-service attacks (wherefeature behaves like anattacker sends purposefully invalid packets, thereby forcing the receiverextension (see Section 16), and it is safe tosend DCCP-Syncs), a DCCP implementation MAY ignore packetsrespond to Change options for the feature withinappropriate Sequence Numbers ifempty Confirm options. Of course, a CCID might require theconnection is still active. By "ignore", we meanfeature; a DCCP that implements CCID 2 MUST support Ack Ratio and Send Ack Vector, for example. 6.5. Examples Here are three example feature negotiations for features located at thepacket is discarded without sending a DCCP-Sync. A connection is "active" when appropriate Sequence Numbers have been recently received; "recently" might mean withinserver, thelast second orfirst two for the Congestion Control ID feature, the lastRTT, whicheverfor the Ack Ratio: Client Server 1. Change R(CCID, 2 3 1) --> ("2 3 1" is client's value preference list) 2. <-- Confirm L(CCID, 3, 3 2 1) (3 isshorter. Similarly, a DCCP MAY rate-limittheDCCP-Syncs sent in response to sequence-invalid packets. 5.2.6. Examples In this first example, DCCP A and DCCP B recover from a large burst of lossnegotiated value; "3 2 1" is server's pref list) * agreement thatruns DCCP A's sequence numbers out of DCCP B's appropriate sequence number window. Kohler/Handley/Floyd/PadhyeCCID/Server = 3 * Kohler/Handley/Floyd Section5.2.6.6.5. [Page31]34] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 Recovery from Burst of Loss DCCP A DCCP B (GSS=1,GSR=10) (GSS=10,GSR=1) ---> DCCP-Data(seq 2) XXX ... ---> DCCP-Data(seq 100)February 2004 1. XXX---> DCCP-Data(seq 101) ---> ??? seqno out of range; send Sync OK <--- DCCP-Sync(seq 11, ack 101) <--- (GSS=11,GSR=1) ---> DCCP-Sync(seq 102, ack 11) ---> OK (GSS=102,GSR=11) (GSS=11,GSR=102) In this example, a DCCP connection recovers from<-- Change L(CCID, 3 2 1) 2. Retransmission: <-- Change L(CCID, 3 2 1) 3. Confirm R(CCID, 3, 2 3 1) --> * agreement that CCID/Server = 3 * 1. <-- Change L(Ack Ratio, 3) 2. Confirm R(Ack Ratio, 3) --> * agreement that Ack Ratio/Server = 3 * This example shows asimple attack. The attacker cannot guess sequence numbers. (DCCPsimultaneous negotiation. Client Server 1a. Change R(CCID, 2 3 1) --> b. <-- Change L(CCID, 3 2 1) (both endpoints in CHANGING) 2a. <-- Confirm L(CCID, 3, 3 2 1) b. Confirm R(CCID, 3, 2 3 1) --> (both endpoints in STABLE) * agreement that CCID/Server = 3 * Example Change and Confirm options follow, with their byte encodings. Each option isnot robust to attackers who can guess sequence numbers.) Recovery from Attack DCCP Asent by DCCPB (GSS=1,GSR=10) (GSS=10,GSR=1) *ATTACKER* ---> DCCP-Data(seq 10^6) ---> ??? seqno out of range; send Sync ??? <--- DCCP-Sync(seq 11, ack 10^6) <--- ackno out of range; ignore (GSS=1,GSR=10) (GSS=11,GSR=1) The final example demonstrates recovery from a half-open connection. Recovery fromA. Change L(CCID, 2 3) = 32,5,1,2,3 I want to change CCID/A's value (feature number 1, aHalf-Open Connection DCCP A DCCP B (GSS=1,GSR=10) (GSS=10,GSR=1) (Crash) CLOSED OPEN REQUEST ---> DCCP-Request(seq 400) ---> ??? !! <--- DCCP-Sync(seq 11, ack 400) <--- OPEN REQUEST ---> DCCP-Reset(seq 401, ack 11) ---> (Abort) REQUEST CLOSED REQUEST ---> DCCP-Request(seq 402) ---> ... 5.3. Extended Sequence Numbers A 10 Gb/s flow of 1500-byte DCCP packets will send 2^24 packetsserver- priority feature); my preferred values are 2 and 3, inabout 20 seconds. This isthat preference order. Change L(Sequence Window, 1024) = 32,6,3,0,4,0 Change Sequence Window/A's value (feature number 3, along time,non- negotiable feature) to the 3-byte string 0,4,0 (the value 1024). Empty Change L(CCID) = 32,3,1 Tell me CCID/A's value using a Confirm R option. Confirm L(CCID, 2, 2 3) = 33,6,1,2,2,3 I've changed CCID/A's value to 2; my preferred values are 2 and 3, in that preference order. Empty Confirm L(126) = 33,3,126 I don't implement feature number 126, or your proposed value for feature 126/A was invalid. Change R(CCID, 3 2) = 34,5,1,3,2 Please change CCID/B's value; my preferred values are 3 and 2, interms of likely round- Kohler/Handley/Floyd/Padhyethat preference order. Kohler/Handley/Floyd Section5.3.6.5. [Page32]35] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 trip timesFebruary 2004 Empty Change R(CCID) = 34,3,1 Tell me CCID/B's value using a Confirm L option. Confirm R(CCID, 2, 3 2) = 35,6,1,2,3,2 I've changed CCID/B's value to 2; my preferred values were 3 and 2, in thatcould possibly achieve suchpreference order. Confirm R(Sequence Window, 1024) = 35,6,3,0,4,0 I've changed Sequence Window/B's value to the 3-byte string 0,4,0 (the value 1024). Empty Confirm R(126) = 35,3,126 I don't implement feature number 126, or your proposed value for feature 126/B was invalid. 6.6. Option Exchange A few basic rules govern feature negotiation option exchange. 1. Every non-reordered Change option gets asustained rate, but itConfirm option in response. 2. Change options are retransmitted until some response isnot without risk. DCCP's current congestion control mechanismsreceived. 3. Preference lists don't change during a negotiation. 4. Feature negotiation options aredesigned for congestion windows (or equivalents)processed in strictly increasing order by Sequence Number. The rest ofat most a few hundred thousand packets, leaving at least 32 RTTs before 24-bit sequence numbers wrap. However, very-high rate connections SHOULD use extended sequence numbers to gainthis section describes the consequences of these rules in moreprotection. DCCP extended sequence numbersdetail. 6.6.1. Normal Exchange Change options areactivatedgenerated whenthe header's X bit is set to one. This extends the Sequence Number and Acknowledgement Number fields by an additional 24 bits, foratotal of 48 bits. A flow of 1500-byteDCCPpackets would have to send more than 28 petabits per secondendpoint wants tooverflow 48-bit sequence numbers withinchange the2-minute maximum segment lifetime. The 48-bit numbers are stored in network order, with most significant bit first. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Dest Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data Offset | CCVal | CsCov | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type |1|# NDP| Sequence Number (high bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number (low bits) | Reserved |T| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ All packet types except for DCCP-Data and DCCP-Request will followvalue of some feature. Generally, thisgeneric header with an extended Acknowledgement Number: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number (high bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgement Number (low bits) | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Once an endpoint has sent any packet with 48-bit sequence numbers (X=1),will happen at the beginning of a connection, although itMUST send all succeeding packets with 48-bit sequence numbers. Furthermore, once an endpoint has receivedmay happen at anypacket with 48-bit sequence numbers, it MUST either send all succeeding packets with 48-bit sequence numbers,time. We say the endpoint "generates" orreset"sends" a Change L or Change R option; but, of course, theconnection with Reason setoption must be attached to"Extended Sequence Numbers" (15). Clients SHOULD decide whethera packet. The endpoint may attach the option touse extended sequence numbers before sending their DCCP-Requests. That is, connections SHOULD NOT transition from 24-bita packet it would have generated anyway (such as a DCCP-Request), or it may create a new packet just to48-bit sequence numbers; they SHOULD contain only 24-bit sequence numbers,carry the options (often a DCCP-Sync). If it does create a new packet, it MUST NOT create more than one such packet per round-trip time (or 0.2 seconds, if no RTT is available). On receiving a Change L oronly 48-bit sequence Kohler/Handley/Floyd/PadhyeChange R option, a DCCP endpoint examines the included preference list, reconciles that with its own Kohler/Handley/Floyd Section5.3.6.6.1. [Page33]36] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 numbers. The Transition bit (T) supports transitioning to extended sequence numbers during an active connection, however, in case this proves necessary; see below. Extended sequence numbers are treated simply as longer sequence numbers. For instance, the sequence-validity mechanisms workFebruary 2004 preference list, calculates thesame way whethernew value, and sends back a Confirm R ornot sequence numbers are extended. CareConfirm L option, respectively, informing its partner of the new value. The rule for reconciling the two preference lists isrequired when comparingfeature-specific; see Section 6.3. Every non-reordered Change option MUST result in a24-bit sequence number withcorresponding Confirm option. Any packet including a Confirm option MUST carry an48-bit sequence number; see below. Extended sequence numbers improve security against attackers by making it harderAcknowledgement Number; thus, Confirm options are not allowed on DCCP-Request and DCCP-Data packets. Again, generated Confirm options may be attached toguesspackets that would have been sent anyway (such as DCCP-Response or DCCP-SyncAck), or to new packets (usually DCCP-Ack). The Change-sending endpoint MUST wait to receive avalid sequence number,corresponding Confirm option before changing its stored feature value. The Confirm-sending endpoint changes its stored feature value aswellsoon asprotecting against benign wrapping. 5.3.1. Transitioningit sends the Confirm. DCCP endpoints effectively exist in one of two states, STABLE and CHANGING, relative toExtended Sequence Numbers The Transition bit (T) followingeach feature. STABLE is theextended Sequence Number field makesnormal state, where the endpoint knows the feature's value and thinks the other endpoint agrees. An endpoint enters the CHANGING state when itpossible to transitionfirst sends a Change for the feature, and returns to48-bit sequence numbers inSTABLE once it receives a corresponding Confirm. 6.6.2. Loss and Retransmission Packets containing Change and Confirm options might be lost or delayed by themiddle ofnetwork. Therefore, Change options are retransmitted to achieve reliability. A CHANGING endpoint retransmits aconnection. TChange option once it realizes that it has not heard back from the other endpoint. Each retransmitted Change option MUST contain exactly the same payload as the original. The endpoint may piggyback its Change options on packets it would have sent anyway. If it generates new packets for feature negotiation, it MUST use an exponential-backoff timer. The timer's initial value is set to approximately oneonly during such a transition. When DCCPor two round-trip times (or 0.2-0.4 seconds, if no RTT is available), and it is pinned at roughly 32 RTTs. AswitchesCHANGING endpoint MUST continue retransmitting Change options until it gets some response. Its only recourse is to48-bit sequence numbers,reset the connection, which it SHOULD NOT do until at least 12 transmissions have failed. Change options SHOULD NOT be transmitted more frequently than once per RTT, or the reordering protection below would prevent any Confirm option from being accepted (since no Confirm would acknowledge the most recently transmitted Change). Kohler/Handley/Floyd Section 6.6.2. [Page 37] INTERNET-DRAFT Expires: August 2004 February 2004 Confirm options are never retransmitted, but the Confirm-sending endpoint MUST generate a new Confirm option for every non-reordered Change it receives. 6.6.3. Reordering Reordering might cause packets containing Change and Confirm options to arrive in an unexpected order. Endpoints MUSTset the T bitbe robust toone on all of its packets for some period. This period SHOULD last on thereordering, by ignoring feature negotiation options that do not arrive in strictly-increasing orderof a few round trip times, or until DCCP A receivesby Sequence Number. The most straightforward way to implement this requirement is for anacknowledgement from DCCP B proving that one of its 48-bit-sequence-number packets has been received, whichever comes later. Each DCCP MUST choose its first 48-bit sequence numberendpoint tohave its lower 24 bits equal the 24-bitassociate two sequence numberit expected to send (GSS+1). If DCCP A sends an extendedvariables with every feature F/X, as follows. F/X.GSR The Greatest Sequence Number Received from the other endpoint on a packet containingan Acknowledgement Number before DCCP B sends ita48-bit Sequence Number, DCCP A may send any valueChange or Confirm option forthe upper 24 bits of that Acknowledgement Number, but the lower 24 bits MUST equal the expected 24-bit Acknowledgementfeature F/X. F/X.GSS The Greatest Sequence Number(GSR). Furthermore, DCCP A MUST leave GSR asSent by this endpoint on a24-bit number until receiving an extendedpacketfrom DCCP B. Ifcontaining a Change option for feature F/X. Then DCCPB transitionsA will check options relating toextended sequence numbers because it receives a validfeature F/A as follows: 1. Ignore any received Change R(F) option whose packet's Sequence Number is not greater than F/A.GSR. 2. Ignore any received Confirm R(F) option whose packet's Sequence Number is not greater than F/A.GSR, or whose packetwith extended sequence numbers, it MAY set the upper 24 bits of its extended sequence number based on the upper 24 bits ofcould not have acknowledged F/A.GSS. Specifically, if thereceivedAcknowledgementNumber, but it can also choose a different upper 24 bits. Switching to 48-bit sequence numbers inNumber is less than F/A.GSS, themiddle of a connection raisesendpoint MUST ignore theissue of comparing a 24-bit sequence number with a 48-bit sequence number. (This may also occurConfirm; and if thenetwork delivers apacketfromhas anold connection, or given a malicious attacker.) Let P beAck Vector indicating that F/A.GSS was not received, thepacket sequence number received from DCCP B, and E beendpoint MAY ignore thesequence number DCCPConfirm. Aexpects. During sequence-validity computations, for example, P mightsimilar procedure applies options relating to feature F/B, namely Change L(F) and Confirm L(F), except that F/B.GSR and F/B.GSS are checked. A less state-intensive way to implement this requirement would be to share thepacket's Acknowledgement NumberF.GSR andE might be AWL,F.GSS variables among all features, rather than keeping one pair per feature. Then theleft edge offeature negotiation options on any received packet would be treated as a unit (either all accepted or all rejected). Checking Confirm options is easier if theappropriate Kohler/Handley/Floyd/Padhyeendpoint only sends Change options on packet types that will be acknowledged immediately, namely DCCP-Request, DCCP-Response, and DCCP-Sync. Then there is never any need to check Ack Vectors, although checking Ack Vectors Kohler/Handley/Floyd Section5.3.1.6.6.3. [Page34]38] INTERNET-DRAFT Expires:April 2004 October 2003 acknowledgement number window. Then DCCP A should perform the comparison as follows. o If P and E are both 24 bits, compare them modulo 2^24. o If P and E are both 48 bits,August 2004 February 2004 is NOT MANDATORY anyway. 6.6.4. Preference Changes Endpoints MUST NOT change their preference lists in thepacket's Transition bitmiddle of a negotiation. This isset,because, if a preference list changed in the middle of a negotiation and thelast packet sent by DCCP Aright packets were lost, the negotiation could terminate with the endpoints thinking the feature had different values. In particular, an endpoint MUST NOT change itsTransition bit set, then compare P and E modulo 2^24. This coverspreference list while in thecase where bothCHANGING state; this ensures that every Change option sent during that negotiation will contain the same data. 6.6.5. Simultaneous Negotiation The two endpointstransitioned simultaneously, so P and E's upper 24 bitsmightdisagree. o Otherwise, if P and E are both 48 bits, compare them modulo 2^48. o If P is 48 bits but E is 24,simultaneously open negotiation for theremote DCCP may want to transitionsame feature, after which an endpoint in the CHANGING state will receive a Change option for the same feature. Such received Change options can act as responses toextended sequence numbers. Ifthepacket's Transition bit is not set,original Change options. The CHANGING endpoint MUST examine thepacket is definitely sequence- invalid; otherwise, compare Preceived Change's preference list, reconcile that withE modulo 2^24. Ifits own preference list (as expressed in its generated Change options), and generate thepacket proves sequence-valid,corresponding Confirm option. It can thenit is OK;transition toextended sequence numbers, and set E according to the full 48 bits of P. Ifthepacket does not prove sequence-valid, send an (extended) DCCP-Sync as required (with T setSTABLE state. 6.6.6. Unknown Features An endpoint may receive a Change option referring toone), but dosome feature number it does notyet transitionunderstand. This is particularly likely to happen when an extendedsequence numbers. o If P is 24 bits but E is 48, there may have been benign packet reordering.DCCP converses with a non-extended DCCP. Thecorrect action depends on whether the last packet seen fromreceiving endpoint MUST respond to such Change options with corresponding empty Confirm options (that is, Confirm options containing no data), which inform theremote DCCP hadCHANGING endpoint that theTransition bit set. o If Transitionfeature was notset, thenunderstood. However, if thepacket is sequence-invalid; send an (extended) DCCP-Sync as required. o If TransitionChange option wasset, extend P topreceded by a48-bit value P'. First, let EH equalMandatory option, theupper 24 bits of E, and EL equalconnection MUST be reset; see Section 6.6.8. On receiving an empty Confirm option for some feature, thelower 24 bits of E. Then: If EL > P, set P' = (EH << 24) | P. Otherwise, set P' = (((EH - 1) mod 2^24) << 24) | P. IfCHANGING endpoint MUST transition back to thepacket proves sequence-valid when comparing with P' modulo 2^48, then it is OK;STABLE state, leaving thepacket was reordered from beforefeature's value unchanged. Section 16 suggests that thetransition. If it does not,default value for any extension feature should correspond to "extension not available". An endpoint will also send an(extended) DCCP-Sync (with T set to one) as required. DCCP implementations can, of course, avoid most of this complexity by disallowing transitions to extended sequence numbers (and by resetting the connectionempty Confirm option when it understood theother endpoint attempts such a transition). Connections that use 48-bit sequence numbers throughout, starting withChange's feature number, but considered theDCCP-Request, MUST have T set to zero on all their packets. Kohler/Handley/Floyd/PadhyeChange's value invalid or inappropriate for the feature. The next section describes this further. Kohler/Handley/Floyd Section5.3.1.6.6.6. [Page35]39] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 5.4. DCCP State Diagram In this section we present a DCCP state diagram showing how a DCCP connection should progress, and the proper responses for packets or timeout events in various connection states. The state diagram is illustrative; the text shouldFebruary 2004 Some features are required to beconsidered definitive. +----------------------------------+ | Figure omitted from text version | +----------------------------------+ All receive events onunderstood by all DCCPs (see Section 6.4); thediagram represent receipt of sequence- valid packets with correct header checksums. For example, receiving aCHANGING endpoint SHOULD reset the connection (with ResetwithCode 5, "Option Error") if it receives an empty Confirm option for such abad Acknowledgement Number MUST NOT cause DCCP to transition to the TIME-WAIT state. DCCP implementations SHOULD send Acks as described abovefeature. Since Confirm options are generated only in response tosequence-invalid packets. Otherwise-valid packets without explicit transitions in the state diagram SHOULD be treated accordingChange options, an endpoint should never receive a Confirm option referring tothe table below. Particular actions are "OK", meaning the packeta feature number it does not understand. Endpoints MUSTbe processed according to this document; "Rst", meaningeither reset thereceiver SHOULD respond with a (possibly rate-limited) Reset; and "-", meaningconnection on receiving such options, or just ignore thepacket SHOULD be ignored. Entries may takeoptions. 6.6.7. Invalid Options A DCCP endpoint might receive a Change or Confirm option that lists one or more values that it does not understand. Some, but not all, such options are invalid, depending on theform "Old/New", where "Old" applies to old packetsrelevant reconciliation rule (Section 6.3). For instance: o All features have length limitiations, and"New" to new packets (whose sequence numbersoptions with invalid lengths aregreater than GSR,invalid. For example, thegreatestMobility ID feature takes 128-bit values, so validsequence number seen"Confirm R(Mobility ID)" options have option length 19. o Some non-negotiable features have value limitations. The Ack Ratio feature takes two-byte, non-zero integer values, sofar). Data/Ack/ DataAck/ Reset/ State Request Response Move CloseReq Close Sync ------------- -------- -------- -------- -------- -------- -------- CLOSED Rst Rst Rst Rst Rst OK LISTEN OK Rst Rst(1) Rst Rst OK REQUEST Rst OK Rst Rst Rst OK RESPOND -/OK Rst Rst/OK Rst OK OK SERVER-OPEN -/Rst Rst OK Rst OK OK CLIENT-OPEN Rst -/Rst OK OK OK OK CLOSEREQ -/Rst Rst OK Rst OK OK CLOSING Rst -/Rst OK OK OK OK TIME-WAIT Rst Rst Rst Rst Rst OK Again, we notea "Change L(Ack Ratio, 0)" option is never valid. Note thatthe table only applies to valid packets. Sequence-invalid packets SHOULD be treatedserver- priority features do not have value limitations, since unknown values are handled asdescribed above. A DCCP endpointa matter of course. o Any Confirm option thatimplementsselects theInit Cookie option (Section 6.6) may changewrong value, based on theReset action marked (1). Init Cookie letstwo preference lists and theserver Kohler/Handley/Floyd/Padhye Section 5.4. [Page 36] INTERNET-DRAFT Expires: April 2004 October 2003 package all state for a requested connection intorelevant reconciliation rule, is invalid. An endpoint receiving an invalid Change option MUST respond with the corresponding empty Confirm option. An endpoint receiving an invalid Confirm option MUST reset the connection, with Reset Code 5, "Option Error". 6.6.8. Mandatory Feature Negotiation Change options may be preceded by Mandatory options (Section 5.9.2). Mandatory Change options are processed like normal Change options, except thatthe clientvarious failure cases willecho. A server with Init Cookie need not implementcause theRESPOND state. Instead, it may replyreceiver toeach DCCP-Request packetreset the connection with Reset Code 6, "Mandatory Failure", rather than send aDCCP-Response containing an Init Cookie. When a DCCP-Data, Ack, or DataAck packet carrying a valid Init Cookie arrives fromConfirm option. Specifically, theclient,connection MUST be reset if: o The Change option's feature number was not understood; Kohler/Handley/Floyd Section 6.6.8. [Page 40] INTERNET-DRAFT Expires: August 2004 February 2004 o The Change option's value was invalid, and theserver will move directly from LISTEN to OPEN. Like TCP SYN cookies [SYNCOOKIES], Init Cookies let servers avoid keeping any state for clients whose addressesreceiver would normally havenot been verified. A DCCP endpointsent an empty Confirm option inthe CLOSEDresponse; orLISTEN state may not have a proper sequence number available to send a Reset. In these cases, it MUST seto For server-priority features, there was no shared entry in theReset's Sequence Numbertwo endpoints' preference lists. There's no reason tozero. Resetsmark Confirm options as Mandatory in this version of DCCP, since Confirm options are sent only inthe CLOSED, LISTEN,response to Change options andTIME-WAIT states SHOULD use Reset Reason "No Connection"; other Resets SHOULD use Reason "Invalid Packet". A DCCP MAY send Resets not listed intherefore can't mention potentially-invalid values or unexpected feature numbers. 6.6.9. Out-of-Band Agreement An endpoint MUST NOT unilaterally change thediagram if it detects an inconsistency---for example, if it receives twovalue of any DCCPpackets with the same sequence number, but different packet types. The Open state does not signify that afeature. However, endpoints MAY cooperatively change DCCPconnection is ready for data transfer. In particular, incompletefeaturenegotiations might prevent data transfer. Featurevalues without using in-band feature negotiationtakes place in parallel with the state transitions on this diagram. Only the server may take the transition from the OPENoptions---by using a separate signalling channel, for example. 6.6.10. State Diagram This diagram illustrates feature-related stateto the CLOSEREQ state. (The server istransitions, ignoring sequence number and option validity issues, for theDCCPendpoint thatbegan in the LISTEN state.) Similarly, only the client must transition to CLOSE after receiving a CloseReq packet. 5.5. DCCP-Request Packet Format A DCCP connectionisinitiated by sending a DCCP-Request packet. The format ofthe feature location. For aDCCP request packet is: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=0 (DCCP-Request) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+feature remote state transition diagram, switch the "L"s and "R"s. rcv Confirm R app/protocol evt : snd Change L : ignore +--------------------------------------------+ +----+ |Service Code|+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|Options / [padding]v |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+rcv Change R v +------------+ rcv Confirm R : calc new value, +------------+ |data| : accept value snd Confirm L |...|+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Kohler/Handley/Floyd/Padhye Section 5.5. [Page 37] INTERNET-DRAFT Expires: April 2004 October 2003 Service Code: 32 bits| STABLE |<------------------------------------| CHANGING | | | rcv empty Confirm R | | +------------+ : revert to old value +------------+ | ^ | ^ +----+ +----+ rcv Change R timeout/rcv non-ack : calc new value, snd Confirm L : snd Change L This state diagram corresponds to the following procedure for reacting to received packets with feature negotiation options. TheService Code field describes the serviceprocedure refers to "P.seqno", "P.ackno", "P.optiontype", and "P.optionlen", which are properties of thesender is trying to connect. Service Codespacket; "F.GSR" and "F.GSS", which are32-bitthe variables mentioned in Section 6.6.3; "F.state", which is the feature's state (STABLE or CHANGING); and "F.value", which is the feature's value. Kohler/Handley/Floyd Section 6.6.10. [Page 41] INTERNET-DRAFT Expires: August 2004 February 2004 If F.state == STABLE: If P.optiontype == Change R && P.seqno > F.GSR: Calculate new value Send Confirm L on next packet F.GSR := P.seqno Otherwise: Ignore option If F.state == CHANGING: If P.optiontype == Confirm R && P.ackno >= F.GSS && P potentially acknowledges F.GSS: If P.optionlen == 3: /* empty Confirm R option */ Retain old value Otherwise: Check new value F.value := new value F.state := STABLE Otherwise, if P.optiontype == Change R && P.seqno > F.GSR: Calculate new value Send Confirm L on next packet F.GSR := P.seqno Otherwise: Ignore option 7. Sequence Numbers DCCP uses 24- or 48-bit sequence numbersallocated by IANA; they are meant to correspondtoapplication servicesarrange packets into sequence, detect losses andprotocols, such as FTPnetwork duplicates, andHTTP,protect against attackers, half-open connections, andare not intended to be DCCP-specific. With Service Codes, stateful middleboxes, such as firewalls, can identify the application running on a nonstandard port (assumingtheDCCP header has not been encrypted). A Service Codedelivery ofzero isvery old packets. Every packet carries awildcard, matching any service. The host operating system MAY force everySequence Number; most packet types carry an Acknowledgement Number as well. DCCPsocket, both actively and passively opened, to specify a nonzero Service Code. Connection requests MUST fail if the Destination Port on the receiver has a different Service Code from that given in the packet, and both Service Codessequence numbers arenonzero. In this case,per-packet. Thus, each endpoint increments thereceiver will respondDCCP Sequence Number field by one (modulo 2^24 or 2^48) witha DCCP-Resetevery packet(with Reason set to "Bad Service Code"). A server or stateful middlebox MAY also send a "Bad Service Code" DCCP-Reset in response tosent. Even DCCP-Ack and DCCP-Sync packets, and other packetswhose Service Codethat don't carry user data, increment the Sequence Number. Since DCCP isconsidered unsuitable. Optionsan unreliable protocol, there are no true retransmissions; but effective retransmissions, such as retransmissions of DCCP-Requestpackets will usually include a "Change R(Connection Nonce)" option, to informpackets, also increment theserverSequence Number. This lets DCCP implementations detect network duplication, retransmissions, and acknowledgement loss, and is a significant departure from TCP practice. 7.1. Variables DCCP endpoints maintain a set ofthe client's connection nonce; seesequence number variables for each connection. Kohler/Handley/Floyd Section6.5.7.1. [Page 42] INTERNET-DRAFT Expires: August 2004 February 2004 ISS Theclient MAY send newInitial Sequence Number Sent by this endpoint. This equals the Sequence Number of the first DCCP-Requestpackets if no response is received after some timeout.or DCCP-Response sent. ISR Theretransmission strategy SHOULD be similar to that for retransmitting TCP SYNs; for instance, a first timeout onInitial Sequence Number Received from theorderother endpoint. This equals the Sequence Number ofa second, with an exponential backoff timer. Each retransmission MUST incrementthe first DCCP-Request or DCCP-Response received. GSS The Greatest SequenceNumber, and possibly # NDP,Number Sent byone. A client MAY decide to give up after some numberthis endpoint. ("Greatest" is ofDCCP-Requests. If so, it SHOULD send a DCCP-Reset packet to the server, to clean up statecourse measured incase one or morecircular sequence space.) GSR The Greatest Sequence Number Received from the other endpoint on an acknowledgeable packet. (Section 7.4 defines "acknowledgeable" packets.) GAR The Greatest Acknowledgement Number Received from the other endpoint on an acknowledgeable packet. Some other variables are derived from these primitives. SWL and SWH (Sequence Number Window Low and High) The extremes of theRequests actually arrived.validity window for received packets' Sequence Numbers. AWL and AWH (Acknowledgement Number Window Low and High) TheDCCP-Reset SHOULD have Reason set to "Aborted". 5.6. DCCP-Response Packet Format In the second phaseextremes of thethree-way handshake,validity window for received packets' Acknowledgement Numbers. 7.2. Initial Sequence Numbers The endpoints' initial sequence numbers are set by theserver sends afirst DCCP- Request and DCCP-Responsemessagepackets sent. Initial sequence numbers MUST be chosen to avoid two problems: o Delivery of old packets, where packets lingering in theclient. In this phase,network from an old connection are delivered to aserver will often specifynew connection with theoptions it would like to use, either from among thosesame addresses and port numbers. o Sequence number attacks, where an attacker can guess theclient requested, or in addition to those. Amongsequence numbers that a future connection would use [M85]. DCCP implementations may use TCP's strategies for avoiding theseoptions isproblems [RFC 793] [RFC 1948]. To address thecongestion control mechanismfirst problem, an implementation MUST ensure that theserver expects to use. Kohler/Handley/Floyd/Padhyeinitial sequence number for a given <source address, source port, Kohler/Handley/Floyd Section5.6.7.2. [Page38]43] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1February 2004 destination address, destination port> 4-tuple doesn't overlap with recent sequence numbers on connections with the same 4-tuple ("recent" meaning sent within 23 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / /maximum segment lifetimes). If the implementation has state for a recent connection withType=1 (DCCP-Response) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)Iff (| Acknowledgement Number (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / [padding] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data | | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Acknowledgement Number: 24 bits Inthecase ofsame 4-tuple, it can simply pick aDCCP-Response packet,good initial sequence number; otherwise, it could tie initial sequence number selection to some clock, such as theAcknowledgement Number field will equal4-microsecond clock used by TCP [RFC 793]. To address the second problem, an implementation MUST provide each 4-tuple with an independent initial sequence numberfromspace; then an attacker can't learn anything about anyone else's initial sequence numbers. RFC 1948 achieves this by adding a cryptographic hash, of thecorresponding DCCP-Request. Options The Data Dropped and Init Cookie options are particularly useful for DCCP-Response packets (Sections 8.7 and 6.6). In addition, DCCP-Response, or early DCCP-Data or DCCP-Ack packets, may include "Confirm L(Connection Nonce)" and "Change R(Connection Nonce)" options, to negotiate connection nonces (Section 6.5), as well as options to negotiate CCIDs4-tuple andother relevant features. The receiver MAY respond toaDCCP-Request packet with a DCCP-Reset packetsecret, torefuseany initial sequence number. For theconnection. Relevant Reset Reasons for refusingsecret, RFC 1948 recommends aconnection include "Connection Refused",combination of some truly-random data [RFC 1750], an administratively-installed passphrase, the endpoint's IP address, and the endpoint's boot time, but truly-random data is sufficient. Care should be taken when changing theDCCP- Request's Destination Port did not correspond tosecret; such aDCCP port openchange alters all initial sequence number spaces, which might make an initial sequence number for some 4-tuple equal a recently sent sequence number for the same 4-tuple. To avoid this problem around such a change, the endpoint might remember dead connection state forlistening; "Bad Service Code",each 4-tuple or stay quiet for 2 maximum segment lifetimes. 7.3. Quiet Time DCCP endpoints, like TCP endpoints, must take care before initiating connections whenthe DCCP-Request's Service Code did not correspondthey boot. In particular, they MUST NOT send packets whose sequence numbers are close to theservice code registered withsequence numbers of packets lingering in theDestination Port; and "Too Busy", whennetwork from before theserver is currently too busyboot. The simplest way torespondenforce this rule is for DCCP endpoints torequests. The server SHOULD limit the rate at which it generates these resets. The receiver SHOULD NOT retransmit DCCP-Response packets; the sender will retransmitavoid sending any packets until one maximum segment lifetime (2 minutes) after boot. Other enforcement mechanisms include remembering recent sequence numbers across boots, or reserving theDCCP-Request if necessary. (Noteupper 8 or so bits of initial sequence numbers for a persistent boot counter that decrements by two each boot (this would require the"retransmitted" DCCP-Request will have, at least, a differentuse of extended sequencenumber fromnumbers). 7.4. Acknowledgement Numbers DCCP has no cumulative acknowledgement field; cumulative acknowledgements would be meaningless in an unreliable protocol. Therefore, the"original" DCCP-Request;Acknowledgement Number field has a different meaning in DCCP than in TCP. A packet is classified as "acknowledgeable" if and only if its options were processed by thereceiver can thus distinguish true retransmissions from network duplicates.) The responder will detectreceiving DCCP. This means, for example, thatthe retransmitted DCCP-Request applies to an existing connection because of its Sourceall acknowledgeable packets have valid header checksums andDestination Ports. Kohler/Handley/Floyd/Padhyesequence numbers. The Acknowledgement Number for most Kohler/Handley/Floyd Section5.6.7.4. [Page39]44] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 Every valid DCCP-Request received while the server is in the RESPOND state MUST elicit a new DCCP-Response. Each new DCCP-ResponseFebruary 2004 packet types MUSTincrementequal GSR, theresponder'sGreatest SequenceNumber, and possibly # NDP, by one. The responder SHOULD NOT accept anyNumber Received on an acknowledgeable packet. Note that "acknowledgeable" refers to option processing, not dataaccompanying a retransmitted DCCP-Request. In particular, the DCCP-Response sentprocessing. Even acknowledgeable packets may have their application data dropped, due to receive buffer overflow or corruption, for instance. Data Dropped options report these data losses when necessary, letting congestion control mechanisms distinguish between network losses and endpoint losses. This issue is discussed further inreplySections 11.4 and 11.7. DCCP-Sync and DCCP-SyncAck packets are a special case to this rule. The Acknowledgement Number on aretransmitted DCCP-Request with data SHOULD containDCCP-Sync packet corresponds to aData Dropped option,received packet, but not necessarily an acknowledgeable packet; inwhich the retransmitted DCCP-Request is reported as "data dropped dueparticular, it might correspond toprotocol constraints" (Drop Code 0).an out-of-sync packet whose options were not processed. Theoriginal DCCP-Request SHOULD alsoAcknowledgement Number on a DCCP- SyncAck packet always corresponds to an acknowledgeable DCCP-Sync packet; if there was reordering, that Acknowledgement Number might bereported inless than GSR. 7.5. Validity and Synchronization Any DCCP endpoint might receive packets that are not actually part of theData Dropped option, either incurrent connection. For instance, the network might deliver an old packet, an attacker might attempt to hijack a connection, or the other endpoint might crash, causing a half-open connection. DCCP, like TCP, uses sequence number checks to detect these cases Packets whose Sequence and/or Acknowledgement Numbers are out of range are called sequence-invalid, and are not processed normally. Unlike TCP, DCCP requires aNormal Block (if the responder accepted the data, or there was no data), or insynchronization mechanism to recover from large bursts of loss. One endpoint might send so many packets during aDrop Code 0 Drop Block (ifburst of loss that when one of its packets finally got through, theresponder refusedother endpoint would label its Sequence Number as invalid. A handshake involving DCCP-Sync and DCCP-SyncAck packets recovers from this case. 7.5.1. Sequence-Validity Rules Sequence-validity depends on thedatareceived packet's type. This table shows thefirst time as well). 5.7. DCCP-Data, DCCP-Ack,sequence andDCCP-DataAck Packet Formats The payload ofacknowledgement number checks applied to each packet; aDCCP connectionpacket issent in DCCP-Data and DCCP- DataAck packets,sequence-valid if it passes both tests, andDCCP-Ack packets are used for acknowledgements when there is no payloadsequence-invalid if it does not. Many of the checks refer tobe sent. DCCP-Data packets look like this: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=2 (DCCP-Data) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / [padding] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data | | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ DCCP-Ack packets dispense withthedata, but contain ansequence and acknowledgementnumber: Kohler/Handley/Floyd/Padhyenumber windows, [SWL, SWH] and [AWL, AWH], defined below in Section5.7.7.5.3. Kohler/Handley/Floyd Section 7.5.1. [Page40]45] INTERNET-DRAFT Expires:AprilAugust 2004 February 2004October 2003 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=3 (DCCP-Ack) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved |Acknowledgement Number| (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)Iff (| AcknowledgementPacket Type Sequence Number(low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / [padding] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+Check Check ----------- --------------------- ---------------------- DCCP-Request SWL <= seqno <= SWH (*) N/A DCCP-Response SWL <= seqno <= SWH (*) AWL <= ackno <= AWH DCCP-Data SWL <= seqno <= SWH N/A DCCP-Ack SWL <= seqno <= SWH AWL <= ackno <= AWH DCCP-DataAckpackets contain both data and an acknowledgement number: acknowledgement information is piggybacked on a data packet. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9SWL <= seqno <= SWH AWL <= ackno <= AWH DCCP-CloseReq SWL <= seqno <= SWH AWL <= ackno <= AWH DCCP-Close SWL <= seqno <= SWH AWL <= ackno <= AWH DCCP-Reset seqno == 01 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12or16 bytes) / / with Type=4 (DCCP-DataAck) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)Iff (| Acknowledgement Number (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / [padding] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data | | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ A DCCP-Dataseqno > GSR GAR <= ackno <= AWH DCCP-Move seqno >= SWL ISS <= ackno <= AWH DCCP-Sync seqno >= SWL AWL <= ackno <= AWH DCCP-SyncAck seqno >= SWL AWL <= ackno <= AWH (*) Check not applied if connection is in LISTEN orDCCP-DataAck packetREQUEST state. In general, packets are sequence-valid if their Sequence and Acknowledgement Numbers lie within the corresponding valid windows, [SWL, SWH] and [AWL, AWH]. The exceptions to this rule are as follows: o DCCP-Reset Sequence Numbers maycontain no data bytes ifbe zero. This is because during theapplication sendscleanup of azero-length datagram. DCCP A sends DCCP-Data and DCCP-DataAck packets to DCCP B duehalf-open connection, an endpoint might generate a DCCP-Reset in response toapplication events on host A. These packets are congestion- controlled bya DCCP-Request or DCCP-Data packet with no Acknowledgement Number; theCCIDresetting endpoint would then use zero for theA-to-B half-connection. In contrast, DCCP-Ack packets sent by DCCP AReset's Sequence Number, since it has no valid Sequence Number available. DCCP-Reset Acknowledgement Numbers, and non-zero Sequence Numbers, arecontrolled by the CCID for the B-to-A half-connection. Generally, DCCP A will piggyback acknowledgement informationchecked more stringently than those ondata packets when acceptable, creating DCCP-DataAck packets. DCCP-Ack packets are used when thereother packet types, however. This is because DCCP-Reset always ends a connection: nodata toendpoint will sendfrom DCCP A to DCCP B,a non-Reset packet on a connection after it has sent a Reset. Thus, a Reset packet whose Sequence Number is less than GSR, orwhen the congestion state ofwhose Acknowledgement Number is less than GAR, must be sequence-invalid. o DCCP-Move Sequence and Acknowledgement Numbers are not strongly checked because moves might likely happen after long loss periods, and theA-to-B CCID willmandatory Mobility ID provides good protection against unexpected packets. o DCCP-Sync and DCCP-SyncAck Sequence Numbers are notallow datastrongly checked. These packet types exist specifically tobe sent. Kohler/Handley/Floyd/Padhyeget the endpoints back into sync after bursts of loss; checking their Sequence Numbers would eliminate their usefulness. Kohler/Handley/Floyd Section5.7.7.5.1. [Page41]46] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 DCCP-Ack and DCCP-DataAck packets often include additional acknowledgement options,February 2004 These lenient checks all allow continued operation after unusual events, such asAck Vector, as required by the congestion control mechanism in use. Section 8, below, describes acknowledgements in DCCP. 5.8. DCCP-CloseReq and DCCP-Close Packet Format The DCCP-CloseReqendpoint crashes andDCCP-Closelarge bursts of loss. There's no need for leniency when the endpoints are actively sending packetshaveto one another. Therefore, a DCCP endpoint SHOULD implement thesame format exceptfollowing, tighter constraints forType. However, only the server can sendactive connections. An endpoint considers aDCCP-CloseReq packet. Either clientconnection active if it has received valid packets from the other endpoint within the last several round-trip times, orserver may send1 second, if the RTT is not known. Acknowledgement Number Packet Type Sequence Number Check Check ----------- --------------------- ---------------------- DCCP-Reset GSR < seqno <= SWH GAR <= ackno <= AWH DCCP-Move SWL <= seqno <= SWH AWL <= ackno <= AWH DCCP-Sync SWL <= seqno <= SWH AWL <= ackno <= AWH DCCP-SyncAck SWL <= seqno <= SWH AWL <= ackno <= AWH Note that sequence-validity is only one of the validity checks applied to received packets. 7.5.2. Handling Sequence-Invalid Packets Sequence-invalid DCCP-Move, DCCP-Reset, DCCP-Sync, and DCCP-SyncAck packets MUST be ignored. When DCCP A receives any other sequence-invalid packet, it MUST reply with aDCCP-CloseDCCP-Sync packet. This packet MUST acknowledge the packet's Sequence Number (not GSR!). Thereceiver ofDCCP-Sync MUST use avalid DCCP-Closenew Sequence Number, and thus will increase GSS; GSR will not change, however, since the received packetSHOULD respond with a DCCP-Reset packet, with Reason set to "Closed";was sequence-invalid. DCCP A MUST NOT otherwise process sequence-invalid packets. For instance, it MUST NOT process their options. When the DCCP B endpointthat originally sentreceives theDCCP-Close will hold Time-Wait state. The receiver of a valid DCCP-CloseReq packet SHOULD respond(sequence-valid) DCCP-Sync, it MUST update its GSR variable and reply with aDCCP-Close packet; thatDCCP-SyncAck packet acknowledging the DCCP-Sync (not necessarily GSR!). Upon receivingendpointthis DCCP-SyncAck, which willexpect to hold Time-Wait state after later receiving a DCCP-Reset. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=5 or 6 (DCCP-CloseReq or Close) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)Iff (| Acknowledgement Number (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / [padding] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 5.9. DCCP-Reset Packet Format DCCP-Reset packets unconditionally shut down a connection. Every normal connection ends with a DCCP-Reset, but resets maybesent for other reasons, including bad port numbers, bad option behavior, incorrect ECN Nonce Echoes,sequence-valid since it acknowledges the DCCP-Sync, DCCP A will update its GSR variable, andso forth. The reason for a resetthe endpoints will be back in sync. Alternatively, if the connection was half-open (DCCP B isrepresented by an eight-bit number,in CLOSED or REQUEST state), DCCP B will send a Reset. A DCCP endpoint MAY temporarily preserve sequence-invalid packets in case they become valid later. This can reduce theReason field, and 24 bitsimpact ofadditional data. Thebursts of loss by delivering more packets to the application. In particular, an endpointthat receivesMAY preserve avalid DCCP-Resetsequence-invalid packetwill hold Time-Wait stateforthe connection. The optional DCCP-Reset payload,up to 2 round-trip times (or 1 second, ifpresent,the RTT isa human-readable text string, preferably in English and encoded in Unicode UTF-8,unknown); if, within thatdescribestime, theerror in more detail. DCCP-Reset packets MUST NOT be generated Kohler/Handley/Floyd/Padhyerelevant sequence windows change so that the Kohler/Handley/Floyd Section5.9.7.5.2. [Page42]47] INTERNET-DRAFT Expires:AprilAugust 2004October 2003February 2004 packet becomes sequence-valid, the endpoint MAY process the packet again. To protect itself against denial-of-service attacks (where an attacker sends many sequence-invalid packets, trying to force the receiver to send many DCCP-Syncs), a DCCP implementation MAY rate- limit the DCCP-Syncs sent in response toreceived DCCP-Resetsequence-invalid packets.0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=7 (DCCP-Reset) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved |7.5.3. Sequence and Acknowledgement Number| (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)Iff (|Windows Each DCCP endpoint defines sequence validity windows that are subsets of the Sequence and Acknowledgement Number(low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reason | Data 1 | Data 2 | Data 3 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / [padding] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | error text | | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Reason: 8 bits The Reason field representsspaces. These windows correspond to packets thereason thatendpoint expects to receive in thesender resetnext few round-trip times. The Sequence and Acknowledgement Number windows always contain GSR and GSS, respectively; the window widths are controlled by Sequence Window features. The Sequence Number validity window for packets from DCCP B is [SWL, SWH]. This window always contains GSR, the Greatest Sequence Number Received on a sequence-valid packet from DCCPconnection. Data 1, Data 2, and Data 3: 8 bits each The Data fields provide additional information about whyB. It is W packets wide, where W is thesender resetvalue of theDCCP connection. The meaningsSequence Window/B feature. One- fourth ofthese fields depend onthevaluesequence window, rounded down, is placed at and before GSR, with three-fourths after GSR. (This asymmetric placement assumes that bursts ofReason. The following Reasonsloss arecurrently defined. The "Data" columns describe whatmore common in theData fields should containnetwork than significant reordering.) invalid | valid Sequence Numbers | invalid <---------*|*===========*=======================*|*---------> GSR -|GSR + 1 - GSR GSR +|GSR + 1 + floor(W/4)|floor(W/4) ceil(3W/4)|ceil(3W/4) = SWL = SWH The Acknowledgement Number validity window fora given Reason. In those columns, N/A meanspackets from DCCP B is [AWL, AWH]. The high end of theData field SHOULD be set to 0window, AWH, always equals GSS, the Greatest Sequence Number Sent by DCCP A; thesenderwindow is W' packets wide, where W' is the value of theDCCP-Reset,Sequence Window/A feature. invalid | valid Acknowledgement Numbers | invalid <---------*|*===================================*|*---------> GSS - W'|GSS + 1 - W' GSS|GSS + 1 = AWL = AWH SWL andignored by its receiver. Kohler/Handley/Floyd/PadhyeAWL are initially adjusted so that they don't go below the initial Sequence Numbers received and sent, respectively: SWL := max(GSR + 1 - floor(W/4), ISR), AWL := max(GSS - W' + 1, ISS). Of course, these adjustments MUST NOT be applied after the relevant Kohler/Handley/Floyd Section5.9.7.5.3. [Page43]48] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 Section Reason Name Data 1 Data 2 Data 3 Reference ------ ---- ------ ------ ------ --------- 0 Unspecified N/A N/A N/A 1 Closed N/A N/A N/A 3.2 2 Invalid Packet packet N/A N/A 5.4 type 3 Option Error option option data number (if any) 4February 2004 sequence numbers wrap. 7.5.4. Sequence Window FeatureErrorThe Sequence Window/A feature determines the width of the Sequence Number validity window used by DCCP B, and the width of the Acknowledgement Number validity window used by DCCP A. DCCP A sends a "Change L(Sequence Window, W)" option to notify DCCP B that the Sequence Window/A value is W. Sequence Window has featuredatanumber(if any) 5 Connection Refused N/A N/A N/A 5.63, and is non-negotiable. It takes 3- or 6-byte integer values, like DCCP sequence numbers. Change and Confirm options for Sequence Window are therefore either 6Bad Service Code N/A N/A N/A 5.5 7 Too Busy N/A N/A N/A 5.6 8 Bad Init Cookie N/A N/A N/A 6.6 10 Unanswered Challenge N/A N/A N/A 6.5.4 11 Fruitless Negotiation feature feature data 6.4.8 number (optional) 12 Aggression Penalty N/A N/A N/A 9.2 13 No Connection N/A N/A N/A 5.4 14 Aborted N/A N/A N/A 5.4 15 Extended Seqnos N/A N/A N/A 5.3 16 Mandatory Failure option option data 6.3 number (if any) 17-127 Reserved 128-255 CCID-specific reasons ... variable ... 7.4or 9 bytes long. New connections start with Sequence Window 100 for both endpoints. ADCCP-Reset packet completes everyproper Sequence Window/A value should reflect how many packets DCCPconnection, whetherA expects to be in flight. Only DCCP A can anticipate this number. Too-small values increase the risk of the endpoints getting out sync after bursts of loss; too-large values increase the risk of connection hijacking. (The next section quantifies this risk.) One good guideline is for each endpoint to set Sequence Window to a small multiple of thetermination is clean (duemaximum number of packets it expects toapplication close; Reset Reason "Closed") or unclean. Unlike TCP, which has two distinct termination mechanisms (FIN and RST), DCCP ends all connectionssend in auniform manner.round-trip time. Thisis justified because some responses tovalue may not be available at connectiontermination close areinitiation, when thesame no matter whether termination was clean. For instance,round-trip time is unknown, but the endpoint can always send updates as the connection progresses. 7.5.5. Sequence Number Attacks Sequence and Acknowledgement Numbers form DCCP's main line of defense against attackers. An attacker thatreceivescannot guess sequence numbers cannot easily manipulate or hijack avalid DCCP-Reset should hold Time-Wait state forDCCP connection, and requirements like careful initial sequence number choice eliminate theconnection. Processors that must distinguish between cleanmost serious attacks. An attacker might still send many packets with randomly chosen Sequence andunclean termination can examineAcknowledgement Numbers, however. If one of those probes ends up sequence-valid, it may shut down theReset Reason. DCCP implementations MUST transitionconnection or otherwise cause problems. The easiest such attacks to execute are: o Send DCCP-Sync packets with random Sequence and Acknowledgement Numbers. If one of these packets hits theCLOSED state after sending a DCCP-Reset packet. 5.10. DCCP-Move Packet Format The DCCP-Move packet type is partvalid acknowledgement number window, the receiver will shift its sequence number window accordingly, getting out ofDCCP's support for multihomingsync with the correct endpoint---perhaps permanently. o Send DCCP-Reset packets with Sequence Number zero andmobility, which is described further in Section 10. DCCP A sends a DCCP-Move packet to DCCP B after changing its address and/or port number. The DCCP-Move packet requests that DCCP B start sending Kohler/Handley/Floyd/Padhyerandom Acknowledgement Numbers. If one of these packets hits the valid Kohler/Handley/Floyd Section5.10.7.5.5. [Page44]49] INTERNET-DRAFT Expires:AprilAugust 2004October 2003February 2004 acknowledgement number window, the connection will be shut down. o Send DCCP-Data packetstowith random Sequence Numbers. If one of these packets hits thenew address and port number. The new address and port come fromvalid sequence number window, the attack packet'snetwork header and generic DCCP header;application data may be inserted into theold address and port are defined through a Mobility ID, which must have been set earlier via a Mobility ID feature.data stream. TheMobility IDattacker has to guess both Source anda mandatory Identification option provide some protection against hijacked connections. See Section 10Destination Ports formore on security and DCCP's mobility support. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=8 (DCCP-Move) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)Iff (| Acknowledgement Number (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Mobility ID (high bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Mobility ID (low bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options, including Identification / [padding] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data | | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Mobility ID: 64 bits The valueany of these attacks to succeed. Additionally, thesender's Mobility ID feature. This value uniquely identifies the currentconnectionamongwould have to be inactive for theset of connections terminating atDCCP-Sync and DCCP-Reset packets to succeed, assuming thereceiver; it MUST have been set byvictim implemented thereceivermore stringent checks for active connections recommended inan earlier exchange. Options Every DCCP-Move packet MUST include a valid Identification option (seeSection6.5). DCCP B MUST ignore7.5.1. To quantify theDCCP-Move if it has no record forprobability of success, let N be thepacket's Mobility ID; ifnumber of attack packets theIdentification optionattacker isnot present or invalid; ifwilling to send, W be theSequence Number is not greater than GSR; or ifrelevant sequence window width, and L be theAcknowledgement Numberlength of sequence numbers (24 or 48). The attacker's best strategy isgreater than GSS. DCCP B SHOULD NOT respondtoinvalid Movesspace the attack packets evenly over sequence space. Then one of these attacks will succeed withDCCP-Reset or DCCP-Ack packets, since any such response would leak informationprobability P = WN/2^L. For N = 1000, W = 100, and L = 24, this probability is about 0.006. (For reference, theconnection, such as the currenteasiest TCP attack---sending a SYN with a random sequence number,towhich will cause apossibly malicious host. After receiving an invalid DCCP-Move, DCCP B MAY ignore subsequent DCCP- Move packets, valid or not,connection reset if it falls within the window---will succeed with probability 0.002 fora short periodN = 1000, W = 8760 [a common default], and L = 32.) Connections with sequence windows much larger than 100 SHOULD use extended sequence numbers to reduce the probability oftime, such as one Kohler/Handley/Floyd/Padhye Section 5.10. [Page 45] INTERNET-DRAFT Expires: April 2004 October 2003 second or one round-trip time. This protectsattack success. 7.5.6. Examples In the following example, DCCP A and DCCP Bagainst denial- of-service attacksrecover fromfloodsa large burst ofinvalid DCCP-Moves. DCCP-Move packets do not follow the usual sequence-validity rules. This is to support endpointsloss thatreact to long burstsruns DCCP A's sequence numbers out ofloss by moving. Such moves will often happen after the endpoints getDCCP B's appropriate sequence number window. Recovery from Burst of Loss DCCP A DCCP B (GSS=1,GSR=10) (GSS=10,GSR=1) --> DCCP-Data(seq 2) XXX ... --> DCCP-Data(seq 100) XXX --> DCCP-Data(seq 101) --> ??? seqno out ofsync, causing DCCP-Move packets to frequently have inappropriate Sequence Numbers. Butrange; send Sync OK <-- DCCP-Sync(seq 11, ack 101) <-- (GSS=11,GSR=1) --> DCCP-SyncAck(seq 102, ack 11) --> OK (GSS=102,GSR=11) (GSS=11,GSR=102) In theusual DCCP-Sync mechanismnext example, a DCCP connection recovers from a simple attack. The attacker cannot guess sequence numbers. (DCCP isinappropriate in responsenot Kohler/Handley/Floyd Section 7.5.6. [Page 50] INTERNET-DRAFT Expires: August 2004 February 2004 robust toMoves, since it could leakattackers who can guess sequencenumbers to possibly malicious hosts.numbers.) Recovery from Attack DCCP A DCCP BMUST set its GSR variable to the Sequence Number on(GSS=1,GSR=10) (GSS=10,GSR=1) *ATTACKER* --> DCCP-Data(seq 10^6) --> ??? seqno out of range; send Sync ??? <-- DCCP-Sync(seq 11, ack 10^6) <-- ackno out of range; ignore (GSS=1,GSR=10) (GSS=11,GSR=1) The final example demonstrates recovery from avalid DCCP-Move.half-open connection. Recovery from a Half-Open Connection DCCPB SHOULD acknowledge valid DCCP-Move packets with DCCP-Ack or DCCP-DataAck packets. IfA DCCP Baccepts the move, it MUST send this acknowledgement to the packet's network source address and DCCP Source Port; if it rejects the move, which it MAY do for any reason, it MUST send this acknowledgement to(GSS=1,GSR=10) (GSS=10,GSR=1) (Crash) CLOSED OPEN REQUEST --> DCCP-Request(seq 400) --> ??? !! <-- DCCP-Sync(seq 11, ack 400) <-- OPEN REQUEST --> DCCP-Reset(seq 401, ack 11) --> (Abort) REQUEST CLOSED REQUEST --> DCCP-Request(seq 402) --> ... 7.6. Extended Sequence Numbers Extended 48-bit sequence numbers increase theold address and old port. The moving endpoint,rate DCCPA,connections candetermine whether or not its move was accepted by checking the acknowledgement's destination addressachieve without wrapping sequence numbers, andPort. If the acknowledgement is lost, DCCP A might resendprovide additional protection against theDCCP-Move packet (using a newsequencenumber).number attacks described above. Very-high-rate DCCPB will detect this case because the network source addressconnections, andSource Port correspond to a valid connection, for whichconnections with large sequence windows, SHOULD therefore use extended sequence numbers rather than the default 24-bit sequence numbers. 7.6.1. When to Use Extended SequenceNumber and Acknowledgement Number fields are appropriate;Numbers The sequence-validity mechanism protects against theIdentification option is valid fornetwork delivering old data, but it assumes thatconnection; andtheMobility ID refers tonetwork does not deliver extremely old data. In particular, it assumes thatconnection. It SHOULD respond by sending another acknowledgement, as allowed bythecongestion control mechanism in use. Once DCCP B receives a non-Movenetwork must have dropped any packetfrom DCCP A, it MUST choose a new Mobility ID forby the time the connection wraps around andsend a new Change R(Mobility ID) option to DCCP A. This reduces the risk of replay. We note that DCCP mobility, as provided by DCCP-Move, may not be useful in the context of IPv6, withuses itsmandatory support for Mobile IP. 5.11. DCCP-Sync Packet Format DCCP-Sync packets are sent when thesequencenumbers ofnumber again. We can easily calculate theendpoints of amaximum connectionappear to have gotten out of sync. On receiving a valid DCCP-Sync packet, DCCP will update its GSR variable, thus restoring synchronization, and possibly send another DCCP-Sync packet to acknowledge the synchronization. DCCP-Sync packets look like this: Kohler/Handley/Floyd/Padhye Section 5.11. [Page 46] INTERNET-DRAFT Expires: April 2004 October 2003 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCP Header (12 or 16 bytes) / / with Type=9 (DCCP-Sync) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Acknowledgement Number | (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)Iff (| Acknowledgement Number (low bits) | Reserved |)X=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options / [padding] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 6. Options and Features All DCCP packets may contain options, which occupy space atrate that can be safely achieved given this constraint. Let MSL equal theend ofmaximum segment lifetime, P equal the average DCCPheader. Each option is a multiple of 8 bitspacket size inlength. The combination of all options MUST add up to a multiple of 32 bits. Individual options are not padded to multiples of 32bits,however; any option may begin on any byte boundary. All options are always included in the checksum. The first byte of an option is the option type. Options with types 0 through 31 are single-byte options. Other options are followed by a byte indicating the option's length. This length value includes the two bytes of option-type and option-length as well as any option-data bytes,andMUST therefore be greater than orL equalto two. Options are processed sequentially, starting attheearliest option in the packet header. The following options are currently defined: Kohler/Handley/Floyd/Padhyelength of sequence numbers (24 or 48 bits). Then the maximum safe rate, in bits per second, is R = P*(2^L)/2MSL. Kohler/Handley/Floyd Section6.7.6.1. [Page47]51] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 Option Section Type Length Meaning Reference ---- ------ ------- --------- 0 1 Padding 6.1 1 1 Mandatory 6.3February 2004 For the default MSL of 21 Slow Receiver 8.6 32 variable Ignored 6.2 33 variable Change L 6.4 34 variable Confirm L 6.4 35 variable Change R 6.4 36 variable Confirm R 6.4 37 variable Init Cookie 6.6 38 variable Ack Vector [Nonce 0] 8.5 39 variable Ack Vector [Nonce 1] 8.5 40 variable Data Dropped 8.7 41 6 Timestamp 6.7 42 6-10 Timestamp Echo 6.9 43 variable Identification 6.5.3 44 variable Challenge 6.5.4 45 4 Payload Checksum 8.8 46 4-6 Elapsed Time 6.8 128-255 variable CCID-specific options 7.4 6.1. Padding Optionminutes, 1500-byte DCCP packets, and 24-bit sequence numbers, the safe rate is therefore approximately 800 Mb/s. Of course, 2 minutes is a very large MSL for any networks that could sustain that rate with such small packets. Nevertheless, 48-bit sequence numbers allow much higher rates, up to 14 petabits a second for 1500-byte packets and the default MSL. ThePadding option,probability of sequence number attack success P = WN/2^L, discussed in Section 7.5.5, may also be relevant when deciding whether to use extended sequence numbers. A fast connection will generally have a relatively high W (sequence window size), increasing the attack success probability for fixed N (number of attack packets); if the probability gets uncomfortably high withtype 0,L = 24, the connection should use 48-bit sequence numbers instead. 7.6.2. Header Processing Extended sequence numbers are activated when the header's X bit isa single byte option usedset topad between or after options. It either ensuresone (see Section 5.1). This extends thepayload begins onSequence Number and Acknowledgement Number fields by an additional 24 bits, for a32-bit boundary (as required), or ensures alignmenttotal offollowing options (not mandatory). +--------+ |00000000| +--------+ Type=0 6.2. Ignored Option48 bits. TheIgnored option,48-bit numbers are stored in network order, withtype 32, signals that a DCCP did not understand some option. This can happen,most significant bit first. All packet types except forexample, when one DCCP conversesDCCP-Data and DCCP-Request will follow this generic header withanother,an extendedDCCP. Each Ignored option48-bit Acknowledgement Number. Once an endpoint hasonetransitioned to 48-bit sequence numbers (X=1), it MUST send all succeeding packets with 48-bit sequence numbers. Furthermore, once an endpoint has received a sequence-valid packet with 48-bit sequence numbers, it MUST either send all succeeding packets with 48-bit sequence numbers, ormore bytes of data. The first byte containsreset theoffending option type;connection with Reset Code 7, "Extended Sequence Numbers". (But note that an endpoint may send extended DCCP-Sync packets before transitioning to extended sequence numbers.) Clients SHOULD decide whether to use extended sequence numbers before sending their DCCP-Requests. However, thesecondTransition bit (T) andsubsequent, if present, contain the first bytes of the offending option's data. If the offending option had data, the Ignored option MUST include at least one byte ofSequence Transition Capable feature support transitioning to extended sequence numbers during an active connection, in case this proves necessary; see below. A client thatdata, butsends an extended DCCP- Request might receive a DCCP-Reset in response with Reset Code 7, "Extended Sequence Numbers"; the client SHOULD respond by sending another Request using 24-bit sequence numbers. Extended sequence numbers are treated simply as longer sequence numbers. For instance, theIgnored option MUST NOT carry more Opt Data thansequence-validity mechanisms work theoffending option had data. Kohler/Handley/Floyd/Padhyesame way whether or not sequence numbers are extended. Care is required when comparing a 24-bit sequence number with an 48-bit sequence number, however; see the next section. Kohler/Handley/Floyd Section6.2.7.6.2. [Page48]52] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 Ignored options should preferably concern options sent on the packet acknowledged by the Acknowledgement Number. Packets without AcknowledgementFebruary 2004 7.6.3. Transitioning to Extended Sequence Numbers(that is, DCCP-Request and DCCP-Data) SHOULD NOT carry Ignored options. +--------+--------+--------+ |00100000|00000011|Opt Type| +--------+--------+--------+ Type=32 Length=3 +--------+--------+--------+--------+-------- |00100000| Length |Opt Type| Opt Data ... +--------+--------+--------+--------+-------- Type=32 6.3. Mandatory OptionTheMandatory option, with type 1, is a single byte option that indicates that the immediatelyTransition bit (T) followingoption is mandatory. Ifthereceiving DCCP does not understand that following option,extended Sequence Number field makes itMUST resetpossible to transition to 48-bit sequence numbers in theconnection with Reset Reasonmiddle of a connection. T is set to"Mandatory Failure". For instance, say DCCP A receives a packet with two options:one only during such aMandatory option, and immediately following, another option O. Thentransition. When DCCP Awould reset the connection (rather than, for example, sending an Ignored(O) option) if it did not understand O's type; ifswitches to 48-bit sequence numbers, itunderstood O's type, but not O's data; if O's data was invalid for O's type; if O was a feature negotiation option, and DCCP A did not understandMUST set theenclosed feature number; if DCCP A understood O, but chose notT bit toperform the action O implies; and so forth. +--------+ |00000001| +--------+ Type=1 6.4. Feature Negotiation DCCP contains a mechanismone on all of its packets forreliably negotiating features, notably the congestion control mechanism in usesome period. This period SHOULD last oneach half-connection. The motivation is to implement reliable feature negotiation once, so that different options need not reinvent that wheel. Features are identified by feature number and owning endpoint. The notation (F,E) representsthefeature with feature number F that is owned by DCCP E. A connection generally has two features for each Kohler/Handley/Floyd/Padhye Section 6.4. [Page 49] INTERNET-DRAFT Expires: April 2004 October 2003 feature number, one per endpoint (or, equivalently, one per half- connection). Givenorder of afeature owned by DCCP A, we callfew round trip times, or until DCCP Athe feature location andreceives an acknowledgement from DCCP Bthe feature remote. Both endpoints keep track of the values of all features, since the pointproving that one offeature negotiation isits 48-bit-sequence-number packets has been received, whichever comes later. Each DCCP MUST choose its first 48-bit sequence number toensure agreement. Four options, Change L, Confirm L, Change R, and Confirm R, implement feature negotiation. The "L" options are sent by the feature location, the "R" options are sent by the feature remote. Change options initiate a negotiation, Confirm options completehave its lower 24 bits equal thenegotiation. Change options are retransmitted24-bit sequence number it expected toensure reliability. Feature values MUST NOT change apart from feature negotiation.send (GSS+1). The upper 24 bits may be chosen arbitrarily. Thisproperty, retransmissions, andapplies to Acknowledgement Numbers as well as Sequence Numbers; if DCCP A sends an extended packet containing an Acknowledgement Number before DCCP B sends it a 48-bit Sequence Number, DCCP A can choose any valuepriority rules ensure that both endpoints eventually agree on every feature's value. Negotiationsformultiple features may take place simultaneously. For instance,the upper 24 bits of the Acknowledgement Number, but the lower 24 bits MUST equal the expected 24-bit Acknowledgement Number (GSR). Furthermore, DCCP A MUST leave GSR as a 24-bit number until receiving an extended packetmay contain multiple Change options that referfrom DCCP B. Switching todifferent features. The endpoints may also simultaneously open negotiations for48-bit sequence numbers in thesame feature; they will still agree onmiddle of asingle value. Feature negotiation generally takes place using packet typesconnection complicates sequence number comparison. Endpoints must compare 48-bit sequence numbers with 24-bit sequence numbers, and compare 48-bit sequence numbers thatcarry no user data, such as DCCP-Ack, particularly whenmight have different, arbitrary values in therelevant feature may affectupper 24 bits, while remaining robust to reordering and to old or malicious packets. The following procedure describes howdata willsequence numbers should betreated. Here are three example feature negotiations for features located atcompared during and immediately after a transition. Let P be the packet sequence number received from DCCP B, and E be thefirst twosequence number DCCP A expects. During sequence-validity computations, for example, P might be theCongestion Control ID feature,packet's Acknowledgement Number and E might be AWL, thelast forleft edge of theAck Ratio: Kohler/Handley/Floyd/Padhye Section 6.4. [Page 50] INTERNET-DRAFT Expires: April 2004 October 2003appropriate acknowledgement number window. Then DCCP ADCCP B 1. Change R(CCID, 2 3 1) ---> ("2 3 1" is DCCP A's value preference list) 2. <--- Confirm L(CCID, 3, 3 2 1) (3 isshould perform thenegotiated value; "3 2 1" is B's pref list) * agreement that (CCID,B) = 3 * 1. XXX <--- Change L(CCID, 3 2 1) 2. Retransmission: <--- Change L(CCID, 3 2 1) 3. Confirm R(CCID, 3, 2 3 1) ---> * agreement that (CCID,B) = 3 * 1. Change R(Ack Ratio, 3) ---> 2. <--- Confirm L(Ack Ratio, 3) * agreement that (Ack Ratio,B) = 3 * 6.4.1. Value Types The feature negotiation optionscomparison as follows. o If P and E arethe same for every feature number, but the format for feature values,both 24 bits, compare them modulo 2^24. o If P andthe value priority rulesE are both 48 bits, you generally compare them modulo 2^48, except thatdetermine the result ofduring anegotiation, differ from feature to feature. All current DCCP features fit one oftransition, the twovalue types, non-negotiable ("NN") or server-priority ("SP"), although other value types are possible. o Non-negotiable features: The feature valuevalues might have arbitrary values in the upper 24 bits. - If the packet's Transition bit isa byte string. Each option contains exactly one feature value. The feature remote changesset, and thevaluelast packet sent bysending Change R options. The feature location has no preferred value for the feature,DCCP A had its Transition bit set, then compare P andMUST acceptE modulo 2^24. Kohler/Handley/Floyd Section 7.6.3. [Page 53] INTERNET-DRAFT Expires: August 2004 February 2004 - Otherwise, compare them modulo 2^48. o If P is 48 bits but E is 24, the remote DCCP may want to transition to extended sequence numbers. - If theproposed value (as long as itpacket's Transition bit isvalid), respondingset, compare P witha Confirm L option containingE modulo 2^24. If thenew value. Change Lpacket proves sequence-valid, then it is OK; transition to extended sequence numbers, andConfirm R options MUST NOTset E according to the full 48 bits of P. - Otherwise, the packet is sequence-invalid. Either way, if the packet proves to besent for non-negotiable features.sequence-invalid, send an extended DCCP-Sync if required (with T set to one), but do not yet transition to extended sequence numbers. oServer-priority features: The feature valueIf P isa fixed-length byte string (length determined by24 bits but E is 48, there may have been benign packet reordering. The correct action depends on whether thefeature number). Each Change option contains a prioritized list of values, withlast sequence-valid packet received from DCCP B had themost preferredTransition bit set. - If Transition was set, extend P to a 48-bit valuecoming first. Each Confirm option containsP'. First, let EH equal theconfirmed value, followed byupper 24 bits of E, and EL equal theconfirmer's value preference list.lower 24 bits of E. Then: If EL > P, set P' = (EH << 24) | P. Otherwise, set P' = (((EH - 1) mod 2^24) << 24) | P. Thevalue priority rule is server priority: Given both preference lists, select the first entry in the server's list that also occurs in"EL > P" test uses arithmetic comparison, NOT circular comparison. Compare P' with E modulo 2^48. - Otherwise, theclient's list. If therepacket isno shared entry,sequence-invalid. Either way, if theconnection MUSTpacket proves to beresetsequence-invalid, send an extended DCCP-Sync if required, withReasonT set toFruitless Negotiation. All four option types are meaningful for server- priority features. Kohler/Handley/Floyd/Padhye Section 6.4.1. [Page 51] INTERNET-DRAFT Expires: April 2004 October 2003one. DCCPendpoints need not calculate their value preference lists before feature negotiation begins. Thus, a server might adjust its preference list based on the client's preference list, assumingimplementations can, of course, avoid most of this complexity by disallowing transitions to extended sequence numbers (and by resetting theclient openedconnection when thenegotiation. Once a negotiation forother endpoint attempts such afeature has begun, however,transition). Connections thatfeature's preference lists MUST remain stable untiluse 48-bit sequence numbers throughout, starting with thenegotiation has closed. 6.4.2.DCCP-Request, MUST have T set to zero on all their packets. 7.6.4. Sequence Transition Capable FeatureNumbersThefirst data byte of every Change or Confirm option is a feature number, defining the type ofSequence Transition Capable featurebeing negotiated. The remainderexpresses whether DCCP endpoints are capable ofthe data gives one or more values for the feature, and is interpreted accordingtransitioning to extended sequence numbers in thefeature. The current setcourse offeature numbers is as follows: Value Initial Section Number Meaning Type Value Reference ------ ------- ----- ----- --------- 1 Congestion Control ID (CCID) SP 2 7 2 ECN Capable SP 1 9.1 3 Ack Ratio NN 2 8.3 4 Use Ack Vector SP 0 8.4 5 Mobility Capable SP 0 10.1 6 Loss Window NN 1000 6.10 7 Connection Nonce NN random 6.5.2 8 Identification Regime SP 1 6.5.1 9 Mobility ID NN 0 10.2 128-255 CCID-specific features ? ? 7.4 6.4.3. Change L Optionan active connection. DCCP A sends aChange LKohler/Handley/Floyd Section 7.6.4. [Page 54] INTERNET-DRAFT Expires: August 2004 February 2004 "Change R(Sequence Transition Capable, 1)" option to DCCP B toinitiate a negotiation for adiscover whether B can transition to extended sequence numbers. Sequence Transition Capable has featurelocated at DCCP A.number 4, and is server- priority. It takes one-byte Boolean values. DCCP BSHOULD respondMUST allow transitions toa Change option for a known feature with a Confirm R option. In special circumstances, such as a Change option whose valueextended sequence numbers when Sequence Transition Capable/B isinappropriate forone. It MUST NOT reset thelisted feature number,connection with Reset Code 7, "Extended Sequence Numbers", under those circumstances. However, DCCP B MAYrespond instead by ignoring the Change (with or without sending an Ignored option),allow such transitions even when Sequence Transition Capable/B is zero. Values of two orby resetting the connectionmore are reserved. New connections start withReason set to "Fruitless Negotiation" or "Feature Error".Sequence Transition Capable 0 (that is, not capable) for both endpoints. 7.7. NDP Count and Detecting Application Loss DCCP's sequence numbers increment by one on every packet, including non-data packets (packets that don't carry application data). This makes DCCPA SHOULD retransmitsequence numbers suitable for detecting any network loss, but not for detecting theChange L option until it receives oneloss ofthose responses. It could send at least oneapplication data. The NDP Count optionper round-trip time,reports the length of each burst of non-data packets. This lets the receiving DCCP determine, forinstance,every burst of loss, whether orit could add the Change Lnot application data was lost. +--------+--------+-------- ... --------+ |00100101| Length | NDP Count | +--------+--------+-------- ... --------+ Type=37 Len=3-5 If a DCCP endpoint's Send NDP Count feature is one (see below), then that endpoint MUST send an NDP Count optiontoon everyKthpacket whose immediate predecessor was a non-data packet. Non-data packets consist of DCCPA MAY resetpacket types DCCP-Ack, DCCP-Close, DCCP-CloseReq, DCCP-Reset, DCCP-Move, DCCP-Sync, and DCCP-SyncAck. All other packet types are considered data packets, although not all DCCP- Request and DCCP-Response packets will actually carry application data. The value stored in NDP Count equals theconnectionnumber of consecutive non- data packets in the run immediately previous to the current packet. Packets withReason setno NDP Count option are considered to"Fruitless Negotiation" or "Feature Error" if retransmission fails (no meaningful response is received after 10 attempts or more).have NDP Count zero. TheformatNDP Count option can carry one to three bytes of data. The smallest option format that can hold theoption's data ("Value or Values") depends on the feature's value type. Change L options are invalid for non-negotiable features. Kohler/Handley/Floyd/PadhyeNDP Count SHOULD be used. Kohler/Handley/Floyd Section6.4.3.7.7. [Page52]55] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 +--------+--------+--------+--------+--------+-------- |00100001| Length |Feature#| Value or Values ... +--------+--------+--------+--------+--------+-------- Type=33 An example Change L option follows. 33,5,1,2,3 I want to change my CC feature (feature number 1, a server- priority feature); my preferred valuesFebruary 2004 7.7.1. Usage Notes Say that K consecutive sequence numbers are2 and 3,missing in some burst of loss, and the Send NDP Count feature is on. Then some application data was lost within those sequence numbers unless the packet following the hole contains an NDP Count option whose value is greater than or equal to K. For example, say thatpreference order. 6.4.4. Confirm L Optionthe following sequence of non-data packets (Nx) and data packets (Dx) were sent. N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 Those packets would have NDP Counts as follows. N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 - 1 2 - 1 - - 1 - - - - 1 2 NDP Count is not useful for applications that include their own sequence numbers with their packet headers. 7.7.2. Send NDP Count Feature The Send NDP Count feature lets DCCPs negotiate whether they should send NDP Count options on their packets. DCCP A sends aConfirm L"Change R(Send NDP Count, 1)" option to ask DCCP Bin responseto send NDP Count options. Send NDP Count has feature number 9, and is server-priority. It takes one-byte Boolean values. DCCP B MUST send NDP Count options on its non-data packets (and some of its data packets) when Send NDP Count/B is one, although it MAY send NDP Count options even when Send NDP Count/B is zero. Values of two or more are reserved. New connections start with Send NDP Count 0 for both endpoints. 8. Event Processing This section describes how DCCP connections move between states, and which packets are sent when. Note that feature negotiation takes place in parallel with the connection-wide state transitions described here. 8.1. Connection Establishment DCCP connections' initiation phase consists of avalid Change R optionthree-way handshake: an initial DCCP-Request packet sent byDCCP B. The Confirm L option will completethenegotiation forclient, afeature located at DCCP A. Confirm L need not be retransmitted, since Change R will be retransmitted as necessary. Again,DCCP-Response sent by theformat of "Value or Values" depends onserver in reply, and finally an acknowledgement from thefeature's value type. +--------+--------+--------+--------+--------+-------- |00100010| Length |Feature#| Value or Values ... +--------+--------+--------+--------+--------+-------- Type=34 Example Confirm L options follow. 34,6,1,2,2,3 I have changed my CC feature (feature number 1,client, usually via aserver- priority feature)DCCP-Ack or DCCP- Kohler/Handley/Floyd Section 8.1. [Page 56] INTERNET-DRAFT Expires: August 2004 February 2004 DataAck packet. The client moves from the REQUEST state tovalue 2; my preferred values are 2PARTOPEN, and3, in that preference order. 34,9,7,239,48,2,188 I have changed my Connection Nonce feature (feature number 7, a non-negotiable feature)finally to OPEN; the4-byte string 239,48,2,188. 6.4.5. Change R Option DCCP A sends a Change R optionserver moves from LISTEN toDCCP BRESPOND, and finally to OPEN. Client State Server State CLOSED LISTEN 1. REQUEST --> Request --> 2. <-- Response <-- RESPOND 3. PARTOPEN --> Ack, DataAck --> 4. <-- Data, Ack, DataAck <-- OPEN 5. OPEN <-> Data, Ack, DataAck <-> OPEN 8.1.1. Client Request When a client decides to initiate anegotiation forconnection, it enters the REQUEST state, chooses an initial sequence number (Section 7.2), and sends a DCCP-Request packet using that sequence number to the intended server. DCCP-Request packets will commonly carry featurelocated at DCCP B.negotiation options that open negotiations for various connection parameters, such as preferred congestion control IDs for each half-connection. They may also carry application data, but the client should be aware that the server may not accept such data. A client in the REQUEST state SHOULD send new DCCP-Request packets after some timeout if no response is received. Thepossible responses to Change R are analogousretransmission strategy SHOULD be similar tothosethat forChange L (Confirm L, Ignored, or Reset). Asretransmitting TCP SYNs; for instance, a first timeout on the order of a second, withChange L, DCCPan exponential backoff timer. Each new DCCP-Request MUST increment the Sequence Number by one, and MUST contain the same Service Code and application data as the original DCCP-Request. A client MAY give up after some number of DCCP-Requests. If so, it SHOULDretransmitsend a DCCP-Reset packet to the server with Reset Code 2, "Aborted", to clean up state in case one or more of theChange R option untilRequests actually arrived. The client leaves the REQUEST state for PARTOPEN when it receives aresponse, orDCCP-Response from theretransmission times out. Again,server. 8.1.2. Service Codes Each DCCP-Request contains a 32-bit Service Code, which identifies theformat of "Value or Values" depends onservice to which thefeature's value type. Kohler/Handley/Floyd/Padhyeclient application is trying to connect. Service Codes should correspond to application services and protocols. For example, there might be a Service Code for HTTP Kohler/Handley/Floyd Section6.4.5.8.1.2. [Page53]57] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 +--------+--------+--------+--------+--------+-------- |00100011| Length |Feature#| Value or Values ... +--------+--------+--------+--------+--------+-------- Type=35 Example Change R options follow. 35,5,1,3,2 Please change your CC feature (feature number 1, a server- priority feature); my preferred values are 3February 2004 connections, one for FTP control connections, and2, in that preference order. 35,9,7,239,48,2,188 Change your Connection Nonce feature (feature number 1, a non- negotiable feature)one for FTP data connections. Middleboxes, such as firewalls, can use the Service Code to identify the4-byte string 239,48,2,188. 6.4.6. Confirm R Option DCCP A sendsapplication running on aConfirm R option tononstandard port (assuming the DCCPB in response toheader has not been encrypted). Endpoints MUST associate avalid Change L option sent byService Code with every DCCPB.socket, both actively and passively opened. TheConfirm R optionapplication willcompletegenerally supply this Service Code. Each active socket MUST have exactly one Service Code, while passive sockets MAY have more than one; this might let multiple applications listen on thenegotiationsame port, differentiated by Service Code. If the DCCP-Request's Service Code doesn't match any of the server's Service Codes for the given port, the server MUST reject the request by sending afeature located at DCCP B. Confirm R need notDCCP-Reset packet with Reset Code 9, "Bad Service Code". A middlebox MAY also send such a DCCP-Reset in response to packets whose Service Code is considered unsuitable. Service Codes should beretransmitted, since Change L willallocated by IANA. We intend for Service Code allocation to beretransmitted as necessary. Again,allocated to anyone who asks, first-come first-serve, subject to theformatfollowing guidelines. o Service Codes should be allocated one at a time, or in small blocks. A short English description of"Valuethe intended service is required to obtain a Service Code assignment, but no specification, standards-track orValues" depends onotherwise, is necessary. IANA should maintain an association of Service Codes to thefeature's value type. +--------+--------+--------+--------+--------+-------- |00100100| Length |Feature#| Valuecorresponding phrases. o Users may request specific Service Code values, which should be assigned first-come first-serve. We suggest that users request Service Codes that can be interpreted as meaningful four-byte ASCII strings. Thus, the "Frobodyne Plotz Protocol" might correspond to "fdpz", orValues ... +--------+--------+--------+--------+--------+-------- Type=36 An example Confirm R option follows. 36,6,1,2,3,2 Change your CC feature (featurethe number1,1717858426. The canonical interpretation of aserver-priority feature) to 2; my preferredService Code field is numeric. o Service Codes whose bytes each have valuesare 3 and 2,inthat preference order. 6.4.7. Unknown Features If a DCCP receives a Change option referring to a feature number it does not understand, it SHOULD respond with an Ignored option. This informs the remote DCCP thatthelocal DCCP does not implementset {32, 45-57, 65-90} should be reserved for international standard or standards- track specifications, IETF or otherwise. (This set consists of thefeature. No other action needASCII digits, uppercase letters, and characters space, '-', '.', and '/'.) o Service Codes whose high-order byte equals 63 (ASCII '?') should never betaken. (Ignored may also indicate thatallocated. These Service Codes are reserved for private use. o Service Code 0 should never be allocated. It represents theDCCP endpoint could not respond toabsence of aCCID-specific feature request because the CCID was in flux; see Section 7.4.) Kohler/Handley/Floyd/Padhyemeaningful Service Code. Kohler/Handley/Floyd Section6.4.7.8.1.2. [Page54]58] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 6.4.8. State Diagram These state diagrams presentFebruary 2004 This design for Service Code allocation is based on thelegal transitions in a DCCP feature negotiation. They define a DCCP's statesallocation of 4-byte identifiers for Macintosh resources, PNG chunks, andtransitions with respect toTrueType and OpenType tables. 8.1.3. Server Response In thenegotiationsecond phase ofa single feature it understands. There are two diagrams, corresponding tothetwo endpoints:three-way handshake, thefeature location, DCCP A, andserver moves from thefeature remote, DCCP B. Each endpoint can be in one of three states, STABLE, CHANGING, and FAILED. The STABLELISTEN statemeans that a value is known for the featureto RESPOND, andno negotiation is in progress. Every feature starts out in the STABLE state. The CHANGING state means thatsends anegotiation started byDCCP-Response message to the client. In thisendpoint is in progress forphase, a server will often specify thefeature. This isfeatures it would like to use, either from among those theonly stateclient requested, or inwhich retransmissions happen. Finally, the FAILED state means thataddition to those. Among these options is theother endpoint does not understandcongestion control mechanism thefeature in question. Transitions between states are triggered by receivingserver expects to use. The receiver MAY respond to avalidDCCP-Request packetcontaining some valid negotiation option, or by an application or protocol event. Receivingwith aChange option causes the new feature valueDCCP-Reset packet tobe calculated, and a Confirm option sent. The details of this calculation, and the contents of Confirm, depend on the value type of the feature in question. Endpoints that receive valid Confirm options can simply trustrefuse thevalues they contain, or they could redoconnection. Relevant Reset Codes for refusing a connection include 8, "Connection Refused", when thefeature value calculation; again, this is feature- specific. Kohler/Handley/Floyd/Padhye Section 6.4.8. [Page 55] INTERNET-DRAFT Expires: April 2004 October 2003 FEATURE LOCATION STATE DIAGRAM (DCCP A) rcv Confirm R app/protocol evt : snd Change L : ignore +---------------------------+ +----+ | | | v | rcv Confirm R v +------------+ : accept value +------------+ | |<-------------------| | | STABLE | | CHANGING |------+ | |<-------------------| | | +------------+ rcv Change R +------------+ | | ^ : calc new value, | ^ | +-----+ snd Confirm L +-----+ | rcv Change R timeout/rcv non-ack | : calc new value, : snd Change L | snd Confirm L | rcv Ignored/timeout fails | : snd Reset/ignore/other v +----------+ | FAILED | +----------+ FEATURE REMOTE STATE DIAGRAM (DCCP B) rcv Confirm L app/protocol evt : snd Change R : ignore +---------------------------+ +----+ | | | v | rcv Confirm L v +------------+ : calc new value +------------+ | |<-------------------| | | STABLE | | CHANGING |------+ | |<-------------------| | | +------------+ rcv Change L +------------+ | | ^ : calc new value, | ^ | +-----+ snd Confirm R +-----+ | rcv Change L timeout/rcv non-ack | : calc new value, : snd Change R | snd Confirm R | rcv Ignored/timeout fails | : snd Reset/ignore/other v +----------+ | FAILED | +----------+DCCP- Request's Destination Port did not correspond to a DCCPimplementations MUST sanity-check options' data as appropriateport open for listening; 9, "Bad Service Code", when thefeature before acting accordingDCCP-Request's Service Code did not correspond to thediagram. For Kohler/Handley/Floyd/Padhye Section 6.4.8. [Page 56] INTERNET-DRAFT Expires: April 2004 October 2003 example, Ack Ratio takes two-byte, non-zero integer values, so a "Confirm(Ack Ratio, 0)" option is never valid. Server-priority features can tolerate some unknown values inservice code registered with thepriority list, as long asDestination Port; and 10, "Too Busy", when theselected valueserver isunderstood. Invalid options SHOULD cause a transitioncurrently too busy to respond to requests. The server SHOULD limit theFAILED state, with an appropriate accompanying action, such as sendingrate at which it generates these resets. The receiver SHOULD NOT retransmit DCCP-Response packets; the sender will retransmit the DCCP-Request if necessary. (Note that the "retransmitted" DCCP-Request will have, at least, areset with Reason set to "Feature Error".different sequence number from the "original" DCCP-Request; the receiver can thus distinguish true retransmissions from network duplicates.) The"snd" actions requestresponder will detect that thesendingretransmitted DCCP-Request applies to an existing connection because of its Source and Destination Ports. Every valid DCCP-Request received while the server is in the RESPOND state MUST elicit anegotiation option. They do not force DCCPnew DCCP-Response. Each new DCCP-Response MUST increment the responder's Sequence Number by one, and MUST include the same application data, if any, as the original DCCP-Response. The responder MUST accept at most one piece of DCCP-Request data per connection. In particular, the DCCP-Response sent in reply toimmediately generateapacket; rather, they sayretransmitted DCCP-Request with data SHOULD contain a Data Dropped option, in whichfeature optionthe retransmitted DCCP-Request is reported as "data dropped due to protocol constraints" (Drop Code 0). The original DCCP-Request SHOULD also besent onreported in thenext packet generated. A DCCP MAY choose to generateData Dropped option, either in apacket, suchNormal Block (if the responder accepted the data, or there was no data), or in a Drop Code 0 Drop Block (if the responder refused the data the first time as well). The Data Dropped and Init Cookie options are particularly useful for DCCP-Response packets (Sections 11.7 and 8.1.4). Kohler/Handley/Floyd Section 8.1.3. [Page 59] INTERNET-DRAFT Expires: August 2004 February 2004 The server leaves the RESPOND state for OPEN when it receives aDCCP-Ack, in response to some "snd" action, rather than piggyback on another packet. In some cases, this may be required---if adding anvalid DCCP-Ack from the client, completing the three-way handshake. 8.1.4. Init Cookie Option +--------+--------+--------+--------+--------+-------- |00100100| Length | Init Cookie Value ... +--------+--------+--------+--------+--------+-------- Type=36 The Init Cookie optionwould bumplets apacket overDCCP server avoid having to hold any state until thePMTU, for instance. However,three-way connection setup handshake has completed. The server wraps up the service code, server port, and any options itMUST NOT generatecares about from both the DCCP-Request and DCCP-Response in an opaque cookie. Typically the cookie will be encrypted using apacket if doingsecret known only to the server and include a cryptographic checksum or magic value sowould violatethat correct decryption can be verified. When thecongestion control mechanismserver receives the cookie back inuse. Retransmissions of Change options happen according to an exponential-backoff timer, and/or whentheCHANGING DCCP realizes thatresponse, it can decrypt the cookie and instantiate all thepacket containing a Change option wasstate it avoided keeping. In the meantime, it need notreceived. A Changemove from the LISTEN state. This option is permitted in DCCP-Response, DCCP-Data, DCCP-Ack, DCCP-DataAck, DCCP-Sync, and DCCP-SyncAck packets. The server MAYadditionally be piggybacked on other packets sent during the negotiation. After too many timer backoff events, or wheninclude anexplicit IgnoredInit Cookie optionis received,in its DCCP-Response. If so, then theCHANGING DCCPclient MUSTtransition toecho theFAILED state, as shown. The CHANGINGsame Init Cookie option in each succeeding DCCPMUST NOT transition topacket until one of those packets is acknowledged, meaning theFAILED state simply becausethree-way handshake has completed, or theother DCCP seems to be ignoringconnection is reset. The server SHOULD design itsChange options (for example,Init Cookie format so that Init Cookies can be checked for tampering; it SHOULD respond to a tampered Init Cookie option byacknowledgingresetting thepacket containingconnection with Reset Code 11, "Bad Init Cookie". The precise implementation of theoptions, butInit Cookie does notincluding a Confirm); reordering can cause this behavior even if the endpoint understands the options. The timeout value might initiallyneed to besetspecified here; since Init Cookies are opaque toa small multiple of round-trip times (or 0.2 seconds, ifthe client, there are noRTT is available). Backoff should be pinned at roughly 32 RTTs; timer failure should occur afterinteroperability concerns. Init Cookies are limited to atleast 12 retransmissions. Feature negotiation options for a given feature MUST be processedmost 253 bytes inincreasing order by Sequence Number. Say thatlength. 8.1.5. Handshake Completion When thelast processed negotiation option forclient receives afeature (F,X) came onDCCP-Response from the server, it moves from the REQUEST state to PARTOPEN, and completes three-way handshake by sending a DCCP-Ack packetwith sequence number S. Then any negotiation options on received packets with Sequence Number less than or equaltoSthe server. The PARTOPEN state represents that the client isn't sure whether the server has received any of its DCCP-Acks. The client MUSTbe ignored.NOT send DCCP-Data packets while it remains in PARTOPEN. Thisrequirement MAY be implemented per-feature, or implementations MAY compare against a single Sequence Number---the most recent negotiation option processed for any feature. Feature negotiation options on safely reorderedis because DCCP-Data packets(with last-negotiation-seqno < S < GSR) SHOULD be accepted, to provide some robustness against reordering. Simultaneous negotiation problems can arise if value preferences change too frequently, particularly for server-priority features. A Kohler/Handley/Floyd/Padhyelack Acknowledgement Numbers, so the server can't tell from Kohler/Handley/Floyd Section6.4.8.8.1.5. [Page57]60] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 DCCP endpoint MUST NOT changeFebruary 2004 a DCCP-Data packet whether the client saw itsvalue preferences while inDCCP-Response. Furthermore, if theCHANGING state: itDCCP-Response included an Init Cookie, that Init Cookie MUSTinstead complete any extant negotiation, then open a new one. Ifbe included on every packet sent in PARTOPEN. The single DCCP-Ack sent when entering theresultPARTOPEN state might, ofsome feature negotiation iscourse, be dropped by the network. The client SHOULD ensure that some packet gets through eventually. The preferred mechanism would be afeature has an unacceptable value---for example, fordelayed-ack-like 200-millisecond timer, set every time aserver-priority feature, none of the client's choices were acceptable to the server,packet is transmitted in PARTOPEN. If this timer goes off and theprior valueclient isunacceptable tostill in PARTOPEN, theclient---a DCCP endpoint MAYclient generates another DCCP-Ack and backs off the timer. If the client remains in PARTOPEN for more than 4MSL, it SHOULD reset theconnection,connection withDCCP-Reset Reason set to "Fruitless Negotiation".Reset Code 2, "Aborted". TheCHANGINGclient leaves the PARTOPEN statesignals thatfor OPEN when it receives a packet other than DCCP-Response or DCCP-Reset from therelevant feature's value isserver. 8.2. Data Transfer In the central, data transfer phase of the connection, both server and client are influx.the OPEN state. DCCPMAY change its behavior when certain features are CHANGING---for example, by refusing to send data until reentering STABLE. 6.4.9. Streamlined Negotiation This section provides guidance for implementations that do not wishA sends DCCP-Data and DCCP-DataAck packets toimplement full feature negotiation, although general-purpose DCCP implementations SHOULD implement negotiation fully. MinimalDCCPimplementations, such as those for embedded devices, might force all negotiationB due totake placeapplication events on host A. These packets are congestion- controlled by thefirst packet exchange. The DCCP-Request would contain Change R optionsCCID forall server-located features, and Change L optionsthe A-to-B half-connection. In contrast, DCCP-Ack packets sent by DCCP A are controlled by the CCID forall client-located features;theDCCP-Response would Confirm each of these requests,B-to-A half-connection. Generally, DCCP A will piggyback acknowledgement information on DCCP-Data packets when acceptable, creating DCCP-DataAck packets. DCCP-Ack packets are used when there is no data to send from DCCP A to DCCP B, orresetwhen theconnection if any Change was unexpected or unacceptable. Changes for CCID-specific features MUST follow Changes forcongestion state of theCongestion Control ID featureA-to-B CCID will not allow data to be sent. The DCCP-Move, DCCP-Sync, and DCCP-SyncAck packets will also occur in theoption list, since optionsdata transfer phase. DCCP-Move handling is discussed in Section 14, and some cases causing DCCP-Sync generation areprocesseddiscussed inorder. Once the connectionSection 7.5. One important distinction between DCCP- Sync packets and other packet types isset up, minimal implementationsthat DCCP-Sync elicits an immediate acknowledgement. On receiving a valid DCCP-Sync packet, a DCCP endpoint MUST immediately generate and send a DCCP-SyncAck in response; and the Acknowledgement Number on that DCCP-SyncAck MUST equal the Sequence Number of the DCCP-Sync. A particular DCCP implementation mightresponddecide toallinitiate feature negotiationoptions with Ignored, exceptonly once the OPEN state was reached, in which case it might not allow data transfer until some time later. Data received during thateven minimal implementationstime SHOULDsupport "Change R(Ack Ratio)"be rejected and"Confirm L(Ack Ratio)". Even general-purpose implementations might refusereported using a Data Dropped Drop Block with Drop Code 0. Kohler/Handley/Floyd Section 8.2. [Page 61] INTERNET-DRAFT Expires: August 2004 February 2004 8.3. Termination DCCP connection termination uses a handshake consisting of an optional DCCP-CloseReq packet, a DCCP-Close packet, and a DCCP-Reset packet. The server moves from the OPEN state, possibly through the CLOSEREQ state, to CLOSED; the client moves from OPEN through CLOSING to TIMEWAIT, and after 2MSL wait time, to CLOSED. The sequence DCCP-CloseReq, DCCP-Close, DCCP-Reset is used when the server decides to close the connection, but doesn't want to hold TIMEWAIT state: Client State Server State OPEN OPEN 1. <-- CloseReq <-- CLOSEREQ 2. CLOSING --> Close --> 3. <-- Reset <-- CLOSED 4. TIMEWAIT 5. CLOSED A shorter sequence occurs when the client decides torenegotiateclose the connection. Client State Server State OPEN OPEN 1. CLOSING --> Close --> 2. <-- Reset <-- CLOSED 3. TIMEWAIT 4. CLOSED Finally, theCongestion Control ID feature inserver can decide to hold TIMEWAIT state: Client State Server State OPEN OPEN 1. <-- Close <-- CLOSING 2. CLOSED --> Reset --> 3. TIMEWAIT 4. CLOSED In all cases, themiddlereceiver of theconnection, by responding to "Change(CCID)" options with Ignored. 6.5. Identification Options The Identification options provideDCCP-Reset packet holds TIMEWAIT state for the connection. As in TCP, TIMEWAIT state, where an endpoint quietly preserves awaysocket forDCCP endpoints to confirm each others' identities, even2MSL (4 minutes) afterchanges of address (Section 10) or long bursts of lossits connection has closed, ensures thatgetno connection duplicating theendpoints out of sync (Section 5.2). Again, DCCP as specified here does not provide cryptographic security guarantees,current connection's source andattackers thatdestination addresses and ports cansee every packet are still capable of manipulating DCCP connections inappropriately, butstart up while old packets might remain in theIdentification options make it more Kohler/Handley/Floyd/Padhyenetwork. Kohler/Handley/Floyd Section6.5.8.3. [Page58]62] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 difficult for some kindsFebruary 2004 The termination handshake proceeds as follows. The receiver ofattacksa valid DCCP-CloseReq packet MUST respond with a DCCP-Close packet; that receiving endpoint will expect tosucceed.hold TIMEWAIT state after later receiving a DCCP-Reset. TheIdentification option is used to prove an endpoint's identity, whilereceiver of aChallenge option elicits an Identification from the other endpoint. An Identification Regime determines how the Identifications are calculated. Invalid DCCP-Close packet MUST respond with a DCCP-Reset packet, with Reset Code 1, "Closed"; thedefault MD5 Regime,endpoint that originally sent thecalculation involves an MD5 hash overDCCP-Close will hold TIMEWAIT state. The endpoint that receives a valid DCCP-Reset packetdata and two Connection Nonces, either exchanged atwill hold TIMEWAIT state for thebeginning ofconnection. A DCCP-Reset packet completes every DCCP connection, whether theconnectiontermination is clean (due to application close; Reset Code 1, "Closed") orimplicitly agreed upon. 6.5.1. Identification Regime Feature Identification Regimeunclean. Unlike TCP, which hasfeature number 8. The ID Regime feature located attwo distinct termination mechanisms (FIN and RST), DCCPB specifiesends all connections in a uniform manner. This is justified because some responses to connection termination close are thealgorithmsame no matter whether termination was clean. For instance, the endpoint thatDCCP B will usereceives a valid DCCP-Reset should hold TIMEWAIT state forits Identification options, andthe connection. Processors thatDCCP A will use for its Challenge options. Each endpointmustkeep track of both its ID regime and, via the ID Regime feature,distinguish between clean and unclean termination can examine theregime used byReset Code. DCCP-Reset packets MUST NOT be generated in response to received DCCP-Reset packets. DCCP implementations generally transition to theother endpoint. ID Regime is a server-priority feature. The value of ID Regime isCLOSED state after sending atwo-byte number, so valid ConfirmDCCP-Reset packet. Endpoints in the CLOSEREQ andChange(ID Regime) options take at least five bytes. Change options MAY list multiple ID RegimesCLOSING states MUST retransmit DCCP- CloseReq and DCCP-Close packets, respectively, until leaving those states. The retransmission timer should initially be set to go off indescending order of preference. This document definestwoID Regimes: ID Regime Meaning --------- ------- 0 Null Regime 1 MD5 Regime (default) In the Null Regime, every IdentificationRTTs, orChallenge option0.4 seconds if the RTT isinvalid. The Null Regime makes it impossible for endpoints to getnot known, and should backinto sync after bursts of loss largeroff to not less thantwo-thirds of the Loss Window (Section In the MD5 Regime, whichonce every 64 RTTs if no relevant response is received. Only thedefault, valid Identification and Challenge options contain an MD5 hash ofserver can send a DCCP-CloseReq packet or enter theConnection Nonce feature values with someCLOSEREQ state. 8.3.1. Abnormal Termination DCCP endpoints generate DCCP-Reset packets to terminate connections abnormally; a DCCP-Reset packetdata. Applications preferring different security guarantees, particularly around mobility issues,maypreferbe generated from any state. However, a DCCP endpoint in the CLOSED or LISTEN state may not have a proper sequence number available toimplement another identification algorithm and allocatesend anew ID Regime value for it. IfReset. In these cases, it MUST set theendpoints cannot agree on mutually acceptable ID Regimes,Reset's Sequence Number to zero. Resets sent in theconnection SHOULDCLOSED, LISTEN, and TIMEWAIT states often use Reset Code 3, "No Connection". Resets sent in the REQUEST or RESPOND states often use Reset Code 4, "Packet Error". 8.4. DCCP State Diagram The most common state transitions discussed above can bereset due to "Fruitless Negotiation". 6.5.2. Connection Nonce Feature Connection Nonce has feature number 7.summarized in the following state diagram. TheConnection Nonce feature located at DCCP Bdiagram is illustrative; thevalue of DCCP A's connection nonce, a value used by Identification Regime 1. Each endpoint SHOULD keep track of Kohler/Handley/Floyd/PadhyeKohler/Handley/Floyd Section6.5.2.8.4. [Page59]63] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 its own nonce and, via the Connection Nonce feature, the other endpoint's nonce. Connection Nonce is a non-negotiable feature. The Connection Nonce feature takes arbitrary values of at least 4 bytes long. A Change or Confirm(Connection Nonce) option therefore takes at least 7 bytes. Connection Nonce defaults to a random 8-byte string. To prevent spoofing, this string MUST NOT have any trivially predictable value.February 2004 text in Section 8.5 and elsewhere should be considered definitive. For example,it MUST NOT be set deterministicallythere are arcs (not shown) from every state except CLOSED tozero, and it SHOULD changeTIMEWAIT, contingent onevery connection. DCCP endpoints MAY, however, exchange Connection Nonces via some mechanism other than the plaintext, snoopable Connection Nonce option. For example, two DCCPs might exchange nonces over a secure channel; or, assuming neither endpoint is behind a network address translator, they might encryptthesource and destination ports with a shared secret key. 6.5.3. Identification Option The Identification option serves as confirmation thatreceipt of a valid DCCP-Reset. +---------------------------+ +---------------------------+ | v v | | +----------+ | | +-------------+ CLOSED +------------+ | | | +----------+ active | | | | passive open | | | | open snd Request | | | v v | | +----------+ +----------+ | | | LISTEN | | REQUEST | | | +----+-----+ +----+-----+ | | | rcv Request rcv Response | | | | snd Response snd Ack | | | v v | | +----------+ +----------+ | | | RESPOND | | PARTOPEN | | | +----+-----+ +----+-----+ | | | rcv Ack/DataAck rcv packetwas sent by| | | | | | | | +----------+ | | | +------------>| OPEN |<-----------+ | | +--+-+--+--+ | | server active close | | | active close | | snd CloseReq | | | or rcv CloseReq | | | | | snd Close | | | | | | | +----------+ | | | +----------+ | | | CLOSEREQ |<---------+ | +--------->| CLOSING | | | +----+-----+ | +----+-----+ | | | rcv Close | | | | | snd Reset | rcv Reset | | |<---------+ | v | | rcv Close | +----+-----+ | | snd Reset | | TIMEWAIT | | | | +----+-----+ | +-----------------------------+ | | +-----------+ 2MSL timer expires 8.5. Pseudocode This section presents anendpoint involved in the initiation ofalgorithm describing the processing steps a DCCPconnection. It is permitted in any DCCP packet, butendpoint must go through when itmightreceives a packet. A DCCP Kohler/Handley/Floyd Section 8.5. [Page 64] INTERNET-DRAFT Expires: August 2004 February 2004 implementation need notbe useful untilimplement theendpoints have exchanged security information suchalgorithm asconnection nonces. The option takes the following form: +--------+--------+--------+--------+--------+-------- |00101011| Length | Identification Data ... +--------+--------+--------+--------+--------+-------- Type=43 The particular data included in an Identification option sent by DCCP A depends on the ID Regime in force for the A-to-B sequence, whichit isthe value of the ID Regime feature located at DCCP B. The remainderdescribed here, but any implementation MUST generate observable effects (meaning packets) exactly as indicated by this pseudocode, except where allowed otherwise by another part of thissection describes ID Regime 1, the default MD5 Regime.document. TheIdentification data provided forreceived packet is written as P, theMD5 Regime consists of a 16-byte MD5 digest of:socket as S. Socket variables: S.SWL - sequence number window low S.SWH - sequence number window high S.AWL - acknowledgement number window low S.AWH - acknowledgement number window high S.ISS - initial sequence number sent S.ISR - initial sequence number received S.OSR - first OPEN sequence number received S.GSS - greatest sequence number sent S.GSR - greatest valid sequence number received S.GAR - greatest acknowledgement number received; initialized to S.ISS "Send packet" actions always use, and increment, S.GSS. First, check the32-bit words inheader basics; If theDCCPheaderthat includechecksum is incorrect, drop packet and return. If theSequencepacket type is not understood, drop packet andAcknowledgement Numbers (this will be words 3-4return. If Data Offset is too small for packet type, or3-6, depending on whether sequence numbers are extended); the value of the sender's Connection Nonce;too large for packet, drop packet and return. Second, process DCCP-Move; If P.type == Move, Look up thevalue of the other endpoint's Connection Nonce,Mobility ID inthat order. The total length of the option is therefore 18 bytes,table; get socket. If socket exists && P.seqno >= S.SWL && P.ackno <= S.AWH && P.ackno >= S.ISS && S.state >= PARTOPEN && S.state < TIMEWAIT, Process options Set socket to point at new address/ports Add reference to new address/ports Set timer to remove old address/ports after 2MSL Choose new Mobility ID, add to table Send DCCP-Sync[Change L[Mobility ID, new ID]] Update S.GSR, S.SWL, S.SWH Drop packet andthe option may only be provided on packets that contain Acknowledgement Numbers, such as DCCP-Ack. Inclusion of the two Connection Nonces ensures that attackers cannot fake an Identification Option,return Otherwise, Drop packet and return Third, check ports and process TIMEWAIT state; Look up flow ID; get socket. If no socket, or S.state == TIMEWAIT, Generate Reset(No Connection) unlessthey snooped on the beginning of the connection when nonces are exchanged. (No mechanism protects Kohler/Handley/Floyd/PadhyeP.type == Reset Drop packet and return Fourth, process LISTEN state; If S.state == LISTEN, Kohler/Handley/Floyd Section6.5.3.8.5. [Page60]65] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 against snoopers who know Connection Nonces, since DCCP as specifiedFebruary 2004 If P.type == Request, /* Init Cookie processing would go heredoes not provide strong cryptographic security guarantees; see Section 16.) Inclusion of the Sequence*/ Set S := new socket for this port pair S.state = RESPOND Choose S.ISS (initial seqno) Set S.ISR, S.GSR, S.SWL, S.SWH from packet Continue (with S.state == RESPOND) Otherwise, Generate Reset(No Connection) unless P.type == Reset Drop packet andAcknowledgement Numbers protects against replay attacks within the connection. To check an Identification option's value, the receiver simply calculates the MD5 digest itselfreturn Fifth, process Reset; If P.type == Reset, If S.GAR <= P.ackno <= S.AWH && (P.seqno == 0 || P.seqno > S.GSR || S.state == REQUEST), Tear down connection S.state := TIMEWAIT Set TIMEWAIT timer Drop packet andcompares that against the option data. The MD5 calculation can be expensive, so an attacker could conceivably disable a DCCP endpoint by sending it a flood of invalid packets with bad Identification options. Rate limits described in Sections 5.2return Otherwise (sequence numbers out of whack), Drop packet and10 mitigate this issue. The receiver MAY ignore an Identification option if it occurs on areturn Sixth, process REQUEST state; If S.state == REQUEST, If P.type == Response && S.AWL <= P.ackno <= S.AWH, Set S.GSR, S.ISR, S.SWL, S.SWH Otherwise, Generate Reset(Packet Error) Drop packetthat would otherwise be considered valid. Example C code for constructing the option's value before transmitting aand return Seventh, process Sync sequence numbers; If P.type == Sync || P.type == SyncAck, If S.AWL <= P.ackno <= S.AWH and P.seqno >= S.SWL, Update S.GSR, S.SWL, S.SWH Otherwise, Drop packetfollows. unsigned char *packet_data; int packet_length; int id_option_offset; /* offset of option in packet_data */ const unsigned char *my_nonce, *other_nonce; int my_nonce_length, other_nonce_length; MD5_CTX md5_context; MD5_Init(&md5_context); MD5_Update(&md5_context, packet_data + 8, 8); /* assuming 24-bitand return Eighth, check sequencenumbers */ MD5_Update(&md5_context, my_nonce, my_nonce_length); MD5_Update(&md5_context, other_nonce, other_nonce_length); packet_data[id_option_offset] = 42; /* option value */ packet_data[id_option_offset+1] = 18;numbers; If S.SWL <= P.seqno <= S.SWH && (P.ackno does not exist || S.AWL <= P.ackno <= S.AWH), Update S.GSR, S.GAR, S.SWL, S.SWH Otherwise, Send Sync packet acknowledging P.seqno Drop packet and return Ninth, check packet type; If (S.is_server && P.type == CloseReq) || (S.is_server && P.type == Response) Kohler/Handley/Floyd Section 8.5. [Page 66] INTERNET-DRAFT Expires: August 2004 February 2004 || (S.is_client && P.type == Request) || (S.state >= OPEN && P.type == Request && P.seqno >= S.OSR) || (S.state >= OPEN && P.type == Response && P.seqno >= S.OSR) || (S.state == RESPOND && P.type == Data), Send Sync packet acknowledging P.seqno Drop packet and return Tenth, process options; /*option lengthmay involve resetting connection, etc. */MD5_Final(packet_data + id_option_offset + 2, &md5_context); 6.5.4. Challenge Option This option informs the receiving DCCP that one of its packets was ignored, andMark packet as "received" for acknowledgement purposes On processing Confirm R(Mobility ID), Check thatsucceeding packets will be ignored untiltheendpoint sends aconfirmed Mobility ID is correctIdentification option. The receiving DCCP SHOULDIf a DCCP-Move was recently processed, Remove any old Mobility ID from table Eleventh, process RESPOND state; If S.state == RESPOND, If P.type == Request, Send Response Otherwise, S.OSR := P.seqno S.state := OPEN Twelfth, process REQUEST state; If S.state == REQUEST, S.state := PARTOPEN /* Do not send Data packets in PARTOPEN; furthermore, includean Identification optionInit Cookie onthe nextevery packetit sends. The option takes the following form: Kohler/Handley/Floyd/Padhye*/ Set PARTOPEN timer Thirteenth, process PARTOPEN state; If S.state == PARTOPEN, If P.type == Response, Send Ack Otherwise, S.OSR := P.seqno S.state := OPEN Fourteenth, process CloseReq; If P.type == CloseReq && S.state < CLOSEREQ, Generate Close S.state := CLOSING Set CLOSING timer Fifteenth, process Close; If P.type == Close, Generate Reset(Closed) Tear down connection Kohler/Handley/Floyd Section6.5.4.8.5. [Page61]67] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 +--------+--------+--------+--------+--------+-------- |00101100| Length | Identification Data ... +--------+--------+--------+--------+--------+-------- Type=44 The Identification Data sent with a Challenge option depends on the active Identification Regime. For the default MD5 Regime (Regime 1), the Identification Data on aFebruary 2004 Drop packetsent byand return Sixteenth, process Sync; If P.type == Sync, Generate SyncAck Seventeenth, process data. Do not deliver data from more than one Request or Response 9. Checksums DCCPB is the sameuses a header checksum to protect its header against corruption. Generally, this checksum covers any application data asthat for an Identification option sent bywell. However, DCCPB. The receiver SHOULD ignore a Challenge option, and the packetapplications can request that theChallenge option contains, ifheader checksum cover only part of theIdentification Data is incorrect. The purposeapplication data, or perhaps no application data at all. Link layers may then reduce their protection on unprotected parts of DCCP packets. For some noisy links, and applications that can tolerate corruption, thismechanismcan greatly improve delivery rates and perceived performance. If checksum coverage isto prevent denial-of-service attacks where an attacker could cause the receiver to send manycomplete, packets withexpensive-to-compute Identification options, since the receiver MAY ignore Challenge options for some time after receiving an invalid Challenge. If, after several Challenge options, a DCCP is unable to elicitcorrupt application data must be treated as network losses, thus incurring avalid Identificationloss response fromits partner, it MAY resettheconnectionsender's congestion control mechanism. Such a heavy-duty response may unfairly penalize connections on links withReason "Unanswered Challenge". 6.6. Init Cookie Option This option is permitted in DCCP-Response, DCCP-Data, DCCP-Ack, and DCCP-DataAck messages. The server MAY include an Init Cookie option in its DCCP-Response. If so, then the client MUST echo the same Init Cookie option in each succeeding DCCP packet until one of those packets is acknowledged or the connectionhigh background corruption. It isreset. The server SHOULD design its Init Cookie format soto the application's benefit to report corruption losses differently from network losses. Therefore, even applications thatInit Cookiesdemand correct data canbe checked for tampering; it SHOULD respond to an tampered Init Cookie optionmake use of reduced checksum coverage, byresettingincluding a Data Checksum option. Data Checksum holds a strong checksum of theconnection with Reason set to "Bad Init Cookie".application data. Thepurposecombination ofthis option is to allow areduced checksum coverage and Data Checksum can detect application data corruption, but report it as corruption, not congestion, via Data Dropped options (see Section 11.7). Reduced checksum coverage introduces some security considerations; see Section 19.2. See Appendix B.1 for further motivation and discussion. DCCP's implementation of reduced checksum coverage was inspired by UDP-Lite [UDP-LITE]. 9.1. Header Checksum Field DCCPserver to avoid having to hold any state untiluses thethree-way connection setup handshake has completed.TCP/IP checksum algorithm. Theserver wraps upChecksum field in theservice code, server port, and any options it cares about from bothDCCP generic header (see Section 5.1) equals theDCCP-Request and DCCP- Response16 bit one's complement of the one's complement sum of all 16 bit words inan opaque cookie. Typicallythecookie will be encrypted usingDCCP header, DCCP options, asecret known only topseudoheader taken from theserver and include a cryptographic checksum or magic value so that correct decryption can be verified. Whennetwork- layer header, and, depending on theserver receivesvalue of thecookie back inChecksum Coverage field, some or all of theresponse, it can decryptapplication data. When calculating thecookie and instantiate allchecksum, thestate it avoided keeping. The precise implementationChecksum field itself is treated as 0. If a packet contains an odd number ofthe Init Cookie does not needheader and text bytes to bespecified here; since Init Cookies are opaque to the client, there are no interoperability concerns. Kohler/Handley/Floyd/Padhyechecksummed, 8 Kohler/Handley/Floyd Section6.6.9.1. [Page62]68] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 Init CookiesFebruary 2004 zero bits arelimitedadded on the right toat most 253 bytes in length. +--------+--------+--------+--------+--------+-------- |00100101| Length | Init Cookie Value ... +--------+--------+--------+--------+--------+-------- Type=37 6.7. Timestamp Option This optionform a 16 bit word for checksum purposes. The pad byte ispermitted in any DCCPnot transmitted as part of the packet. The pseudoheader is calculated as for TCP. For IPv4, it is 96 bits long, and consists of the IPv4 source and destination addresses, the IP protocol number for DCCP (padded on the left with 8 zero bits), and the DCCP length as a 16-bit quantity (the length of theoptionDCCP header with options, plus the length of any data); see Section 3.1 of [RFC 793]. For IPv6, it is6 bytes. +--------+--------+--------+--------+--------+--------+ |00101001|00000110| Timestamp Value | +--------+--------+--------+--------+--------+--------+ Type=41 Length=6 The four bytes320 bits long, and consists ofoption data carrythetimestamp of this packet in some undetermined form. AIPv6 source and destination addresses, the DCCPreceiving a Timestamp option SHOULD respond withlength as aTimestamp Echo option32-bit quantity, and the IP protocol number for DCCP (padded on thenext packet it sends. 6.8. Elapsed Time Option This option is permittedleft with 24 zero bits); see Section 8.1 of [RFC 2460]. Packets with invalid header checksums MUST be ignored. In particular, their options MUST NOT be processed. 9.2. Header Checksum Coverage Field The Checksum Coverage field inanythe DCCPpacket that contains an Acknowledgement Number. It indicates how much time, in tenthsgeneric header (see Section 5.1) specifies what parts ofmilliseconds, has elapsed sincethe packetbeing acknowledged---the packet withare covered by thegiven Acknowledgement Number---was received.Checksum field, as follows: CsCov = 0 Theoption may take 4 or 6 bytes, dependingChecksum field covers the DCCP header, DCCP options, network-layer pseudoheader, and all application data in the packet, possibly padded on thesizeright with zeros to an even number of bytes. CsCov = 1-15 The Checksum field covers theElapsed Time value. Elapsed Time helps correct round-trip time estimates when the gap between receiving a packetDCCP header, DCCP options, network-layer pseudoheader, andacknowledging that packet may be long---in CCID 3, for example, where acknowledgements are sent infrequently. +--------+--------+--------+--------+ |00101110|00000100| Elapsed Time | +--------+--------+--------+--------+ Type=46 Len=4 +--------+--------+--------+--------+--------+--------+ |00101110|00000110| Elapsed Time | +--------+--------+--------+--------+--------+--------+ Type=46 Len=6 The option data, Elapsed Time, represents an estimated upper bound ontheamountinitial (CsCov-1)*4 bytes oftime elapsed sincethepacket being acknowledged was received, with units of tenthspacket's application data. Thus, if CsCov is 1, none ofmilliseconds. If Elapsed Timethe application data is protected by the header checksum. The value (CsCov-1)*4 MUST be less thana second,or equal to thefirst, smaller formlength of theoption SHOULD Kohler/Handley/Floyd/Padhye Section 6.8. [Page 63] INTERNET-DRAFT Expires: April 2004 October 2003 be used. Elapsed Times of more than 6.5535 secondsapplication data. Packets with invalid CsCov values MUST besent using the second form of the option. DCCP endpointsignored; in particular, their options MUST NOTreport Elapsed Times that are significantly largerbe processed. The meanings of values other thanthe true elapsed times. A connection MAY0 and 1 should bereset, with Reason set to "Aggression Penalty", if one endpoint determines that theconsidered experimental. Values other than 0 specify that corruption isreporting a much-too-large Elapsed Time. Elapsed Time is measuredacceptable intenths of milliseconds as a compromise between two conflicting goals. First, it provides enough granularity to reduce rounding error when measuring elapsed time over fast LANs. Second, Elapsed Time allows most reasonable elapsed times to fit into two bytessome or all of the DCCP packet's application data.6.9. Timestamp Echo Option ThisIn fact, DCCP cannot even detect corruption in areas not covered by the header checksum, unless the Data Checksum option ispermitted inused. Applications should not make any assumptions about the correctness of received data not covered by the checksum, and should if necessary introduce their own validity checks. A DCCPpacket, as longapplication interface should let sending applications suggest a value for CsCov for sent packets, defaulting to 0 (full coverage). Kohler/Handley/Floyd Section 9.2. [Page 69] INTERNET-DRAFT Expires: August 2004 February 2004 It should also let receiving applications refuse delivery of packets with checksum coverage less than a value provided by the application; by default, only packets with fully-covered application data should be accepted. (Note that, for short packets, application data might be fully covered by a nonzero Checksum Coverage value.) Lower layers that support partial error detection MAY use the Checksum Coverage field as a hint of where errors do not need to be detected. Lower layers MUST use a strong error detection mechanism to detect at leastone packet carryingerrors that occur in theTimestamp option has been received. Generally, a DCCP endpoint should send one Timestamp Echo option for each Timestamp option it receives;sensitive part of the packet, andit should send that option as soon as is convenient.discard damaged packets. Thelengthsensitive part consists of theoption isbytes between6the first byte of the IP header and10 bytes, dependingthe last byte identified by Checksum Coverage. For more details onwhether Elapsed Time is includedapplication andhow large it is. +--------+--------+--------+--------+--------+--------+ |00101010|00000110| Timestamp Echo | +--------+--------+--------+--------+--------+--------+ Type=42 Len=6 +--------+--------+------- ... -------+--------+--------+ |00101010|00001000| Timestamp Echo | Elapsed Time | +--------+--------+------- ... -------+--------+--------+ Type=42 Len=8 (4 bytes) +--------+--------+------- ... -------+------- ... -------+ |00101010|00001010| Timestamp Echo | Elapsed Time | +--------+--------+------- ... -------+------- ... -------+ Type=42 Len=10 (4 bytes) (4 bytes)lower-layer interface issues relating to partial checksumming, see [UDP-LITE]. 9.3. Data Checksum Option Thefirst four bytes ofData Checksum optiondata, Timestamp Echo, carryholds aTimestamp Value taken from32-bit CRC-32c cyclic redundancy- check code of apreceding received Timestamp option. Usually, this will be the last packet that was received---the packet indicated by the Acknowledgement Number, if any---butDCCP packet's application data. +--------+--------+--------+--------+--------+--------+ |00101100|00000110| CRC-32c | +--------+--------+--------+--------+--------+--------+ Type=44 Length=6 Data Checksum is intended for packets containing application data, such as DCCP-Request, DCCP-Response, DCCP-Data, and DCCP-DataAck, but itmightmay bea precedingincluded on any packet. TheElapsed Time field is similar to the value stored in the Elapsed Time option. If present, it indicatessending DCCP computes theamountCRC oftime elapsed since receivingthepacket whose timestamp is being echoed. This time MUST bebytes comprising the application data and stores it intenths of milliseconds. Elapsed Timethe option data. The CRC-32c algorithm used for Data Checksum ismeant to Kohler/Handley/Floyd/Padhye Section 6.9. [Page 64] INTERNET-DRAFT Expires: April 2004 October 2003 helptheTimestamp sender separatesame as that used for SCTP [RFC 3309]; note that thenetwork round-trip time fromCRC-32c of zero bytes of data equals zero. The DCCP header checksum will cover theTimestamp receiver's processing time. This may be particularly important for CCIDs where acknowledgements are sent infrequently,Data Checksum option, sothat there mightthe data checksum must beconsiderable delay betweencomputed before the header checksum. The receivinga Timestamp optionDCCP SHOULD compute the received application data's CRC-32c using the same algorithm as the sender, andsendingcompare thecorresponding Timestamp Echo. A missing Elapsed Time field is equivalentresult and the Data Checksum value. If the values differ, the packet's application data MUST be dropped, and reported using a Data Dropped option as dropped due to corruption (Drop Code 3). However, DCCP MAY provide anElapsed Time of zero. The smallest versionAPI through which the receiving application could request delivery of known-corrupt data. When that API is active, theoptionpacket's data SHOULD beused that can holddelivered, but reported as delivered corrupt (Drop Code 7) using a Data Dropped option. In either case, therelevant Elapsed Time value. 6.10. Loss Windowpacket will be reported as Received or Received ECN Marked by Ack Vector or similar options. Kohler/Handley/Floyd Section 9.3. [Page 70] INTERNET-DRAFT Expires: August 2004 February 2004 9.3.1. Check Data Checksum FeatureLoss Window has feature number 6.TheLoss WindowCheck Data Checksum featurelocated at DCCP B is the width of the windowlets a sending DCCPB uses todetermine whetherpackets from DCCP A are valid. Packets outside this window will be dropped by DCCP B as old duplicatesorspoofing attempts; see Section 5.2 for more information.not its partner can check Data Checksum options. DCCP A sends a Mandatory "ChangeR(Loss Window, W)"R(Check Data Checksum, 1)" option to DCCP B toset DCCP B's Loss Windowrequire B toW. Loss Window is non-negotiable. The Loss Windowcheck Data Checksum options (the connection will be reset if DCCP B cannot). Check Data Checksum has feature number 10, and is server-priority. It takes3- or 6-byte integer values, likeone-byte Boolean values. DCCPsequence numbers. Change and ConfirmB MUST check any received Data Checksum optionsfor Loss Window are therefore either 6when Check Data Checksum/B is one, although it MAY check them even when Check Data Checksum/B is zero. Values of two or9 bytes long. Loss Window defaultsmore are reserved. New connections start with Check Data Checksum 0 for both endpoints. 9.3.2. Usage Notes Internet links must normally apply strong integrity checks to1000the packets they transmit [UDP-LITE] [LINK BCP]. Data Checksum is redundant fornew connections. The Loss Window valueDCCP packets whose integrity is checked by every link they traverse. This is thetotal widthdefault case when the DCCP header's Checksum Coverage value equals zero (full coverage). However, the DCCP Checksum Coverage value might not be zero. By setting partial Checksum Coverage, the application indicates that it can tolerate corruption in the unprotected part of theloss window. The receiver positionsapplication data. Recognizing this, link layers may reduce theloss window asymmetrically around GSR,strength of their error detection and/or correction when transmitting this unprotected part, which can significantly increase thegreatest sequence number received, with one-thirdprobability of theloss window width (rounded down) reserved for GSR and older sequence numbers and two-thirds reserved for newer sequence numbers. See Section 5.2. 7.endpoint receiving corrupt data. Data Checksum lets the receiver detect any ensuing corruption. 10. Congestion Control IDs Each congestion control mechanism supported by DCCP is assigned a congestion control identifier, or CCID: a number from 0 to 255. During connection setup, and optionally thereafter, the endpoints negotiate their congestion control mechanisms by negotiating the values for their Congestion Control ID features. Congestion Control ID has feature number 1. Thefeature located at DCCP A isCCID/A value equals the CCID in use for the A-to-B half-connection. DCCP B sends a "Change R(CCID, K)" option toDCCP A toask DCCP A to use CCID K for its data packets. CCID is a server-priorityfeature. The data byte of Congestion Control ID featurefeature, so CCID negotiation optionsform acan listofmultiple acceptable CCIDs, sorted in descending order of priority. For example, the option "Change R(CCID, 1 2 3)" asks the receiver to use CCID 1 for its packets, although CCIDs 2 and 3 are also acceptable. (This corresponds to the bytes "35, 6, 1, 1, 2, 3": Change R option (35), option length (6), feature ID (1), CCIDsKohler/Handley/Floyd/PadhyeKohler/Handley/Floyd Section7.10. [Page65]71] INTERNET-DRAFT Expires:AprilAugust 2004 February 2004October 2003(1, 2, 3).) Similarly, "Confirm L(CCID, 1, 1 2 3)" tells the receiver that the sender is using CCID 1 for its packets, but that CCIDs 2 or 3 might also be acceptable. The CCIDs defined by this document are: CCID Meaning ---- ------- 0 Reserved 1 Unspecified Sender-Based Congestion Control 2 TCP-like Congestion Control 3 TFRC Congestion ControlA new connection startsNew connections start with CCID 2 for bothDCCPs.endpoints. If this is unacceptable for a DCCP endpoint, that endpoint MUST send"Change(CCID)"Mandatory Change(CCID) options on its firstpackets, and MUST Reset the connection if the results of those negotiations are unacceptable.packets. All CCIDs standardized for use with DCCP will correspond to congestion control mechanisms previously standardized by the IETF. We expect that for quite some time, all such mechanisms will be TCP- friendly, but TCP-friendliness is not an explicit DCCP requirement. A DCCP implementation intended for generaluse---inuse, such as an implementation in ageneral- purposegeneral-purpose operating system kernel,for example---SHOULDSHOULD implement at least CCIDs 1 and 2. The intent is to make these CCIDs broadly available for interoperability, althoughany given applicationparticular applications might disallow theiruse via the feature negotiation process. 7.1.use. 10.1. Unspecified Sender-Based Congestion Control CCID 1 denotes an unspecified sender-based congestion control mechanism.Separate features negotiate the corresponding congestion acknowledgement options---for example, Ack Vector.This provides a limited, controlled form of interoperability for new IETF-approvedCCIDs. Implementors MUST NOT useCCIDs: with CCID1 in production environments as1, an HC- Sender can use aproxy fornew sender-based congestion controlmechanisms that havemechanism whose details the HC-Receiver does notenteredunderstand. Some congestion control mechanisms require only generic behavior from theIETF standards process. We intendreceiver. For example, CCID 2, TCP-like Congestion Control, requires thatany productionthe receiver (1) send Ack Vectors and (2) respond to Ack Ratio. Both of these requirements use generic mechanisms described in this document. Thus, a CCID 2 HC-Receiver doesn't really need to understand the details of CCID 2. CCID 1would haveuses this insight tobe explicitly approved first bysupport forward compatibility for sender-based congestion control mechanisms. An HC-Sender proposes CCID 1 as a proxy for a sender-based mechanism whose details theIETF. Middleboxes MAY chooseHC- Receiver doesn't need to understand. The HC-Receiver can then agree totreat the use ofCCID11, and provide generic acknowledgement feedback asexperimentalrequested Kohler/Handley/Floyd Section 10.1. [Page 72] INTERNET-DRAFT Expires: August 2004 February 2004 by other features (such as Send Ack Vector). Individual CCID profile documents say whether orunacceptable.not they can masquerade as CCID 1. For example, say that CCID 98, a new sender-based congestion control mechanism using Ack Vector for acknowledgements, has entered the IETF standards process, and the IETF has approved the use of CCID 1Kohler/Handley/Floyd/Padhye Section 7.1. [Page 66] INTERNET-DRAFT Expires: April 2004 October 2003as abackupproxy for CCID 98. Now, say DCCPA, which understands andA would like to use CCID 98 for its data packets. It should therefore send a "Change L(CCID, 98 1)" option to open a CCID negotiation. 98 comes first, since that is the preferred CCID; 1 comes next, as a potential proxy for 98. If DCCP B understands CCID 98, it will respond with "Confirm R(CCID, 98, ...)" and all istrying to communicatewell. But if it does not understand CCID 98, it may respond with "Confirm R(CCID, 1, ...)", still allowing DCCPB, which doesn't yet know aboutA to use CCID 98. DCCP Acan simply negotiate use of CCID 1 and, separately,will separately negotiateUseSend AckVector.Vector, and thus DCCP B will provide the feedback DCCP Arequires for CCID 98,requires, namely Ack Vector, without needing to understand thecongestion control mechanism in use.operation of CCID 98. Implementors MUST NOT use CCID 1has no sender implementation; it is exclusively meaningful at the receiver to support forward compatibility. The sender always usesin production environments as aspecificproxy for congestion controlmechanism whose CCID ismechanisms that have not1. However,entered thecode implementing a CCIDIETF standards process. We intend thatrequires only generic feedback, such as Ack Vector, MAY addany production use of CCID 1 would have to be explicitly approved first by thelist of acceptable CCIDs sentIETF. Middleboxes MAY choose tothe receiver (following the actual CCID), facilitating communication with receivers that do not understand the actual CCID. Any CCID feature negotiation in which the sender proposestreat the use of CCID 1without any otheras experimental or unacceptable. Since CCIDis considered erroneous,1 should be used only as a proxy for other, defined CCIDs, an HC-Sender MUST NOT report a preference list consisting only of CCID 1, and the option "Change L(CCID, 1)" is illegal. Receiving such an option SHOULD result in connectionreset,reset withReason setReset Code 5, "Option Error". An HC-Receiver MAY suggest CCID 1 exclusively: the option "Change R(CCID, 1)" is not illegal. If CCID 1 is the result of a CCID feature negotiation, the HC-Sender determines which CCID to"Fruitless Negotiation".actually use by picking the earliest CCID in its preference list that can masquerade as CCID 1. The HC-Sender MUST pick a CCID that appeared explicitly in its preference list. Many DCCP APIs will allow applications to suggest preferred CCIDs for sending and receiving data.ApplicationsSuch APIs mightbe able tolet applications allow or prevent the use of CCID 1 forsending and receiving. For sending, however, it makes sense toreceiving, but they should not let applications suggest the use of CCID 1 for sending. The code implementing a particular CCIDsilently suggestshould add CCID 1 to the HC- Sender's CCID preference list when appropriate, unless the application disagrees. The default for both sender and receiver should be to allow CCID 1 whenappropriate.possible. CCID 1 places no restrictions on how often the HC-Receiver may send DCCP-Ack packets.This applies wherever we say "send a DCCP-Ack as allowed by the congestion control mechanism in use".A careful implementation SHOULD implement a liberal rate limit on DCCP-Acks to prevent ackstorms, however. 7.2.storms. Kohler/Handley/Floyd Section 10.1. [Page 73] INTERNET-DRAFT Expires: August 2004 February 2004 10.2. TCP-like Congestion Control CCID 2, TCP-like Congestion Control, denotes Additive Increase, Multiplicative Decrease (AIMD) congestion control with behavior modelled directly on TCP, including congestion window, slow start, timeouts, and so forth. CCID 2 achieves maximum bandwidth over the long term, consistent with the use of end-to-end congestion control, but halves its congestion window in response to each congestion event. This leads to the abrupt rate changes typical of TCP. Applications should use CCID 2 if they prefer maximum bandwidth utilization to steadiness of rate. This is often the case for applications that are not playing their data directly to the user. For example, a hypothetical application that transferred files over DCCP, using application-level retransmissions for lost packets,Kohler/Handley/Floyd/Padhye Section 7.2. [Page 67] INTERNET-DRAFT Expires: April 2004 October 2003would prefer CCID 2 to CCID 3. On-line games may also prefer CCID 2. CCID 2 is further described in [CCID 2 PROFILE].7.3.10.3. TFRC Congestion Control CCID 3 denotes TCP-Friendly Rate Control (TFRC), an equation-based rate-controlled congestion control mechanism. TFRC is designed to be reasonably fair when competing for bandwidth with TCP-like flows, where a flow is "reasonably fair" if its sending rate is generally within a factor of two of the sending rate of a TCP flow under the same conditions. However, TFRC has a much lower variation of throughput over time compared with TCP, which makes CCID 3 more suitable than CCID 2 for applications such as telephony or streaming media where a relatively smooth sending rate is of importance. CCID 3 is further described in [CCID 3 PROFILE]. The TFRC congestion control algorithms were initially described in [RFC 3448].7.4.10.4. CCID-Specific Options, Features, and ResetReasons OptionCodes Half of the option types, feature numbers, and ResetReasons 128 through 255Codes areavailablereserved for CCID-specific use. CCIDs may often need newoption types---foroptions, for communicating acknowledgement or rate information, forexample. CCID-specificexample; reserved optiontypesspaces letthemCCIDs create options at will without polluting the global option space. Option 128 might have different meanings on a half-connection using CCID 4 and ahalf- connectionhalf-connection using CCID 8. CCID-specific options and features will never conflict with global options and features introduced by later versions of this specification. Any packet may contain information meant for either half-connection, so CCID-specific option types, feature numbers, and ResetReasonsCodes Kohler/Handley/Floyd Section 10.4. [Page 74] INTERNET-DRAFT Expires: August 2004 February 2004 explicitly signal the half-connection to which they apply. o Option numbers 128 through 191 are for options sent from the HC- Sender to the HC-Receiver; option numbers 192 through 255 are for options sent from the HC-Receiver to the HC-Sender. o ResetReasonsCodes 128 through 191 indicate that the HC-Sender reset the connection (most likely because of some problem with acknowledgements sent by the HC-Receiver); ResetReasonsCodes 192 through 255 indicate that the HC-Receiver reset the connection (most likely because of some problem with data packets sentby the HC-Sender). Kohler/Handley/Floyd/Padhye Section 7.4. [Page 68] INTERNET-DRAFT Expires: April 2004 October 2003by the HC- Sender). o Finally, feature numbers 128 through 191 are used for features located at the HC-Sender; feature numbers 192 through 255 are for features located at the HC-Receiver. Since Change L and Confirm L options for a feature are sent by the feature location, we know that any Change L(128) option was sent by the HC-Sender, while any Change L(192) option was sent by the HC-Receiver. Similarly, Change R(128) options are sent by the HC-Receiver, while Change R(192) options are sent by the HC-Sender. For example, consider a DCCP connection where the A-to-B half- connection uses CCID 4 and the B-to-A half-connection uses CCID 5. Here is how a sampling of CCID-specific options and features are assigned to half-connections: Kohler/Handley/Floyd Section 10.4. [Page 75] INTERNET-DRAFT Expires: August 2004 February 2004 Relevant Relevant Packet Option Half-conn. CCID ------ ------ ---------- ---- A > B 128 A-to-B 4 A > B 192 B-to-A 5 A > B Change L(128, ...) A-to-B 4 A > B Change R(192, ...) A-to-B 4 A > B Confirm L(128, ...) A-to-B 4 A > B Confirm R(192, ...) A-to-B 4 A > B Change R(128, ...) B-to-A 5 A > B Change L(192, ...) B-to-A 5 A > B Confirm R(128, ...) B-to-A 5 A > B Confirm L(192, ...) B-to-A 5 B > A 128 B-to-A 5 B > A 192 A-to-B 4 B > A Change L(128, ...) B-to-A 5 B > A Change R(192, ...) B-to-A 5 B > A Confirm L(128, ...) B-to-A 5 B > A Confirm R(192, ...) B-to-A 5 B > A Change R(128, ...) A-to-B 4 B > A Change L(192, ...) A-to-B 4 B > A Confirm R(128, ...) A-to-B 4 B > A Confirm L(192, ...) A-to-B 4 CCID-specific options and features have no clear meaning when a nontrivial negotiation for the relevant CCID is in progress. This can happen when a CCID-specific option follows a Change(CCID) option. Say the Change optionpreferslists CCIDX.X first. Then the negotiation is nontrivial if and only if its result is not X.CCID-specificCCID- specific options and features MUST be ignored during a nontrivial CCIDnegotiation---for instance, by responding Ignored options---exceptnegotiation, except that Mandatory CCID-specific options and features MUST induce aKohler/Handley/Floyd/Padhye Section 7.4. [Page 69] INTERNET-DRAFT Expires: April 2004 October 2003DCCP-Reset withReasonReset Code 6, "Mandatory Error".8.11. Acknowledgements Congestion control requires receivers to transmit information about packet losses and ECN marks to senders. DCCP receivers MUST report all congestion they see, as defined by the relevant CCID profile. Each CCID says when acknowledgements should be sent, what options they must use, how they should be congestion controlled, and so on. Most acknowledgements use DCCP options. For example, on a half- connection with CCID 2 (TCP-like), the receiver reports acknowledgement information using the Ack Vector option. This section describes common acknowledgement options and shows how acks using those options will commonly work. Full descriptions of theacknowledgementKohler/Handley/Floyd Section 11. [Page 76] INTERNET-DRAFT Expires: August 2004 February 2004 ack mechanisms used for each CCID are laid out in the CCID profile specifications. Acknowledgement options, such as Ack Vector, generally depend on the DCCP Acknowledgement Number, and are thus only allowed on packet types that carry that number (all packets except DCCP-Request and DCCP-Data). Detailed acknowledgement options are not necessarily required on every packet that carries an Acknowledgement Number, however.8.1.11.1. Acks of Acks and Unidirectional Connections DCCP was designed to work well for both bidirectional and unidirectional flows of data, and for connections that transition between these states. However, acknowledgements required for a unidirectional connection are very different from those required for a bidirectional connection. In particular, unidirectional connections need to worry about acks of acks. The ack-of-acks problem arises because some acknowledgement mechanisms are reliable. For example, an HC-Receiver using CCID 2, TCP-like Congestion Control, sends Ack Vectors containing completely reliable acknowledgement information. The HC-Sender should occasionally inform the HC-Receiver that it has received an ack. If it did not, the HC-Receiver might resend complete Ack Vector information, going back to the start of the connection, with every DCCP-Ack packet! However, note that acks-of-acks need not be reliable themselves: when an ack-of-acks is lost, the HC-Receiver will simply maintain, and periodically retransmit, old acknowledgement-related state for a little longer. Therefore, there is no need for acks-of-acks-of-acks.Kohler/Handley/Floyd/Padhye Section 8.1. [Page 70] INTERNET-DRAFT Expires: April 2004 October 2003When communication is bidirectional, any required acks-of-acks are automatically contained in normal acknowledgements for data packets. On a unidirectional connection, however, the receiver DCCP sends no data, so the sender would not normally send acknowledgements. Therefore, the CCID in force on that half-connection must explicitly say whether, when, and how the HC-Sender should generate acks-of- acks. For example, consider a bidirectional connection where both half- connections use the same CCID (either 2 or 3), and where DCCP B goes "quiescent". This means that the connection becomes unidirectional: DCCP B stops sending data, and sends only sends DCCP-Ack packets to DCCP A. For CCID 2, TCP-like Congestion Control, DCCP B uses Ack Vector to reliably communicate which packets it has received. As described above, DCCP A must occasionally acknowledge a pure acknowledgement from DCCP B, so thatDCCPB can free old Ack Vector Kohler/Handley/Floyd Section 11.1. [Page 77] INTERNET-DRAFT Expires: August 2004 February 2004 state. For instance,DCCPA might send a DCCP-DataAck packet every now and then, instead of DCCP-Data. In contrast, for CCID 3, TFRC Congestion Control, DCCP B's acknowledgements generally need not be reliable, since they contain cumulative loss rates; TFRC works even if every DCCP-Ack is lost. Therefore, DCCP A need never acknowledge an acknowledgement. When communication is unidirectional, a single CCID---in the example, the A-to-B CCID---controls both DCCPs' acknowledgements, in terms of their content, their frequency, and so forth. For bidirectional connections, the A-to-B CCID governs DCCP B's acknowledgements (including its acks of DCCP A's acks), while the B- to-A CCID governs DCCP A's acknowledgements. DCCP A switches its ack pattern from bidirectional to unidirectional when it notices that DCCP B has gone quiescent. It switches from unidirectional to bidirectional when it must acknowledge even a single DCCP-Data or DCCP-DataAck packet from DCCP B.(This includesEach CCID defines how to detect quiescence on that CCID, and how that CCID handles acks-of-acks on unidirectional connections. The B-to-A CCID defines when DCCP B has gone quiescent. Usually, this happens when a period has passed without B sending any data packets; for CCID 2, this period is thecase wheremaximum of 0.2 seconds and two round- trip times. The A-to-B CCID defines how DCCP A handles acks-of-acks once DCCP B has gone quiescent. 11.2. Ack Piggybacking Acknowledgements of A-to-B data MAY be piggybacked on data sent by DCCP B, as long as that does not delay the acknowledgement longer than the A-to-B CCID would find acceptable. However, data acknowledgements often require more than 4 bytes to express. A large set of acknowledgements prepended to asinglelarge data packet might exceed the allowed maximum packet size. In this case, DCCP B SHOULD send separate DCCP-Data and DCCP-Ack packets, orDCCP-DataAckwait, but not too long, for a smaller datagram. Piggybacking is particularly common at DCCP A when the B-to-A half- connection is quiescent---that is, when DCCP A is just acknowledging DCCP B's acknowledgements, as described above. There are three reasons to acknowledge DCCP B's acknowledgements: to allow DCCP B to free up information about previously acknowledged data packets from A; to shrink the size of future acknowledgements; and to manipulate the rate at which future acknowledgements are sent. Since these are secondary concerns, DCCP A can generally afford to wait indefinitely for a data packetwas lostto piggyback its acknowledgement onto. Kohler/Handley/Floyd Section 11.2. [Page 78] INTERNET-DRAFT Expires: August 2004 February 2004 Any restrictions on ack piggybacking are described intransit,the relevant CCID's profile. 11.3. Ack Ratio Feature Ack Ratio provides a common mechanism by which CCIDs that clock acknowledgements off data packets can perform rudimentary congestion control on the acknowledgement stream. CCID 2, TCP-like Congestion Control, uses Ack Ratio to limit the rate of its acknowledgement stream, for example. Some CCIDs ignore Ack Ratio, performing congestion control on acknowledgements in some other way. Ack Ratio has feature number 7, and is non-negotiable. It takes two-byte integer values. The Ack Ratio/A feature is the rough ratio of data packets sent by DCCP A to acknowledgement packets sent back by DCCP B. For example, if Ack Ratio/A is four, then DCCP B will send at least one acknowledgement packet for every four data packets sent by DCCP A. DCCP A sends a "Change L(Ack Ratio)" option to notify DCCP B of its ack ratio. New connections start with Ack Ratio 2 for both endpoints. Implementations should treat Ack Ratio as a loose guideline. For instance, a DCCP endpoint might implement a delayed acknowledgement timer like TCP's, whereby each packet is acknowledged within at most T seconds of its receipt. (In TCP, T isdetectablecommonly set to 200 milliseconds.) This is explicitly allowed even though it might lead to sending more acknowledgement packets than Ack Ratio would suggest. Particular algorithms for setting and usingthe # NDP fieldAck Ratio are discussed in theDCCP packet header.) Each CCID defines how to detect quiescence on that CCID, and how thatrelevant CCIDhandles acks-of-acks on unidirectional connections.drafts. 11.4. Ack Vector Options TheB-to-A CCID defines when DCCP B has gone quiescent. Usually, this happens whenAck Vector gives aperiod has passed without B sending anyrun-length encoded history of datapackets. For CCID 2, this period ispackets received at themaximumclient. Each byte of0.2 secondsthe vector gives the state of that data packet in the loss history, and the number of preceding packets with the same state. The option's data looks like this: +--------+--------+--------+--------+--------+-------- |0010011?| Length |SSLLLLLL|SSLLLLLL|SSLLLLLL| ... +--------+--------+--------+--------+--------+-------- Type=38/39 \___________ Vector ___________... The tworound- trip times.Ack Vector options (option types 38 and 39) differ only in the values they imply for ECN Nonce Echo. Section 12.2 describes this further. TheA-to-B CCID defines how DCCP A handles acks-of-acks once DCCP B has gone quiescent. Kohler/Handley/Floyd/Padhyevector itself consists of a series of bytes, each of whose encoding is: Kohler/Handley/Floyd Section8.1.11.4. [Page71]79] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 8.2. Ack Piggybacking AcknowledgementsFebruary 2004 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |Sta| Run Length| +-+-+-+-+-+-+-+-+ Sta[te] occupies the most significant two bits ofA-to-B data MAY be piggybacked on data sent by DCCP B, as long as that doeseach byte, and can have one of four values: 0 Packet received (and notdelayECN marked). 1 Packet received ECN marked. 2 Reserved. 3 Packet not yet received. Run Length, theacknowledgement longer thanleast significant six bits of each byte, specifies how many consecutive packets have theA-to-B CCID would find acceptable. However, data acknowledgements often require more than 4 bytesgiven State. Run Length zero says the corresponding State applies toexpress. A large setone packet only; Run Length 63 says it applies to 64 consecutive packets. Run lengths ofacknowledgements prepended65 or more must be encoded in multiple bytes. The first byte in the first Ack Vector option refers toa large datathe packetmight exceedindicated in thepath's MTU. In this case, DCCP B SHOULD send separateAcknowledgement Number; subsequent bytes refer to older packets. (Ack Vector MUST NOT be sent on DCCP-Data andDCCP-AckDCCP- Request packets,or wait, but not too long, for a smaller datagram. Piggybacking is particularly common at DCCP A whenwhich lack an Acknowledgement Number.) If an Ack Vector contains theB-to-A half- connection is quiescent---that is, when DCCP Adecimal values 0,192,3,64,5 and the Acknowledgement Number isjust acknowledging DCCP B's acknowledgements, as described above. There are three reasons todecimal 100, then: Packet 100 was received (Acknowledgement Number 100, State 0, Run Length 0). Packet 99 was lost (State 3, Run Length 0). Packets 98, 97, 96 and 95 were received (State 0, Run Length 3). Packet 94 was ECN marked (State 1, Run Length 0). Packets 93, 92, 91, 90, 89, and 88 were received (State 0, Run Length 5). A single Ack Vector option can acknowledgeDCCP B's acknowledgements: to allow DCCP B to freeupinformation about previously acknowledgedto 16192 data packets. Should more packetsfrom A;need toshrink the sizebe acknowledged than can fit in 253 bytes offuture acknowledgements; and to manipulateAck Vector, then multiple Ack Vector options can be sent; therate at which future acknowledgements are sent. Since thesesecond Ack Vector begins where the first left off, and so forth. Ack Vector states aresecondary concerns, DCCP A can generally affordsubject towait indefinitelytwo general constraints. (These principles SHOULD also be followed fora data packet to piggyback itsother acknowledgementonto.Kohler/Handley/Floyd Section 11.4. [Page 80] INTERNET-DRAFT Expires: August 2004 February 2004 mechanisms; referring to Ack Vector states simplifies their explanation.) 1. Packets reported as State 0 or State 1 MUST have been processed by the receiving DCCP stack. In particular, their options must have been processed. Anyrestrictionsdata onack piggybacking are describedthe packet need not have been delivered to the receiving application; in fact, therelevant CCID's profile. 8.3. Ack Ratio Feature Ack Ratio provides a common mechanism by which CCIDs that clock acknowledgements offdatapackets can perform rudimentary congestion controlmay have been dropped. 2. Packets reported as State 3 MUST NOT have been received by DCCP. Feature negotiations and options on such packets MUST NOT have been processed, and theacknowledgement stream. CCID 2, TCP-like Congestion Control, uses Ack RatioAcknowledgement Number MUST NOT correspond tolimitsuch a packet. Packets dropped in therate of its acknowledgement stream, for example. Some CCIDs ignore Ack Ratio, performing congestion controlapplication's receive buffer SHOULD be reported as Received or Received ECN Marked (States 0 and 1), depending onacknowledgementstheir ECN state; such packets' ECN Nonces MUST be included insome other way. Ack Ratio has feature number 3.the Nonce Echo. The Data Dropped option informs the sender that some packets reported as received actually had their application data dropped. One or more AckRatio feature located at DCCP B equalsVector options that, together, report therough ratiostatus ofdatamore packets than have actually been sentbySHOULD be considered invalid. The receiving DCCPA to acknowledgement packets sent backSHOULD either ignore the options or reset the connection with Reset Code 5, "Option Error". Packets that haven't been included in any Ack Vector option SHOULD be treated as "not yet received" (State 3) byDCCP B. For example, if it is set to four, then DCCP B will send at least one acknowledgement packet for every four data packets DCCP A sends. DCCPthe sender. Appendix Asendsprovides a"Change R(Ack Ratio)" option to DCCP B to changenon-normative description of the details of DCCPB's ack ratio.acknowledgement handling, in the context of an abstract AckRatio is a non-negotiable feature. AnVector implementation. 11.4.1. AckRatio option containsVector Consistency A DCCP sender will commonly receive multiple acknowledgements for some of its data packets. For instance, an HC-Sender might receive twobytesDCCP-Acks with Ack Vectors, both ofdata:which contained information about sequence number 24. (Information about a sequence number is generally repeated in every ack until the HC-Sender acknowledges an ack. In this case, perhaps the HC-Receiver is sending acks faster than the HC-Sender is acknowledging them.) In asixteen-bit integer representingperfect world, theratio. A new connection starts with Ack Ratio 2 for both DCCPs. Implementations should treattwo AckRatio as a loose guideline. For instance, a DCCP endpointVectors would always be consistent. However, there are many reasons why they mightimplement a delayed acknowledgement timer like TCP's, whereby eachnot be: o The HC-Receiver received packetis acknowledged within at most Kohler/Handley/Floyd/Padhye24 between sending its acks, so the first ack said 24 was not received (State 3) and the second said it was received or ECN marked (State 0 or 1). Kohler/Handley/Floyd Section8.3.11.4.1. [Page72]81] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 T seconds ofFebruary 2004 o The HC-Receiver received packet 24 between sending itsreceipt. (In TCP, T is commonly setacks, and the network reordered the acks. In this case, the packet will appear to200 milliseconds.) This is explicitly allowed even though it might leadtransition from State 0 or 1 tosending more acknowledgement packets than Ack Ratio would suggest. Particular algorithms for settingState 3. o The network duplicated packet 24, andusing Ack Ratio are discussed inone of therelevant CCID drafts. 8.4. Use Ack Vector Feature The Use Ack Vector feature lets DCCPs negotiate whether they should useduplicates was ECN marked. This might show up as a transition between States 0 and 1. To cope with these situations, HC-Sender DCCP implementations SHOULD combine multiple received Ack Vectoroptionsstates according toreport congestion. Ack Vector provides detailed loss information,this table: Received State 0 1 3 +---+---+---+ 0 | 0 |0/1| 0 | Old +---+---+---+ 1 | 1 | 1 | 1 | State +---+---+---+ 3 | 0 | 1 | 3 | +---+---+---+ To read the table, choose the row corresponding to the packet's old state andlets senders report backthe column corresponding totheir applications whether particular packets were dropped. Usethe packet's state in the newly received AckVector is mandatory for some CCIDs,Vector, then read the packet's new state off the table. For an old state of 0 (received non-marked) andoptionalreceived state of 1 (received ECN marked), the packet's new state may be set to either 0 or 1. The HC-Sender implementation will be indifferent to ack reordering if it chooses new state 1 forothers. Use Ack Vector has feature number 4.that cell. TheUse Ack Vector feature located at DCCP B specifies whether DCCP B MUST use Ack Vector optionsHC-Receiver should collect information about received packets, which it will eventually report to the HC-Sender onits acknowledgementsone or more acknowledgements, according toDCCP A, although DCCP B may send Ack Vector options eventhe following table: Received Packet 0 1 3 +---+---+---+ 0 | 0 |0/1| 0 | Stored +---+---+---+ 1 |0/1| 1 | 1 | State +---+---+---+ 3 | 0 | 1 | 3 | +---+---+---+ This table equals the sender's table, except that whenUse Ack Vector is false. DCCP A sends a "Change R(Use Ack Vector, 1)" option to DCCP B to ask B to send Ack Vector options as part of its acknowledgement traffic. Use Ack Vectorthe stored state isa server-priority feature. Use Ack Vector feature values are a single byte long. The1 and the received state is 0, the receiverMUST send Ack Vector options if this byteisnonzero.allowed to switch its stored state to 0. Kohler/Handley/Floyd Section 11.4.1. [Page 82] INTERNET-DRAFT Expires: August 2004 February 2004 Anew connection starts with UseHC-Sender MAY choose to throw away old information gleaned from the HC-Receiver's AckVector 0Vectors, in which case it MUST ignore newly received acknowledgements from the HC-Receiver forboth DCCPs. 8.5.those old packets. It is often kinder to save recent Ack VectorOptions Theinformation for a while, so that the HC-Sender can undo its reaction to presumed congestion when a "lost" packet unexpectedly shows up (the transition from State 3 to State 0). 11.4.2. Ack Vectorgives a run-length encoded history of dataCoverage We can divide the packets that have been sent from an HC-Sender to an HC-Receiver into four roughly contiguous groups. From oldest to youngest, these are: 1. Packets already acknowledged by the HC-Receiver, where the HC- Receiver knows that the HC-Sender has definitely receivedattheclient. Each byte ofacknowledgements. 2. Packets already acknowledged by thevector givesHC-Receiver, where thestate ofHC- Receiver cannot be sure thatdata packet intheloss history, andHC-Sender has received thenumber of preceding packets withacknowledgements. 3. Packets not yet acknowledged by thesame state. The option's data looks like this: +--------+--------+--------+--------+--------+-------- |0010011?| Length |SSLLLLLL|SSLLLLLL|SSLLLLLL| ... +--------+--------+--------+--------+--------+-------- Type=38/39 \___________ Vector ___________... The two Ack Vector options (option types 38 and 39) differ only inHC-Receiver. 4. Packets not yet received by thevalues they imply for ECN Nonce Echo. Section 9.2 describes this further.HC-Receiver. Thevector itself consists of a series of bytes, eachunion ofwhose encoding is: Kohler/Handley/Floyd/Padhye Section 8.5. [Page 73] INTERNET-DRAFT Expires: April 2004 October 2003 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |St | Run Length| +-+-+-+-+-+-+-+-+ St[ate]:groups 2bits Run Length: 6 bits State occupies the most significant two bits of each byte,andcan have one of four values: 0 Packet received (and not ECN marked). 1 Packet received ECN marked. 2 Reserved.3Packet not yet received. The first byte inis called thefirstAcknowledgement Window. Generally, every Ack Vectoroption refers togenerated by thepacket indicated inHC-Receiver will cover the whole AcknowledgementNumber; subsequent bytes refer to older packets. (AckWindow: Ack VectorMUST NOT be sent on DCCP-Data and DCCP- Request packets, which lack an Acknowledgement Number.) If anacknowledgements are cumulative. (This simplifies Ack Vectorcontainsmaintenance at thedecimal values 0,192,3,64,5 andHC- Receiver; see Section A, below.) As packets are received, this window both grows on theAcknowledgement Number is decimal 100, then: Packet 100 was received (Acknowledgement Number 100, State 0, Run Length 0). Packet 99 was lost (State 3, Run Length 0). Packets 98, 97, 96 and 95 were received (State 0, Run Length 3). Packet 94 was ECN marked (State 1, Run Length 0). Packets 93, 92, 91, 90, 89,right and88 were received (State 0, Run Length 5). Run lengths of more than 64 must be encoded in multiple bytes. A single Ack Vector option can acknowledge up to 16192 data packets. Shouldshrinks on the left. It grows because there are more packets, and shrinks because the data packets' Acknowledgement Numbers will acknowledge previous acknowledgements, moving packetsneed to be acknowledged than can fit in 253 bytes offrom group 2 into group 1. 11.5. Send AckVector, then multipleVector Feature The Send Ack Vector feature lets DCCPs negotiate whether they should use Ack Vector optionscan be sent. The secondto report congestion. Ack Vectoroption will begin where the firstprovides detailed loss information, and lets senders report back to their applications whether particular packets were dropped. Send Ack Vectoroption left off,is mandatory for some CCIDs, andso forth. Kohler/Handley/Floyd/Padhyeoptional for others. Send Ack Vector has feature number 8, and is server-priority. It takes one-byte Boolean values. DCCP A MUST send Ack Vector options on its acknowledgements when Send Ack Vector/A has value one, although it MAY send Ack Vector options even when Send Ack Vector/A Kohler/Handley/Floyd Section8.5.11.5. [Page74]83] INTERNET-DRAFT Expires:AprilAugust 2004October 2003February 2004 is zero. Values of two or more are reserved. New connections start with Send Ack Vectorstates are subject to two general constraints. (These principles SHOULD also be followed0 forother acknowledgement mechanisms; referringboth endpoints. DCCP B sends a "Change R(Send Ack Vector, 1)" option to DCCP A to ask A to send Ack Vectorstates simplifies their explanation.) (1) Packets reportedoptions asState 0 or State 1 MUST have been processed bypart of its acknowledgement traffic. 11.6. Slow Receiver Option An HC-Receiver sends thereceiving DCCP stack. In particular, their options must have been processed. Any data onSlow Receiver option to its sender to indicate that it is having trouble keeping up with the sender's data. The HC-Sender SHOULD NOT increase its sending rate for approximately one round-trip time after seeing a packet with a Slow Receiver option. However, the Slow Receiver option does not indicate congestion, and the HC-Sender need nothave been delivered toreduce its sending rate. (If necessary, thereceiving application; in fact,receiver can force thedata may have been dropped. (2) Packets reported as State 3 MUST NOT have been receivedsender to slow down byDCCP. Feature negotiationsdropping packets, with or without Data Dropped, or reporting false ECN marks.) APIs should let receiver applications set Slow Receiver, andoptions on such packets MUST NOT have been processed,sending applications determine whether or not their receivers are Slow. The Slow Receiver option takes just one byte: +--------+ |00000010| +--------+ Type=2 Slow Receiver does not specify why the receiver is having trouble keeping up with the sender. Possible reasons include lack of buffer space, CPU overload, and application quotas. A sending application might react to Slow Receiver by reducing its sending rate or by switching to a lossier compression algorithm. The sending application should not react to Slow Receiver by sending more data, however. The optimal response to a CPU-bound receiver might be to increase theAcknowledgement Number MUST NOT correspondsending rate, by switching tosuchapacket. Packets dropped inless- compressed sending format, since a highly-compressed data format might overwhelm a slow CPU more seriously than theapplication's receive buffer SHOULD be reported as Received or Received ECN Marked (States 0 and 1), depending on their ECN state; such packets' ECN Nonces MUSThigher memory requirements of a less-compressed data format. The Slow Receiver option is not appropriate for this case; a CPU-bound receiver should not ask for Slow Receiver options to beincluded in the Nonce Echo.sent. Slow Receiver implements a portion of TCP's receive window functionality. 11.7. Data Dropped Option The Data Dropped optioninforms the senderindicates that some packets reported as received actually had theirpayloads dropped. One or more Ack Vector options that, together, reportdata dropped before it reached thestatus of moreKohler/Handley/Floyd Section 11.7. [Page 84] INTERNET-DRAFT Expires: August 2004 February 2004 application. The sender's congestion control mechanism may respond to data-dropped packets less severely thanhave actually been sent SHOULD be considered invalid.to lost or marked packets. For instance, a windowed mechanism might subtract a constant value from its congestion window, rather than cut it in half. Data Dropped lets a sender differentiate between different kinds of loss (network and endpoint), but it does not allow total freedom in how to react. Thereceiving DCCP SHOULD either ignorecongestion control response to a Data Dropped packet must be approved by theoptions or resetIETF. Each congestion control mechanism MUST react to a Data Dropped packet as if theconnection with Reason setpacket were ECN marked, unless explicitly specified otherwise. If a received packet's application data is dropped for one of the reasons listed below, this SHOULD be reported using a Data Dropped option. Alternatively, the receiver MAY choose to"Option Error". Packetsreport as "received" only those packets whosestatus hasdata were not dropped, subject to the constraint that packets not reportedby any Ack Vector option SHOULD be treatedas"not yet received" (State 3) by the sender. Appendix A providesreceived MUST NOT have had their options processed. The option's data looks like this: +--------+--------+--------+--------+--------+-------- |00101000| Length | Block | Block | Block | ... +--------+--------+--------+--------+--------+-------- Type=40 \___________ Vector ___________ ... The vector itself consists of anon-normative descriptionseries ofthe detailsbytes, called Blocks, each ofDCCP acknowledgement handling,whose encoding corresponds to one of these choices: 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ |0| Run Length | or |1|DrpCd|Run Len| +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ Normal Block Drop Block The first byte in thecontext of an abstract Ack Vector implementation. 8.5.1. Ack Vector Consistency A DCCP sender will commonly receive multiple acknowledgements for some of its datafirst Data Dropped option refers to the packet indicated in the Acknowledgement Number; subsequent bytes refer to older packets.For instance,(Data Dropped MUST NOT be sent on DCCP-Data or DCCP- Request packets, which lack anHC-Sender might receive two DCCP-Acks with Ack Vectors, both ofAcknowledgement Number.) Normal Blocks, whichcontained information about sequence number 24. (Because of cumulative acking, information about a sequence number is repeatedhave high bit 0, indicate that any received packets inevery ack untiltheHC-Sender acknowledges an ack. Perhaps the HC-Receiver is sending acks faster thanRun Length had their data delivered to theHC-Sender is acknowledging them.) In a perfect world,application. Drop Blocks, which have high bit 1, indicate that received packets in thetwo Ack Vectors would always be consistent. However, there are many reasons why they mightRun Len[gth] were notbe: odelivered as usual. TheHC-Receiver received3-bit Drop Code [DrpCd] field says what happened; generally, no data from that packet24 between sending its acks, soreached thefirst ack said 24 wasapplication. Packets reported as "not yet received" MUST be included in Normal Blocks; packets notreceived (State 3) and the second Kohler/Handley/Floyd/Padhyecovered by any Data Dropped option are treated as if they were in a Normal Kohler/Handley/Floyd Section8.5.1.11.7. [Page75]85] INTERNET-DRAFT Expires:AprilAugust 2004October 2003 said it was received or ECN marked (State 0 or 1). o The HC-Receiver received packet 24 between sending its acks, and the network reordered the acks. In this case, the packet will appear to transition from StateFebruary 2004 Block. Defined Drop Codes for Drop Blocks are: 0or 1Packet data dropped due toState 3. o The network duplicated packet 24, and one ofprotocol constraints. For example, theduplicatesdata wasECN marked. This might show up asincluded on atransition between States 0DCCP-Request packet, and1. To cope with these situations, HC-Sender DCCP implementations SHOULD combine multiple received Ack Vector states according to this table: Received State 0 1 3 +---+---+---+ 0 | 0 |0/1| 0 | Old +---+---+---+ 1 | 1 | 1 | 1 | State +---+---+---+ 3 | 0 | 1 | 3 | +---+---+---+ To readthetable, choosereceiving application does not allow that piggybacking; or therow corresponding todata was sent during an important feature negotiation. 1 Packet data dropped because thepacket's old state andapplication is no longer listening. 2 Packet data dropped in thecolumn correspondingreceive buffer. 3 Packet data dropped due to corruption. 4-6 Reserved. 7 Packet data corrupted, but delivered to thepacket's state in the newly received Ack Vector, then readapplication anyway. For example, if a Data Dropped option contains thepacket's new state off