draft-duker-as2-reliability-00.txt  -->   draft-duker-as2-reliability-01.txt

view Side-By-Side changes

Internet Draft                                           Dale Moberg
Target Category: Informational                       August 10,                          June 9, 2005
Expires: February December 11, 2006


Operational Reliability for EDIINT AS2 
draft-duker-as2-reliability-00.txt 
draft-duker-as2-reliability-01.txt

Status of this Memo

By submitting this Internet-Draft, each author represents that any 
applicable patent or other IPR claims of which he or she is aware 
have been or will be disclosed, and any of which he or she becomes 
aware will be disclosed, in accordance with Section 6 of BCP 79.

Internet-Drafts are working documents of the Internet Engineering 
Task Force (IETF), its areas, and its working groups.  Note that 
other groups may also distribute working documents as Internet-
Drafts.

Internet-Drafts are draft documents valid for a maximum of six months 
and may be updated, replaced, or obsoleted by other documents at any 
time.  It is inappropriate to use Internet-Drafts as reference 
material or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at 
http://www.ietf.org/ietf/1id-abstracts.txt

The list of Internet-Draft Shadow Directories can be accessed at 
http://www.ietf.org/shadow.html.

Any questions, comments, and reports of defects or ambiguities in 
this specification may be sent to the mailing list for the EDIINT 
working group of the IETF, using the address <ietf-ediint@imc.org>. 
Requests to subscribe to the mailing list should be addressed to 
<ietf-ediint-request@imc.org>.

Abstract
The goal of this document is to define approaches to achieve a "once 
and only once" delivery of messages. The EDIINT AS2 protocol is 
implemented by a number of software tools on a variety of platforms 
with varying capabilities and with varying network service quality. 
Although the AS2 protocol defines a unique "Message-ID", current 
implementations of AS2 do not provide a standard method to prevent 
the same message (re-transmitted by the initial sender) from reaching 
back-end business applications at the initial receiver. TCP 
underpinnings of HTTP over which AS2 operates generally provide a 
good quality of network connectivity, but experience indicates a need 
to be able to compensate for both transient server and socket 
exceptions, including "Connection refused" as well as "Server busy." 
In addition, difficulties with server availability, stability, and 
loads can result in reduced operational reliability. This document 
describes some ways to compensate for exceptions and enhance the 


Duker & Moberg         Expires - February December  11, 2006           [Page 1]

Internet-Draft   Operational Reliability for EDIINT AS2    August 2005      June 2006

reliability of AS2 protocol operation. Implementation of these 
reliability features is indicated by presence of the "AS2-
Reliability" value in the EDIINT-Features header.

Feedback Instructions:
NOTE TO RFC EDITOR:  This section should be removed by the RFC editor 
prior to publication.

If you want to provide feedback on this draft, follow these 
guidelines:

-Send feedback via e-mail to the ietf-ediint list for discussion, 
with "AS2 Reliability" in the Subject field. To enter or follow the 
discussion, you need to subscribe to ietf-ediint@imc.org.

-Be specific as to what section you are referring to, preferably 
quoting the portion that needs modification, after which you state 
your comments.

-If you are recommending some text to be replaced with your suggested 
text, again, quote the section to be replaced, and be clear on the 
section in question.

Table of Contents

1. Introduction
1.1 Key Word Conventions
1.2 Terminology and Scope Limitations
2. AS2 Modes of Operation
3. AS2 Reliability Concepts
4. Basic Initial Sender Operation
5. Initial Sender Operation for Retry Situations
6. Initial Sender Operation for Resend Situations
7. Initial Receiver Operation
8. Additional Reliability Considerations with Synchronous MDNs
9. Security Considerations
10. IANA Considerations
11. Acknowledgements
Normative References
Informative References
Appendix


1. Introduction

AS2 Reliability has the goal of ensuring that the AS2 protocol 
succeeds in exchanging business data payloads exactly once, provided 
that the network routing and transport (IP and TCP) layers are fully 
functional. That is, the goals for reliability are, first, that 
errors associated with HTTP server operation and server initiated sub 
processes do not prevent delivering messages or their receipt 


Duker & Moberg         Expires - February December  11, 2006           [Page 2]

Internet-Draft   Operational Reliability for EDIINT AS2    August 2005      June 2006

responses (MDNs) at least once and, second, that retry or resending 
operations made to compensate for these errors do not result in the 
same message payloads being submitted for further processing more 
than once. 

1.1     Key Word Conventions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY"  "MAY", and "OPTIONAL" in this 
document are to be interpreted as described in [RFC2119].

1.2     Terminology and Scope Limitations

Initial Sender: The AS2 application ("sending implementation") which 
transmits the Message containing the business payload to the "initial 
receiver"

Initial Receiver: The AS2 application ("receiving implementation") 
which receives the Message containing the business payload. The 
initial receiver sends a MDN back to the initial sender.

Message: The business payload as embedded in a "wire" format and 
ready for transmission by a transfer protocol (including all MIME 
wrappings, headers, encodings, and security transformations).

Message Disposition Notification (MDN) - The Internet messaging 
format used to convey a receipt. This term is used interchangeably 
with receipt. See RFC 3798.

Message Identifier (Message-ID) - A globally unique identifier for a 
message. The sending implementation MUST guarantee that the Message-
ID is unique. See RFC 2822.

Payload: The business data exchanged between business applications. 

Retry:   When attempting to send a message using the POST method, the 
initial sender can encounter transient exceptions that result in a 
failure to obtain a HTTP status code or a transient HTTP error such 
as 503. "Retry" is the term used in this document to refer to an 
additional POST of the same message, with the same content (including 
even the boundary delimiters and timestamps) and with the same 
Message-ID value. A retry can occur after a few second delay or on a 
schedule. Retrying ceases when a message is sent (which is indicated 
by receiving a HTTP 200 range status code), or when a retry limit is 
exceeded.

Resend:  The AS2 protocol normally requests a (signed or unsigned) 
MDN response in the HTTP response message body. When a MDN is not 
received in a timely manner, the initial sender may choose to resend 
the original message. Because the message has already been sent, but 
has presumably not been processed according to expectation, the same 


Duker & Moberg         Expires - February December  11, 2006           [Page 3]

Internet-Draft   Operational Reliability for EDIINT AS2    August 2005      June 2006

message, with the same content and the same Message-ID value is sent 
again. This operation is referred to as a resend of the message. This 
document will suggest guidelines to prevent AS2 software 
implementations that receive duplicate messages from distributing 
that message to back-end business applications, as well as guidelines 
on resend intervals and resend counts for various modes of AS2 
operation. Resending ends when the MDN is received or the resend 
count is reached.

Resubmit: Accidents happen, and possibly the remote system will need 
to get a new copy (a "resubmit") of a message that was previously 
exchanged. In addition, neither Resending nor Retrying continue 
forever, but the data may still need to be exchanged at a later time, 
so a message may need to be resubmitted. When data that failed to be 
exchanged or was exchanged but lost is resubmitted in a new message 
(with a new Message-ID value, and possibly with new timestamps, and 
boundary delimiters), it is called resubmission. Resubmission is 
normally a manual compensation and is not discussed in this document 
further.

Duplicate: Duplicate copies of the same message are messages between 
the same AS2-To and AS2-From organizations which have the same 
Message-ID. This document will recommend ways to respond to 
duplication in messages as indicated by messages being received with 
the same Message-ID values. These duplicates arise in some cases when 
retries and/or resends are allowed, and success indicators (such as 
HTTP "200 OK" or MDNs) are not received by the initial sender.


2. AS2 Modes of Operation

There are many user selectable options within the AS2 protocol, but 
there are two main modes of operation commonly used that differ in 
how the MDN is returned to the initial sender.

Transferring data via AS2 involves two organizations that are 
identified using AS2-To and AS2-From header values. The initial 
sender places its identifying value in the AS2-From header and POSTS 
an AS2 formatted message to the organization whose value is found in 
the AS2-To header. The initial sender can indicate that it wants to 
receive its MDN on a different connection by including a header, 
Receipt-Delivery-Option, with an http, https, or (rarely) mailto URL 
as its value. Use of the mailto URL is not further considered in this 
document.

When the MDN is requested to be returned on a distinct TCP connection 
using the included URL, the AS2 operation mode is called 
"asynchronous." An asynchronous MDN is returned when the initial 
receiver has the information and resources available to do so, and 
the formatted MDN is POSTED to the delivery URL with the initial 
sender identifier as the AS2-To value and the original recipient as 


Duker & Moberg         Expires - February December  11, 2006           [Page 4]

Internet-Draft   Operational Reliability for EDIINT AS2    August 2005      June 2006

the AS2-From value.

The AS2 protocol does not specify how asynchronous MDN delivery is 
scheduled; it is left to the receiving implementation to determine 
how MDNs will be returned. The protocol in effect uses the "200" 
level status code to determine that the initial message has been 
sent. The initial sender then enters into a state of "waiting" or 
"expecting" a MDN to be received.

While it is expected that a MDN be returned in a timely fashion, 
there has not been an agreed upon deadline, and receiving 
implementations have had flexibility in scheduling return. It is 
therefore possible that an indefinite waiting period occur when a MDN 
is "lost" (for whatever reason). Most implementations do eventually 
time out this wait for a lost MDN. Implementations also try to 
recover from this protocol failure by resending the original message. 
Under certain circumstances, therefore, duplicate messages can arrive 
at the recipient. 

When no Receipt-Delivery-Option is included in the original message, 
and a MDN is requested (whether signed or unsigned), the AS2 
operation mode is said to be "synchronous." This means that the 
protocol requires that the MDN arrive back on the same TCP 
connection, and in the MIME body of the HTTP reply message. When 
using the synchronous mode, there can be a timeout by either side 
waiting for the HTTP reply, and that timeout usually aborts the 
protocol by closing the connection. In such a case, the message has 
not been successfully sent, so the payload from the message should 
not be distributed to a back end business application, and the 
message can only be retried (or perhaps resubmitted, an option not 
discussed here). Therefore resend compensation will not be discussed 
for the synchronous mode, but instead retry compensation will be the 
main topic.


3. AS2 Reliability Concepts

Introducing timeouts for various wait states does not in itself 
promote the goal of delivering the message. Instead it simply cleans 
up the protocol state machine so that it can be restarted. Delivery 
thus requires that the message be sent again either in a retry or a 
resend operation.

It is important to have precise understanding of what "sending the 
same message again" means. When exchanging business data, there are 
both payload transaction identifiers and message identifiers. In AS2, 
the message identifier is the value of the Message-ID header, and the 
procedures described below will assume that each message has a unique 
Message-ID value in the message headers. Implementations MUST not 
change any content of the message when retrying or resending. This 
requirement allows implementations to use Message-ID values to detect 


Duker & Moberg         Expires - February December  11, 2006           [Page 5]

Internet-Draft   Operational Reliability for EDIINT AS2    August 2005      June 2006

duplicate messages, and avoid sending their payloads to the internal 
business applications that process the business data. [Note: 
duplicate payloads could still be sent, but they would have to be 
sent in different messages. Implementations MAY provide duplicate 
detection for payloads as well. Implementations will need to be 
informed about the specific business data (such as the interchange 
control numbers of the ISA header of ANSI X12 payloads) in order to 
offer a service for duplicate payload detection.]


4. Basic Initial Sender Operation

Sending implementations MUST have a configurable timer if they close 
an unresponsive HTTP connection prior to HTTP status code reply. This 
timer can then be adjusted upwards if closing connections leads to 
failures.

Sending implementations MUST retain an exact copy of every message 
(including the Message-ID value) which is attempted to be sent or is 
sent. Repackaging a payload will not necessarily produce the same 
message, because MIME boundary delimiter values, timestamps, and 
other dynamic data used in assembling messages may not be the same. 
It is implementation dependent how this copy is retained.

Several parameters relating to the number and schedule for retries 
and resends need to be described. Implementations MUST allow the 
configurability of these parameters but are allowed to use an 
implementation dependent "back off" algorithm for lengthening 
intervals. The result of allowing implementations to use their own 
algorithm means that, in effect, some of the parameters described 
below will not be independent of the others for a given 
implementation.

For example, when an implementation allows configurability of the 
number of retries or resends and the minimum interval for 
retry/resend, then the total retry/resend duration will be an 
implementation dependent quantity. Implementations are encouraged to 
check user selected configurations so that their "back off" algorithm 
works correctly. In general, two of the three variables described are 
normally independently configurable. 


5. Initial Sender Operation for Retry Situations

Relevant conditions:

Connection refused:
Closed connection prior to HTTP reply received.
Transient exception (such as 503) in HTTP reply codes

Behavior and Capabilities:


Duker & Moberg         Expires - February December  11, 2006           [Page 6]

Internet-Draft   Operational Reliability for EDIINT AS2    August 2005      June 2006



o Number of retries.

Sending implementations MUST be capable of configuring a total number 
of retries, and MUST stop retrying either when a successful send 
occurs or when the total retry number is reached. The count of 
retries SHALL begin with the first retry counting as the first one. 
So, if five retries are allowed, a total of 6 attempts can be made to 
send the message using the retry operation, provided retry does not 
attain success or otherwise stop. Retries MUST also cease if the 
total elapsed time for the retry duration is reached.

o Interval of retries.

Sending implementations MUST be capable of configuring a minimum 
retry interval. The minimum interval pertains to the first retry, and 
(depending on an implementation dependent algorithm) remaining ones. 
The function governing lengthening intervals between retries MUST 
increase monotonically (stay the same or increase). In addition, the 
sum of the intervals MUST not exceed the total retry duration value. 
Note that there are as many intervals as there are retries. So, the 
first interval pertains to the time between the original failed send 
(start of the total retry duration timer mentioned below) and the 
time of the first retry.

o Total retry duration

Sending implementations MUST be capable of configuring a period 
within which all retries of the same message are attempted. After 
this period expires, a payload would have to be resubmitted to be 
exchanged. The timer for this interval should commence after the 
initial attempt to deliver is not known to have succeeded (success is 
marked by the reception of a 200 level HTTP status code.)

o Diagnostic logging

Sending implementations SHOULD keep a record of the condition that 
caused a failure in sending a message as this log may help identify a 
cause of and a solution to a sending failure. For example, if the 
time involved in all retries of sending a message has approximately 
the same value and the error is reported as an unexpected close in a 
connection, then a review of the timer values governing closing 
connections on both sides, followed by their adjustment can be 
useful. Of course, other factors may be involved-- ranging from 
network congestion to unpredictably large payloads to be exchanged-- 
that may also need further tweaking. 


6. Initial Sender Operation for Resend Situations



Duker & Moberg         Expires - February December  11, 2006           [Page 7]

Internet-Draft   Operational Reliability for EDIINT AS2    August 2005      June 2006

Because successful delivery of the message in the synchronous MDN 
mode implies that the initial sender must receive a response which 
contains both a HTTP 200 level status code and a MDN in the body of 
the response, the resend operation is not defined for the synchronous 
MDN mode of operation.

Relevant conditions:

MDN not received after a resend interval has expired.

Behavior and Capabilities:

Sending implementations MUST start the resend timer after successful 
sending (when using asynchronous MDNs). 

o Number of Resends

Sending implementations MUST be capable of configuring a total number 
of resends, and MUST stop resending when a MDN is received, when 
reaching the total number of resends, or when exceeding the total 
allowed time for resends.

o Interval of Resends

Sending implementations MUST be capable of configuring an interval of 
time separating resends. Implementations MUST ensure for a given 
message that retry and resend operations are not interwoven. For 
example, during a resend attempt, retries could occur. In this 
situation, the sending implementation MUST ensure that another resend 
does not start while retries are still occurring.

o Total resend duration.

Sending implementations MUST be capable of configuring a total 
duration for a resend operation and MUST stop resend operations when 
that duration is exceeded. The timer for this interval is started 
after a successful send operation. The value for this timer SHOULD 
exceed the value (Resend Number times Resend Interval).


7. Initial Receiver Operation

Behavior and Capabilities:

Receiving implementations MUST have a configurable timer if they 
respond to exceptions by closing the HTTP connection before they can 
return a HTTP reply status code.

Receiving implementations MUST return an appropriate MDN (when a MDN 
is requested) even when a message is detected as a duplicate.



Duker & Moberg         Expires - February December  11, 2006           [Page 8]

Internet-Draft   Operational Reliability for EDIINT AS2    August 2005      June 2006

Duplicate elimination is based on Message-ID values. Receiving 
implementations MUST retain Message-ID values for the pairs of 
organizations exchanging data, beginning with the successful receipt 
of the message. Successful receipt may possibly occur from the 
receiving implementation's point of view even if the initial sender 
does not see the HTTP reply status code, thereby causing the initial 
sender to initiate a retry.

Receiving implementations SHOULD retain Message-IDs until the initial 
sender has exhausted all retry and resend durations. Since the 
receiving implementation may not know these durations, the receiving 
implementation MUST retain each Message-ID for a minimum of five days 
unless users explicitly configure the time period for a shorter time. 

Receiving implementations MUST be configurable so that backend 
business applications are not sent the contents (payload) from the 
same message more than once. It is recommended that implementers make 
this the default. (Different messages, as determined by their 
Message-ID values, may still send the same payload contents, 
however.)

Receiving implementations MAY be configurable so that backend 
business applications do not receive the same payload more than once 
(for mutually agreed upon business data types). This functionality 
is, however, not specified in this document.


8. Additional Reliability Considerations with Synchronous MDNs

There can be combinations of server and client behavior that, even 
when the network is fully functional, still interfere with reliable 
AS2 data exchange. 

When clients, their operating systems, or intermediate HTTP relay 
agents choose to close TCP connections before the server has had time 
to complete the processing needed to create the reply, repetition of 
the client HTTP request need not lead to a successful outcome, no 
matter how often the retry operation is repeated. 

Timeouts while waiting for a HTTP response may themselves create 
errors. The intent of these timeouts is to avoid waste of resources 
tied up in possibly indefinite delays ("hangs") in HTTP response. 
However, with short timeout periods and for very large files, the 
security processing required to be able to form the MDN may, 
especially under very heavy loads, lead to a particularly bad 
outcome. The initial sender may attempt to repeatedly retry its HTTP 
POST creating additional load with no better outcome (timeout before 
MDN reply is received).

In such a situation, there are many different corrective measures 
possible. The timeout period may be raised considerably. The data in 


Duker & Moberg         Expires - February December  11, 2006           [Page 9]

Internet-Draft   Operational Reliability for EDIINT AS2    August 2005      June 2006

a message may possibly be distributed into smaller business data 
payloads. The processing power of the receiving server can be 
increased either by clustering or higher rated equipment. Retry logic 
may be disabled. The protocol mode of operation may be adjusted to 
make use of MDNs returned over separate connections (asynchronously). 
Other underlying transfer protocols (ftp, for example) may be 
adopted. 


9. Security Considerations

[None]

10. IANA Considerations

[None]

11. Acknowledgements

The authors wish to extend gratitude to John Koehring for his 
comments on early drafts.

Normative References

[AS2] RFC 4130 "MIME-Based Secure Peer-to-Peer Business Data 
Interchange Using HTTP, Applicability Statement 2 (AS2)", D. Moberg, 
R. Drummond, July 2005.

[RFC2119] RFC2119 "Key Words for Use in RFC's to Indicate Requirement 
Levels", S.Bradner, March 1997.

[RFC 2822] RFC 2822"Internet Message Format" P. Resnick, April 2001

[RFC3798] RFC3798 "Message Disposition Notification" T. Hansen, G. 
Vaudreuil, May 2004

Informative References

[AS1] RFC3335 "MIME-based Secure Peer-to-Peer Business Data 
Interchange over the Internet using SMTP", T. Harding, R. 
Drummond, C. Shih, September 2002.

[AS3] draft-ietf-ediint-as3-03.txt "MIME-based Secure Peer-to-Peer 
Business Data Interchange over the Internet using FTP", T. 
Harding, R. Scott, April 2005.


Authors' Addresses

John Duker
duker.jp@pg.com


Duker & Moberg         Expires - February December  11, 2006           [Page 10]

Internet-Draft   Operational Reliability for EDIINT AS2    August 2005      June 2006

Procter & Gamble
2 Procter & Gamble Plaza
Cincinnati, OH 45202 USA

Dale Moberg
dmoberg@cyclonecommerce.com
Cyclone Commerce
8388 E. Hartford Drive, Suite 100
Scottsdale, AZ  85255 USA


Appendix


Intellectual Property

The IETF takes no position regarding the validity or scope of any 
Intellectual Property Rights or other rights that might be claimed to 
pertain to the implementation or use of the technology described in 
this document or the extent to which any license under such rights 
might or might not be available; nor does it represent that it has 
made any independent effort to identify any such rights.  Information 
on the procedures with respect to rights in RFC documents can be 
found in BCP 78 and BCP 79.

Copies of IPR disclosures made to the IETF Secretariat and any 
assurances of licenses to be made available, or the result of an 
attempt made to obtain a general license or permission for the use of 
such proprietary rights by implementers or users of this 
specification can be obtained from the IETF on-line IPR repository at 
http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any 
copyrights, patents or patent applications, or other proprietary 
rights that may cover technology that may be required to implement 
this standard.  Please address the information to the IETF at ietf-
ipr@ietf.org.

Copyright (C) The Internet Society (2005). (2006). This document is subject 
to the rights, licenses and restrictions contained in BCP 78, and 
except as set forth therein, the authors retain all their rights.

This document and the information contained herein are provided on an 
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.




Duker & Moberg         Expires - February December  11, 2006           [Page 11]
----