view Side-By-Side changes
<draft-ietf-ipngwg-addrconf-privacy-01.txt> IBM<draft-ietf-ipngwg-addrconf-privacy-00.txt> JuneR. Draves Microsoft Research October 1999 Privacy Extensions for Stateless Address Autoconfiguration in IPv6<draft-ietf-ipngwg-addrconf-privacy-00.txt><draft-ietf-ipngwg-addrconf-privacy-01.txt> Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract Nodes use IPv6 stateless address autoconfiguration to generate addresses without the necessity of a DHCP server. Addresses are formed by combining network prefixes witha constantan interface identifier. On interfaces that contain embedded IEEE Identifiers, the interface identifier is typically derived from it. On other interface types, theinterface's IEEE Indentifier.interface identifier is generated through other means, for example, via random number generation. This document describes an optional extension to IPv6 stateless address autoconfigurationthat results in a node generating addressesfor interfaces whose interface identifier is derived from an IEEE identifier. Use of the extension causes nodes to generate global- scope addresses from interfaceidentifieridentifiers thatchangeschange overtime.time, even in cases where the interface contains an embedded IEEE identifier. Changing the interface identifier (and the global-scope addresses generated from it) over time makes it more difficult for draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 1] INTERNET-DRAFT October, 1999 eavesdroppers and other information collectors to identify when different addresses used in different transactions actually correspond to the same node.draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 1] INTERNET-DRAFT June 24, 1999Contents Status of this Memo.......................................... 1 1. Introduction............................................. 2 2. Background............................................... 3 3. Protocol Description.....................................57 4. Implications of Changing Interface Identifiers...........710 5. Open Issues and Future Work..............................711 6. Security Considerations..................................812 7. References...............................................8 8. Authors' Addresses....................................... 912 9. Appendix................................................. 13 1. Introduction Stateless address autoconfiguration [ADDRCONF] defines how an IPv6 node generates addresses without the need for a DHCP server.NetworkSome types of network interfacestypicallycome with an embedded IEEE Identifier (i.e., a link-layer MAC address), and in those cases stateless address autoconfiguration uses the IEEE identifier to generate a 64-bit interface identifier [ADDRARCH]. By design, the interface identifierwill typically beis globallyunique.unique when generated in this fashion. The interface identifier is in turn appended to a prefix to form a 128-bit IPv6 address. All nodesuse this techniquecombine interface identifiers (whether derived from an IEEE identifier or generated through some other technique) with the reserved link-local prefix to generate link-local addresses for their attached interfaces. Additional addresses, including site-local and global-scope addresses, are then created by combining prefixes advertised in Router Advertisements via Neighbor Discovery [DISCOVERY] with the interface identifier.As mobile devices (e.g., laptops, PDAs, etc.) move topologically, they form new addresses for their current topological point of attachment. While the node's address changes as it moves, however, theNot all nodes and interfaces contain IEEE identifiers. In such cases, an interface identifiercontained within the address remains the same. Becauseis generated through some other means (e.g., at random), and the resultant interface identifierassociated with a node can potentially remain fixed for a long periodis not globally draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 2] INTERNET-DRAFT October, 1999 unique and may also change over time. The focus oftime (e.g., months or years) concern has been voiced thatthis document is on addresses derived from IEEE identifiers, as theinterface identifier couldconcern being addressed exists only insomethose casesbe used to trackwhere themovementinterface identifier is globally unique andusagenon-changing. The rest ofa particular machine. For example, a serverthis document assumes thatlogs the source addresses of incoming connections would simultaneously collect identical information keyed on the interface id, allowing one to correlate draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 2] INTERNET-DRAFT June 24, 1999 activities based on interfaceIEEE identifiersin addition to addresses. This is of particular concern with the expected proliferation of next-generation network-connected devices (e.g, PDAs, cell phones, etc.) in which large numbers of devicesarein practice associated with a single user. Thus,being used, but theinterface identifier embedded within an address could be usedtechniques described may also apply totrack activitiesinterfaces with other types ofan individual.globally unique and persistent identifiers. This document discusses concerns associated with the embedding of interface identifiers within IPv6 addresses and describes optional extensions to stateless address autoconfiguration that can help mitigate those concerns in environments where such concerns are significant. Section 2 provides background information on the issue. Section 3 describes a procedure for generating alternate interface identifiers and global-scope addresses. Section 4 discusses implications of changing interface identifiers. 2. Background This section discusses the problem in moredetail anddetail, provides context for evaluating the significance of the concerns in specific environments and makes comparisons with existing practices. 2.1. Extended Use of the Same Identifier The use of a non-changing interface identifier to form addresses is a specific instance of the more general case where a constant identifier is reused over an extended period of time and in multiple independent activities. Anytime the same identifier is used in multiple contexts, it becomes possible for that identifier to be used to correlate seemingly unrelated activity. For example, a network sniffer placed strategically on a link across which all traffic to/from a particular host crosses could keep track of which destinations a node communicated with and at what times. Such information can in some cases be used to infer things, such as what hours an employee was active, when someone is at home, etc. One of the requirements for correlating seemingly unrelated activities is the use (and reuse) of an identifier that is recognizable over time within different contexts. IP addresses provide one obvious example, but there are more. Many nodes also have DNS names associated with their addresses, in which case the DNS name serves as a similar identifier. Although the DNS name associated with an address is more work to obtain (it may require a DNS query) the information is often readily available. In such cases, changing the address on a machine over time would do little to address the concern draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 3] INTERNET-DRAFT October, 1999 raised in this document, as the DNS name would become the correlating identifier. The use of a constant identifier within an address is of specialdraft-ietf-ipngwg-addrconf-privacy-00.txt [Page 3] INTERNET-DRAFT June 24, 1999concern because addresses are a fundamental requirement of communication and cannot easily be hidden from eavesdroppers and other parties. Even when higher layers encrypt their payloads, addresses in packet headers appear in the clear. Consequently, if a mobile host (e.g., laptop) accessed the network from several different locations, an eavesdropper might be able to track the movement of that mobile host from place to place, even if the upper layer payloads were encrypted [SERIALNUM]. 2.2. Not a New Issue Although the topic of this document may at first appear to be an issue new to IPv6, similar issuesalreadyexist in today's Internet already. That is, addresses used in today's Internet are oftenconstantnon-changing in practice for extended periods of time. In many sites, addresses are assigned statically; such addresses typically change infrequently. However, many sites are moving away from static allocation to dynamic allocation viaDHCP.DHCP [DHCP]. In theory, the address a client gets via DHCP can change over time, but in practice servers return the same address to the same client (unless addresses are in such short supply that they are reused immediately by a different node when they become free). Thus, although many sites use DHCP, clients end up using the same address for months at a time. Nodes that need a (non-changing) DNS name generally have static addresses assigned to them to simplify the configuration of DNS servers. Although Dynamic DNS [DDNS] can be used to update the DNS dynamically, it is not widely deployed today. In addition, changing an address but keeping the same DNS name does not really address the underlying concern, since the DNS name becomes a non-changing identifier. Servers generally require a DNS name (so clients can connect to them), and clients often do as well (e.g., some servers refuse to speak to a client whose address cannot be mapped into a DNS name that also maps back into the same address). Many network services require that the client authenticate itself to the server before gaining access to a resource. The authentication step binds the activity (e.g., TCP connection) to a specific entity (e.g., an end user). In such cases, a server already has the ability to track usage by an individual, independent of the address they happen to use. Indeed, such tracking is an important part of accounting. draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 4] INTERNET-DRAFT October, 1999 Web browsers and servers typically exchange "cookies" with eachother. Such cookiesother [COOKIES]. Cookies allow web servers to correlate a current activity with a previous activity. One common usage is to send back targeted advertising to abrowser by noting that a transaction that it is draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 4] INTERNET-DRAFT June 24, 1999 performing was starteduser byan entity that previously requested information that had the side-effect of indicating the interest ofusing thequerier. 2.3. Possible Approaches One way to avoid some ofcookie supplied by theproblems discussed above would bebrowser touse DHCPidentify what earlier queries had been made (e.g., forobtaining addresses. With DHCP, the DHCP server could arrange to hand out addresses that change over time. Another approach, one compatible withwhat type of information). Based on thestateless address autoconfiguration architecture wouldearlier queries, advertisements can be targeted tochangematch theinterface id portion(assumed) interests ofan address over time. For example, upon each system restart, select a new interface identifier different from the ones used previously. Changingthe end-user. The use of non-changing interfaceidentifier makes it more difficult to look at the IP addressesidentifiers in IPv6 has implications inindependent transactions and identify which ones actually correspond to the same node. In order to make it difficult to make educated guesses as to whethertwo quite differentinterface identifiers belong to the same node, the algorithm for generating alternate identifiers must include inputcontexts: stationary devices (i.e., those thathas an unpredictable component from the perspective of the outside entity's collecting information. Picking identifiers from a pseudorandom sequence suffices, so longgenerally do not move physically such asthe specific sequence cannot be determined by an outsider examining just the identifiersdesktop PCs), and mobile devices (i.e., those thatappear in addresses. This document proposesmove frequently, including laptops, cell phones, etc.). In today's internet, many home users do not have permanent connections and indeed are assigned temporary addresses each time they connect to their ISP. Consequently, the addresses they use change frequently over time and are shared among a number of different users. If addresses are generated from anMD5 hash, usinginterface identifier, however, aper-interface "key"home user's address could contain an interface identifier thatvariesremains the same from oneinterfacedialup session to the next. The way PPP is used today, however, PPP servers typically unilaterally inform the client what address they are toanother. Specifically, weuse (i.e., theinterface identifier generated usingclient doesn't generate one on its own). This practice, if continued in IPv6, would avoid thenormal procedure [ADDRARCH] asconcerns that are thekey. 3. Protocol Description The goalfocus of thissection is to define procedures that: 1) Resultdocument. A more interesting case concerns always-on connections (e.g., cable modems, ISDN, DSL, etc.) that result in adifferent interface identifier being generated at each system restart or attachment to a network. 2) Produce a sequencehome site using the same address for extended periods ofinterface identifierstime. This is a scenario thatappearis just starting tobe randombecome common inthe sense that it is difficult for an outside observerIPv4 and promises topredictbecome more of afuture identifier based on a current one and it is difficult to determine previous identifiers knowing only the present one. We describe two approaches.concern as always-on internet connectivity becomes widely available. Thefirst assumestechnique described later in thepresence of stable storage that can be useddocument attempt torecord state history for use as input into the next iteration of the algorithm, i.e., after a system draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 5] INTERNET-DRAFT June 24, 1999 restart. A second approach addresses the case where stable storage is unavailable andaddress this concern by changing the interface identifiermustportion of an address. However, it should begenerated at random. 3.1. When Stable Storage is Present The following algorithm assumesnoted that in thepresencecase ofa 64-bit "history value" thatalways-on connections, the network prefix portion of an address isused as inputingenerating an interfaceeffect a constant identifier.The very first timeAll nodes at (say) a home, would have thesystem bootssame network prefix. This has implications for privacy, though not at the same granularity (i.e.,out-of-the-box), any value can be used includingallzeros. Whenevernodes within anew interface identifier is generated, its value is saved in the seedhome would be lumped together for thenext iterationpurposes of collecting information). This issue is also non-trivial to address, because theprocess. Section 5.3routing prefix part of[ADDRCONF] describes the steps for generating a link- local address whenaninterface becomes enabled. This document modifiesaddress contains topology information and cannot contain arbitrary values. Another case concerns mobile devices (e.g., laptops, PDAs, etc.) thatstep inmove topologically within thefollowing way. Rather than use interface identifiers generated as described in [ADDRARCH],Internet. Whenever they move (in theidentifier is generatedabsence of technology such asfollows: 1) Take the history value from the previous iteration (or 0 if there is no previous value)mobile IP [MOBILEIP]), they form new addresses for their current topological point of attachment. This is draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 5] INTERNET-DRAFT October, 1999 typified today by the "road warrior" who has Internet connectivity both at home andappend toat the office. While the node's address changes as it moves, however, the interface identifiergenerated as described in [ADDRARCH]. 2) Compute the MD5 message digest [MD5] over the quantity created in step 1). 3) Takecontained within theleft-most 64-bits ofaddress remains theMD5 digest and set bit 6 (the left-most bit is numbered 0) to zero. This createssame (when derived from an IEEE Identifier). In such cases, the interface identifierwithcould (in theory) be used to track theuniversal/local bit indicating local significance only. Usemovement and usage of a particular machine [SERIALNUM]. For example, a server that logs usage information together with a source addresses, is also recording theresultantinterface identifierfor generatingsince it is embedded within an address. Consequently, any data-mining technique that correlates activity based on addressesas outlined in [ADDRCONF]. That is, usecould trivially do theinterface identifier to generate a link-local and other appropriate addresses. 4) Savesame using the interfaceidentifier createdidentifier. This is of particular concern with the expected proliferation of next-generation network-connected devices (e.g, PDAs, cell phones, etc.) instep 3)which large numbers of devices are instable storage aspractice associated with individual users (i.e., not shared). Thus, thehistory value tointerface identifier embedded within an address could be usedin the next iterationto track activities of an individual, even as they move topologically within thealgorithm. MD5 was chosen for convenience, not because of strict requirements. IPv6 nodes are already requiredinternet. 2.3. Possible Approaches One way toimplement MD5 as partavoid some ofIPsec [IPSEC], thusthecode will already be present on IPv6 machines. 3.2. In The Absence of Stable Storage Inproblems discussed above is to use DHCP for obtaining addresses. With DHCP, theabsence of stable storage, no history information willDHCP server could arrange to hand out addresses that change over time. Another approach, compatible with the stateless address autoconfiguration architecture, would beavailabletogenerate a pseudo-random sequence ofchange the interfaceidentifiers. Consequently, identifiers will need to be generated at random. A numberid portion oftechniques might be appropriate. Consult [RANDOM] for suggestions on good sources for obtaining random numbers. Note that even though a machine may not have stable storagean address over time forstoring draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 6] INTERNET-DRAFT June 24, 1999some address scopes. Changing thepreviously usinginterfaceidentifier, they will in many cases have configuration information that differs from one machine to another (e.g., user identity, security keys, etc.). One approachidentifier can make it more difficult togenerating random interface identifierslook at the IP addresses insuch cases isindependent transactions and identify which ones actually correspond tousetheconfiguration information to generate some data bits (which may be remain constant forsame node, both in thelife ofcase where themachine, but will vary from one machine to another), append some random datarouting prefix portion of an address changes andcompute the MD5 digestwhen it does not. Many machines function asbefore. The remaining details for generating addressesboth clients and servers. In such cases, the machine wouldbe analogous to those ofneed a DNS name for its use as a server. Whether theprevious section. 4. Implications of Changing Interface Identifiers The IPv6 addressing architecture goes to great lengths to ensure that interface identifiers are globally unique. Duringaddress stays fixed or changes has no privacy implications since theIPng discussions ofDNS name remains constant and serves as a constant identifier. When acting as a client (e.g., initiating communication), however, such a machine may want to vary theGSE proposal [GSE],addresses itwas felt that keeping interface identifiers globally uniqueuses. In such environments, one may need multiple addresses: a "public" (i.e., non- secret) server address, registered inpractice might prove useful to future transport protocols. Usage ofthealgorithms in this document would eliminateDNS, thatfuture flexibility. The desires of protecting individual privacy vs. the desireis used toeffectively maintainaccept incoming connection requests from other machines, anddebug(possibly) anetwork can conflict with each other. Having clients use addresses that change over time will make it more difficult"anonymous" address used totrack down and isolate operational problems. For example,shield the identity of the client whenlooking at packet traces,itcould become more difficultinitiates communication. These two cases are roughly analogous todetermine whether one is seeing behavior caused by a single errant machine, or bytelephone numbers and caller ID, where a user may list their telephone number in the public phone book, but disable the display ofthem. 5. Open Issues and Future Work This document specifies that a node generate a new interface identifier each timeits number via caller ID when initiating calls. draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 6] INTERNET-DRAFT October, 1999 To make itautoconfigures an interface. The same identifier is used to generate all addresses, including link-local, site-local and global. However, the concerns this document addresses are most likely relevant onlydifficult toglobal-scope addresses. Thus, it maymakesense for a nodeeducated guesses as tohavewhether two different interfaceidentifiers,identifiers belong to thestandard one [ADDRCONF] used for link-local and site-local addresses, with a changing one used onlysame node, the algorithm forglobal-scope addresses. This would appear to require only small changesgenerating alternate identifiers must include input that has an unpredictable component from thecurrent specification. In some cases, one could imagineperspective of theneed to change an address more frequently than upon reboot or movement tooutside entities that are collecting information. Picking identifiers from anew location. For example, for machinespseudo-random sequence suffices, so long as the specific sequence cannot be determined by an outsider examining just the identifiers thatdo not restart for months at time, one might changeappear in addressesevery few daysorweeks. In extreme cases, one might even want to change addresses uponare otherwise readily available. This document proposes theinitiationgeneration ofeach new TCP connection. Doing frequent changes would appear to add significant issues and possible implementation complications. For draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 7] INTERNET-DRAFT June 24, 1999 example, an implementation might need to supportasignificant numberpseudo-random sequence ofaddress oninterface identifiers via an MD5 hash. Periodically, the next interfacesimultaneously. An implementation would also need to keep trackidentifier in the sequence is generated, a new set ofwhichanonymous addresseswere being used so asis created, and the previous anonymous addresses are deprecated tobe ablediscourage their further use. The precise pseudo- random sequence depends on both a random component and the globally unique interface identifier (when available), tostop using an address once no upper layer protocols are using it (but not before). Thisincrease the likelihood that all node generate a different sequence. 3. Protocol Description The goal of this section isin contrasttocurrent approaches wheredefine procedures that: 1) Result in the creation of addressesare removedfromanthe same (constant) interfacewhen they become invalididentifier just as is the case with stateless address autoconfiguration [ADDRCONF]. Link-local and site-local addresses would be used just as in [ADDRCONF],independentbut global-scope addresses would be used only for the acceptance ofwhether or not upper layer protocolsincoming connections (i.e., they arestill using them. Some machinesserveras both clientsaddresses), andservers. In such cases, the servernot used when initiating outgoing communication. 2) Create additional global-scope addresses based on a random interface identifier for use with global scope addresses. Such addresses wouldneedbe used to initiate outgoing sessions. These "random" addresses would be used for aDNS name. Whethershort period of times (hours to days) and then be deprecated (where they could continue to be used for already established connections, but not for new connections). New addresses are generated periodically, with the exact time between addressstays fixed or changes doesn'tgeneration a mattersince the DNS name remains constant. Simultaneously, when acting asof local policy. 3) Produce aclient (e.g., initiating communication) it may wantsequence of global-scope addresses from a sequence of interface identifiers that appear tovarybe random in theaddresssense that ituses. In such environments, one might need multiple addresses. Sourceis difficult for an outside observer to predict a future addressselection rules would need(or identifier) based on a current one and it is difficult totake into accountdetermine previous addresses (or identifiers) knowing only thepolicy aspectspresent one. We describe two approaches. The first assumes the presence ofwhich addresses wouldstable storage that can beacceptableused to record state history for usewhen initiating communication. 6. Security Considerations The motivation for this document stems from privacy concerns for individuals. This document does not appear to add any security issues beyond those already associated with stateless address autoconfiguration [ADDRCONF]. 7. References [ADDRARCH] Hinden, R.as input into the next iteration of the algorithm. A second approach addresses draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 7] INTERNET-DRAFT October, 1999 the case where stable storage is unavailable andS. Deering, "IP Versionthe interface identifier must be generated at random. 3.1. When Stable Storage is Present The following algorithm assumes the presence of a 64-bit "history value" that is used as input in generating an interface identifier. The very first time the system boots (i.e., out-of-the-box), a random value should be generated using techniques that help ensure the initial value is hard to guess [RANDOM]. Whenever a new interface identifier is generated, a value generated by the computation is saved in the history value for the next iteration of the algorithm. [ADDRCONF] describes the steps for generating a link-local address when an interface becomes enabled, and for generating addresses for other scopes. This document extends [ADDRCONF] in the following way: 1) When processing a Router Advertisement with a Prefix Information option carrying a global-scope prefix for the purposes of address autoconfiguration (i.e., the A bit is set), effectively ignore the Preferred Timer value. A value of 0 should be used instead. This deprecates the address, allowing it to be used for accepting incoming connections, but not (in general) for outgoing connections. In addition, for such Prefix Information options, perform the following steps. 2) Take the history value from the previous iteration of this algorithm (or a random value if there is no previous value) and append to it the interface identifier generated as described in [ADDRARCH]. 3) Compute the MD5 message digest [MD5] over the quantity created in the previous step. 4) Take the left-most 64-bits of the MD5 digest and set bit 6Addressing Architecture", RFC 2373, July 1998.(the left-most bit is numbered 0) to zero. This creates an interface identifier with the universal/local bit indicating local significance only. Use the resultant identifier to generate an address as outlined in [ADDRCONF]. That is, use the interface identifier to generate a global-scope address. 5) Perform duplicate address detection (DAD) on the generated address. If DAD indicates the address is already in use, repeat steps 2-5 as appropriate up to 5 times. If after 5 consecutive attempts no non-unique address was generated, log a system error and give up attempting to generate a random address for that prefix. 6) Take the rightmost 64-bits of the MD5 digest computed in step 3) and save them in stable storage as the history value to be used in the next iteration of the algorithm. draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 8] INTERNET-DRAFT October, 1999 MD5 was chosen for convenience, not because of strict requirements. IPv6 nodes are already required to implement MD5 as part of IPsec [IPSEC], thus the code will already be present on IPv6 machines. In theory, generating successive interface identifiers using a history scheme as above has no advantages over generating them at random. In practice, however, generating truly random numbers can be tricky. Use of a history value is intended to avoid the particular scenario where two nodes generate the same interface identifier, both detect the situation via DAD, but then proceed to generate identical interface identifiers via the same (flawed) random number generation algorithm. The above algorithm avoids this problem by having the interface identifier (which will often be globally unique) used in the calculation that generates subsequent interface identifiers. Thus, if two nodes happen to generate the same interface identifier, they should generate different ones on the followup attempt. 3.2. In The Absence of Stable Storage In the absence of stable storage, no history information will be available to generate a pseudo-random sequence of interface identifiers. Consequently, identifiers will need to be generated at random. A number of techniques might be appropriate. Consult [RANDOM] for suggestions on good sources for obtaining random numbers. Note that even though machines may not have stable storage for storing the previously using interface identifier, they will in many cases have configuration information that differs from one machine to another (e.g., user identity, security keys, serial numbers, etc.). One approach to generating random interface identifiers in such cases is to use the configuration information to generate some data bits (which may remain constant for the life of the machine, but will vary from one machine to another), append some random data and compute the MD5 digest as before. The remaining details for generating addresses would be analogous to those of the previous section. 3.3. Regenerating Interface Identifiers How often to change addresses depends on how a device is being used (e.g., how frequently it initiates new communication) and the concerns of the end user. The most egregious privacy concerns appear to involve addresses used for long periods of time (weeks to months to years). The more frequently an address changes, the less feasible collecting or coordinating information keyed on interface identifiers becomes. Moreover, the cost of collecting information and attempting to correlate it based on interface identifiers will only be justified if enough addresses contain such identifiers to make it worthwhile. Thus, having large numbers of clients change their address on a daily draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 9] INTERNET-DRAFT October, 1999 or weekly basis is likely to be sufficient to alleviate most privacy concerns. There are also client costs associated with having a large number of addresses associated with a node (e.g., in doing address lookups). Thus, changing addresses frequently (e.g., every few minutes) may have performance implications. This document recommends that implementations generate new addresses on a periodic basis of once per day. At that time, previously generated random addresses should be placed in a deprecated state. The valid lifetime for an anonymous address should be a minimum of a) the valid lifetime of the corresponding public address, and b) (a default value of) two weeks. The preferred lifetime for an anonymous address then is in effect the minimum of a) its valid lifetime, and b) (a default of) one day. As an optional optimization, an implementation can remove a deprecated anonymous address that is not in use by applications or upper-layers. For TCP connections, such information is available in control blocks. For UDP-based applications, it may be the case that only the applications have knowledge about what addresses are actually in use. Consequently, it may need to use heuristics in deciding when an address is no longer in use (e.g., the two week default suggested above). Because the precise frequency at which it is appropriate to generate new addresses varies from one environment to another, implementations should provide end users with the ability to change the frequency at which addresses are regenerated. The default value should be one day. In addition, the exact time at which to invalidate an anonymous address depends on how applications are used by end users. Thus the default value of two weeks may not be appropriate in all environments. Implementations should provide end users with the ability to override the default value. 4. Implications of Changing Interface Identifiers The IPv6 addressing architecture goes to great lengths to ensure that interface identifiers are globally unique. During the IPng discussions of the GSE proposal [GSE], it was felt that keeping interface identifiers globally unique in practice might prove useful to future transport protocols. Usage of the algorithms in this document would eliminate that future flexibility. The desires of protecting individual privacy vs. the desire to effectively maintain and debug a network can conflict with each other. Having clients use addresses that change over time will make it more difficult to track down and isolate operational problems. For example, when looking at packet traces, it could become more draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 10] INTERNET-DRAFT October, 1999 difficult to determine whether one is seeing behavior caused by a single errant machine, or by a number of them. Some servers refuse to grant access to clients for which no DNS name exists. That is, they perform a DNS PTR query to determine the DNS name, and may then also perform an A query on the returned name to verify that the returned DNS name maps back into the address being used. Consequently, clients not properly registered in the DNS may be unable to access some services. As noted earlier, however, a node's DNS name (if non-changing) serves as a constant identifier. If the extension described in this document becomes widely deployed, servers will likely need to change their behavior to not require every address be in the DNS. One alternative is that DNS servers (for client machines) may need to fabricate "dummy" answers so that all addresses, whether used or not, appear to have DNS names associated with them. Another alternative is to register anonymous addresses in DNS using random names (for example a string version of the address itself). 5. Open Issues and Future Work An implementation probably needs to keep track of which addresses are being used by upper layers so as to be able to remove an address from internal data structures once no upper layer protocols are using it (but not before). This is in contrast to current approaches where addresses are removed from an interface when they become invalid [ADDRCONF], independent of whether or not upper layer protocols are still using them. For TCP connections, such information is available in control blocks. For UDP-based applications, it may be the case that only the applications have knowledge about what addresses are actually in use. Consequently, it may need to use heuristics in deciding when an address is no longer in use (e.g., as is suggested in Section 3.3). A node's permanent global addresses (i.e., those derived from a constant interface identifier) are placed in a deprecated state. This effectively prevents the address from being used to initiate future communication. In some cases, however, it may be desirable and even preferable to allow the permanent address to be used for new communication on an application-by-application basis. This may require API extensions. Use of the extensions defined in this document is likely to make debugging and other operational troubleshooting activities more difficult. Consequently, it may be site policy that anonymous addresses should not be used. Should a system administrator (i.e., and not just the end user) have control of whether these algorithms draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 11] INTERNET-DRAFT October, 1999 are to be used? If so, it might make sense (for example) to define a bit in Router Advertisements or in the Prefix Information Option to indicate whether anonymity should be enabled or disabled. 6. Security Considerations The motivation for this document stems from privacy concerns for individuals. This document does not appear to add any security issues beyond those already associated with stateless address autoconfiguration [ADDRCONF]. 7. References [ADDRARCH] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 2373, July 1998. [ADDRCONF] Thomson, S. and T. Narten, "IPv6 Address Autoconfiguration", RFC 2462, December 1998. [COOKIES] Kristol, D., Montulli, L., "HTTP State Management Mechanism", draft-ietf-http-state-man-mec-12.txt. [DHCP] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, March 1997. [DDNS] Vixie et. al., "Dynamic Updates in the Domain Name System (DNS UPDATE)", RFC 2136, April 1997. [DISCOVERY] Narten, T., Nordmark, E. and W. Simpson, "Neighbor Discovery for IP Version 6 (IPv6)", RFC 2461, December 1998. [GSE-ANALYSIS] Crawford et. al., "Separating Identifiers and Locators in Addresses: An Analysis of the GSE Proposal for IPv6 ", draft-ietf-ipngwg-esd-analysis-04.txt. [IPSEC] Kent, S., Atkinson, R., "Security Architecture for the Internet Protocol", RFC 2401, November 1998. [MD5] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April 1992. [MOBILEIP] Perkins, C., "IP Mobility Support", RFC 2002, October 1996. [RANDOM] "Randomness Recommendations for Security", Eastlake 3rd, D., Crocker S., Schiller, J., RFC 1750, December 1994. draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 12] INTERNET-DRAFT October, 1999 [SERIALNUM] Moore, K., "Privacy Considerations for the Use of Hardware Serial Numbers in End-to-End Network Protocols", draft-iesg-serno-privacy-00.txt. 8. Authors' Addresses Thomas Narten IBM Corporation P.O. Box 12195 Research Triangle Park, NC 27709-2195 USA Phone: +1 919 254 7798 EMail: narten@raleigh.ibm.com Richard Draves Microsoft Research One Microsoft Way Redmond, WA 98052 Email: richdr@microsoft.com 9. Appendix This section describes a simple alternate algorithm for changing interface identifiers. It's main weakness is that it uses the same interface ID for all addresses, and does not distinguish between addresses used for initiating communication and those used by servers for accepting incoming connections. The goal of this section is to define procedures that: 1) Result in a different interface identifier being generated at each system restart or attachment to a network. 2) Produce a sequence of interface identifiers that appear to be random in the sense that it is difficult for an outside observer to predict a future identifier based on a current one and it is difficult to determine previous identifiers knowing only the present one. We describe two approaches. The first assumes the presence of stable storage that can be used to record state history for use as input into the next iteration of the algorithm, i.e., after a system restart. A second approach addresses the case where stable storage is unavailable and the interface identifier must be generated at random. draft-ietf-ipngwg-addrconf-privacy-01.txt [Page 13] INTERNET-DRAFT October, 1999 9.1. When Stable Storage is Present The following algorithm assumes the presence of a 64-bit "history value" that is used as input in generating an interface identifier. The very first time the system boots (i.e., out-of-the-box), any value can be used including all zeros. Whenever a new interface identifier is generated, its value is saved in the history value for the next iteration of the process. Section 5.3 of [ADDRCONF]Thomson, S.describes the steps for generating a link- local address when an interface becomes enabled. This document modifies that step in the following way. Rather than use interface identifiers generated as described in [ADDRARCH], the identifier is generated as follows: 1) Take the history value from the previous iteration (or 0 if there is no previous value) andT. Narten, "IPv6 Address Autoconfiguration", RFC 2462, December 1998. [DHCP] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, March 1997. [DDNS] Vixie et. al., "Dynamic Updatesappend to it the interface identifier generated as described in [ADDRARCH]. 2) Compute the MD5 message digest [MD5] over the quantity created in step 1). 3) Take theDomain Name System (DNS UPDATE)", RFC 2136, April 1997. [DISCOVERY] Narten, T., Nordmark, E.left-most 64-bits of the MD5 digest andW. Simpson, "Neighbor Discovery for IP Versionset bit 6(IPv6)", RFC 2461, December 1998. [GSE-ANALYSIS] Crawford et. al., "Separating Identifiers(the left-most bit is numbered 0) to zero. This creates an interface identifier with the universal/local bit indicating local significance only. Use the resultant identifier for generating addresses as outlined in [ADDRCONF]. That is, use the interface identifier to generate a link-local andLocatorsother appropriate addresses. 4) Perform duplicate address detection (DAD) on the generated address. If DAD indicates the address is already inAddresses: An Analysisuse, repeat steps 1-4 as appropriate up to 5 times. If after 5 consecutive attempts no non-unique address was generated, log a system error and give up attempting to generate an address from the current prefix. 5) Take the rightmost 64-bits of the MD5 digest computed in step 2) and save them in stable storage as the history value to be used in the next iteration of theGSE Proposalalgorithm. MD5 was chosen for convenience, not because of strict requirements. IPv6", draft-ietf-ipngwg-esd-analysis-04.txt. draft-ietf-ipngwg-addrconf-privacy-00.txtnodes are already required to implement MD5 as part of IPsec [IPSEC], thus the code will already be present on IPv6 machines. 9.2. In The Absence of Stable Storage In the absence of stable storage, no history information will be available to generate a pseudo-random sequence of interface identifiers. Consequently, identifiers will need to be generated at random. A number of techniques might be appropriate. Consult [RANDOM] draft-ietf-ipngwg-addrconf-privacy-01.txt [Page8]14] INTERNET-DRAFTJune 24,October, 1999[IPSEC] Kent, S., Atkinson, R., "Security Architecturefor suggestions on good sources for obtaining random numbers. Note that even though a machine may not have stable storage for storing theInternet Protocol", RFC 2401, November 1998. [MD5] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April 1992. [SERIALNUM] Moore, K., "Privacy Considerationspreviously using interface identifier, they will in many cases have configuration information that differs from one machine to another (e.g., user identity, security keys, etc.). One approach to generating random interface identifiers in such cases is to use the configuration information to generate some data bits (which may be remain constant for theUselife ofHardware Serial Numbers in End-to-End Network Protocols", draft-iesg-serno-privacy-00.txt. 8. Authors' Addresses Thomas Narten IBM Corporation P.O. Box 12195 Research Triangle Park, NC 27709-2195 USA Phone: +1 919 254 7798 EMail: narten@raleigh.ibm.com draft-ietf-ipngwg-addrconf-privacy-00.txtthe machine, but will vary from one machine to another), append some random data and compute the MD5 digest as before. The remaining details for generating addresses would be analogous to those of the previous section. draft-ietf-ipngwg-addrconf-privacy-01.txt [Page9]15] ----