view Side-By-Side changes
SIPSIPPING J. Rosenberg Internet-DraftdynamicsoftCisco Systems Expires:December 28, 2004 June 29,April 18, 2005 October 18, 2004 A Framework for Conferencing with the Session Initiation Protocoldraft-ietf-sipping-conferencing-framework-02draft-ietf-sipping-conferencing-framework-03 Status of this Memo By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire onDecember 28, 2004.April 18, 2005. Copyright Notice Copyright (C) The Internet Society (2004). All Rights Reserved. Abstract The Session Initiation Protocol (SIP) supports the initiation, modification, and termination of media sessions between user agents. These sessions are managed by SIP dialogs, which represent a SIP relationship between a pair of user agents. Because dialogs are between pairs of user agents, SIP's usage for two-party communications (such as a phone call), is obvious. Communications sessions with multiple participants, generally known as conferencing, are more complicated. This document defines a framework for how such conferencing can occur. This framework describes the overall architecture, terminology, and protocol components needed for Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page 1] Internet-Draft Conferencing FrameworkJuneOctober 2004 multi-party conferencing. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . .43 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . .54 3. Overview of Conferencing Architecture . . . . . . . . . . . .87 3.1 Usage of URIs . . . . . . . . . . . . . . . . . . . . . .1110 4. Functions of the Elements . . . . . . . . . . . . . . . . . .1312 4.1 Focus . . . . . . . . . . . . . . . . . . . . . . . . . .1312 4.2 Conference Policy Server . . . . . . . . . . . . . . . . .1413 4.3 Mixers . . . . . . . . . . . . . . . . . . . . . . . . . .1513 4.4 Conference Notification Service . . . . . . . . . . . . .1513 4.5 Participants . . . . . . . . . . . . . . . . . . . . . . .1614 4.6 Conference Policy . . . . . . . . . . . . . . . . . . . .1614 5. Common Operations . . . . . . . . . . . . . . . . . . . . . .1815 5.1 Creating Conferences . . . . . . . . . . . . . . . . . . .18 5.1.1 SIP Mechanisms . . . . . . . . . . . . . . . . . . . . 18 5.1.2 CPCP Mechanisms . . . . . . . . . . . . . . . . . . . 19 5.1.3 Non-Automated Mechanisms . . . . . . . . . . . . . . . 1915 5.2 Adding Participants . . . . . . . . . . . . . . . . . . .19 5.2.1 SIP Mechanisms . . . . .16 5.3 Removing Participants . . . . . . . . . . . . . . .19 5.2.2 CPCP Mechanisms. . . 16 5.4 Creating Sidebars . . . . . . . . . . . . . . . .20 5.2.3 Non-Automated Mechanisms. . . . 16 5.5 Destroying Conferences . . . . . . . . . . .20 5.3 Conditional Joins. . . . . . . 17 5.6 Obtaining Membership Information . . . . . . . . . . . . .20 5.417 5.7 Adding and RemovingParticipants . . . .Media . . . . . . . . . . . . . .21 5.4.1 SIP Mechanisms. . 17 5.8 Conference Announcements and Recordings . . . . . . . . . 18 5.9 Floor Control . . . . . . . . .21 5.4.2 CPCP Mechanisms. . . . . . . . . . . . . 20 6. Physical Realization . . . . . .21 5.4.3 Non-Automated Mechanisms. . . . . . . . . . . . . . . 215.5 Approving Policy Changes . . . .6.1 Centralized Server . . . . . . . . . . . . .22 5.6 Creating Sidebars. . . . . . . 21 6.2 Endpoint Server . . . . . . . . . . . . .24 5.7 Destroying Conferences. . . . . . . . 21 6.3 Media Server Component . . . . . . . . . .24 5.7.1 SIP Mechanisms. . . . . . . . 23 6.4 Distributed Mixing . . . . . . . . . . . .25 5.7.2 CPCP Mechanisms. . . . . . . . 24 6.5 Cascaded Mixers . . . . . . . . . . .25 5.7.3 Non-Automated Mechanisms. . . . . . . . . . 26 7. Security Considerations . . . . .25 5.8 Obtaining Membership Information. . . . . . . . . . . . .25 5.8.1 SIP Mechanisms. 28 8. Contributors . . . . . . . . . . . . . . . . . . .25 5.8.2 CPCP Mechanisms. . . . . . 29 9. Acknowledgements . . . . . . . . . . . . .25 5.8.3 Non-Automated Mechanisms. . . . . . . . . . 30 10. Changes from draft-ietf-sipping-conferencing-framework-02 . 31 11. Changes from draft-ietf-sipping-conferencing-framework-00 . 32 12. Changes since draft-rosenberg-sipping-conferencing-framework-01 . . .25 5.9 Adding and Removing Media. . 33 13. Changes since draft-rosenberg-sipping-conferencing-framework-00 . . . . . 34 14. Informative References . . . . . . . . .26 5.9.1 SIP Mechanisms. . . . . . . . . . 34 Author's Address . . . . . . . . . .26 5.9.2 CPCP Mechanisms. . . . . . . . . . . . . 35 Intellectual Property and Copyright Statements . . . . . .26 5.9.3 Non-Automated Mechanisms. .. . . . . . . . . . . . . 26 5.10 Conference Announcements and Recordings . . . . . . . . . 26 5.11 Floor Control . . . . . . . . . . . . . . . . . . . . . . 28 5.12 Camera and Video Controls . . . . . . . . . . . . . . . . 28 6. Physical Realization . . . . . . . . . . . . . . . . . . . . . 29 6.1 Centralized Server . . . . . . . . . . . . . . . . . . . . 29 Rosenberg Expires December 28, 2004 [Page 2] Internet-Draft Conferencing Framework June 2004 6.2 Endpoint Server . . . . . . . . . . . . . . . . . . . . . 29 6.3 Media Server Component . . . . . . . . . . . . . . . . . . 31 6.4 Distributed Mixing . . . . . . . . . . . . . . . . . . . . 32 6.5 Cascaded Mixers . . . . . . . . . . . . . . . . . . . . . 34 7. Security Considerations . . . . . . . . . . . . . . . . . . . 36 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 37 9. Changes from draft-ietf-sipping-conferencing-framework-00 . . 38 10. Changes since draft-rosenberg-sipping-conferencing-framework-01 . . . . . 39 11. Changes since draft-rosenberg-sipping-conferencing-framework-00 . . . . . 40 12. Informative References . . . . . . . . . . . . . . . . . . . 40 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 42 Intellectual Property and Copyright Statements . . . . . . . . 43 Rosenberg Expires December 28, 2004 [Page 3] Internet-Draft Conferencing Framework June 2004 1. Introduction The Session Initiation Protocol (SIP) [1] supports the initiation, modification, and termination of media sessions between user agents. These sessions are managed by SIP dialogs, which represent a SIP relationship between a pair of user agents. Because dialogs are between pairs of user agents, SIP's usage for two-party communications (such as a phone call), is obvious. Communications sessions with multiple participants, however, are more complicated. SIP can support many models of multi-party communications. One, referred to as loosely coupled conferences, makes use of multicast media groups. In the loosely coupled model, there is no signaling relationship between participants in the conference. There is no central point of control or conference server. Participation is gradually learned through control information that is passed as part of the conference (using the Real Time Control Protocol (RTCP) [2], for example). Loosely coupled conferences are easily supported in SIP by using multicast addresses within its session descriptions. In another model, referred to as fully distributed multiparty conferencing, each participant maintains a signaling relationship with each other participant, using SIP. There is no central point of control; it is completely distributed amongst the participants. This model is outside the scope of this document. In another model, sometimes referred to as the tightly coupled conference, there is a central point of control. Each participant connects to this central point. It provides a variety of conference functions, and may possibly perform media mixing functions as well. Tightly coupled conferences are not directly addressed by RFC 3261, although basic participation is possible without any additional protocol support. This document is one of a series of specifications that discusses tightly coupled conferences. Here, we present the overall framework for tightly coupled conferencing, referred to simply as "conferencing" from this point forward. This framework presents a general architectural model for these conferences, presents terminology used to discuss such conferences, and describes the sets of protocols involved in a conference. The aim of the framework is to meet the general requirements for conferencing that are outlined in [3]. Rosenberg Expires December 28, 2004 [Page 4] Internet-Draft Conferencing Framework June 2004 2. Terminology Conference: Conference is an overused term which has different meanings in different contexts. In SIP, a conference is an instance of a multi-party conversation. Within the context of this specification, a conference is always a tightly coupled conference. Loosely Coupled Conference: A loosely coupled conference is a conference without coordinated signaling relationships amongst participants. Loosely coupled conferences frequently use multicast for distribution of conference memberships. Tightly Coupled Conference: A tightly coupled conference is a conference in which a single user agent, referred to as a focus, maintains a dialog with each participant. The focus plays the role of the centralized manager of the conference, and is addressed by a conference URI. Focus: The focus is a SIP user agent that is addressed by a conference URI and identifies a conference (recall that a conference is a unique instance of a multi-party conversation). The focus maintains a SIP signaling relationship with each participant in the conference. The focus is responsible for ensuring, in some way, that each participant receives the media that make up the conference. The focus also implements conference policies. The focus is a logical role. Conference URI: A URI, usually a SIP URI, which identifies the focus of a conference. Participant: The software element that connects a user or automata to a conference. It implements, at a minimum, a SIP user agent, but may also include a conference policy control protocol client, for example. Conference Notification Service: A conference notification service is a logical function provided by the focus. The focus can act as a notifier [4], accepting subscriptions to the conference state, and notifying subscribers about changes to that state. The state includes the state maintained by the focus itself, the conference policy, and the media policy. Conference Policy Server: A conference policy server is a logical function which can store and manipulate the conference policy. The conference policy is the overall set of rules governing operation of the conference. It is broken into membership policy and media policy. Unlike the focus, there is not an instance of the conference policy server for each conference. Rather, there is an instance of the membership and media policies for each conference. Conference Policy: The complete set of rules for a particular conference manipulated by the conference policy server. It includes the membership policy and the media policy. There is an instance of conference policy for each conference. Rosenberg Expires December 28, 2004 [Page 5] Internet-Draft Conferencing Framework June 2004 Membership Policy: A set of rules manipulated by the conference policy server regarding participation in a specific conference. These rules include directives on the lifespan of the conference, who can and cannot join the conference, definitions of roles available in the conference and the responsibilities associated with those roles, and policies on who is allowed to request which roles. Media Policy: A set of rules manipulated by the conference policy server regarding the media composition of the conference. The media policy is used by the focus to determine the mixing characteristics for the conference. The media policy includes rules about which participants receive media from which other participants, and the ways in which that media is combined for each participant. In the case of audio, these rules can include the relative volumes at which each participant is mixed. In the case of video, these rules can indicate whether the video is tiled, whether the video indicates the loudest speaker, and so on. Conference Policy Control Protocol (CPCP): The protocol used by clients to manipulate the conference policy. Mixer: A mixer receives a set of media streams of the same type, and combines their media in a type-specific manner, redistributing the result to each participant. This includes media transported using RTP \cite{rfc1889}. As a result, the term defined here is a superset of the mixer concept defined in RFC 1889, since it allows for non-RTP-based media such as instant messaging sessions [5]. Conference-Unaware Participant: A conference-unaware participant is a participant in a conference that is not aware that it is actually in a conference. As far as the UA is concerned, it is a point-to-point call. Cascaded Conferencing: A mechanism for group communications in which a set of conferences are linked by having their focuses interact in some fashion. Simplex Cascaded Conferences: a group of conferences which are linked such that the user agent which represents the focus of one conference is a conference-unaware participant in another conference. Conference-Aware Participant: A conference-aware participant is a participant in a conference that has learned, through automated means, that it is in a conference, and that can use a conference policy control protocol, media policy control protocol, or conference subscription, to implement advanced functionality. Conference Server: A conference server is a physical server which contains, at a minimum, the focus. It may also include a conference policy server and mixers. Mass Invitation: A conference policy control protocol request to invite a large number of users into the conference. Rosenberg Expires December 28, 2004 [Page 6] Internet-Draft Conferencing Framework June 2004 Mass Ejection: A conference policy control protocol request to remove a large number of users from the conference. Sidebar: A sidebar appears to the users within the sidebar as a "conference within the conference". It is a conversation amongst a subset of the participants to which the remaining participants are not privy. Anonymous Participant: An anonymous participant is one that is known to other participants through the conference notification service, but whose identity is being withheld. Hidden Participant: A hidden participant is one that is not known to other participants in the conference. They may be known to the moderator, depending on conference policy. Rosenberg Expires December 28, 2004 [Page 7] Internet-Draft Conferencing Framework June 2004 3. Overview of Conferencing Architecture +-----------+ | | | | |Participant| | 4 | | | +-----------+ | |SIP |Dialog |4 | +-----------+ +-----------+ +-----------+ | | | | | | | | | | | | |Participant|-----------| Focus |------------|Participant| | 1 | SIP | | SIP | 3 | | | Dialog | | Dialog | | +-----------+ 1 +-----------+ 3 +-----------+ | | |SIP |Dialog |2 | +-----------+ | | | | |Participant| | 2 | | | +-----------+ Figure 1 The central component (literally) in a SIP conference is the focus. The focus maintains a SIP signaling relationship with each participant in the conference. The result is a star topology, shown in Figure Figure 1. The focus is responsible for making sure that the media streams which constitute the conference are available to the participants in the conference. It does that through the use of one or more mixers, each of which combines a number of input media streams to produce one or Rosenberg Expires December 28, 2004 [Page 8] Internet-Draft Conferencing Framework June 2004 more output media streams. The focus uses the media policy to determine the proper configuration of the mixers. The focus has access to the conference policy (composed of the membership and media policies), an instance of which exist for each conference. Effectively, the conference policy can be thought of as a database which describes the way that the conference should operate. It is the responsibility of the focus to enforce those policies. Not only does the focus need read access to the database, but it needs to know when it has changed. Such changes might result in SIP signaling (for example, the ejection of a user from the conference using BYE), and most changes will require a notification to be sent to subscribers using the conference notification service. The conference is represented by a URI, which identifies the focus. Each conference has a unique focus and a unique URI identifying that focus. Requests to the conference URI are routed to the focus for that specific conference. Users usually join the conference by sending an INVITE to the conference URI. As long as the conference policy allows, the INVITE is accepted by36 Rosenberg Expires April 18, 2005 [Page 2] Internet-Draft Conferencing Framework October 2004 1. Introduction The Session Initiation Protocol (SIP) [1] supports thefocusinitiation, modification, andthetermination of media sessions between useris brought into the conference. Users can leave the conferenceagents. These sessions are managed bysending a BYE, as they would in a normal call. Similarly, the focus can terminate a dialog withSIP dialogs, which represent aparticipant, should the conference policy change to indicate that the participant is no longer allowed in the conference. A focus can also initiate an INVITE, should the conference policy indicate that the focus needs to bringSIP relationship between aparticipant into the conference. The notionpair ofa conference-unaware participant is important in this framework. A conference-unaware participant does not even know that the UA it is communicating with happens to be a focus. As far as it's concerned, its a UA just like any other. The focus,user agents. Because dialogs are between pairs ofcourse, knows that its a focus, and it performs the tasks neededuser agents, SIP's usage forthe conference to operate. Conference-unaware participants have access totwo-party communications (such as agood deal of functionality. They can join and leave conferences using SIP, and obtain more advanced features through stimulus signaling,phone call), is obvious. Communications sessions with multiple participants, however, are more complicated. SIP can support many models of multi-party communications. One, referred to asdiscussedloosely coupled conferences, makes use of multicast media groups. In the loosely coupled model, there is no signaling relationship between participants in[6]. However, iftheparticipant wishes to explicitlyconference. There is no central point of controlaspectsor conference server. Participation is gradually learned through control information that is passed as part of the conferenceusing functional signaling protocols,(using theparticipant must be conference-aware. Rosenberg Expires December 28, 2004 [Page 9] Internet-Draft Conferencing Framework June 2004 ..................................... . . . . . . . . . Conference . . Policy . Conference . . Policy . +-----------+ //-----\\ .Real Time Control. | | || || .Protocol. | Conference| \\-----// . +---------------->| Policy | | | . | . | Server |----> |Membership . | . | | | | . | . +-----------+ | & | . | . | | . | . | Media | . +-----------+ . +-----------+ | Policy| . | | . | | \ // . | | . | | \-----/ . |Participant|<--------->| Focus | | . | |(RTCP) [2], for example). Loosely coupled conferences are easily supported in SIP. | | | . | | Dialog . | |<-----------+ . +-----------+ . |...........| . ^ . | Conference| . | . |Notification . +------------>| Service | . Subscription. +-----------+ . . . . . . . . . ..................................... Conference Functions Figure 2 A conference-awareby using multicast addresses within its session descriptions. In another model, referred to as fully distributed multiparty conferencing, each participant maintains a signaling relationship with each other participant, using SIP. There isone that has access to advanced functionality through additional protocol interfaces. The client uses these protocolsno central point of control; it is completely distributed amongst the participants. This model is outside the scope of this document. In another model, sometimes referred tointeract withas the tightly coupled conference, there is a central point of control. Each participant connects to this central point. It provides a variety of conferencepolicy serverfunctions, and may possibly perform media mixing functions as well. Tightly coupled conferences are not directly addressed by RFC 3261, although basic participation is possible without any additional protocol support. This document is one of a series of specifications that discusses tightly coupled conferences. Here, we present thefocus. A modeloverall framework for tightly coupled conferencing, referred to simply as "conferencing" from thisinteraction is shownpoint forward. This framework presents a general architectural model for these conferences, presents terminology used to discuss such conferences, and describes the sets of protocols involved inFigure Figure 2. The participant can interact witha conference. It also discusses thefocus using extensions, such as REFER,ways inorder to access enhanced call control functions [7].which SIP itself is involved in conferencing. Theparticipant can SUBSCRIBE toaim of theconference URI, and be connectedframework is to meet theconference notification service provided bygeneral requirements for conferencing that are outlined in [3]. An additional document, thefocus. Through this mechanism, it can learn about changesCentralized Conferencing (XCON) framework [16], discusses the non-SIP signaling aspects of conferencing in more detail, as well as providing additional functionality and details necessary for a generic protocol agnostic conferencing architecture. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page10]3] Internet-Draft Conferencing FrameworkJuneOctober 2004participants (effectively, the state2. Terminology Conference: Conference is an overused term which has different meanings in different contexts. In SIP, a conference is an instance of a multi-party conversation. Within thedialogs), the media policy, and the membership policy. The participant can communicate with thecontext of this specification, a conferencepolicy server usingis always a tightly coupled conference. Loosely Coupled Conference: A loosely coupled conferencepolicy control protocol. Through this protocol, it can affect theis a conferencepolicy. Thewithout coordinated signaling relationships amongst participants. Loosely coupled conferences frequently use multicast for distribution of conference memberships. Tightly Coupled Conference: A tightly coupled conferencepolicy server need not be available in any particular conference, although thereisalwaysa conferencepolicy.in which a single user agent, referred to as a focus, maintains a dialog with each participant. Theinterfaces between thefocusand the conference policy, and the conference policy server andplays theconference policy, are not subject to standardization atrole of thetimecentralized manager ofthis writing. They are intended primarily to showthelogical roles involved in aconference,as opposed to suggestingand is addressed by aphysical decomposition.conference URI. Focus: Theseparation of these functionsfocus isdocumented here to encourage clarity in the requirements and to allow individual implementations the flexibility to composeaconferencing system inSIP user agent that is addressed by ascalableconference URI androbust manner. 3.1 Usage of URIs It is fundamental to this frameworkidentifies a conference (recall that a conference isuniquely identified byaURI, and that this URI identifiesunique instance of a multi-party conversation). The focus maintains a SIP signaling relationship with each participant in the conference. The focuswhichis responsible for ensuring, in some way, that each participant receives the media that make up the conference. The focus also implements conferenceURIpolicies. The focus isunique, such that no two conferences have the same conference URI.a logical role. Conference URI: Aconference URI is alwaysURI, usually a SIPor SIPS URI. The conference URI is opaque to any participants which might use it. There is no way to look at theURI,and know for certain whether itwhich identifies the focus of afocus, as opposed toconference. Participant: The software element that connects a user oran interface onautomata to aPSTN gateway. This is in line withconference. It implements, at a minimum, a SIP user agent, but may also include a conference policy control protocol client, for example. Conference State: The state of thegeneral philosophyconference includes the state ofURI usage [8]. However, contextual information surroundingtheURI (for example, SIP header parameters) may indicate thatfocus and theURI represents a conference. When a SIP request is sentconference policy. Focus state includes the set of participants connected to the focus and the state of their respective dialogs. Conference Notification Service: A conferenceURI, that requestnotification service isrouteda logical function provided by the focus. The focus can act as a notifier [4], accepting subscriptions to thefocus,conference state, andonlynotifying subscribers about changes tothe focus. The element or systemthatcreatesstate. The state includes the state maintained by the focus itself, the conference policy, and the media policy. Conference Policy Server: A conferenceURIpolicy server isresponsible for guaranteeing this property. The conference URI can representalong-livedlogical function which can store and manipulate the conferenceor interest group, such as "sip:discussion-on-dogs@example.com".policy. Thefocus identified by this URI would always exist,conference policy is the overall set of rules governing operation of the conference, andalways be managinginclude membership policy and media policy. Unlike the focus, there is not an instance of the conference policy server forwhatever participants are currently joined. Other conference URIs can represent short-lived conferences, such as an ad-hoceach conference.Ideally, a conference URI is never constructed or guessed by a user.Rather, there is an instance of the conferenceURIs are learned through many mechanisms. Apolicy for each conference instance. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page11]4] Internet-Draft Conferencing FrameworkJuneOctober 2004conference URI can be emailed or sent in an instant message. A conference URI can be linked onConference Policy: The complete set of rules for aweb page. Aparticular conferenceURI can be obtained from amanipulated by the conference policycontrol protocol, which can beserver. The policy includes membership and media policies. The conference policy is used tocreate conferencesspecify and control thepolicies associated with them. To determine that a SIP URI does representoperation of afocus, standard techniques for URI capability discovery can be used. Specifically, the callee capabilities specification [9] provides the "isfocus" feature tag to indicate thatconference instance. Membership Policy: A set of rules manipulated by theURI is a focus. Caller preferences parameters are also used to indicate thatconference policy server regarding participation in afocus supportsspecific conference. These rules include directives on theconference notification service. This is done by declaring support forlifespan of theSUBSCRIBE methodconference, who can and cannot join therelevant package(s)conference, definitions of roles available in thecaller preferences feature parametersconference and the responsibilities associated withthe conference URI. The other functions in a conference are also representedthose roles, and policies on who is allowed to request which roles. Media Policy: A set of rules manipulated byURIs. Ifthe conference policy server regarding the media composition of the conference. The media policy isimplemented through web pages, this server is identifiedused byHTTP URIs. If it is accessed using an explicit protocol, it is a URI defined for that protocol. Starting withtheconference URI,focus to determine theURIsmixing characteristics for the conference. The media policy includes rules about which participants receive media from which otherlogical entitiesparticipants, and the ways in which that media is combined for each participant. In theconferencecase of audio, these rules canbe learned usinginclude theconference notification service. Rosenberg Expires December 28, 2004 [Page 12] Internet-Draft Conferencing Framework June 2004 4. Functionsrelative volumes at which each participant is mixed. In the case of video, these rules can indicate whether theElements This section givesvideo is tiled, whether the video indicates the loudest speaker, and so on. Mixer: A mixer receives amore detailed descriptionset of media streams of thefunctions typically implementedsame type, and combines their media ineach ofa type-specific manner, redistributing theelements. 4.1 Focusresult to each participant. This includes media transported using RTP [2]. Asits name implies,a result, thefocusterm defined here isthe centera superset of theconference. All participantsmixer concept defined inthe conference are connected toRFC 3550, since itbyallows for non-RTP-based media such as instant messaging sessions [5]. Conference-Unaware Participant: A conference-unaware participant is aSIP dialog. The focusparticipant in a conference that isresponsible for maintaining the dialogs connected to it. It ensuresnot aware that it is actually in a conference. As far as thedialogs are connected toUA is concerned, it is a point-to-point call. Cascaded Conferencing: A mechanism for group communications in which a set ofparticipants whoconferences areallowed to participate in the conference, as definedlinked bythe membership policy. The focus also uses SIP to manipulate the media sessions,having their focuses interact inorder to make sure each participant obtains all the media forsome fashion. Simplex Cascaded Conferences: a group of conferences which are linked such that theconference. To do that,user agent which represents the focusmakes useofmixers. Whenone conference is afocus receives an INVITE, it checks the membership policy. The membership policy might indicate that thisconference-unaware participant in another conference. Conference-Aware Participant: A conference-aware participant isnot allowed to join,a participant inwhich case the call can be rejected. It might indicatea conference thatanother participant, acting ashas learned, through automated means, that it is in amoderator, needs to approve this new participant. Inconference, and thatcase, the INVITE might be parked oncan use a conference policy control protocol, media policy control protocol, or conference subscription, to implement advanced functionality. Conference Server: A conference server is a physical server which contains, at amusic-on-hold server, orminimum, the focus. It may also include a183 response might be sent to indicate progress.conference policy server and mixers. Rosenberg Expires April 18, 2005 [Page 5] Internet-Draft Conferencing Framework October 2004 Mass Invitation: Anotification, using theconferencenotification service, would be sent to the moderator. The moderator then has the abilitypolicy control protocol request tomanipulate the policies usinginvite a large number of users into the conference. Mass Ejection: A conference policy controlprotocol. Ifprotocol request to remove a large number of users from thepolicies are changedconference. Sidebar: A sidebar appears toallow this new participant,thefocus can acceptusers within theINVITE (or unpark it fromsidebar as a "conference within themusic-on-hold server). The interpretationconference". It is a conversation amongst a subset of themembership policy byparticipants to which thefocus is, itself, a matter of local policy, andremaining participants are notsubject to standardization. If aprivy. Anonymous Participant: An anonymous participantmanipulated the membership policy to indicateis one thata certainis known to otherparticipant was no longer allowedparticipants through the conference notification service, but whose identity is being withheld. Rosenberg Expires April 18, 2005 [Page 6] Internet-Draft Conferencing Framework October 2004 3. Overview of Conferencing Architecture +-----------+ | | | | |Participant| | 4 | | | +-----------+ | |SIP |Dialog |4 | +-----------+ +-----------+ +-----------+ | | | | | | | | | | | | |Participant|-----------| Focus |------------|Participant| | 1 | SIP | | SIP | 3 | | | Dialog | | Dialog | | +-----------+ 1 +-----------+ 3 +-----------+ | | |SIP |Dialog |2 | +-----------+ | | | | |Participant| | 2 | | | +-----------+ Figure 1 The central component (literally) in a SIP conference is theconference, thefocus. The focuswould sendmaintains aBYE to that otherSIP signaling relationship with each participantto remove them. This is often referred to as "ejecting" a user fromin the conference. Theprocess of ejecting fundamentally constitutes these two steps - the establishment of the policy through the conference policy protocol, and the implementation of that policy (using a BYE) by the focus. Similarly, ifresult is auser manipulated the membership policy to indicatestar topology, shown in Figure Figure 1. The focus is responsible for making sure thata number of users need to be added totheconference,media streams which constitute thefocus would send an INVITE to those participants. This is often referredconference are available toasthe"mass invitation" function. As with ejection, it is fundamentally composed ofparticipants in thepolicy functionsconference. It does thatspecify the participants which should be present, andthrough theimplementationuse ofthose functions. A policy request to addone or more mixers, each of which combines asetnumber ofusers might not require an INVITEinput media streams toexecute it; those users might already be participants inproduce one or Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page13]7] Internet-Draft Conferencing FrameworkJuneOctober 2004 more output media streams. The focus uses theconference. A similar modelmedia policy to determine the proper configuration of the mixers. The focus has access to the conference and media policies, for which an instance of each exists formedia policy. Ifeach conference. Effectively, themediaconference policyindicates thatcan be thought of as aparticipantdatabase which describes the way that the conference shouldnot receive any video,operate. It is the responsibility of the focusmight implement that policy by sending a re-INVITE, removingto enforce those policies. Not only does themedia streamfocus need read access tothat participant. Alternatively, ifthevideo is being centrally mixed,database, but itcould inform the mixerneeds tosendknow when it has changed. Such changes might result in SIP signaling (for example, the ejection of ablack screen to that participant. The means by whichuser from thepolicy is implemented are not subject to specification. 4.2 Conference Policy Server Theconferencepolicy server allows clients to manipulateusing BYE), andinteract withmost changes will require a notification to be sent to subscribers using the conferencepolicy. Thenotification service. Further details on conference and media policy isused byprovided in thefocus to make authorization decisions and guide its overall behavior. Logically speaking, there is a one-to-one mapping between a conference policy and a focus.XCON framework document [16]. The conferencepolicyis represented by aURI. There is a unique conference policy for each conference. The conference policy URI points to a conference policy server which can manipulate that conference policy. A conference policy server also has a "top level" URIURI, whichcan be used to access functions that are independent of any conference. Perhaps the most important of these functions isidentifies thecreation of a new conference. Creation of a newfocus. Each conferencewill result in the construction ofhas anewunique focus and acorresponding conference URI, which can then be usedunique URI identifying that focus. Requests to the conference URI are routed to the focus for that specific conference. Users usually join the conferenceitself, along with a media policy andby sending an INVITE to the conferencepolicy. TheURI. As long as the conference policyserverallows, the INVITE isaccessed using a client-server transactional protocol. The clientaccepted by the focus and the user is brought into the conference. Users canbeleave the conference by sending aparticipantBYE, as they would in a normal call. Similarly, theconference, or itfocus canbeterminate athird party. Access control lists for who can modifydialog with aconference policy are themselves part ofparticipant, should the conferencepolicy. The conferencepolicyserver is responsible for reconciliation of potentially conflicting requests regardingchange to indicate that thepolicy forparticipant is no longer allowed in the conference.The client ofA focus can also initiate an INVITE, should the conference policycontrol protocol can be any entity interested in manipulatingindicate that theconference policy. Clearly, participants might be interestedfocus needs to bring a participant into the conference. The notion of a conference-unaware participant is important inmanipulating them.this framework. A conference-unaware participantmight wantdoes not even know that the UA it is communicating with happens toraise or lowerbe a focus. As far as it's concerned, its a UA just like any other. The focus, of course, knows that its a focus, and it performs thevolumetasks needed forone oftheotherconference to operate. Conference-unaware participantsit is hearing. Or, a participant might wanthave access toaddauser togood deal of functionality. They can join and leave conferences using SIP, and obtain more advanced features through stimulus signaling, as discussed in [6]. However, if theconference. A clientparticipant wishes to explicitly control aspects of the conferencepolicy protocol could also be another server whose job is to determineusing functional signaling protocols, theconference policy. As anparticipant must be conference-aware. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page14]8] Internet-Draft Conferencing FrameworkJuneOctober 2004example, a floor control server is responsible for determining which participant(s) in a conference are allowed to speak at any given time, based on..................................... . . . . . . . . . Conference . . Policy . Conference . . Policy . +-----------+ //-----\\ . Control . | | || || . Protocol . | Conference| \\-----// . +---------------->| Policy | | | . | . | Server |----> |Membership . | . | | | | . | . +-----------+ | & | . | . | | . | . | Media | . +-----------+ . +-----------+ | Policy| . | | . | | \ // . | | . | | \-----/ . |Participant|<--------->| Focus | | . | | SIP . | | | . | | Dialog . | |<-----------+ . +-----------+ . |...........| . ^ . | Conference| . | . |Notification . +------------>| Service | . Subscription. +-----------+ . . . . . . . . . ..................................... Conference Functions Figure 2 A conference-aware participantrequests and access rules. The floor control server would act as a client of the conference policy server, and change the media policy based on whoisallowedone that has access tospeak.advanced functionality through additional protocol interfaces. The clientofuses these protocols to interact with the conference policycontrol protocol could also be another conference policy server. 4.3 Mixersserver and the focus. Amixer is responsiblemodel forcombining the media streams that make upthis interaction is shown in Figure Figure 2. The participant can interact with theconference, and generating one or more output streams that are distributedfocus using extensions, such as REFER, in order torecipients (which could be participants or other mixers).access enhanced call control functions [7]. Theprocess of combining media is specificparticipant can SUBSCRIBE to themedia type,conference URI, andis directedbe connected to the conference notification service provided by thefocus, underfocus. Through this mechanism, it can learn about changes in Rosenberg Expires April 18, 2005 [Page 9] Internet-Draft Conferencing Framework October 2004 participants (effectively, theguidancestate of therules described indialogs), the mediapolicy. A mixer is not aware of a "conference" as an entity, per se. A mixer receives media streams as inputs,policy, andbased on directions provided bythefocus, generates media streams as outputs. There is no grouping of media streams beyondmembership policy. The participant can communicate with thepolicies that describeconference policy server using a conference policy control protocol. Through this protocol, it can affect thewaysconference policy. The conference policy server need not be available inwhich the streams are mixed. A mixerany particular conference, although there is alwaysunder the control ofafocus.conference policy. The interfaces between the focusis responsible for interpretingand themediaconference policy, andthen installing the appropriate rules in the mixer. Ifthefocus is directly controlling a mixer,conference policy server and themixer can either be co-resident withconference policy are detailed in thefocus, or can be controlled through some kindXCON framework document [16]. For the purposes ofprotocol. However, a focus need not directly controlSIP-based conferencing, they serve as logical roles involved in amixer. Rather,conference, as opposed to representing afocus can delegatephysical decomposition. The separation of these functions is documented here to encourage clarity in themixingrequirements and to ensure compatibility between SIP based conferencing and the extensions to theparticipants, each of which has their own mixer. This isframework described inSection Section 6.4. 4.4 Conference Notification Service The focus can provide a conference notification service. In[16]. More importantly, thisrole, it acts asapproach provides individual SIP implementations the flexibility to compose anotifier, as definedconferencing system inRFC 3265 [4]. It accepts subscriptions from clients for the conference URI,a scalable andgenerates notifications to them asrobust manner without requiring thestatecomplete development ofthe conference changes. This state is composedthese interfaces. 3.1 Usage oftwo separate pieces. The firstURIs It isthe state offundamental to this framework that a conference is uniquely identified by a URI, and that this URI identifies the focusandwhich is responsible for thesecondconference. The conference URI is unique, such that no two conferences have the same conferencepolicy.URI. Asubscriber to theconferencenotification service can use capabilities defined in theURI is always a SIPevents framework [4] to request that it receive focus state changes only, conference policy changes only,orboth. Rosenberg Expires December 28, 2004 [Page 15] Internet-Draft Conferencing Framework June 2004SIPS URI. Thestate of the focus includes theconference URI is opaque to any participantsconnectedwhich might use it. There is no way to look at the URI, and know for certain whether it identifies a focus, as opposed to a user or an interface on a PSTN gateway. This is in line with thefocus, andgeneral philosophy of URI usage [8]. However, contextual informationaboutsurrounding thedialogs associated with them. As new participants join, this state changes, and is reported throughURI (for example, SIP header parameters) may indicate that thenotification service. Similarly, when someone leaves, this state also changes, allowing subscribersURI represents a conference. When a SIP request is sent tolearn about this fact. As described previously,the conferencepolicy includesURI, that request is routed to themembership policyfocus, andthe media policy. As those policies change, dueonly tousage of the CPCP, direct change bythefocus,focus. The element orthrough an application,system that creates the conferencenotification service informs subscribers of these changes. 4.5 Participants A participant in a conferenceURI isany SIP user agent that hasresponsible for guaranteeing this property. The conference URI can represent adialog withlong-lived conference or interest group, such as "sip:discussion-on-dogs@example.com". The focus identified by this URI would always exist, and always be managing thefocus. This SIP user agentconference for whatever participants are currently joined. Other conference URIs canbe a PC application,represent short-lived conferences, such as an Rosenberg Expires April 18, 2005 [Page 10] Internet-Draft Conferencing Framework October 2004 ad-hoc conference. Ideally, aSIP hardphone,conference URI is never constructed or guessed by aPSTN gateway. Ituser. Rather, conference URIs are learned through many mechanisms. A conference URI canalsobeanother focus.emailed or sent in an instant message. A conferencewhich hasURI can be linked on aparticipant that is the focus of anotherweb page. A conferenceis calledURI can be obtained from asimplex cascaded conference. Theyconference policy control protocol, which canalsobe used toprovide scalablecreate conferenceswhere there are regional sub-conferences, each of which is connected to the main conference. 4.6 Conference Policy The conference policy contains the rules that guide the operation ofand thefocus. The rules can be simple, such as an access listpolicies associated with them. To determine thatdefines the set of allowed participants inaconference. The rulesSIP URI does represent a focus, standard techniques for URI capability discovery canalsobeincredibly complex, specifying time-of-day based rules on participation conditional onused. Specifically, thepresence of other participants. It is importantcallee capabilities specification [9] provides the "isfocus" feature tag tounderstandindicate thatthere is no restriction onthetype of rules that can be encapsulated inURI is aconference policy. The conference policy can be manipulated using web applications or voice applications. It canfocus. Caller preferences parameters are alsobe manipulated with proprietary protocols. However, the conference policy control protocol can beusedasto indicate that astandardized means of manipulatingfocus supports the conferencepolicy. Bynotification service. This is done by declaring support for thenature of conference policies, not all aspects ofSUBSCRIBE method and thepolicy can be manipulatedrelevant package(s) in the caller preferences feature parameters associated with the conferencepolicy control protocol.URI. The other functions in a conferencepolicy includes the membership policy andare also represented by URIs. If themedia policy. The membershipconference policyincludes per-participant policies that specify how the focusserver is implemented through web pages, this server is identified by HTTP URIs. If it isto handle a particular participant. These include whether or not the participantaccessed using an explicit protocol, it isanonymous,a URI defined forexample. The media policy describesthat protocol. Starting with theway in whichconference URI, theset of inputs to a mixer are combined to generateURIs for theset of outputs. Media policies can span media types. Inotherwords,logical entities in thepolicy on how one media stream is mixedconference can bebased on characteristics of other medialearned using the conference notification service. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page16]11] Internet-Draft Conferencing FrameworkJuneOctober 2004streams. Media policies can be based on any quantifiable characteristic4. Functions of themedia stream (its source, volume, codecs, speaking/silence, etc.), and they can be based on internal or external variables accessible byElements This section gives a more detailed description of themedia policy. Some examplesfunctions typically implemented in each ofmedia policies include: o The video outputthe elements. 4.1 Focus As its name implies, the focus is thepicturecenter of theloudest speaker (video follows audio). o The audio from each participant will be mixed with equal weight, and distributedconference. All participants in the conference are connected toall other participants. oit by a SIP dialog. Theaudio and video that is distributedfocus is responsible for maintaining theone selected bydialogs connected to it. It ensures that thefloor control server. Rosenberg Expires December 28, 2004 [Page 17] Internet-Draft Conferencing Framework June 2004 5. Common Operations Theredialogs are connected to alarge number of ways in which users can interact with a conference. They can join, leave,setpolicies, approve members, and so on. This section is meant as an overview of the major conferencing operations, summarizing how they operate. More detailed examplesofthe SIP mechanisms can be found in [7]. 5.1 Creating Conferences Thereparticipants who aremany waysallowed to participate inwhich a conference can be created.the conference, as defined by the membership policy. Thecreation of a conference actually constructs several elements all atfocus also uses SIP to manipulate thesame time. It resultsmedia sessions, in order to make sure each participant obtains all thecreationmedia for the conference. To do that, the focus makes use of mixers. When a focusand a conferencereceives an INVITE, it checks the membership policy.It also resultsThe membership policy might indicate that this participant is not allowed to join, inthe construction of a conference URI,whichuniquely identifies the focus. Sincecase theconference URIcall can be rejected. It might indicate that another participant, acting as a moderator, needs tobe unique, the element which creates conferences is responsible for guaranteeingapprove this new participant. In thatuniqueness. This cancase, the INVITE might beaccomplished deterministically, by keeping records of conference URIs, or by generating URIs algorithmically,parked on a music-on-hold server, orprobabilistically, by creating random URI with sufficiently low probabilities of collision. Whenamedia and183 response might be sent to indicate progress. A notification, using the conferencepolicy are created, they are established with default rules that are implementation dependent. Ifnotification service, would be sent to thecreator ofmoderator. The moderator then has theconference wishesability tochange those rules, they would do somanipulate the policies using theconference policy control protocol (CPCP), for example. Of course, usingconference policy control protocol. If the policies are changed to allow this new participant, the focus can accept the INVITE (or unpark it from theCPCP requires that an element knowmusic-on-hold server). The interpretation of theURI for manipulatingmembership policy by thepolicy. That requiresfocus is, itself, ameansmatter of local policy, and not subject tolearnstandardization. If a participant manipulated theconferencemembership policyURI fromto indicate that a certain other participant was no longer allowed in theconference URI, sinceconference, theconference URIfocus would send a BYE to that other participant to remove them. This isfrequently the sole result returnedoften referred tothe clientas "ejecting" aresultuser from the conference. The process ofconference creation. Any other URIs associated withejecting fundamentally constitutes these two steps - theconference are learnedestablishment of the policy through the conferencenotification service. They are carried as elements inpolicy protocol, and thenotifications. 5.1.1 SIP Mechanisms SIP can be used to create conferences hosted inimplementation of that policy (using acentral serverBYE) bysending an INVITE tothe focus. Similarly, if aconferencing applicationuser manipulated the membership policy to indicate thatwould automatically create a new conference and then placeauser into it. Creationnumber ofconferences whereusers need to be added to the conference, the focusresides inwould send anendpoint operates differently. There, the endpoint itself creates the conference URI, and hands it out to other endpoints which areINVITE tobe thethose participants.What differs from caseThis is often referred tocaseas the "mass invitation" function. As with ejection, it ishowfundamentally composed of theendpoint decidespolicy functions that specify the participants which should be present, and the implementation of those functions. A policy request tocreateadd aconference. One important case is the ad-hoc conference describedset of users might not require an INVITE to execute it; those users might already be participants inSection 6.2.Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page18]12] Internet-Draft Conferencing FrameworkJuneOctober 2004There, an endpoint unilaterally decides to createtheconference based on localconference. A similar model exists for media policy.The dialogs that were connected toIf theUA are migrated tomedia policy indicates that a participant should not receive any video, theendpoint-hosted focus, usingfocus might implement that policy by sending are-INVITE to passre-INVITE, removing theconference URImedia stream tothe newly joined participants.that participant. Alternatively,one UA can ask another UA to create an endpoint-hosted conference. This is accomplished with the SIP Join header [10]. The UA which receives the Join header in an invitation may need to create a new conference URI (a new one is not neededif thedialog thatvideo is beingjoined is already part of a conference). The conference URI is then handed tocentrally mixed, it could inform therecently joined participants through a re-INVITE. 5.1.2 CPCP Mechanisms Another waymixer tocreate a conference is through interaction with the conference policy server. Using the conference policy control protocol,send aclient can instruct the conference policy serverblack screen tocreate a new conference and returnthat participant. The means by which theconference URI and conferencepolicyURI. 5.1.3 Non-Automated Mechanisms One way to create a conferenceisthrough interaction with an IVR application.implemented are not subject to specification. 4.2 Conference Policy Server Theuser would send a SIP INVITEconference policy server allows clients tothe conferencing application. This application wouldmanipulate and interact with theuser, collect information aboutconference policy. The conference policy is used by thedesired conference,focus to make authorization decisions andcreate it. The user can then be placed into their newly created conference. Of course,guide its overall behavior. Logically speaking, there is auser can also create conferences by interacting withone-to-one mapping between aweb server. The webconference policy and a focus. Further detail on the functionality and access to the policy serverwould promptare provided in theuserXCON framework document [16]. 4.3 Mixers A mixer is responsible for combining theneccessary information (start and stop times ofmedia streams that make up the conference,participants, etc.)andreturn the conference URIgenerating one or more output streams that are distributed tothe user.recipients (which could be participants or other mixers). Theuser would copy this URI into their SIP phone,process of combining media is specific to the media type, andsend it an INVITEis directed by the focus, under the guidance of the rules described inorder to jointhenewly-created conference. 5.2 Adding Participants There are many mechanisms for adding participants tomedia policy. A mixer is not aware of aconference. These include SIP, the conference policy control protocol,"conference" as an entity, per se. A mixer receives media streams as inputs, andnon-automated means. In all cases, participant additions can be first party (a user adds themself) or third party (a user adds another user). 5.2.1 SIP Mechanisms First person additions using SIPbased on directions provided by the focus, generates media streams as outputs. There is no grouping of media streams beyond the policies that describe the ways in which the streams aretrivially accomplished with a standard INVITE.mixed. Aparticipant can send an INVITE request tomixer is always under theconference URI,control of a focus, either directly or indirectly The focus is responsible for interpreting the media policy, andifthen installing theconference policy allows them to join, Rosenberg Expires December 28, 2004 [Page 19] Internet-Draft Conferencing Framework June 2004 they are added toappropriate rules in theconference.mixer. Ifa UA does not knowtheconference URI, but has learned about a dialog whichfocus isconnected todirectly controlling aconference (by usingmixer, thedialog event package, for example [11]),mixer can either be co-resident with theUAfocus, or canjoinbe controlled through some kind of protocol. If theconference by usingfocus is indirectly controlling a mixer, it delegates theJoin headermixing tojointhedialog. Third party additions with SIP are done using REFER [12].participants, each of which has their own mixer. This is described in Section 6.4. 4.4 Conference Notification Service Theclientfocus cansendprovide aREFER request to the participant, asking them to send an INVITE request to theconferenceURI. Additionally, the client can send a REFER request to the focus, askingnotification service. In this Rosenberg Expires April 18, 2005 [Page 13] Internet-Draft Conferencing Framework October 2004 role, itto send an INVITE to the participant. The latter technique has the benefit of allowing a client to addacts as aconference-unaware participant that does not supportnotifier, as defined in RFC 3265 [4]. It accepts subscriptions from clients for theREFER method. 5.2.2 CPCP Mechanisms A basic functionconference URI, and generates notifications to them as the state of the conferencepolicy control protocolchanges. This state isto add participants. A clientcomposed ofthe protocol can specify any SIP URI (which may identify themself) that is to be added. If the URI does not identify a user thattwo separate pieces. The first isalready a participant intheconference,state of the focuswill send an INVITE to that URI in order to add them in. 5.2.3 Non-Automated Mechanisms There are countless non-automated means for asking a participant to joinand theconference. Generally, they involve conveyingsecond is the conferenceURIpolicy. A subscriber to thedesired participant, so that they can send an INVITE to it. These mechanisms all require some kind of human interaction. As an example, a userconference notification service cansend an instant message [13] to the third party, containing an HTML document which requestsuse capabilities defined in theuserSIP events framework [4] toclick onrequest that it receive focus state changes only, conference policy changes only, or both. The state of thehyperlink to joinfocus includes theconference: <html> Hey, would you likeparticipants connected to<a href="sip:9sf88fk-99sd@conferences.example.com">join </a>theconference now? </html> 5.3 Conditional Joins In many cases, a new participant will not wish to joinfocus, and information about theconference unless they can joindialogs associated witha particular set of policies.them. Asan example, a participant may want to join anonymously, so that othernew participantsknow that someone has joined, but not who. To Rosenberg Expires December 28, 2004 [Page 20] Internet-Draft Conferencing Framework June 2004 accomplish this, the conference policy control protocoljoin, this state changes, and isused to establish these policies prior toreported through thegeneration or acceptance of an invitationnotification service. Similarly, when someone leaves, this state also changes, allowing subscribers to learn about this fact. Conference notification associated with changes to theconference. For example, ifconference policies is discussed in [16]. 4.5 Participants A participant in a conference is any SIP userwishes to joinagent that has aconferencedialog witha known conference URI,the focus. This SIP userwould obtain the URI for the conference policy, manipulate the policy to set themself as an anonymous participant, and then actually join theagent can be a PC application, a SIP hardphone, or a PSTN gateway. It can also be another focus. A conferenceby sending an INVITE request towhich has a participant that is the focus of another conferenceURI. 5.4 Removing Participants As with additions, there are several mechanisms for departures. These include SIP mechanisms and CPCP mechanisms. Removalsis called a simplex cascaded conference. They can also befirst person or third person. 5.4.1 SIP Mechanisms First person departuresused to provide scalable conferences where there aretrivially accomplished by sending a BYE requestregional sub-conferences, each of which is connected to thefocus. This terminates the dialog withmain conference. 4.6 Conference Policy The conference policy contains thefocus and removesrules that guide theparticipant fromoperation of theconference. Third person departuresfocus. The rules canalsobedone using SIP, throughsimple, such as an access list that defines theREFER method. 5.4.2 CPCP Mechanismsset of allowed participants in a conference. TheCPCPrules can also beused by a client to remove any participant (including themself). When CPCP is used for this purpose, the focus will send a BYE request toincredibly complex, specifying time-of-day based rules on participation conditional on theparticipant that is being removed. The focus will execute anypresence of othersignaling thatparticipants. It isneeded to remove them (for example, manipulate other dialogs in orderimportant tomanageunderstand that there is no restriction on thechangetype of rules that can be encapsulated inmedia streams).a conference policy. The conference policycontrol protocolcan be manipulated using web applications or voice applications. It can also beused to remove a large number of users. Thismanipulated with proprietary protocols. The conference policy control protocol isgenerally referred toproposed asmass ejection. 5.4.3 Non-Automated Mechanisms As with the other common conferencing functions, there are many non-automated ways to removeaparticipant. The identitystandardized means ofthe participant can be entered into a web form. When the user clicks submit, the focus sends a BYE to that participant, removing them from the conference. Alternatively,manipulating the conferencecan expose an IM interface, where the user can send an IM topolicy. Further detail on the conferencesaying "remove Bob", causing thepolicy and conferenceserver to remove Bob.policy control protocol are provided in [16]. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page21]14] Internet-Draft Conferencing FrameworkJuneOctober 20045.5 Approving Policy Changes OPEN ISSUE: The basic mechanism described here depends on the actual protocols used for conference and media policy manipulation. If the protocol itself provides change notifications, sip-events may not be needed for that purpose. Thus, this description here is tentative. A conference policy for5. Common Operations There are aparticular conference may designate one or morelarge number of ways in which usersas moderators for somecan interact with a conference. They can join, leave, setof media policy or conference policy change requests. This means that those moderators need to approve the specific policy change. Typically, moderators are used topolicies, approvemember additionsmembers, andremovals. However,so on. This section is meant as an overview of the major conferencing operations, summarizing how they operate. More detailed examples of theframework allows for moderators to be associated with any policy change thatSIP mechanisms can bemade. Moderating a policy request is done using a combinationfound in [7]. As well as providing an overview of theconference notification service andcommon conferencing operations, each of theCPCP protocol. First,subsections in this section of the document provides aclient makesdescription of the SIP mechanism for supporting the operation. Non-SIP mechansims are discussed in the XCON framework document [16]. 5.1 Creating Conferences There are many ways in which apolicy change. Thisconference can bedirectly, using the CPCP, or indirectly. An indirect policy change request is any non-CPCP action that requires approval.created. Thesimplest example is an INVITE to the focus from a new participant. That representscreation of arequest to changeconference actually constructs several elements all at themembership ofsame time. It results in theconference. Fromcreation of amoderation perspective, it is handled identically tofocus and a conference policy. It also results in thecase whereconstruction of aclient usedconference URI, which uniquely identifies theCPCP to request thatfocus. Since thesame userconference URI needs to beadded to the conference. Part ofunique, theconference policy itself may designate any policy change as moderated. This meanselement which creates conferences is responsible for guaranteeing thatthey change cannotuniqueness. This can beperformedaccomplished deterministically, bythe client directly. As a result, the CPCP request will be answeredkeeping records of conference URIs, or by generating URIs algorithmically, or probabilistically, by creating random URI with sufficiently low probabilities of collision. When aresponse sayingmedia and conference policy are created, they are established with default rules that are implementation dependent. If theaction will be done pending authorization. That completes the CPCP transaction. In the casecreator of the conference wishes to change those rules, they would do so using apolicy change requested indirectly through some other means, the behavior depends on thenon-SIP mechanism.For example, if a user sends aSIPINVITE requestcan be used tothe conferencecreate conferences hosted inordera central server by sending an INVITE tojoin, anda conferencing application thatjoin request is moderated, the focuswouldnormally accept it and play music-on-hold until the request is approved. Even though the CPCP transaction failed, it does result in a change in internal state. Specifically, the requested change shows up asautomatically create a"pending" state within the media andnew conferencepolicies. This means that the change has been requested, but has not taken effect. It is almostand then place aformuser into it. Creation ofchange request history. However, because it is a state change, it is something that can result in notifications throughconferences where theconference notification service. Therefore,focus resides inorder to moderate requests,an endpoint operates differently. There, themoderator subscribes toendpoint itself creates the conferencepolicy notification service. Normally, the Rosenberg Expires December 28, 2004 [Page 22] Internet-Draft Conferencing Framework June 2004 notifications from the focus do not reflect pending state changes. That is, the service will not normally send a notification informing a subscriber that a policy change request was madeURI, andfailed duehands it out tolack of authorization. However, notificationsother endpoints which are to be themoderator do reflect these changes. Thatparticipants. What differs from case to case isbecause the policy ofhow thefocus isendpoint decides toinform moderators, and only moderators, of these changes. Indeed, different users can be moderators for different parts of the conference and media policies. For example, one user can be a moderator for membership changes, and another,create amoderator for whether users can be anonymously joined or not. There are two ways thatconference. One important case is thefocus knows whether a subscriberad-hoc conference described in Section 6.2. There, an endpoint unilaterally decides to create the conferencenotification service is a moderator.based on local policy. Thefirst is configured policy (once again through CPCP). That policy can specifydialogs thata particular user iswere connected to themoderator forUA are migrated to the endpoint-hosted focus, using aparticular piece of policy. Therefore, if that user subscribesre-INVITE to pass the conferencenotification service, any notification sentURI tothat user will include pending changesthe newly joined participants. Rosenberg Expires April 18, 2005 [Page 15] Internet-Draft Conferencing Framework October 2004 Alternatively, one UA can ask another UA tothat piece of policy. Ascreate analternative, a SUBSCRIBE request from a user can include a filter [14] that requests receipt of these pending state changes. If the conference policy allows, that requestendpoint-hosted conference. This ishonored, and the subscriber will receive notifications about pending state changes. Onceaccomplished with themoderatorSIP Join header [10]. The UA which receivesa notification about the pending state change, they use the CPCP to implement their decision. Ifthemoderator decidesJoin header in an invitation may need toapprove the change, they usecreate a new conference URI (a new one is not needed if theCPCP or MPCPdialog that is being joined is already part of a conference). The conference URI is then handed toactually performthechange themselves. Since the moderatorrecently joined participants through a re-INVITE. 5.2 Adding Participants There are many mechanisms for adding participants to apiece of policy is allowedconference. In all cases, participant additions can be first party (a user adds themself) or third party (a user adds another user). First person additions using SIP are trivially accomplished with a standard INVITE. A participant can send an INVITE request tochange that piece of policy, by definition, their change is acceptedthe conference URI, andperformed. Ifif themoderator decidesconference policy allows them toreject the change,join, theyuse the CPCPare added toremovethepending state from the database. The pending state persists inconference. If a UA does not know thedatabase forconference URI, but has learned about aperiod of timedialog whichis, itself, part of theis connected to a conferencepolicy. If(by using themoderator does not either approve or rejectdialog event package, for example [11]), thechange,UA can join thepending state eventually disappears, as ifconference by using thechange was explicitly rejected. IfJoin header to join thepending state is approved,dialog. Third party additions with SIP are done using REFER [12]. The client can send areal changeREFER request to theconference or media policy takes place, and this change will be reflected inparticipant, asking them to send an INVITE request to the conferencenotification service. In this way, if aURI. Additionally, the clientmakescan send apolicy change, and theirREFER requestis rejected because they are not authorized,to theclient can subscribefocus, asking it to send an INVITE to theconference notification serviceparticipant. The latter technique has the benefit of allowing a client tolearn if their change is eventually approvedadd a conference-unaware participant that does not support the REFER method. 5.3 Removing Participants As with additions, there are several mechanisms for departures. Removals can also be first person orrejected.third person. First person departures are trivially accomplished by sending a BYE request to the focus. Thisgeneral mechanism for moderating policy requests is consistentterminates the dialog with themoderation of presence subscriptions [15][16]. Rosenberg Expires December 28, 2004 [Page 23] Internet-Draft Conferencing Framework June 2004 5.6focus and removes the participant from the conference. Third person departures can also be done using SIP, through the REFER method. 5.4 Creating Sidebars A sidebar is a "conference within a conference", allowing a subset of the participants to converse amongst themselves. Frequently, participants in a sidebar will still receive media from the main Rosenberg Expires April 18, 2005 [Page 16] Internet-Draft Conferencing Framework October 2004 conference, but "in the background". For audio, this may mean that the volume of the media is reduced, for example. A sidebar is represented by a separate conference URI. This URI isa type of "alias" for the main conference URI. Both route to the same focus. Like any other conference, the sidebar conference URI has a conference policy and a media policy associated with it. Like any other conference, one can join it by sending an INVITE to this URI, or ask others to join by referring them to it. However, it differs from a normal conference URI in several ways. First, users in the main conference do not need to establish a separate dialog to the sidebar conference. The focus recognizes the sidebar as a special URI, and knows to use the existing dialog to the main conference as a "virtual" connection to the sidebar URI. The second difference is the way in which conference and media policies are implemented. If the conference policy control protocol is used to add a user to a normal conference, the focus will typically send an INVITE to the participant to ask them to join. For a sidebar conference, it is done differently. If the conference policy control protocol is used to add a user to it, and that user is already part of the main conference, the focus will use the conference notification service to alert the existing participant that they have been asked to join the sidebar. The invited user can then make use of the CPCP to formally add themselves toa type of "alias" for thesidebar. 5.7main conference URI. 5.5 Destroying Conferences Conferences can be destroyed in several ways. Generally, whether those means are applicable for any particular conference is a component of the conference policy. When a conference is destroyed, the conference and media policies associated with it are destroyed. Any attempts to read or write those policies results in a protocol error. Furthermore, the conference URI becomes invalid. Any attempts to send an INVITE to it, or SUBSCRIBE to it, would result in a SIP error response. Typically, if a conference is destroyed while there are still participants, the focus would send a BYE to those participants before actually destroying the conference. Similarly, if there were any users subscribed to the conference notification service, those subscriptions would be terminated by the server before the actualRosenberg Expires December 28, 2004 [Page 24] Internet-Draft Conferencing Framework June 2004destruction.5.7.1 SIP MechanismsThere is no explicit means in SIP to destroy a conference. However, a conference may be destroyed as a by-product of a user leaving the conference, which can be done with BYE. In particular, if the conference policy states that the conference is destroyed once the last user leaves, when that user does leave (using a SIP BYE request), the conference is destroyed.5.7.2 CPCP Mechanisms The CPCP contains mechanisms for explicitly destroying a conference. 5.7.3 Non-Automated Mechanisms As with conference creation, a conference can be destroyed by interacting with a web application or voice application that prompts the user for the conference to be destroyed. 5.85.6 Obtaining Membership Information A participant in a conference will frequently wish to know the set of other users in the conference. This information can be obtained many ways.5.8.1 SIP MechanismsThe conference notification service allows a conference aware participant to subscribe to it, and receive notifications that contain the list of participants. When a new participant joins or leaves, subscribers are notified. The conference notification service also allows a user to do a "fetch" [4] to obtain the current listing.5.8.2 CPCP Mechanisms The CPCP contains mechanisms for querying for the current set of conference participants. 5.8.3 Non-Automated Mechanisms Users can also interact with applications to obtain conference membership. There may be a conference web page associated with the conference, which has a link that will fetch the current list of participants and display them in the browser. Similarly, an interactive voice response application connected to the focus can be Rosenberg Expires December 28, 2004 [Page 25] Internet-Draft Conferencing Framework June 2004 used to obtain the current membership. A user in the conference could press the pound key on their phone, and hear a listing of the current participants. 5.95.7 Adding and Removing Media Each conference is composed of a particular set of media that the Rosenberg Expires April 18, 2005 [Page 17] Internet-Draft Conferencing Framework October 2004 focus is managing. For example, a conference might contain a video stream and an audio stream. The set of media streams that constitute the conference can be changed by participants. When the set of media in the conference change, the focus will need to generate a re-INVITE to each participant in order to add or remove the media stream to each participant. When a media stream is being added, a participant can reject the offered media stream, in which case it will not receive or contribute to that stream. Rejection of a stream by a participant does not imply that that the stream is no longer part of the conference - just that the participant is not involved in it.There are several ways in which a media stream can be added or removed from a conference. 5.9.1 SIP MechanismsA SIP re-INVITE can be used by a participant to add or remove a media stream. This is accomplished using the standard offer/answer techniques for addingmedia streams to a session [17]. This will trigger the focus to generate its own re-INVITEs. 5.9.2 CPCP Mechanisms The CPCP can be used to add or remove a media stream. This too will trigger the focus to generate a re-INVITE to each participant in order to affect the change. 5.9.3 Non-Automated Mechanisms As with most of the other common functions, addition and removal of media streams can be accomplished with a web application or interactive voice application. 5.10media streams to a session [14]. This will trigger the focus to generate its own re-INVITEs. 5.8 Conference Announcements and Recordings Conference announcements and recordings play a key role in many real conferencing systems. Examples of such features include: o Asking a user to state their name before joining the conference, in order to support a roll call o Allowing a user to request a roll call, so they can hear who else is in the conferenceRosenberg Expires December 28, 2004 [Page 26] Internet-Draft Conferencing Framework June 2004o Allowing a user to press some keys on their keypad in order to record the conference o Allowing a user to press some keys on their keypad in order to be connected with a human operator o Allowing a user to press some keys on their keypad to mute or unmute their line Rosenberg Expires April 18, 2005 [Page 18] Internet-Draft Conferencing Framework October 2004 User 1 +-----------+ | | | | |Participant| | 1 | | | +-----------+ |SIP |Dialog Conference |1 Policy +---|--------+ User 2 Server | | | Application +-----------+ +-----------+ |CPCPnon-SIP ************* | | | | |-------- * * | | | | | * * |Participant|-----------| Focus |------------*Participant* | 2 | SIP | | | SIP * 4 * | | Dialog | |--+ Dialog * * +-----------+ 2 +-----------+ 4 ************* | | |SIP |Dialog |3 | +-----------+ | | | | |Participant| | 3 | | | +-----------+ User 3 Figure43 In this framework, these capabilities are modeled as an application which acts as a participant in the conference. This is shownRosenberg Expires December 28, 2004 [Page 27] Internet-Draft Conferencing Framework June 2004pictorially in Figure4.3. The conference has four participants. Three of these participants are end users, and the fourth is the announcement application. If the announcement application wishes to play an announcement to all the conference members (for example, to announce a join), it merely sends media to the mixer as would any other participant. The announcement is mixed in with the conversation and played to the participants. Rosenberg Expires April 18, 2005 [Page 19] Internet-Draft Conferencing Framework October 2004 Similarly, the announcement application can play an announcement to a specific user byusing the CPCP to configureconfiguring its media policy so that the media it generates is only heard by the target user. The application then generates the desired announcement, and it will be heard only by the selected recipient. The announcement application can also receive input from a specific user through the conference.The announcement application would use the CPCP to cause in-band DTMF to be dropped from the mix, and sent only to itself. When a user wishes to invoke an operation, such as to obtain a roll call, the user would press the appropriate key sequence. That sequence would be heard only by the announcement application. Once the application determines that the user wishes to hear a roll call,To do this, it can use theCPCP to set the media policy so that media from that user is delivered only to the announcement application. This "disconnects" the user from the rest of the conference so they can interact with the application. Once the interaction is done, and announcementapplicationuses the CPCPinteraction framework [6]. This allows it to"reconnect" thecollect userto the conference. 5.11input, possibly through keypad stimulus, and take actions. 5.9 Floor Control Floor control is similar to a conference announcement application. Within the context of this framework, floor controliswould be managed by anapplication (possiblyapplication, possibly one that is not aparticipant)participant, thatuses the CPCPwould use a non-SIP protocol to enforce the resulting floor control decisions.[[Need more work here]] 5.12 Camera and Video Controls OPEN ISSUE: Originally, I was just going to say that this is outside the scope of conferencing. But, it does impact conferencing. Effectively, camera control is treated like a media stream. The mixer would combine the various requests across participants and direct them to the appropriate device. How does that work though? In a video conference with 4 participants, the cameraFurther detail on floor controlneeds to identify the specific user whose camera is to be controlled. Thatissomething unique to conferencing.provided in the XCON framework document [16]. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page28]20] Internet-Draft Conferencing FrameworkJuneOctober 2004 6. Physical Realization In this section, we present several physical instantiations of these components, to show how these basic functions can be combined to solve a variety of problems. 6.1 Centralized Server In the most simplistic realization of this framework, there is a single physical server in the network which implements the focus, the conference policy server, and the mixers. This is the classic "one box" solution, shown in Figure5.4. Conference Server ................................... . . . +------------+ . . | Conference | . . |Notification| . . | Server | . . +------------+ . . +----------+ . . |Conference| +-----+ . . | Policy | +-------+ +-----+| . . | Server | | Focus | |Mixer|+ . . +----------+ +-------+ +-----+ . ................//.\.....***....... // \ *** * // *** * RTP SIP // *** \ * // *** \SIP * // *** RTP \ * / ** \ * +-----------+ +-----------+ |Participant| |Participant| +-----------+ +-----------+ Figure54 6.2 Endpoint Server Another important model is that of a locally-mixed ad-hoc conference. In this scenario, two users (A and B) are in a regular point-to-point call. One of the participants (A) decides to conference in a third participant, C. To do this, A begins acting as a focus. Its Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page29]21] Internet-Draft Conferencing FrameworkJuneOctober 2004 existing dialog with B becomes the first dialog attached to the focus. A would re-INVITE B on that dialog, changing its Contact URI to a new value which identifies the focus. In essence, A "mutates" from a single-user UA to a focus plus a single user UA, and in the process of such a mutation, its URI changes. Then, the focus makes an outbound INVITE to C. When C accepts, it mixes the media from B and C together, redistributing the results. The mixed media is also played locally. Figure65 shows a diagram of this transition. B B +------+ +------+ | | | | | UA | | UA | | | | | +------+ +------+ | . | . | . | . | . | . | . Transition | . | . ------------> | . SIP| .RTP SIP| .RTP | . | . | . | . | . | . | . | . | . +----------+ +------+ | +------+ | SIP +------+ | | | |Focus | |----------| | | UA | | |C.Pol.| | | UA | | | | |Mixers| |..........| | +------+ | | | | RTP +------+ | +------+ | A | + | C | + <..|....... | + | . | +------+ | . | |Parti-| | . | |cipant| | . | | | | . | +------+ | . +----------+ . A . . Internal Interface Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page30]22] Internet-Draft Conferencing FrameworkJuneOctober 2004 Figure65 It is important to note that the external interfaces in this model, between A and B, and between B and C, are exactly the same to those that would be used in a centralized server model. B could also include a conference policy server and conference notification service, allowing the participants to have access to them if they so desired. Just because the focus is co-resident with a participant does not mean any aspect of the behaviors and external interfaces will change. 6.3 Media Server Component +------------+ +------------+ | App Server| SIP |Conf. Cmpnt.| | |-------------| | | Focus | Conf. Proto | Focus | | C.Pol |-------------| C.Pol | | | Media Proto | Mixers | |Notification|-------------| | | | | | +------------+ +------------+ | \ .. . | \\ RTP... . | \\ .. . | SIP \\ ... . SIP | \\ ... .RTP | ..\ . | ... \\ . | ... \\ . | .. \\ . | ... \\ . | .. \ . +-----------+ +-----------+ |Participant| |Participant| +-----------+ +-----------+ Figure76 In this model, shown in Figure7,6, each conference involves two centralized servers. One of these servers, referred to as the "application server" owns and manages the membership and media policies, and maintains a dialog with each participant. As a result, it represents the focus seen by all participants in a conference. However, this server doesn't provide any media support. To perform the actual media mixing function, it makes use of a second server, called the "mixing server". This server includes a focus, and a Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page31]23] Internet-Draft Conferencing FrameworkJuneOctober 2004 conference policy server, but has no conference notification service. It has a default membership policy, which accepts all invitations from the top-level focus. Its conference policy server accepts any controls made by the application server. The focus in the application server uses third party call control to connect the media streams of each user to the mixing server, as needed. If the focus in the application server receives a conference policy control command from a client, it delegates that to the media server by making the same media policy control command to it. This model allows for the mixing server to be used as a resource for a variety of different conferencing applications. This is because it is unaware of any conference or media policies; it is merely a "slave" to the top-level server, doing whatever it asks. 6.4 Distributed Mixing In a distributed mixed conference, there is still a centralized server which implements the focus, conference policy server, and media policy server. However, there are no centralized mixers. Rather, there are mixers in each endpoint, along with a conference policy server. The focus distributes the media by using third party call control[18][15] to move a media stream between each participant and each other participant. As a result, if there are N participants in the conference, there will be a single dialog between each participant and the focus, but the session description associated with that dialog will be constructed to allow media to be distributed amongst the participants. This is shown in Figure8.7. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page32]24] Internet-Draft Conferencing FrameworkJuneOctober 2004 +---------+ |Partcpnt | media | | media ...............| |.................. . | Mixers | . . |C.Pol.Srv| . . +---------+ . . | . . | . . | . . dialog | . . | . . | . . | . . +---------+ . . |Cnf.Srvr.| . . | | . . | Focus | . . |C.Pol.Srv| . . / | | \ . . / +---------+ \ . . / \ . . / \ . . / dialog \ . . / \ . . /dialog \ . . / \ . . / \ . . / \ . . . +---------+ +---------+ |Partcpnt | |Partcpnt | | | | | | | ......................... | | | Mixers | | Mixers | |C.Pol.Srv| media |C.Pol.Srv| +---------+ +---------+ Figure87 There are several ways in which the media can be distributed to each participant for mixing. In a multi-unicast model, each participant sends a copy of its media to each other participant. In this case, the session description manages N-1 media streams. In a multicast model, each participant joins a common multicast group, and each participant sends a single copy of its media stream to that group. The underlying multicast infrastructure then distributes the media, so that each participant gets a copy. In a single-source multicast Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page33]25] Internet-Draft Conferencing FrameworkJuneOctober 2004 model (SSM), each participant sends its media stream to a central point, using unicast. The central point then redistributes the media to all participants using multicast. The focus is responsible for selecting the modality of media distribution, and for handling any hybrids that would be necessitated from clients with mixed capabilities. When a new participant joins or is added, the focus will perform the necessary third party call control to distribute the media from the new participant to all the other participants, and vice-a-versa. The central conference server also includes a conference policy server. Of course, the central conference server cannot implement any of the media policies directly. Rather, it would delegate the implementation to the conference policy servers co-resident with a participant. As an example, if a participant decides to switch the overall conference mode from "voice activated" to "continuous presence", they would communicate with the central conference policy server. The conference policy server, in turn, would communicate with the conference policy servers co-resident with each participant, using the same conference policy control protocol, and instruct them to use "continuous presence". This model requires additional functionality in user agents, which may or may not be present. The participants, therefore, must be able to advertise this capability to the focus. 6.5 Cascaded Mixers In very large conferences, it may not be possible to have a single mixer that can handle all of the media. A solution to this is to use cascaded mixers. In this architecture, there is a centralized focus, but the mixing function is implemented by a multiplicity of mixers, scattered throughout the network. Each participant is connected to one, and only one of the mixers. The focus uses some kind of control protocol to connect the mixers together, so that all of the participants can hear each other. +---------+ +-----------------------| |------------------------+ | ++++++++++++++++++++| |++++++++++++++++++ | | + +------| Focus |---------+ + | | + | | | | + | | + | +-| |--+ | + | | + | | +---------+ | | + | | + | | + | | + | Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page34]26] Internet-Draft Conferencing FrameworkJuneOctober 2004 | + | | + | | + | | + | | + | | + | | + | | +---------+ | | + | | + | | | | | | + | | + | | | Mixer 2 | | | + | | + | | | | | | + | | + | | +---------+ | | + | | + | |... . .... | | + | | + .|....| . .|.... | + | | + ...... | | . | ..|... + | | + ... | | . | | ....+ | | +---------+ | | +---------+ | | +---------+ | | | | | | | | | | | | | | | Mixer 2 | | | | Mixer 3 | | | | Mixer 4 | | | | | | | | | | | | | | | +---------+ | | +---------+ | | +---------+ | | . . | | . . | | . . | | . . | | .. . | | .. . | | . . | | . . | | . . | +---------+ . | +---------+ . | +---------+ . | | Prtcpnt | . | | Prtcpnt | . | | Prtcpnt | . | | 1 | . | | 1 | . | | 1 | . | +---------+ . | +---------+ . | +---------+ . | . | . | . | +---------+ +---------+ +---------+ | Prtcpnt | | Prtcpnt | | Prtcpnt | | 1 | | 1 | | 1 | +---------+ +---------+ +---------+ ------- SIP Dialog ....... Media Flow +++++++ Control Protocol Figure98 This architecture is shown in Figure9.8. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page35]27] Internet-Draft Conferencing FrameworkJuneOctober 2004 7. Security Considerations Conferences frequently require security features in order to properly operate. The conference policy may dictate that only certain participants can join, or that certain participants can create new policies. Generally speaking, conference applications are very concerned about authorization decisions. Mechanisms for establishing and enforcing such authorization rules is a central concept throughout this document. Of course, authorization rules require authentication. Normal SIP authentication mechanisms should suffice for the conference authorization mechanisms described here. Privacy is an important aspect of conferencing. Users may wish to join a conference without anyone knowing that they have joined, in order to silently listen in. In other applications, a participant may wish just to hide their identity from other participants, but otherwise let them know of their presence. These functions need to be provided by the conferencing system. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page36]28] Internet-Draft Conferencing FrameworkJuneOctober 2004 8. Contributors This document is the result of discussions amongst the conferencing design team. The members of this team include: Alan Johnston Brian Rosen Rohan Mahy Henning Schulzrinne Orit Levin Roni Even Tom Taylor Petri Koskelainen Nermeen Ismail Andy Zmolek Joerg Ott Dan Petrie Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page37]29] Internet-Draft Conferencing FrameworkJuneOctober 2004 9. Acknowledgements The authors would like to thank Mary Barnes and Chris Boulton for their comments. Thanks to Allison Mankin for her comments and support of this work. Rosenberg Expires April 18, 2005 [Page 30] Internet-Draft Conferencing Framework October 2004 10. Changes from draft-ietf-sipping-conferencing-framework-02 Removed detailed discussions on policy servers, CPCP operations, sidebars, and approval of policy changes. These now reside in the XCON framework draft, which is referenced from here now. Rosenberg Expires April 18, 2005 [Page 31] Internet-Draft Conferencing Framework October 2004 11. Changes from draft-ietf-sipping-conferencing-framework-00 Updated references and formatting cleanup. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page38]32] Internet-Draft Conferencing FrameworkJuneOctober 200410.12. Changes since draft-rosenberg-sipping-conferencing-framework-01 o Clarified that the conference notification service uses a single package with some kind of filtering to select whether you get the focus or policy state. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page39]33] Internet-Draft Conferencing FrameworkJuneOctober 200411.13. Changes since draft-rosenberg-sipping-conferencing-framework-00 o Rework of terminology. o More details on moderating policy changes. o Rework of the overview, and in particular, a shift of focus from basic/complex conferences (a term which has been removed) to conference aware/unaware participants. o Removal of explicit reference to megaco for controlling a mixer. o Discussion of a lot more conferencing operations. o New sidebar mechanism.1214 Informative References [1] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [2] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 3550, July 2003. [3] Levin,O., "RequirementsO. and R. Even, "High Level Requirements for Tightly Coupled SIP Conferencing",draft-levin-sipping-conferencing-requirements-01draft-ietf-sipping-conferencing-requirements-01 (work in progress),July 2002.October 2004. [4] Roach, A., "Session Initiation Protocol (SIP)-Specific Event Notification", RFC 3265, June 2002. [5] Campbell, B., "The Message Session Relay Protocol",draft-ietf-simple-message-sessions-06draft-ietf-simple-message-sessions-08 (work in progress),MayAugust 2004. [6] Rosenberg, J., "A Framework for Application Interaction in the Session Initiation Protocol (SIP)",draft-ietf-sipping-app-interaction-framework-01draft-ietf-sipping-app-interaction-framework-02 (work in progress),FebruaryJuly 2004. [7] Johnston, A. and O. Levin, "Session Initiation Protocol Call Control - Conferencing for User Agents",draft-ietf-sipping-cc-conferencing-03draft-ietf-sipping-cc-conferencing-04 (work in progress),FebruaryJuly 2004. [8] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998. [9] Rosenberg, J., Schulzrinne, H. and P. Kyzivat, "Indicating User Agent Capabilities in the Session Initiation Protocol (SIP)",draft-ietf-sip-callee-caps-03 (work in progress), January 2004.Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page40]34] Internet-Draft Conferencing FrameworkJuneOctober 2004 RFC 3840, August 2004. [10] Mahy, R. and D. Petrie, "The Session Inititation Protocol (SIP) 'Join' Header", draft-ietf-sip-join-03 (work in progress), February 2004. [11] Rosenberg, J. and H. Schulzrinne, "An INVITE Inititiated Dialog Event Package for the Session Initiation Protocol (SIP)", draft-ietf-sipping-dialog-package-04 (work in progress), February 2004. [12] Sparks, R., "The Session Initiation Protocol (SIP) Refer Method", RFC 3515, April 2003. [13] Campbell, B., Rosenberg, J., Schulzrinne, H., Huitema, C. and D. Gurle, "Session Initiation Protocol (SIP) Extension for Instant Messaging", RFC 3428, December 2002. [14]Khartabil, H., Leppanen, E. and T. Moran, "Requirements for Presence Specific Event Notification Filtering", draft-ietf-simple-pres-filter-reqs-03 (work in progress), January 2004. [15] Rosenberg, J., "A Presence Event Package for the Session Initiation Protocol (SIP)", draft-ietf-simple-presence-10 (work in progress), January 2003. [16] Rosenberg, J., "A Watcher Information Event Template-Package for the Session Initiation Protocol (SIP)", draft-ietf-simple-winfo-package-05 (work in progress), January 2003. [17]Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.[18][15] Rosenberg, J., Peterson, J., Schulzrinne, H. and G. Camarillo, "Best Current Practices for Third Party Call Control (3pcc) in the Session InitiationProtocol", draft-ietf-sipping-3pcc-06Protocol (SIP)", BCP 85, RFC 3725, April 2004. [16] Barnes, M. and C. Boulton, "A Framework for Centralized Conferencing", draft-barnes-xcon-framework-00.txt (work in progress),JanuarySeptember 2004.Rosenberg Expires December 28, 2004 [Page 41] Internet-Draft Conferencing Framework June 2004Author's Address Jonathan RosenbergdynamicsoftCisco Systems 600 Lanidex Plaza Parsippany, NJ 07054 US Phone: +1 973 952-5000 EMail: jdrosen@dynamicsoft.com URI: http://www.jdrosen.net Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page42]35] Internet-Draft Conferencing FrameworkJuneOctober 2004 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Rosenberg ExpiresDecember 28, 2004April 18, 2005 [Page43]36] ----