A Framework of Requirements, Threat Models, and Security Services for Videoconferencing over Internet2

Version 0.1

Draft February 3, 2002

 

Working Groups

Internet2 VidMid Video Middleware Group

Video Development Initiative

 

Contributors

Samir Chatterjee, Claremont Graduate University

Tyler Johnson, UNC

 

Contents

1. Introduction

2. Security Requirements

            2.1 Basic services framework

            2.2 SIP message transaction

            2.3 Threat models

                        2.3.1 Registration hijacking

                        2.3.2 Impersonating a server

                        2.3.3 Tampering with message bodies

                        2.3.4 Tearing down sessions

                        2.3.5 DOS attacks

3. SIP authentication mechanisms

            3.1 User-to-User authentication 

            3.2 Proxy-to-User authentication 

4. Other security issues

            4.1 Encryption with IPSEC, TLS, S/MIME

            4.2 Overview of authorization mechanisms

            4.3 Synergies with Shiboleth and Federated Admin

            4.4 User versus computer authentication

            4.5 Problems with NAT and Firewalls

References

 

 

1. Introduction

Internet-2 VidMid working group is currently developing a work plan to develop and deploy Video-conferencing tools and applications. Of tremendous interest to this group is the secured signaling and transmission of media over Internet-2. The working group wants to develop a federated administration model for security. It also wants to create an interoperable testbed that will likely house three type of VC systems: SIP, H.323 and VRVS.

Security models within each of these standards are currently on-going work. It is premature to even guess what requirements and services must be supported. SIP is being developed by IETF and in their most recent draft (bis-06), has included preliminary requirements and some basic guidelines for security. H.323, which is being developed by ITU-T has a working group under the H.225 Annex D which is considering a similar set of security related issues. VRVS (a joint development between CERN, multicasting working group, and private companies) will soon have to deal and cope with the myriads of security problems as discussed in this draft. It should be noted that of the three standards, SIP is slightly ahead in specifying security guidelines and a new security task force has been created to oversee developments in that front.

This working draft will use SIP as a reference model for discussions. But most of this would apply to H.323 systems and even VRVS. For example, the reader can replace proxy server by gatekeeper and SIP User Agent by H.323 end-terminal, the context could be preserved.

2. Security Requirements

SIP (and H.323, VRVS) is not an easy protocol to secure. Their use of many trust boundaries and modes of operation, use of intermediaries, multi-faceted trust relationships, its expected usage between elements with no trust at all, and its user-to-user operation make security mechanisms very challenging. It should be noted from the outset that there is no such thing as "absolute" security. There are different levels of risk that various deployments are willing to tolerate given the spectrum of risk versus operational and deployments ease. Adding to the confusion, there are many existing security mechanisms with overlapping functionality. This draft first lists possible and likely threats and then provides the types of security services that a framework should have. Some implementation choices (as mentioned in standards documents) are also discussed. However, this generalized framework is not comprehensive and a further discussion within VidMid VC is encouraged.

2.1  Basic Services Framework

A good security framework must include at least four elements: authentication, confidentiality, integrity, and authorization. For web and Internet related applications, it may also include: privacy, nonrepudiation, administration and audit trails.

Authentication - is means of identifying another entity. There are many ways to authenticate another entity, but the typical computer based methods involve user ID/password or  digitally signing a set of bytes using a keyed hash. Authentication usually relies on either direct knowledge of the other entity (say, a shared symmetric key or possession of the other person's public key), or third party schemes such as Kerberos and X.509 Certificate Authorities.

Confidentiality - Cryptographic confidentiality means that only the intended recipients will be able to determine the contents of the confidential area. This is typically done using encryption algorithms such as DES and AES.

Integrity - A message integrity check is means of insuring that a message in transit was not altered. In combination with a key, a message integrity check (or checksum, or keyed hash) insures that only the holders of the proper keying materiel will be able to modify a message in transit without detection.

Authorization - Once identification of a correspondent is achieved, a decision must be made as to whether that identity should be granted access for the requested services. This is the act of authorization. This is often done using access control lists (ACL).

Privacy - Many customers want their identity to be secured. They want to make sure others do not know what they are doing or transmitting. Some people prefer anonymity. In a higher education environment, faculty and student reserve the right to privacy.

Non-repudiation - In e-commerce transactions, merchants need protection against the customer's unjustifiable denial of placing an order. On the other hand, customers need protection against the merchants' unjustifiable denial of payments made. For video-conferencing, the same applies to customers and service providers for billing dispute purposes.

Administration - typically deals with the accounting and billing of sessions (calls) made and these call data records (CDRs) must be secured and protected from tampering. This is more important for service providers than end-users.

Audit-trails - Borrowing from Enron, DO NOT shred documents.

2.2  SIP message transactions

SIP session setup with SIP trapezoid [borrowed from RFC2543bis-05]

Figure 1: SIP session setup with SIP trapezoid [borrowed from RFC2543bis-05]

SIP [1] signaling involves transmission of several messages between user agents, proxies, and redirect servers or directly between user agents. Typically, trust domains are assumed between a UA and its local proxy (Alice's PC and the Atlanta.com proxy server).

Note that the security of SIP signaling itself has no bearing on the security of protocols used in concert with SIP such as RTP, or with the security implications of any specific bodies SIP might carry  (although MIME security plays a substantial role in securing SIP). Any media associated with a session can be encrypted end-to-end independently of any associated SIP signaling. Media encryption is outside the scope of this document.

IP currently transmits all data as clear text, which is commonly referred to as transmitting in the clear. This means that the data is not scrambled or rearranged; it is simply transmitted in its raw form. This includes data and authentication information. Network analyzers which often operate as passive devices can quietly monitor such clear text transmissions and compromise security. In order for a network analyzer to capture a communication session, it must be connected somewhere along the session's path. This could be on the network at some point between the system initializing the session and the destination system. It is interesting to note that several Internet protocols (FTP, Telnet, SMTP, HTTP, POP3, IMAP) often send authentication information in the clear text. So beware, if you do on-line banking, make sure digital certificates are used and authenticated by a third provider such as Verisign.

2.3  Threat Models

The following examples by no means provide an exhaustive list of the threats against the SIP protocol; rather, these are "classic" threats that demonstrate the need for particular security services which can potentially prevent whole categories of threats [1, 3, 6].

2.3.1 Registration Hijacking

The SIP registration mechanism allows a user agent to identify itself to a registrar as a device at which a user (designated by an address of record) is located. A registrar assesses the identity asserted in the From header field of a REGISTER message to determine whether or not this request can modify the contact addresses associated with the address of record in the To header field; while these two fields are frequently the same, there are many valid deployments in which a third-party may register contacts on a user's behalf.

The From header of a SIP request, however, can essentially be modified arbitrarily by the owner of a user agent, and this opens the door to malicious registrations. An attacker that successfully impersonates a party authorized to change contacts associated with an address of record could, for example, de-register all existing contacts for a URI and then register their own device as the appropriate contact address, thereby directing all requests for the affected user to the attacker's device.

This threat belongs to a family of threats that rely on the absence of cryptographic assurance of a request's originator. Any SIP UAS that represents a valuable service (a gateway that interworks SIP requests with traditional telephone calls, for example) might want to control access to its resources by authenticating requests that it receives.  Even end-user UAs, for example SIP video-phones, have an interest in ascertaining the identities of originators of requests.

This threat demonstrates the need for security services that enable SIP entities to authenticate the originators of requests [1, 5].

2.3.2 Impersonating a Server

The domain to which a request is destined is generally specified in the Request-URI; user agents commonly contact a server in this domain directly in order to deliver a request. However, there is always a possibility that an attacker could impersonate the remote server, and that the user agent's request could be intercepted by some other party.

For example, consider a case in which a redirect server at one domain, chicago.com, impersonates a redirect server at another domain, biloxi.com. A user agent sends a request to biloxi.com, but the redirect server at chicago.com answers with a forged response that has appropriate SIP headers for a response from biloxi.com. The forged contact addresses in the redirection response could direct the originating user agent to inappropriate or insecure resources, or simply prevent requests for biloxi.com from succeeding.

This family of threats has a vast membership, many of which are critical. As a converse to the registration hijacking threat, consider the case in which a registration sent to biloxi.com is intercepted by chicago.com, which replies to the intercepted registration with a forged 301 (Moved Permanently) response. This response might seem to come from biloxi.com yet designate chicago.com as the appropriate registrar. All future REGISTER requests from the originating user agent would then go to chicago.com.

Prevention of this threat requires a means by which user agents can authenticate the servers to whom they send requests.

2.3.3 Tampering with Message Bodies

As a matter of course, SIP user agents route requests through trusted proxy servers. Regardless of how that trust is established (authentication of proxies is discussed in section 3), a user agent may trust a proxy server to route a request, but not to inspect or possibly modify the bodies contained in that request.

Consider a UA that is using SIP message bodies to communicate session encryption keys for a media session. Although it trusts the proxy server of the domain it is contacting to deliver signaling properly, it may not be desirable for the administrators of that domain to be capable of decrypting any subsequent media session. Worse yet, if the proxy server were actively malicious, it could modify the session key, either acting as a man-in-the-middle, or perhaps changing the security characteristics requested by the originating user agent.

This family of threats applies not only to session keys, but to most conceivable forms of content carried end-to-end in SIP. These might include MIME bodies that should be rendered to the user, SDP, or encapsulated telephony signals among others.

For these reasons, the UA might want to secure SIP message bodies, and in some limited cases headers, end-to-end. The security services required for bodies include confidentiality, integrity, and authentication. These end-to-end services should be independent of the means used to secure interactions with intermediaries such as proxy servers.

2.3.4 Tearing Down Sessions

Once a dialog has been established by initial messaging, subsequent requests can be sent that modify the state of the dialog and/or session.  It is critical that principals in a session can be certain that such requests are not forged by attackers.

Consider a case in which a third-party attacker captures some initial messages in a dialog shared by two parties in order to learn the parameters of the session (To, From, and so forth) and then inserts a BYE request into the session. The attacker could opt to forge the request such that it seemed to come from either participant. Once the BYE is received by its target, the session will be torn down prematurely.

Similar mid-session threats include the transmission of forged re-INVITEs that alter the session (possibly to reduce session security or redirect media streams as part of a wiretapping attack).

The most effective countermeasure to this threat is the authentication of the sender of the BYE - in this instance, the recipient needs only know that the BYE came from the same party with whom the corresponding dialog was established (as opposed to ascertaining the absolute identity of the sender). Also, if the attacker is unable to learn the parameters of the session due to confidentiality, it would not be possible to forge the BYE; however, some intermediaries (like proxy servers) will need to inspect those parameters as the session is established.

2.3.5 Denial of Service and Amplification

Denial of service attacks focus on rendering a particular network element unavailable, usually by directing an excessive amount of network traffic at its interfaces. A distributed denial of service attack allows one network user to cause multiple network hosts to flood a target host with a large amount of network traffic.

In much architecture SIP proxy servers face the public Internet in order to accept requests from worldwide IP endpoints. SIP creates a number of potential opportunities for distributed denial of service attacks that must be recognized and addressed by the implementers and operators of SIP systems.

Attackers can create bogus requests that contain a falsified source IP address and a corresponding Via header field which identify a targeted host as the originator of the request and then send this request to a large number of SIP network elements, thereby using hapless SIP UAs or proxies to generate denial of service traffic aimed at the target.

Similarly, attackers might use falsified Route headers in a request that identify the target host and then send such messages to forking proxies that will amplify messaging sent to the target.  Record-Route could be used to similar effect when the attacker is certain that the SIP dialog initiated by the request will result in numerous transactions originating in the backwards direction.

A number of denial of service attacks open up if REGISTER requests are not properly authenticated and authorized by registrars. Attackers could de-register some or all users in an administrative domain, thereby preventing these users from being invited to new sessions. An attacker could also register a large number of contacts designating the same host for a given address of record in order to use the registrar and any associated proxy servers as amplifiers in a denial of service attack.  Attackers might also attempt to deplete available memory and disk resources of a registrar by registering huge numbers of bindings.

The use of multicast to transmit SIP requests can greatly increase the potential for denial of service attacks. These problems demonstrate a general need to define architectures that minimize the risks of denial of service, and the need to be mindful in recommendations for security mechanisms of this class of attacks.

It is clear from the above discussion of security threat models, SIP requires authentication, confidentiality, integrity, authorization and some means of protection against DoS attacks. Rather than defining new security mechanisms that are specific to SIP protocol, it is wise to reuse wherever possible existing security models. Fortunately, the web, email and Internet world has been dealing with several of the same issues and hence a host of mechanisms are available for consideration.

3. SIP Authentication Mechanisms

SIP provides a stateless challenge-response mechanism for authentication that is based on authentication framework used in HTTP [2]. The basic idea (Section 20, [1]) is as follows. Any time that a proxy server or user agent receives a request, it MAY challenge the initiator of the request to provide assurance of its identity. Once the originator has been identified, the recipient of the request SHOULD ascertain whether or not this user is authorized to make the request in question. It is very important to note that no authorization systems are recommended by SIP working group in IETF. It should be noted that H.235[1] (Annex D and Annex E) also discuss similar authentication and privacy issues in H.323 systems.

The recent SIP standard rejects the earlier Basic Access Authentication scheme since it passes user ID and passwords in the clear. A Digest Authentication scheme, based on cryptographic hashes is recommended. The Digest Access Authentication scheme is not intended to be a complete answer to the need for security in the Web or Voice/Video over IP. This scheme provides no encryption of message content. The intent is simply to create an access authentication method that avoids the most serious flaws of Basic authentication.

The "basic" authentication scheme is based on the model that the client must authenticate itself with a user-ID and a password for each realm [2].  The realm value should be considered an opaque string which can only be compared for equality with other realms on that server. The server will service the request only if it can validate the user-ID and password for the protection space of the Request-URI.

There are no optional authentication parameters.

For Basic, the framework above is utilized as follows:

      challenge   = "Basic" realm

      credentials = "Basic" basic-credentials

Upon receipt of an unauthorized request for a URI within the protection space, the origin server MAY respond with a challenge like the following:

      WWW-Authenticate: Basic realm="WallyWorld"

where "WallyWorld" is the string assigned by the server to identify  the protection space of the Request-URI. A proxy may respond with the same challenge using the Proxy-Authenticate header field.

Like Basic Access Authentication, the Digest scheme is based on a simple challenge-response paradigm. The Digest scheme challenges using a nonce value. A valid response contains a checksum (by default, the MD5 checksum) of the username, the password, the given nonce value, the HTTP method, and the requested URI. In this way, the password is never sent in the clear. Just as with the Basic scheme, the username and password must be prearranged in some fashion not addressed by this document.

The Digest authentication scheme just described suffers from many known limitations. It is intended as a replacement for Basic authentication and nothing more. It is a password-based system and (on the server side) suffers from all the same problems of any password system. In particular, no provision is made in this protocol for the initial secure arrangement between user and server to establish the user's password. Users and implementors should be aware that this protocol is not as secure as Kerberos [7], and not as secure as any client-side private-key scheme [4]. Nevertheless it is better than nothing, better than what is commonly used with telnet and ftp, and better than Basic authentication.

Since SIP does not have the concept of a canonical root URL, the notion of protection spaces is interpreted differently in SIP. The realm string alone defines the protection domain. This is a change from RFC 2543, in which the Request-URI and the realm together defined the protection domain; this definition gave rise to some amount of confusion since the Request-URI sent by the UAC and the Request-URI received by the server issuing a challenge might be different, and indeed the final form of the Request-URI might not be known to the UAC. Also, the previous definition depended on the presence of a SIP URI in the Request-URI, and seemed to rule out alternative URI schemes (like for example the tel URL).

Operators of user agents or proxy servers that will authenticate received requests MUST adhere to the following guidelines for creation of a realm string for their server:

   For example:

      INVITE sip:bob@biloxi.com SIP/2.0

      WWW-Authenticate:  Digest realm="biloxi.com", <...>

Generally, SIP authentication is meaningful for a specific realm, a protection domain. Thus, for Digest authentication, each such protection domain has its own set of user names and secrets. If a server does not care about authenticating individual users, it may make sense to establish a "global" user name and secret for its realm as a default challenge if a particular Request-URI does not have its own realm or set of user names, For example, an INVITE to gateways, MAY have their own device-specific credentials for particular realms.

3.1 User-to-User Authentication

When a UAS receives a request from a UAC, the UAS MAY authenticate the originator before the request is processed. If no credentials (in the Authorization header field) are provided in the request, the UAS can challenge the originator to provide credentials by rejecting the request with a 401 (Unauthorized) status code.

The WWW-Authenticate response-header field MUST be included in 401 (Unauthorized) response messages. The field value consists of at least one challenge that indicates the authentication scheme(s) and parameters applicable to the Request-URI.

An example of the WWW-Authenticate header field in a 401 challenge is:

            WWW-Authenticate: Digest

                    realm="biloxi.com",

                    qop="auth,auth-int",

                    nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093",

                    opaque="5ccc069c403ebaf9f0171e9517f40e41"

When the originating UAC receives the 401 (Unauthorized), it SHOULD, if it is able, re-originate the request with the proper credentials. The UAC may require input from the originating user before proceeding.  Once authentication credentials have been supplied (either directly by the user, or discovered in an internal keyring), user agents SHOULD cache the credentials for a given value of the To header and "realm" and attempt to re-use these values on the next request for that destination. UAs MAY cache credentials in any way they would like.

Once credentials have been located,  any user agent that wishes to authenticate itself with a UAS or registrar -- usually, but not  necessarily, after receiving a 401 (Unauthorized) response -- MAY do so by including an Authorization header field with the request. The Authorization field value consists of credentials containing the authentication information of the user agent for the realm of the resource being requested as well as parameters required in support of authentication and replay protection.

An example of the Authorization header is:

    Authorization: Digest username="bob",

              realm="biloxi.com",

              nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093",

              uri=sip:alice@atlanta.com,

              qop=auth,

              nc=00000001,

              cnonce="0a4f113b",

              response="6629fae49393a05397450978507c4ef1",

              opaque="5ccc069c403ebaf9f0171e9517f40e41"

3.2 Proxy to User Authentication

Similarly, when a UAC sends a request to a proxy server, the proxy server MAY authenticate the originator before the request is processed. If no credentials (in the Proxy-Authorization header field) are provided in the request, the UAS can challenge the originator to provide credentials by rejecting the request with a 407 (Proxy Authentication Required) status code. The proxy MUST populate the 407 (Proxy Authentication Required) message with a Proxy-Authenticate header applicable to the proxy for the requested resource.

4. Other security issues

4.1  Encryption with  IPSEC, TLS, S/MIME

Full encryption of SIP messages provides the best means to preserve the confidentiality of signaling - it can also guarantee that messages are not provided by any malicious intermediaries. However, SIP requests and responses cannot be simply encrypted end-to-end since there are many header fields that must be visible to proxies for routing SIP messages. Note that proxy servers need to modify some features  of messages (such as adding Via headers) in order for SIP to function. Proxy servers must therefore be trusted, to some degree, by SIP user agents.  It must also be noted that SIP is an application layer protocol. It is possible and recommended that encryption mechanisms can be used at lower layers of the protocol stack. For example, at the transport layer, implementers may use TLS and at the network layer IPSEC can be used. Also, the body of SIP messages can use S/MIME type of attachments. All these schemes uses cryptographic techniques and some of them are quite complex since key management is involved.

Cryptography is a set of techniques used to transform information into an alternate format which can later be reversed [4]. This alternate format is referred to as the ciphertext and is typically created using a crypto algorithm and a crypto key. The crypto algorithm is simply a mathematical formula which is applied to the information you wish to encrypt. The crypto key is an additional variable injected into the algorithm to insure that the ciphertext is not derived using the same computational operation every time the algorithm processes information. Since encryption uses mathematical formulas, there is a symbiotic relationship between

This means that knowing any three of these pieces will allow you to derive the fourth. The exception is knowing the combination of the original data and the ciphertext. If you have multiple examples of both, you may be able to discover the algorithm and the key.

The two methods of producing ciphertext are

The two methods are similar except for the amount of data each encrypts on each pass. Most modern encryption schemes use some form of a block cipher.

Public/Private Crypto Keys

So far, we have discussed use secret key algorithms. A secret key algorithm relies on the same key to encrypt and to decrypt the ciphertext. This means that the crypto key must remain secret in order to insure the confidentiality of the ciphertext. If an attacker learns your  secret key, she would be able to unlock all encrypted messages. This creates an interesting Catch-22, because you now need a secure method of exchanging the secret key in order to use the secret key to create a secure method of exchanging information!

In 1976, Whitfield Diffie and Martin Hellman introduce the concept of public cipher keys in their paper "New Directions in Cryptography." Not only did this paper revolutionize the cryptography industry; the process of generating public keys is now known as Diffie-Hellman.

In layman's terms, a public key is a crypto key that has been mathematically derived from a private or secret crypto key. Information encrypted with the public key can only be decrypted with the private key; however, information encrypted with the private key cannot be decrypted with the public key. In other words, the keys are not symmetrical. They are specifically designed so that the public key is used to encrypt data, while the private key is used to decrypt ciphertext. This eliminates the Catch-22 of the symmetrical secret key, because a secure channel is not required in order to exchange key information. Public keys can be exchanged over insecure channels while still maintaining the secrecy of the messages they encrypted.

Digital Certificate Servers

In public and private cipher key systems, a private key can be used to create a unique digital signature. This signature can then be verified later with the public key in order to insure that the signature is authentic. This process provides a very strong method of authenticating a user's identity. A digital certificate server provides a central point of management for multiple public keys. This prevents every user from having to maintain and manage copies of every other user's public cipher key. A Lotus Notes server will act as a digital certificate server, allowing users to sign messages using their private keys. The Notes server will then inform the recipient on delivery whether the Notes server could verify the digital signature.

Digital certificate servers, also known as certificate authorities (CA), provide verification of digital signatures. For example, if Tyler receives a digitally signed message from Jill but does not have a copy of Jill's public cipher key, Tyler can obtain a copy of Jill's public key from the CA in order to verify that the message is authentic. Also, let's assume that Tyler wishes to respond to Jill's e-mail but wants to encrypt the message in order to protect it from prying eyes. Tyler can again obtain a copy of Jill's public key from the CA so that the message can be encrypted using Jill's public key.

Certificate servers can even be used to provide single sign-on and access control. Certificates can be mapped to access control lists for files stored on a server in order to restrict access. When a user attempts to access a file, the server verifies that the user's certificate has been granted access. This allows a CA to manage nearly all document security for an organization.

IP Security (IPSEC)

IPSEC is public/private key encryption algorithm which is being spearheaded by Cisco Systems and IETF. It is not so much a new specification as a collection of open standards. IPSEC uses a Diffie-Hellman exchange in order to perform authentication and establish session keys. IPSEC also uses a 40-bit DES algorithm in order to encrypt the data stream. IPSEC has been implemented at the session layer, so it does not require direct application support. Use of IPSEC is transparent to the end user.

One of the benefits of IPSEC is that it is very convenient to use. Since Cisco has integrated IPSEC into its router line of products, IPSEC becomes an obvious virtual private network (VPN) solution. While IPSEC is becoming quite popular for remote network access from the Internet, the use of a 40-bit DES algorithm makes it most suited for general business use. Organizations that need to transmit sensitive or financial data over insecure channels may be prudent to look for a different encryption technology.

TLS

Transport layer security (TLS) is used for encapsulation of various higher level protocols (including SIP, H.323). The TLS handshake protocol, allows the server and client to authenticate each other and to negotiate an encryption algorithm and cryptographic keys before the application protocol transmits or receive its first byte of data.  The TLS handshake protocol provides connection security that has three basic properties:

The most commonly voiced concern about TLS is that it cannot run over UDP; TLS requires a connection-oriented underlying transport protocol, which for the purposes of this draft means TCP. Even running TCP, regardless of any additional overhead incurred by TLS, is argued to be too intensive for some embedded devices.

It may also be arduous for a local outbound proxy server and/or registrar to maintain many simultaneous long-lived TLS connections with numerous user agents might. This introduces some valid scalability concerns, especially for intensive ciphersuites. Maintaining redundancy of long-lived TLS connections, especially when a user agent is solely responsible for their establishment, could also be cumbersome.

TLS only allows SIP entities to authenticate servers to which they are adjacent; TLS offers strictly hop-by-hop security. Neither TLS, nor any other mechanism specified in this document, allows clients to authenticate proxy servers to whom they cannot form a direct TCP connection.

S/MIME

The secure multipurpose Internet mail extensions (S/MIME) is a secure method of sending email that uses the RSA encryption system. It provides a consistent way to send and receive secure MIME data. Based on the popular MIME standard, S/MIME provides the following cryptographic security services for electronic messaging applications: authentication, integrity, nonrepudiation and privacy.

SIP messages carry MIME bodies and the MIME standard includes mechanisms for securing MIME contents to ensure both integrity and confidentiality (including the 'multipart/signed/'). Implementers should note, however, that there may be rare network intermediaries (not typical proxy servers) that rely on viewing or modifying the bodies of SIP messages (especially SDP), and that secure MIME may prevent these sorts of intermediaries from functioning.

The certificates that are used to identify an end-user for the purposes of S/MIME differ from those used by servers in one important respect - rather than asserting that the identity of the holder corresponds to a particular hostname, these certificates assert that the holder is identified by an end-user address - this address is composed of the concatenation of the "userinfo" "@" and "domainname" portions of a SIP URI (in other words, an email address of the form "bob@biloxi.com"), most commonly corresponding to a user's address of record. These certificates are used to sign or encrypt bodies of SIP messages.  Bodies are signed with the private key of the sender (who may include their public key with the message as appropriate), but bodies are encrypted with the public key of the intended recipient. Obviously, senders must have foreknowledge of the public key of recipients in order to encrypt message bodies. Public keys can be stored within a user agent on a virtual keyring.

4.2  Overview of authorization mechanisms

Authorization is definitely a service rather than a server - authorization functionality will be provided coherently through several means of delivery, including authentication, directory servers and certificates.

Examples are legion, which is what makes this area so important. Authorization will be the basis of workflow. It will drive permissions for accessing networked resources, allow us to control and delegate electronic responsibilities, and serve as the basis for future administrative applications. It will allow us to convert our complex legal policies into automated systems in a easily scalable fashion.

At its simplest, authorization is the next generation of ACLs - the read/write/execute controls that are embedded in file systems. Typically, authorization indicates what an identifier, properly authenticated, is permitted to do with a networked object or resource.

There are many challenges associated with authorization, including

Recently, several access control models have been proposed for Internet applications [9]. Access control services protect Internet resources from unauthorized use. An important prerequisite for access control is user authentication, the process that establishes the identity of the user.

Traditional access control models are broadly categorized [9] as discretionary access control (DAC) and mandatory access control (MAC) models. New models such as role-based access control (RBAC) and task-based access control (TBAC) have been proposed to address the security requirements. VidMid needs to explore these models to find a suitable authorization scheme that can work well across video-conferencing realms.

4.3  Synergies with Shiboleth and the Federated Administration Model

Shiboleth, a joint project of Internet2/MACE and IBM, is developing architectures, frameworks, and practical technologies to support inter-institutional sharing of resources that are subject to access controls [10]. Within the worlds of academia and business, there is growing interest in collaboration and resource sharing among institutions. However, common methods like IP address is subject to spoofing. Even user ID and password adds to the network administrator's management headache and often does not scale well within university environments. The resource provider winds up in the role of system administrator for the accessing university's users. This sort of administrative entanglement is not generally considered a positive feature of collaboration, and often inhibits it entirely. Even the "one-identity" or "global sign-on" approach does not work since it fails to offer nonrepudiation which is needed. PKI solutions using digital certificates are a possibility. However, setting up and administering a PKI is not trivial task; registering users, distributing keys and providing education on user protection of private keys or smart cards all take up a fair bit of administrative and employee time and effort. Moreover, global PKI is more of a myth than a reality.

Shiboleth's solution is to have users registered only at their origin site, and not at each resource provider site. Shiboleth, then is responsible for transferring attributes about a user from the users' origin site to a resource provider site. A critical component that is needed for privacy is the Attribute Authority (AA). This component releases information about users. AA also has the responsibility of providing a means for users to specify exactly which of their allowable attributes gets sent to each site they visit. This is done by defining and adding certain components in the entire call flow.

In the context of video-conferencing, SIP and H.323, we need to understand what resources are being provided and what sort of access control is required across multiple institutions. For example, in SIP-systems, proxy within a realm will interact with other proxies in different realms and the use of authentication has been discussed in this draft. In terms of services, one could envision directory name lookups and resource discovery. In directory name lookups, if user at site X wants to view directories of video-conferencing end-users at site Y, then a Shiboleth approach can be taken. Also, if a proxy server at site X wants to discover a new resource at site Y (either a remote proxy or a location server), the seamless attribute transferring techniques may be used. The details of what approach will work needs to be further flushed out at VidMid.

4.4  User versus computer authentication

In the Chapel Hill meeting recently (Nov 2001), there was some discussion about authenticating the computer versus authenticating the end-user who is in front of the computer. The hardware in any campus can be authenticated at lower levels using DHCP or autoconfiguration in IPV6. User authentication is what we need to worry about.

One confusion that often occurs is with the word endpoint. When we say endpoint, we do not mean a physical device. What we mean is a logical instantiation of an endpoint running on the network. An analogy may help. Think of your cell phone. There are three identities. The first is the physical address of the telephone (similar to a MAC address). The second is the telephone number (which is programmable). The third is the person using the phone. We agree we can't and should not authenticate at the first level.  However, we argue that level 2 exists already in SIP/H.323. And of course there is the third level which is user's authentication. The architecture we are building allows an implementer to synchronize layers 2 and 3, or keep them separate.

4.5  Problems with NAT and Firewalls

SIP and H.323 has severe problems with network address translators and firewall traversal. However, given that our working realm is Internet2 and higher education, we can ignore the NAT issue for the time being. Moreover, given that there is slight chance that IPV6 may actually come along, lack of addresses will no longer be an issue and messy NAT problems can be a past.

However, firewalls are a reality and at the same time a pain to deal with. SIP and H.323 has to pass signaling messages through firewalls. Not only that, media streams that often uses randomly assigned port numbers have to traverse firewalls. There are some schemes that have been proposed in IETF, which will be discussed in a different draft document.

References

[1] Jonathan Rosenberg, Henning Schulzrinne, Gonzalo Camarillo, Alan Johnston, Jon Peterson, Robert Sparks, Mark Handley, Eve Schooler, "SIP: Session Initiation Protocol", Internet Draft draft-ietf-sip-rfc2543bis-07.txt.

[2] J. Franks, P. Hallam-Baker, et al., "HTTP Authentication: Basic and Digest Access Authentication", IETF RFC 2617.

[3] Michael Thomas, "SIP Security Framework", Internet Draft draft-thomas-sip-sec-framework-00.txt.

[4]  Chapter 9 from Mastering Network Security, published by Sybex, Inc. Author is Chris Brenton.

[5] Jari Arkko, et al., "Security Mechanisms Agreement for SIP connections", Internet Draft draft-arkko-sip-sec-agree-00.txt.

[6] Michael Thomas, "SIP Security Requirements", Internet Draft draft-thomas-sip-sec-req-00.txt.

[7] Mark Walla, "Kerberos Explained", article from May 2000 issue of Windows 2000 Advantage magazine.

[8] Dave Kosiur. Building and Managing Virtual private Networks. J Wiley Publisher, 1998.

[9] "Securing Network Software Applications", Communications of the ACM, Vol. 44, No. 2, February 2001.

[10 Marlene Erdos, Scott Cantor, "Shiboleth Architecture DRAFT v04", draft-internet2-shiboleth-architecture-04.html.


[1] Recommendation H.235 "Security and encryption for H-Series (H.323 and other H.245-based) multimedia terminals"