A Hitchhiker's Guide to the Session Initiation Protocol (SIP)
Cisco
Edison NJ
US
jdrosen@cisco.com
http://www.jdrosen.net
RAI
SIP
SIP
Guide
42
Don't Panic
Overview
The Session Initiation Protocol (SIP) is
the subject of numerous specifications that have been produced by the
IETF. It can be difficult to locate the right document, or even to
determine the set of Request for Comments (RFC) about SIP. This
specification serves as a guide to the SIP RFC series. It
lists the specifications under the SIP umbrella, briefly summarizes
each, and groups them into categories.
The Session Initiation Protocol (SIP) is
the subject of numerous specifications that have been produced by the
IETF. It can be difficult to locate the right document, or even to
determine the set of Request for Comments (RFC) about SIP. Don't
Panic! This specification serves as a guide to the
SIP RFC series. It lists the specifications under the SIP
umbrella. For each specification, a paragraph or so description is
included that summarizes the purpose of the specification. Each
specification also includes a letter that designates its category in
the standards track . These values are:
Standards Track (Proposed Standard, Draft Standard, or
Standard)
Experimental
Best Current Practice
Informational
The specifications are grouped together by topic. The topics are:
The essential SIP specifications that are expected
to be utilized for every session or registration.
Specifications related to interworking
with the telephone network.
General purpose
extensions to SIP, SDP and MIME, but ones that are not expected to
always be used.
Specifications to deal with firewall and
NAT traversal.
Specifications that solve a narrow
problem space or provide an optimization.
Specifications for multimedia
conferencing.
Specifications for
manipulating SIP dialogs and calls.
Defines the core specifications for
the SIP event framework, providing for pub/sub capability.
Packages that utilize the SIP event
framework.
Specifications related to
multimedia quality of service (QoS).
Specifications related to
configuration and monitoring of SIP deployments.
Specifications to facilitate usage of
SIP with the Signaling Compression (Sigcomp) framework.
Specifications on how to use SIP URIs
to address multimedia services.
Specifications providing security
functionality for SIP.
SIP
extensions related to IM, presence and multimedia. This covers only
the SIP extensions related to these topics. See
for a full treatment of SIP
for IM and Presence (SIMPLE).
SIP extensions related to emergency
services. See for a more
complete treatment of additional functionality related to emergency
services.
Typically, SIP
extensions fit naturally into topic areas, and implementers
interested in a particular topic often implement many or all of the
specifications in that area. There are some specifications which fall
into multiple topic areas, in which case they are listed more than
once.
Do not print all the specs cited here at once, as they
might share the fate of the rules of Brockian Ultracricket
when bound together: collapse under their own gravity and
form a black hole .
This document itself is not an update to RFC 3261 or an extension to
SIP. It is an informational document, meant to guide newcomers,
implementors and deployers to the many of the specifications
associated with SIP.
It is very difficult to enumerate the set of SIP specifications. This
is because there are many protocols that are intimately related to
SIP and used by nearly all SIP implementations, but are not formally
SIP extensions. As such, this document formally defines a "SIP
specification" as:
RFC 3261 and any specification that defines an extension to it,
where an extension is a mechanism that changes or updates in
some way a behavior specified there.
The basic SDP specification, RFC 4566 , and
any specification that defines an extension to SDP whose primary
purpose is to support SIP.
Any specification that defines a MIME object whose primary purpose
is to support SIP
Excluded from this list are requirements, architectures, registry
definitions, non-normative frameworks, and processes. Best Current
Practices are included when they normatively define mechanisms for
accomplishing a task, or provide significant description of the
usage of the normative specifications, such as call flows.
The SIP change process defines two types
of extensions to SIP. These are normal extensions and the so-called
P-headers (where P stands for "preliminary", "private", or
"proprietary", and the "P-" prefix is included in the header field
name), which are meant to be used in areas of limited applicability.
P-headers cannot be defined in the standards track. For the most part,
P-headers are not included in the listing here, with the exception of
those which have seen general usage despite their P-header status.
This document includes specifications which have already
been approved by the IETF and granted an RFC number, in addition
to Internet Drafts which are still under development within IETF and
will eventually finish and get an RFC number. Inclusion of Internet
Drafts here helps encourage early implementation and demonstrations of
interoperability of the protocol, and thus aids in the standards
setting process. Inclusion of these also identifes where the IETF is
targetting a solution at a particular problem space. Note that final
IANA assignment of codepoints (such as option tags and header field
names) does not take place until shortly before
publication as an RFC, and thus codepoint assignments may change.
The core SIP specifications represent the set of specifications whose
functionality is broadly applicable. An extension is broadly
applicable if it fits into one of the following categories:
For specifications that impact SIP session management, the
extension would be used for almost every session initiated by a user
agent
For specifications that impact SIP registrations, the extension
would be used for almost every registration initiated by a user agent
For specifications that impact SIP subscriptions, the extension
would be used for almost every subscription initiated by a user agent
In other words, these are not specifications that are used just for
some requests and not others; they are specifications that would
apply to each and every request that the extension is relevant
for. In the galaxy of SIP, these specifications are like towels .
is the core SIP protocol
itself. RFC 3261 is an update to . It is the
president of the galaxy as
far as the suite of SIP specifications is concerned.
provides DNS procedures for taking a SIP URI, and
determining a SIP server that is associated with that SIP URI. RFC
3263 is essential for any implementation using SIP with DNS. RFC 3263
makes use of both DNS SRV records and NAPTR
records .
defines how the Session Description Protocol (SDP) is used with SIP to negotiate the
parameters of a media session. It is in widespread usage and an
integral part of the behavior of RFC 3261.
defines the SUBSCRIBE and NOTIFY methods. These two methods provide a
general event notification framework for SIP. To actually use the
framework, extensions need to be defined for specific event
packages. An event package defines a schema for the event data, and
describes other aspects of event processing specific to that
schema. An RFC 3265 implementation is required when any event package
is used.
Though its P-header status implies that
it has limited applicability, , which
defines the P-Asserted-Identity header field, has been widely deployed. It is
used as the basic mechanism for providing network asserted caller ID
services. Its update, ,
clarifies its usage for connected party identification as well.
defines
the Path header field. This field is inserted by proxies between a
client and their registrar. It allows inbound requests towards that
client to traverse these proxies prior to being delivered to the user
agent. It is essential in any SIP deployment that has edge proxies,
which are proxies between the client and the home proxy or SIP
registrar.
defines the rport
parameter of the Via header. It allows SIP responses to traverse
NAT. It is one of several specifications that are utilized for NAT
traversal (see ).
defines a mechanism for
carrying capability information about
a user agent in REGISTER requests and in dialog-forming requests like
INVITE. It has found use with conferencing (the isfocus parameter
declares that a user agent is a conference server) and with
applications like push-to-talk.
formally updates RFC 3261, and modifies some of the behaviors
associated with non-INVITE transactions. This addresses some problems
found in timeout and failure cases.
defines a mechanism for providing a
cryptographically verifiable identity of the calling party in a SIP
request. Known as "SIP Identity", this mechanism provides an
alternative to RFC 3325. It has seen little deployment so far, but its
importance as a key construct for anti-spam techniques
and new security mechanisms makes it a core part of the SIP specifications.
defines a mechanism for directing
requests towards a specific UA instance. GRUU is essential for
features like transfer and provides another piece of the SIP NAT
traversal story.
, also known
as SIP outbound, defines important changes to the SIP registration
mechanism which enable delivery of SIP messages towards a UA when it
is behind a NAT. This specification is the cornerstone of the SIP NAT
traversal strategy.
defines a format for
representing multimedia sessions. SDP objects are carried in the body
of SIP messages, and based on the offer/answer model, are used to
negotiate the media characteristics of a session between users.
defines a set of extensions to SDP that allow for capability
negotiation within SDP. Capability negotiation can be used to select
between different profiles of RTP (secure vs. unsecure) or to
negotiate codecs such that an agent has to select one amongst a set of
supported codecs.
defines a
technique for NAT traversal of media sessions for protocols that make
use of the offer/answer model. This specification is the IETF
recommended mechanism for NAT traversal for SIP media streams, and is
meant to be used even by endpoints which are themselves never behind a
NAT. A SIP option tag and media feature tag (also a core specification)
have been defined for use with
ICE.
defines a way to explicitly signal,
within an SDP message, the IP address and port for RTCP, rather than
using the port+1 rule in the Real Time Transport Protocol (RTP) . It is needed for devices behind NAT and used by
ICE.
formally updates RFC 3261. It defines an
extension to SIP that allows a calling user to determine the identity of the
final called user (connected party). Due to forwarding and retargeting services, this may not be the
same as the user that the caller was originally trying to reach. The
mechanism works in tandem with the SIP identity specification to provide signatures over the
connected party identity. It can also be used
if a party identity changes mid call due to third party call control
actions or PSTN behavior.
defines the UPDATE method for SIP. This method is
meant as a means for updating session information prior to the
completion of the initial INVITE transaction. It can also be used to
update other information, such as the identity of the participant
,
without involving an updated offer/answer exchange. It was developed
initially to support but has found
other uses. In particular, its usage with RFC 4916 means it will
typically be used as part of every session, to convey a secure
connected identity.
formally updated RFC 3261. It revises the processing of
the SIPS URI, originally
defined in RFC 3261, to fix many errors and problems
that have been encountered with that mechanism.
contains
best practice call flow examples for basic SIP
interactions - call establishment, termination, and
registration.
A collection of fixes to
SIP that address important bugs and vulnerabilities. These include a
fix requiring loop detection in any proxy that
forks , a
clarification on how record-routing works
, and a correction to
the IPv6 BNF .
Numerous extensions and usages of SIP related to interoperability and
communications with or through the PSTN.
is one of
the earliest extensions to SIP. It defines procedures for using SIP to
invoke services that actually execute on the PSTN. Its main
application is for third party call control, allowing an IP host to
set up a call between two PSTN endpoints. PINT has a relatively narrow
focus and has not seen widespread deployment.
Continuing the trend
of naming PSTN related extensions with alcohol references, SPIRITS
defines the inverse of PINT. It allows a
switch in the PSTN to ask an IP element about how to proceed with call
waiting. It was developed primarily to support Internet Call Waiting
(ICW). Perhaps the next specification will be called the Pan Galactic
Gargle Blaster .
SIP-T defines a mechanism
for using SIP between pairs of PSTN gateways. Its essential idea is to
tunnel ISUP signaling between the gateways in the body of SIP
messages. SIP-T motivated the development of INFO . SIP-T has seen widespread implementation for the
limited deployment model that it addresses. As ISUP endpoints
disappear from the network, the need for this mechanism will
decrease.
defines how to do protocol mapping from the SS7
ISDN User Part (ISUP) signaling to SIP. It is widely used in SS7 to
SIP gateways and is part of the SIP-T framework.
defines how to do protocol mapping from Q.SIG, used for
PBX signaling, to SIP.
defines a mechanism to map overlap
dialing into SIP. This specification is widely regarded as the ugliest
SIP specification, as the introduction to the specification itself
advises that it has many problems. Overlap signaling (the practice of
sending digits into the network as dialed instead of waiting for
complete collection of the called party number) is largely
incompatible with SIP at some fairly fundamental levels. That said,
RFC 3578 is mostly harmless and has seen some usage.
defines some guidelines for handling
early media - the practice of sending media from the called party or
an application server towards the caller - prior to acceptance of the
call. Early media is often generated from the PSTN. Early media is
a complex topic, and this specification does not fully address the
problems associated with it.
defines a
new session disposition type for use with early
media. It indicates that the SDP in the body is for a special early
media session. This has seen little usage.
defines MIME objects for representing
SS7 and QSIG signaling messages. SS7 signaling messages are carried in
the body of SIP messages when SIP-T is used. QSIG signaling messages
can be carried in a similar way.
provides best practice call
flows around interworking with the PSTN.
These extensions are general purpose enhancements to SIP, SDP and MIME that can
serve a wide variety of uses. However, they are not used for every
session or registration, as the core specifications are.
SIP defines two types of responses to a request - final and
provisional. Provisional responses are numbered from 100 to 199. In
SIP, these responses are not sent reliably. This choice was made in
RFC 2543 since the messages were meant to just be truly informational,
and rendered to the user. However, subsequent work on PSTN
interworking demonstrated a need to map provisional responses to PSTN
messages that needed to be sent reliably. was developed to allow reliability of provisional
responses. The specification defines the PRACK method, used for indicating that
a provisional response was received. Though it provides a generic
capability for SIP, RFC 3262 implementations have been most common in
PSTN interworking devices. However, PRACK brings a great deal of
complication for relatively small benefit. As such, it has seen only
moderate levels of deployment.
defines the
Privacy header field, used by clients to request anonymity for their
requests. Though it defines several privacy services, the only one
broadly used is the one that supports privacy of the P-Asserted-Identity
header field .
defines a mechanism for achieving
anonymous calls in SIP. It is an alternative to
, and instead places more
intelligence in the endpoint to craft anonymous messages
by directly accessing network services.
was defined as an extension to RFC 2543. It defines
a method, INFO, used to transport mid-dialog information that has no
impact on SIP itself. Its driving application was the transport of
PSTN related information when using SIP between a pair of
gateways. Though originally conceived for broader use, it only found
standardized usage with SIP-T . It has been
used to support numerous proprietary and non-interoperable
extensions due to its poorly defined scope.
defines the Reason header field. It is used
in requests, such as BYE, to indicate the reason that the request is
being sent.
RFC 3388
defines a framework for grouping together media streams in an SDP
message. Such a grouping allows relationships between these streams,
such as which stream is the audio for a particular video feed, to be
expressed.
defines a MIME object that
contains a SIP message fragment. Only certain header fields and parts
of the SIP message are present. For example, it is used to
report back on the responses received to a request sent as a
consequence of a REFER.
allows a client to determine, from a REGISTER response, a path of
proxies to use in requests it sends outside of a dialog. It can also
be used by proxies to verify the Route header in client initiated
requests. In many respects, it is the inverse of the Path header
field, but has seen less usage since default outbound proxies have
been sufficient in many deployments.
defines a set of headers that a client can include
in a request to control the way in which the request is routed
downstream. It allows a client to direct a request towards a UA with
specific capabilities, which a UA indicates using
.
defines a keepalive mechanism for SIP signaling. It
is primarily meant to provide a way to cleanup old state in proxies
that are holding call state for calls from failed endpoints which were
never terminated normally. Despite its name, the session timer is not
a mechanism for detecting a network failure mid-call. Session timers
introduces a fair bit of complexity for relatively little gain, and
have seen moderate deployment.
defines how to carry SIP messages over the Stream
Control Transmission Protocol (SCTP) . SCTP
has seen very limited usage for SIP transport.
defines the History-Info
header field, which
indicates information on how and why a call came to be routed to
a particular destination.
defines an
extension to SDP for setting up TCP-based
sessions between user agents. It defines who sets up the connection
and how its lifecycle is managed. It has seen relatively little usage
due to the small number of media types to date which use TCP.
defines a mechanism for
including both IPv4 and IPv6 addresses for a media session as
alternates. This mechanism has been deprecated in favor of ICE
.
defines an extension to the SDP capability negotiation framework for negotiating
codecs, codec parameters, and media streams.
clarifies
handling of bodies in SIP, focusing primarily on
multi-part behavior, which was underspecified in SIP.
These SIP extensions are primarily aimed at addressing NAT traversal
for SIP.
defines a
technique for NAT traversal of media sessions for protocols that make
use of the offer/answer model. This specification is the IETF
recommended mechanism for NAT traversal for SIP media streams, and is
meant to be used even by endpoints which are themselves never behind a
NAT. A SIP option tag and media feature tag have been defined for use with
ICE.
specifies the usage of ICE for TCP streams. This allows
for selection of RTP-based voice ontop of TCP only when NAT or
firewalls would prevent UDP-based voice from working.
defines a way to explicitly signal,
within an SDP message, the IP address and port for RTCP, rather than
using the port+1 rule in the Real Time Transport Protocol (RTP) . It is needed for devices behind NAT and used by
ICE.
, also known
as SIP outbound, defines important changes to the SIP registration
mechanism which enable delivery of SIP messages towards a UA when it
is behind a NAT.
defines the rport
parameter of the Via header. It allows SIP responses to traverse
NAT.
defines a mechanism for directing
requests towards a specific UA instance. GRUU is essential for
features like transfer and provides another piece of the SIP NAT
traversal story.
Numerous SIP extensions provide a toolkit of dialog and call
management techniques. These techniques have been combined together to
build many SIP-based services.
REFER
defines a mechanism for asking a user agent
to send a SIP request. It's a form of SIP remote control, and is the
primary tool used for call transfer in SIP. Beware that not all
potential uses of REFER (neither for all methods nor for all URI
schemes) are well defined. Implementors should only use the well-defined
ones, and should not second guess or freely assume behavior for the others
to avoid unexpected behavior of remote UAs, interoperability issues,
and other bad surprises.
defines a
number of different call flows that allow one SIP entity, called the
controller, to create SIP sessions amongst other SIP user agents.
defines the Join header field. When sent in an
INVITE, it causes the recipient to join the resulting dialog into a
conference with another dialog in progress.
defines a mechanism that allows a new dialog to
replace an existing dialog. It is useful for certain advanced transfer
services.
defines the Referred-By header field. It is
used in requests triggered by REFER, and provides the identity of the
referring party to the referred-to party.
defines
how to use 3pcc for the purposes of invoking transcoding services for
a call.
defines the SUBSCRIBE and NOTIFY methods. These two methods provide a
general event notification framework for SIP. To actually use the
framework, extensions need to be defined for specific event
packages. An event package defines a schema for the event data, and
describes other aspects of event processing specific to that
schema. An RFC 3265 implementation is required when any event package
is used.
defines the PUBLISH method. It is
not an event package, but is used by all event packages as a mechanism
for pushing an event into the system.
defines an
extension to RFC 3265 that allows a client to subscribe to a list of
resources using a single subscription. The server, called a Resource
List Server (RLS) will "expand" the subscription and subscribe to each
individual member of the list. It has found applicability primarily in
the area of presence, but can be used with any event package.
defines an
extension to RFC 3265 to optimize the performance of
notifications. When a client subscribes, it can indicate what
version of a document it has, so that the server can skip sending a
notification if the client is up to date. It is applicable to any
event package.
These are event packages defined to utilize the SIP events
framework. Many of these are also listed elsewhere in their respective
areas.
defines an event package for finding out
about changes in registration state.
is an extension to
the registration event package that allows
user agents to learn about their GRUUs. It is particularly useful in
helping to synchronize a client and its registrar with its currently
valid temporary GRUU.
defines a way for a user agent to find out about
voicemails and other messages that are waiting for it. Its primary
purpose is to enable the voicemail waiting lamp on most business
telephones.
defines an event package for indicating user
presence through SIP.
, also known as winfo,
provides a mechanism for a user agent to find out what subscriptions
are in place for a particular event package. Its primary usage is with
presence, but it can be used with any event package.
defines an event package for
learning the state of the dialogs in progress at a user agent, and is
one of several RFCs starting with the important number 42 .
defines
a mechanism for learning about changes in conference state, including
conference membership.
defines a
way for an application in the network to subscribe to the set of
keypresses made on the keypad of a traditional telephone. It, along
with RFC 2833 , are the two mechanisms
defined for handling DTMF. RFC 4730 is a signaling-path solution,
and RFC 2833 is a media-path solution.
defines
a SIP event package that enables the collection
and reporting of metrics that measure the quality for Voice over
Internet Protocol (VoIP) sessions.
defines a
framework for session policies. In this framework, policy servers are
used to tell user agents about the media characteristics required for
a particular session. The session policy framework has not been widely
implemented.
defines a SIP
event package used in conjunction with the session policy framework
.
defines a
SIP event package that allows a UA to learn whether consent has been
given for the addition of an address to a SIP "mailing list". It is
used in conjunction with the SIP framework for consent .
Several specifications concern themselves with the interactions of SIP
with network Quality of Service (QoS) mechanisms.
, updated by defines a way to make sure that the
phone of the called party doesn't ring until a QoS reservation has
been installed in the network. It does so by defining a general
preconditions framework, which defines conditions that must be true in
order for a SIP session to proceed.
defines a
way for user agents to negotiate what type of end-to-end QoS
mechanism to use for a session. At this time, there are two
that can be used - RSVP and NSIS. This negotiation is done
through an SDP extension. Due to limited deployment of RSVP
and even more limited deployment of NSIS, this extension has
not been widely used.
defines a P-header that
provides a mechanism for passing an authorization token between SIP
and a network QoS reservation protocol like RSVP. Its purpose is to
make sure network QoS is only granted if a client has made a SIP call
through the same providers network. This specification is sometimes
referred to as the SIP walled garden specification by the truly
paranoid androids in the SIP community. This is because it requires
coupling of signaling and the underlying IP network.
defines a usage of the SDP grouping framework for indicating that a
set of media streams should be handled by a single resource
reservation.
Several specifications have been defined to support operations and
management of SIP systems. These include mechanisms for configuration
and network diagnostics.
defines a mechanism that allows a SIP user agent to bootstrap its
configuration from the network, and receive updates to its
configuration should it change. This is considered an essential piece
of deploying a usable SIP network.
defines
a SIP event package that enables the collection
and reporting of metrics that measure the quality for Voice over
Internet Protocol (VoIP) sessions.
Sigcomp was defined
to allow compression of
SIP messages over low bandwidth links. Sigcomp is not formally part of
SIP. However, usage of Sigcomp with SIP has required extensions to
SIP.
defines a SIP URI parameter that can be used to
indicate that a SIP server supports Sigcomp.
defines how to
apply
Sigcomp to SIP.
Several extensions define well-known services that can be invoked by
constructing requests with the specific structures for the Request
URI, resulting in specific behaviors at the UAS.
introduced the context of
using Request URIs, encoded appropriately, to invoke services.
defines a resource called a Resource List Server. A client can send a
subscribe to this server. The server
will generate a series of subscriptions, and compile the resulting
information and send it back to the subscriber. The set of resources
that the RLS will subscribe to is a property of the request URI in the
SUBSCRIBE request.
defines the
framework for list services in SIP. In this framework, a UA
can include an XML list object in the body of various
requests and the server provides list-oriented services as a
consequence. For example, a SUBSCRIBE with a list subscribes
to the URI in the list.
uses the
URI list framework
and allows
a client to subscribe to a resource called a Resource
List Server. This server will generate subscriptions to
the URI in the list, and compile the resulting
information and send it back to the subscriber.
uses the URI list
framework
and allows a client to send a MESSAGE to a number of
recipients.
uses
the URI list framework
and allows a
client to ask the server to act as a conference focus and
send an invitation to each recipient in the list.
defines a way for SIP
application servers to invoke announcement and conferencing services
from a media server. This is accomplished through a set of defined URI
parameters which tell the media server what to do, such as what file
to play and what language to render it in.
defines a way to invoke voicemail and IVR
services by using a SIP URI constructed in a particular way.
These SIP extensions don't fit easily into a single specific
use case. They have somewhat general applicability, but they solve a
relatively small problem or provide an optimization.
defines an enhancement
to REFER. REFER normally creates an implicit subscription to the
target of the REFER. This subscription is used to pass back updates on
the progress of the referral. This extension allows that implicit
subscription to be bypassed as an optimization.
provides a mechanism that allows
a UAS to authorize a request because the requestor proves it knows a
dialog that is in progress with the UAS. The specification is useful
in conjunction with the SIP application interaction framework .
defines a mechanism for carrying RFC 3840 feature tags in REFER. It is
useful for informing the target of the REFER about the characteristics
of the intentended target of the referred request.
defines an extension for
indicating to the called party whether or not the phone should ring
and/or be answered immediately. This is useful for push-to-talk and
for diagnostic applications.
defines
a mechanism for a
called party to indicate to the calling party that a call was rejected
since the caller was anonymous. This is needed for implementation of
the Anonymous Call Rejection (ACR) feature in SIP.
allows a UA
sending a REFER to ask the recipient of the REFER to generate multiple
SIP requests, not just one. This is useful for conferencing, where a
client would like to ask a conference server to eject multiple users.
defines a mechanism for content indirection. Instead of
carrying an object within a SIP body, a URL reference is carried
instead, and the recipient dereferences the URL to obtain the
object. The specification has potential applicability for sending
large instant messages, but has yet to find much actual use.
specifies an SDP extension that
allows for the description of the bandwidth for a media session that
is independent of the underlying transport mechanism.
defines a mechanism
in SDP to signal floor control streams that use BFCP. It is used for
Push-To-Talk and conference floor control.
defines a
usage of the precondition framework . The
connectivity precondition makes sure that the session doesn't get
established until actual packet connectivity is checked.
defines an SDP attribute for
describing the purpose of a
media stream. Examples include a slide view, the speaker, a sign
language feed, and so on.
defines practices
for interworking between IPv6 and IPv6 user agents. This is done
through multi-homed proxies which interwork IPv4 and IPv6, along
with ICE for media
traversal. The specification includes
some minor extensions and clarifications to SDP in order to cover
some additional cases.
defines an
extension to SIP that allows a TLS connection between
servers to be reused for requests in both
directions. Normally two connections are set up between a
pair of servers, one for requests in each direction.
Several extensions provide additional security features to SIP.
defines a mechanism for providing a
cryptographically verifiable identity of the calling party in a SIP
request. Known as "SIP Identity", this mechanism provides an
alternative to RFC 3325. It has seen little deployment so far, but its
importance as a key construct for anti-spam techniques
and new security mechanisms makes it a core part of the SIP specifications.
formally updates RFC 3261. It defines an
extension to SIP that allows a calling user to determine the identity of the
final called user (connected party). Due to forwarding and retargeting services, this may not be the
same as the user that the caller was originally trying to reach. The
mechanism works in tandem with the SIP identity specification to provide signatures over the
connected party identity. It can also be used
if a party identity changes mid call due to third party call control
actions or PSTN behavior.
formally updated RFC 3261. It revises the processing of
the SIPS URI, originally
defined in RFC 3261, to fix many errors and problems
that have been encountered with that mechanism.
clarifies the
usage of SIP over TLS with regards to certificate
handling, and defines additional procedures needed for
interoperability.
defines the
Privacy header field, used by clients to request anonymity for their
requests. Though it defines several privacy services, the only one
broadly used is the one that supports privacy of the P-Asserted-Identity
header field .
defines
extensions to SDP that allow tunneling of an key
management protocol, namely MIKEY
, through offer/answer
exchanges. This mechanism is one of three SRTP keying
techniques specified for SIP, with DTLS-SRTP
having
been selected as the final solution.
defines extensions to SDP that allow for
the negotiation of keying material directly through offer/answer,
without a separate key management protocol. This mechanism,
sometimes called sdescriptions, has the drawback that the media keys
are available to any entity that has visibility to the SDP. It is
one of three SRTP keying techniques specified for SIP, with
DTLS-SRTP having
been selected as the final solution.
defines
the overall framework and SDP and SIP processing required
to perform key management for RTP using Datagram TLS
(DTLS) directly between
endpoints, over the media path. It is
one of three SRTP keying techniques specified for SIP, with
DTLS-SRTP having
been selected as the final solution.
defines the usage of SDP with DTLS-SRTP.
formally updates RFC 3261. It is a brief
specification that updates the
cryptography mechanisms used in SIP S/MIME. However, SIP S/MIME has
seen very little deployment.
defines a certificate service for SIP whose purpose is to
facilitate the deployment of S/MIME. The certificate service allows
clients to store and retrieve their own certificates, in addition to
obtaining the certificates for other users.
defines a
SIP message fragment which can be signed in
order to provide an authenticated identity over a request. It was an
early predecessor to , and
consequently AIB has seen no deployment.
defines the usage of the
Security Assertion Markup Language (SAML) within SIP, and describes
how to use it in conjunction with SIP identity to provide authenticated assertions about a users
role or attributes.
defines
several extensions to SIP, including the Trigger-Consent and
Permission-Missing header fields. These header fields, in addition to
the other procedures defined in the document, define a way to manage
membership on "SIP mailing lists" used for instant messaging or
conferencing. In particular, it helps avoid the problem of using such
amplification services for the purposes of an attack on the network,
by making sure a user authorizes the addition of their address onto
such a service.
defines
an XML object used by the consent framework. Consent
documents are sent from SIP "mailing list servers" to
users to allow them to manage their membership on lists.
defines a
SIP event package that allows a UA to learn whether consent has been
given for the addition of an address to a SIP "mailing list". It is
used in conjunction with the SIP framework for consent .
defines a mechanism to prevent bid-down
attacks in conjunction with SIP authentication. The mechanism has seen
very limited deployment. It was defined as part of the 3gpp IMS
specification suite , and is needed only
when there is a multiplicity
of security mechanisms deployed at a particular server. In practice,
this has not been the case.
defines
mechanisms for
providing confidentiality and integrity for SIP message bodies sent
from user agents to specific network intermediaries.
specifies a mechanism
for signaling TLS-based media
streams between endpoints. It expands the TCP-based media signaling
parameters defined in to include fingerprint
information for TLS streams, so that TLS can operate between end hosts
using self-signed certificates.
defines
a precondition for use with the preconditions framework . The security precondition prevents a session from
being established until a security media stream is set up.
Numerous SIP and SDP extensions are aimed at conferencing as their
primary application.
defines an SDP attribute for providing an opaque label for
media streams. These labels can be referred to by external documents,
and in particular, by conference policy documents. This allows a UA to
tie together documents it may obtain through conferencing mechanisms
to media streams to which they refer.
defines the Join header field. When sent in an
INVITE, it causes the recipient to join the resulting dialog into a
conference with another dialog in progress.
defines
a mechanism for learning about changes in conference state, including
conference membership.
allows a UA
sending a REFER to ask the recipient of the REFER to generate multiple
SIP requests, not just one. This is useful for conferencing, where a
client would like to ask a conference server to eject multiple users.
is
similar to . However, instead
of subscribing to the resource, an INVITE request is sent to the
resource, and it will act as a conference focus and generate an
invitation to each recipient in the list.
defines best practice procedures
and call flows for conferencing. This includes conference
creation, joining, and dial out, amongst other
capabilities.
defines a mechanism
in SDP to signal floor control streams that use BFCP. It is used for
Push-To-Talk and conference floor control.
SIP provides extensions for instant messaging, presence, and
multimedia.
defines the MESSAGE method, used for sending
an instant message without setting up a session (sometimes called
"page mode").
defines an event package for indicating user
presence through SIP.
, also known as winfo,
provides a mechanism for a user agent to find out what subscriptions
are in place for a particular event package. Its primary usage is with
presence, but it can be used with any event package.
defines a
mechanism for signaling a file transfer session with SIP.
Emergency services include pre-emption features, which allow
authorized individuals to gain access to network resources in time of
emergency, along with traditional emergency calling.
defines an extension to
the Reason header, allowing a UA to know that its dialog was torn down
because a higher priority session came through.
defines a new header field,
Resource-Priority, that allows a session to get priority treatment
from the network.
defines
a mechanism for carrying location objects in SIP
messages. This is used to convey location from a UA to an
emergency call taker.
This specification is an overview of existing specifications, and does
not introduce any security considerations on its own. Of course, the
world would be far more secure if everyone would follow one simple
rule: "Don't Panic!" .
The author would like to thank Spencer Dawkins, Brian Stucker, Keith
Drage, John Elwell and Avshalom Houri for their comments on this document.
The Hitchhiker's Guide to the Galaxy