Patterns in application-layer protocol design
There are a lot of different application-layer protocols, and it seems they all end up evolving solutions to many of the same problems. This article seeks to identify the aspects of functionality commonly found in many application-layer protocol, and serves as a dictionary of those functions.
Authentication. Most commonly to authenticate one side of the association, such as the client.
- Extensibility, negotiation. May support multiple authentication methods for extensibility. The negotiation/autodiscovery of the supported authentication methods might be supported. Authentication is frequently handled by multi-method frameworks such as SASL, which allow any given application to support a standardised set of authentication protocols.
- Mutuality. Mutual authentication is rare. (Functionality of lower layers
such as TLS is out of scope.) If TLS is used, authentication
may be delegated solely to TLS via client certificates or TLS-PSK (see the SASL
EXTERNAL
method). - Authn v. authz username; impersonation. Some protocols may enable a distinction to be drawn between an authentication user identity and an authorization user identity, wherein users can authenticate as other users if they have the privileges to do so (impersonation).
- Domain. Some protocols will explicitly qualify usernames as belong to a domain, making usernames globally unambiguous.
Channel security. A protocol may provide channel security (that is, confidentiality and integrity), most commonly by delegation to a lower-layer protocol such as TLS, DTLS or QUIC.
- Keying material. A channel security technology might enable key material to be derived from its state for special-purpose application-specific uses. Channel security protocols being an entire subject of their own, they are otherwise out of scope of this article.
- Upgrade (STARTTLS). A channel security mechanism is not activated initially, but only after capability negotiation for the ability to use TLS.
Framing. Where an application-layer protocol desires to exchange frames but runs on a byte-oriented lower-layer protocol (bytepipe), it can be adapted to provide a frame interface (framepipe) via a framing protocol. At its simplest, such a mechanism may consist of a simple length prefix before each frame's payload.
- Out-of-band frames. A framing protocol might provide support for out-of-band frames or frames which are considered special in some way, or may deliberately avoid providing such support.
- Existing protocols. Aside from trivial length-prefix headers, interesting protocols providing framing facilities include SLIP, COBS, and encodings such as 8b10b (which also provides several out-of-band symbols). Some transport protocols provide framepipes and not bytepipes and thus need no adaptation.
- Length limitation. A framing protocol will generally impose some kind of frame length limitation, which may be fixed by a protocol specification, fixed by an implementation, or be dynamic. Particularly sophisticated framing protocols could support the negotiation of frame length limits, or other framing parameters.
- Framing error handling. A framing protocol must define the means of handling framing errors, such as the transmission of malformed or impermissibly long frames. For example, a protocol could define that a lower-layer transport association be torn down, could ignore the frame, or could return some kind of framing-level error indication frame facilitating error detection and recovery.
Frame typing. Where an application layer exchanges frames with multiple kinds of meaning, it needs a way to distinguish them. This is often facilitated via some prefixed type information at the start of each frame. Sometimes this functionality is tightly coupled together with framing functionality, for example in the TLV construction, where a type field precedes a length field; by using the LTV ordering, logically constructed as LV(TV), this coupling can be avoided.
Extensibility framework. Application-layer protocols frequently find the need to evolve a highly generalised framework for future extensibility. Most commonly this is facilitated via further use of LTVs within a frame, possibly nested in turn. Note that though terms such as TLV usually imply a binary format, we consider things such as HTTP headers to be effectively LTVs in kind for our purposes here. LTVs may have a binary or textual representation.
- Type identifiers. LTVs, both those identifying frames and those identifying fields within frames, are identified by their type identifiers. Type identifiers might be either integers or strings. In either case, they might have substructure or be hierarchically allocated, or might be a flat namespace.
- Type identifiers — extensibility. Choice of type identifier format determines how easily the protocol is extended by multiple independent parties without coordination.
- Namespacing. Ease of extensibility may be enhanced via namespacing schemes, in which type identifiers are allocated in a hierarchy. This might make use of a new registration namespace or existing registration namespaces, such as URIs or the domain name system, IANA enterprise numbers, ASN.1 OIDs, or an existing IANA registry. Barring this, a crude scheme is often chosen whereby a fixed range is reserved for private or experimental use, with other ranges being reserved for allocation by the specifier of the protocol.
- Backwards compatibility. To facilitate backwards compatibility the handling of unknown LTVs by software is defined. Usually this will involve ignoring unknown LTVs but sometimes security, integrity or other requirements will dictate the opposite and that messages with unknown LTVs be rejected instead.
- Backwards compatibility: criticality. Where LTVs must sometimes be rejected if unrecognised and sometimes must be ignored if unrecognised, the concept of a criticality flag commonly appears in application-layer protocol design. Each LTV carries with it a flag bit, “critical”, making it the logical serialization of the tuple (type, value, critical?) rather than just (type, value). A critical LTV must be rejected if it is unrecognised, and a non-critical one must be ignored.
- Backwards compatibility: transitivity. Where an application-layer protocol involves the relaying of LTVs from one party to another via intermediaries, extensible protocols sometimes also annotate their LTVs with a transitive flag, so that the LTV serializes the tuple (type, value, transitive). An intermediary blindly forwards an LTV which it does not recognise unchanged if and only if it is marked transitive. LTVs which are intended only to be processed by the intermediary if they are to be processed at all are marked non-transitive. An example of a protocol with this pattern is BGP.
Version negotiation. As the application protocol may receive new major versions, a mechanism for version negotiation and upgrade is almost always desired. This must facilitate both forwards and backwards compatibility. Most simply this may be in the form of a simple version integer.
- Major/minor version. Some protocols may use a (major, minor) version number tuple, in which increases in the minor version are assumed to be backwards-compatible.
- Version/minimum version. Some protocols may use a (version, minimum version) tuple, in which peers state the most recent protocol version they support and the minimum version they are able to support.
Capability negotiation. If the requirements incumbent on a protocol implementation only ever increase to a superset of the previous set, a simple version negotiation scheme suffices. Where the number of implementers of a protocol is large, or the feature set supported by the protocol is substantial so that different implementers may implement different subsets, more sophisticated feature negotiation may be necessary in the form of capability negotiation, in which one or all of the parties to an association may negotiate zero or more optional and zero or more required features known as capabilities. The initiator of an association might declare its capabilities with the responder listing the subset of those capabilities it supports; or it might refuse an association if it sees a capability it does not support; or the responder may declare its capabilities first, with the initiator choosing its supported capabilities from the list given; or the two may interact with one another in an interactive process, enumerating one another's capabilities until the process of capability negotiation is complete. These capabilities have identifiers; see type identifiers for discussion on identifier design.
Implementation identifiers. Sometimes application layer protocols will have implementations offer a freeform string identifying the implementation being used. These do not affect the behaviour of the protocol and do not provide intelligible information to machines. Occasionally, where incompatibilities in protocol implementations have occurred due to bugs, or underspecification of the protocol, known implementation identifiers have been used to vary processing to work around such an incompatibility. This is not considered a good practice and should only be engaged in where absolutely necessary. It is also only possible if the protocol uses an implementation identifier.
Messaging paradigms. The bulk of the meaning of the application is in its handling of messages, though terminology may differ from protocol to protocol. Messages are frequently linked or related to other messages or used with one another in sequence. The following patterns of messaging are commonly found:
- Request/response. A request message is sent and a response is returned. Responses may often be distinguished by whether they represent success or failure.
- Request/multiple-response. A request message is sent; multiple responses are received.
- Request with interim responses. A request message is sent; interim response messages are returned, followed by a final response message.
- Request without response. A request message is sent and handled, but no response is sent.
- Notification. A message is sent other
than in connection with a request, possibly asynchronously. Notifications may
represent the occurrence of some asynchronously occurring event.
- Notification: acknowledgement, retransmission, reliability. Generally, responses to notifications are neither required nor accepted, but some application protocols may require them to be acknowledged, or retransmit them in future associations if they are not acknowledged, to render such notifications reliable.
Request framework. Various patterns aid in the implementation of the set of messaging paradigms an application-layer protocol chooses to support. Requests will generally have some kind of type identifier.
- Common fields. Some request or response fields may be defined to be common to all requests, including ones yet to be specified.
- Request success/failure (ACK/NAK). Responses will often contain standard fields common to all request types which indicate whether the request was successful or not.
- Standardised error signalling. The signalling of non-successful requests may be standardised to allow for generic understanding without understanding of the specific semantics of the request or error response. The error response might include some kind of standardised error identifier (whether integer or string), error message string, or both.
- Errors: error classes. Errors might be classified into multiple error classes, which provide some kind of significant distinction between different classes of errors. The class of error might be designated by its own field, or be inferable from the way error identifiers are allocated (for example, HTTP status codes).
- Errors: party at fault. Error classes might be used to express which party in an association is believed to be responsible for the request failure (for example, HTTP 4xx vs. HTTP 5xx).
- Errors: transience. Error classes might be used to express whether an error is permanent or transient, and thus whether a request should be subsequently retried (e.g. SMTP).
- Standardised error signalling: localisation. If error messages are used in error responses, they might be localised for different languages.
- Request IDs. Some protocols may allow a unique request
identifier to be associated with a request, and quote the identifier in their
response. This allows requests and responses to be correlated by a requester.
- Request IDs: scope of assignment. Request IDs might only be unique to an association, in which case they might simply be a small integer, or they might be globally unique.
- Request IDs: scope of use. Usually they will only have meaning within the association for request/response correlation purposes and not be transmitted elsewhere, though sometimes they might have wider use or significance.
- Pipelining. A protocol might allow multiple requests to be issued and be in-flight at once. If the requests are not guaranteed to complete and have their responses transmitted in the order they were issued, use of request IDs is required to allow request/response correlation; if they are always completed in order of issuance, this is not necessary.
- Cancellation. A protocol might allow a request previously made to be cancelled. If multiple requests can be in-flight at once, this requires request IDs to allow the request to be cancelled to be identified.
- Progress indication. A protocol might support the indication of the completion progress of a request currently in progress, via interim response or notification messages.
- Streaming: large or realtime responses. If a response is large, or if it is generated slowly over an extended period of time, for example as time passes, a protocol might support the streaming of a response over a period of time and over multiple messages, rather than sending it all at once.
- Streaming: indeterminate length. A protocol might support the streaming of a response whose length is indeterminate when streaming of it begins.
- Request classes. The different types of request may be separated into request classes with different protocol
behaviours. It may or may not be possible for a peer which does not support a
given request type to discern its request class, for example, if the request
type identifier scheme embeds a request class signifier.
- Request classes: messaging paradigm. For example, there may be one class of request types for requests which have responses, and another for request types which do not.
- Request classes: mutation. There may be one class of request types for requests which can mutate state, and another for request types which cannot.
- Idempotent requests. Where idempotency is required, a protocol may use a field such as the request identifier (if globally unique) to guarantee that no two requests with the same identifier will be executed twice. A peer may use this to submit requests which ensure idempotency.
- Format negotiation. Occasionally, an implementation may be able to offer a response in multiple different formats, in which case a mechanism for the negotiation of the desired format may be implemented by an application-layer protocol (for example, HTTP content negotiation). Note that this is effectively Layer 6 functionality.
Keepalives. An application layer protocol might implement application-layer
keepalives. Although this functionality can often be delegated to lower-layer
protocols such as TCP which support optional keepalive functionality, ensuring
that this functionality is implemented at the application layer ensures that
the application layer code is functioning correctly to respond to keepalive
messages. (See for example the IRC PING
command.)
Multiplexing. An application layer protocol might support the multiplexing of multiple logical message streams on a single lower-layer connection. What a message or stream refers to is application-specific, but the pattern is common. Such multiplexing functionality might be integrated directly into a framing mechanism discussed above, or might be separate without tight coupling to framing mechanisms.
- Head-of-line blocking. If multiplexing is performed via the interleaving of frame, frame size must be limited to avoid head-of-line blocking.
Transport agnosticism. An application-layer protocol might be transport agnostic. For example, it might be defined to operate over either TCP, TCP with TLS, or SCTP. Frequently this will involve the definition of the core of the application-layer protocol in a transport-agnostic way, and then supplementation of this specification with a transport-layer binding specification. Since some transports, such as SCTP, provide framing, framing facilities might not be needed for some transport bindings; therefore framing facilities might be specified as part of a transport binding and not the core protocol.
Routing: service identification. An application-layer protocol may also require the submission of a service identifier before providing service, to allow multiple services offered over the same transport (e.g. TCP port) to be disambiguated. Examples include the HTTP Host header.
Routing: resource identifiers. Within a given application, the subrouting of requests is often necessary; thus application-layer protocols may often include identifiers or addresses of specific network entities, resources or objects. Object-oriented protocols will require all objects to bear identifiers which can be used to refer to them; such identifiers might be a standard field common to all requests. Examples include URLs in HTTP.
Routing: application-layer networking. An application-layer protocol may allow requests to include internal routing information which allows the addressing of other protocol machines addressable via a given peer, possibly on different machines; this essentially resembles the headers of a Layer 3 networking protocol such as IP. For example, an application-layer protocol for an MMORPG might enable requests to be directed to an internal realm server, internal player information server or internal chat server based on addressing information in a request. Other examples include SMTP or XMPP, the purposes of which are to allow messaging beyond the scope of the parties to the transport-layer association. Some scheme of addressing must be created to address the entities which can be reached.
Routing: load balancing. An application-layer protocol may allow requests received by a role to be delegated to one of several third parties for the purposes of load balancing. The request is tracked to ensure that any responses are returned to the original submitter of the request. Various load balancing algorithms may be used to choose a third party from the set available.
- Routing: load balancing: affinity. Requests may have attributes which require that they be routed to a specific third party for proper handling (sharding), or it may be necessary for multiple requests received over the same association to be sent to the same third party (but it does not necessarily matter which third party it is). An example of this is what is commonly referred to as HTTP session cookie affinity.
Roles. An application-layer protocol will typically define different roles for different peers.
- Roles: client/server. The most common set of roles are client and server, in a point-to-point association between two peers.
- Roles: messaging paradigm. Roles may relate to messaging paradigms by defining whether a given role is permitted to make requests and whether it is permitted to receive requests. Notifications, if supported, may be exempt from these restrictions, may be subject to them, or may always go in the opposite direction.
- Roles: initiator/responder. Usually, a client initiates an underlying transport-layer association to a server and this establishes the assignment of the client and server roles. Occasionally, these are decoupled and the initiator of a transport-layer connection might be the server; in this case, the orthogonal terms initiator and responder can be used to designate which party initiated the transport-layer connection.
- Roles: P2P. Some protocols are peer-to-peer, in which case there is (usually) only one role (peer) which all peers have. These protocols can be viewed as symmetric, and client-server protocols as asymmetric in the peer relationship.
- TCP Simultaneous Open. An obscure feature of TCP is the ability of two hosts to
connect to one another using two simultaneous
connect()
calls with matching source and destination IP and port numbers. A listening socket is not required. An application-layer protocol might be designed to make constructive use of this functionality.
Quit message. An application-layer protocol might support a quit message, to be responded to with a response merely confirming the quit message, or confirming it and terminating the transport-layer association. The advantage of this over simply tearing down the transport-layer association is that it ensures all previous requests have been handled by application-layer code. Since the completion of a request is made obvious by delivery of its response, this is only of significant use for protocols making substantial use of requests which do not receive responses. Transport-layer delivery guarantees such as TCP ACKs cannot be used for confirmation of handling as these only indicate data has been delivered to an OS kernel's TCP buffer and not to an application.
Application-layer flow control. An application-layer protocol may include facilities for flow control beyond those of the lower-layer transport protocol. For example, it may support the rate limiting of specific commands, specific categories of commands, or all commands.
- Single v. dual enforcement. Flow control may be enforced singly by the receiver, in which case it is expected that request submitters will sometimes exceed flow control limits, and be informed that they have done so by error responses. In dual enforcement, submitters are expected to be aware of the flow control limits and not submit requests in excess of the limit. This avoids the occurrence of violations in the first place, rather than simply refusing to process them when they occur. Flow control limits can also be solely implemented by submitters on the honour system, though this is not secure.
- Rate negotiation. In the dual enforcement case, the protocol may also support the advertisement or negotiation of its flow control functionality, whereby the flow control limits a peer is expected to abide by are negotiated and communicated dynamically, enabling the peer to understand the flow control requirements as well as the current state of the flow control mechanism.
- Flow control paradigms. There are numerous flow control paradigms. Examples might include a simple timer system in which requests are limited to a certain number per unit of time; a credit system in which peers receive a certain number of credits every unit of time, where any given command depletes a certain number of credits; or fixed limits on the number of requests which may be simultaneously in-flight.
Who sends first? Some application-layer protocols have the initiator of a lower-level transport association sending first, with others having the responder send first. Protocols where the responder sends first, such as for example SSH, force the responder to reveal what protocol it implements first. By contrast, protocols where the initiator sends first (e.g. HTTP) are more flexible, and there is the potential to multiplex multiple application-layer protocols on the same lower-level transport endpoint (e.g. TCP port number).
Speculative push. This is a rare function only really known for adoption in HTTP/2.0; a response is issued to a synthetic request which was never actually issued by the peer, in the expectation that the peer may soon need the response. The peer handles the response under the premise it issued the synthetic request for it.
Control/data plane split. Application-layer protocols may sometimes split their control and data planes. They may, for example, use multiple lower-level transport layer associations for a single application-layer association, with one such lower-level association being used for control, and the others for data. Examples include FTP and, when combined with SIP or H.323, RTP.
Envelope/payload split. Application-layer protocols may split requests into envelope and payload data, and relay the payload blindly as opaque data (application-layer networking) while using the envelope data for routing and other purposes. For example SMTP, which treats envelope addresses and email bodies separately. Even though those bodies themselves will usually contain headers repeating the envelope addresses, they are treated separately.
Referrals. Application-layer protocols may incorporate the ability to respond to requests with referrals to other services better able to process the request; for example, HTTP redirects.
- Referrals: temporary/permanent. An application-layer protocol may make a distinction between temporary and permanent referrals, where permanent referrals from a peer indicate the given request should never be issued to it again.
- Referrals: long-standing connections. An application layer protocol which uses long-standing associations (for example, because it sends asynchronous notifications) might support a different kind of referral, which relates not to a specific request but indicates the association should be torn down and replaced with a new association made with a new peer. For example, a chat server which is shutting down might tell its clients to connect to another chat server first.
Verbs. Some application-protocols may employ a reduced or limited set of request types intended to be used generically on a large number of resources identified via resource identifiers which are of varied type, where each kind of resource has its own implementation of these standard verbs. A given resource may or may not implement any given verb, but each verb will have some set of universal semantics which must be met by all resources which implement it. This requires use of a single resource identifier namespace shared by all resource types. Examples include HTTP and SNMP.
- Common verb semantic requirements. Some verbs may require non-mutation or idempotency.
- Common verb sets. A common verb set is GET and SET, which maps to the notion of getting and setting configurable variables. Verb sets which reflect filesystem-like operations are also common. The popularity of HTTP has rendered its verb set common.
Enumeration. Some application-layer protocols may provide facilities for enumerating the resources or objects which can be addressed via a peer. For example, FTP provides directory listings and LDAP allows enumeration of directory objects. HTTP is notably deficient here, neglecting to provide directory listings even where its predecessor, Gopher, does so.
- Enumeration of supported request types/verbs. Some application-layer protocols may additionally support the enumeration and discovery of the supported request types which can be made against a resource, which may be described as reflection. (In the case of the RPC paradigm, for example, if this discovery functionality extends further to discovering the format of the information to be provided in a request and its response, this can be used to automatically generate call proxies in dynamic programming languages.)
Publish/subscribe. Some application-layer protocols may provide facilities for peers to subscribe to notifications of some given description, and subsequently transmit any notifications of that description to that peer. A peer can cancel its previous subscription using an unsubscribe request. Note that this is a higher-level pattern which can be built on top of aforementioned messaging paradigms.
Application-layer retransmission. Application-layer protocols may enable some messages, or higher-level constructs built on top of the protocol's messages, to be made reliable by requiring acknowledgement of their delivery and retransmitting messages which were transmitted over a previous association but not acknowledged. This is a higher level construct not to be confused with reliability of an underlying transport-layer protocol. Examples include SMTP, which provides for reliable email transmission.
- Reliable queueing. This is often used to implement a higher-level pattern such as reliable work or messaging queues.
- Inter-association peer identity. Where a peer is expected to initiate an association to collect messages queued to it, and the peer does not not already have a natural identity (for example, because it does not authenticate), some protocols may assign peers identifiers to correlate successive associations made from the same peer and allow them to receive messages queued to them while no association for the peer was active, or which were not acknowledged. Examples include ZMTP (the protocol used by ZeroMQ).
Transactions. The notion of atomic transactions, which apply totally or not at all, and possibly which can be rolled back is often built on top of an application-layer protocol. This is a higher-level pattern which can be built on top of aforementioned messaging paradigms.
Note on layering. The fact that all these patterns are listed together does not imply that they can not be decomposed into layered subprotocols, and in fact an analysis deducing such layering should be made. For example, an application-layer protocol which provides framing can consider this to be the lowest layer of itself, with other subprotocols of the application-layer protocol layered on top of it. Some functionality discussed herein, such as publish/subscribe, transactions or reliable queueing, is particularly high level and can be viewed as sitting above, for example, a general request/response/notification messaging layer, itself sitting on a framing layer. The full analysis of an application-layer protocol's subprotocols is inevitably specific to the protocol, so it is not discussed further here.