HTTP URLs form the foundation of the Web. HTTP URLs are dereferenceable by any standard HTTP client without additional infrastructure. This specification defines an HTTP-based Controlled Identifier [[CID-1.0]] Profile for Agent Identification on the Web.
This is an unofficial proposal.
HTTP URLs form the foundation of the Web. Importantly, HTTP URLs are dereferenceable by any standard HTTP client without additional infrastructure. This specification defines an HTTP-based Controlled Identifier (CID) [[CID-1.0]] Profile for Agent Identification on the Web.
A CID is a URL. Dereferencing a CID yields the CID document that may detail, e.g., the CID's verification methods or its controller. How exactly a CID is to be derefenced remains unspecified by [[CID-1.0]]. It depends on the particular type of URL used as the CID.
To achieve interoperability in CID-based agent identification and subsequent authentication on the Web, implementations of Clients and Servers require a common mechanism to dereference an agent's CID. This specification provides one such mechanism based on HTTP [[RFC9110]] and the architecture of the Web itself [[WEBARCH]].
This specification is for:
This specification adopts terminology from [[CID-1.0]], including but not limited to:
This specification further defines the following terminology:
This section describes the conformance model of the Web-CID Profile.
All assertions, diagrams, examples, and notes are non-normative, as are all sections explicitly marked non-normative. Everything else is normative.
The key words “MUST”, “MUST NOT”, “SHOULD”, and “MAY” are to be interpreted as described in BCP 14 [[!RFC2119]] [[!RFC8174]] when, and only when, they appear in all capitals, as shown here.
The key words “strongly encouraged”, “strongly discouraged”, “encouraged", “discouraged", “can", “cannot”, “could”, “could not”, “might”, and “might not” are used for non-normative content.
The Web-CID Profile identifies the following Specification Category to distinguish the types of conformance: notation/syntax, processor behavior, protocol.
The Web-CID Profile identifies the following Classes of Products for conforming implementations. These products are referenced throughout this specification.
An agent and a Web-CID Agent Document are two distinct resources [[WEBARCH]], which cannot be identified by the same URL at the same time. This specification thus distinguishes between an agent identifier and an Web-CID Agent Document's identifier. Importantly, the two identifiers denote different things:
A Client can obtain the agent identifier's corresponding Web-CID Agent Document by dereferencing the agent identifier via HTTP [[RFC9110]]. A Client then validates the corresponding Web-CID Agent Document to be indeed authoritative for the expected agent identifier.
An agent identifier SHOULD be an HTTP URL that includes a fragment [[!URL]].
When a Server receives an HTTP request targeting an agent identifier that does not include a fragment, the Server MUST respond with a redirect and provide the corresponding Web-CID Agent Document identifier in the response's Location header field.
The current [[CID-1.0]] does not strictly forbid fragment identifiers but defines its algorithms in such a way that fragment identifiers effectively cannot be used.
A corresponding issue and a corresponding pull request are already opened.
A Web-CID Agent Document is a CID document; all property definitions by [[!CID-1.0]] apply. In addition, this specification defines properties, restrictions, and examples as follows.
A Web-CID Agent Document thus describes the verification methods and services of an agent. It describes which verification methods or services might be used to make assertions about the agent. Such assertions can be used as authentication credentials, e.g., ID tokens [[OPENID-CONNECT-CORE]], SAML assertions [[SAML2-CORE]], or custom JSON Web Tokens [[RFC7519]], to authenticate an agent.
Using distinct identifiers for the agent and their Web-CID Agent Document allows expressing additional document-related meta data without ambiguity between the two.
To ensure broad interoperability across implementations of Clients and Servers,
a Web-CID Agent Document MUST specify JSON-LD's @context property to indicate its entire applicable context.
Concretely, a Web-CID Agent Document MUST include at least the JSON-LD context defined in [[CID-1.0]]. When a Web-CID Agent Document includes additional terms recommended by this specification, the Web-CID Agent Document MUST also include at least the corresponding JSON-LD context defined by in this specification.
Let the following JSON-LD context be available at:https://www.w3.org/ns/web-cid/v1.
The CID specification [[!CID-1.0]] defines the subject of a CID document to be identified by the CID document's base identifier, which is the value of the id property in the topmost map of the CID document.
This means that the topmost map present in a Web-CID Agent Document describes the agent, i.e., the subject of the CID document.
A Web-CID Agent Document MUST include an id property to indicate the agent identifier of the agent that this Web-CID Agent Document describes.
The value of this id property is thus the URL of the agent who is the subject of the CID document.
A Web-CID Agent Document SHOULD include an isPrimaryTopicOf property to indicate its own identifier [[!FOAF]].
The value of this isPrimaryTopicOf property is thus either the URL of the Web-CID Agent Document itself or a map whose id property value is the URL of the Web-CID Agent Document itself.
The CID specification [[!CID-1.0]] specifies the controller property such that
it is possible to express that a CID document or a certain verification method is controlled by a particular agent.
An agent might also be controlled by another agent, which is increasingly common with highly automated software agents.
Using distinct identifiers for the agent and their Web-CID Agent Document prevents additional ambiguity.
A Web-CID Agent Document SHOULD specify a controller property to indicate at least one controller of the document itself.
A Web-CID Agent Document might include one or more verification methods of an agent, e.g., for authentication or claim assertion, the choice of which depends on the particular protocols in which the agent takes part. See [[CID-1.0]] for more details on verification relationships.
A Web-CID Agent Document might include one or more services of an agent, e.g., to express ways of communicating with the controller or associated entities. A service can be any type of service the controller wants to advertise for further discovery, authentication, authorization, or interaction.
A Server MUST conform to HTTP Semantics [[!RFC9110]].
A Server MUST use TLS connections through the https URL scheme in order to secure the communication with Clients.
If a Server provides redirects from an agent identifier to a corresponding Web-CID Agent Document, then the Server SHOULD use a 303 status code and provide the URL of the Web-CID Agent Document in the Location header field.
A Server MUST include a Content-Type header field in a message that contains content.
A Server MUST support content-negotiation for the media types of
application/cid [[!CID-1.0]],
application/json [[!JSON]], and
application/ld+json [[!JSON-LD]].
When serving a response based on content negotiation, a Server MUST include a Vary: Accept header field in the response to ensure proper caching behavior [[!RFC9110]].
A Server MUST support Cross-Origin Resource Sharing (CORS) [[!FETCH]].
Concretely, whenever a Server receives an HTTP request containing a valid Origin header [[!RFC6454]],
the server MUST respond with the appropriate Access-Control-* header fields as specified in the CORS protocol [[!FETCH]].
When serving a response with dynamically generated CORS headers based on the request's origin, a Server MUST include a Vary: Origin header field in the response to ensure proper caching behavior [[!RFC9110]].
A Client MUST conform to HTTP Semantics [[!RFC9110]].
A Client MUST use the Accept header field in a HTTP GET request to indicate acceptable media types for the requested Web-CID Agent Document [[!RFC9110]].
If a Client does not expect a given HTTP URL to be an agent identifier, the Client derferences the URL as usual [[!RFC9110]][[!CID-1.0]].
If a Client expects a given HTTP URL to be an agent identifier, dereferencing that identifier is expected to yield its corresponding Web-CID Agent Document:
If the agent identifier includes a fragment, a Client strips the fragment from the HTTP URL as usual [[!RFC9110]]. The Client then dispatches an HTTP request to the resulting URL.
If the agent identifier does not include a fragment, the Client checks that the Server does not conflate the agent with its corresponding Web-CID Agent Document:
When dereferencing this URL, if the Server responds with a 200 status code directly (implying the URL identifies the document itself), the Client MUST reject the response and provide a client-defined error.
The Client MUST only accept a redirect response (e.g., 303 status code) that provides a distinct URL to the Web-CID Agent Document in the Location header field.
Whether or not a Client allows following redirects depends on the trust model assumed by the implementation. Examples include but are not limited to the following:
If a Client chooses to not follow redirects, then the Client MUST provide a client-defined error as the result of the process.
A Client MUST reject a non-conforming Web-CID Agent Document and provide a corresponding error [[!CID-1.0]].
As this profile is an application of the Controlled Identifier (CID) specification all security considerations defined in [[CID-1.0]] apply directly to this specification.
The following HTTP-specific considerations also apply:
id property as defined by [[CID-1.0]] is a critical security control for the prevention of substitution of cryptographic material.
This specification only provides one component to be used in an authentication, authorization, or more general, interaction protocol on the Web. While composing such protocols is common-place in software engineering, it poses in inherent risk from a formal security perspective: Composing a protocol from components that are individually sound does not entail that the resulting protocol is also sound. To ensure desired security properties hold for the composed protocol, the authors of this specification strongly recommend making trust assumptions explicit and verifying the protocols' security properties using formal methods, e.g., following the approach presented in [[BHKM24]].
As this profile is an application of the Controlled Identifier (CID) specification all privacy considerations defined in [[CID-1.0]] apply directly to this specification.
Thanks to the participants of the W3C Solid Community Group and the participants of the W3C Linked Web Storage Working Group for their support.
Thanks to the editors of the Solid Protocol, especially Sarven Capadisli, Tim Berners-Lee, and Kjetil Kjernsmo, for providing inspiration and guidance on the structure and design of the conformance considerations in this specification. The Solid Protocol specification serves as an example to follow.
Thanks to Pierre-Antoine Champin for encouraging the discussion on agent identification. Thanks to all participants of that discussion who provided valuable input, in no particular order: Melvin Carvalho, Aaron Coburn, Laurens Debackere, elf Pavlik, Jacopo Scazzosi, Ted Thibodeau Jr.