Web-CID: A Controlled Identifier Profile for Agent Identification on the Web

HTTP URLs form the foundation of the Web. HTTP URLs are dereferenceable by any standard HTTP client without additional infrastructure. This specification defines an HTTP-based Controlled Identifier [[CID-1.0]] Profile for Agent Identification on the Web.

Introduction

HTTP URLs form the foundation of the Web. Importantly, HTTP URLs are dereferenceable by any standard HTTP client without additional infrastructure. This specification defines an HTTP-based Controlled Identifier (CID) [[CID-1.0]] Profile for Agent Identification on the Web.

A CID is a URL. Dereferencing a CID yields the CID document that may detail, e.g., the CID's verification methods or its controller. How exactly a CID is to be derefenced remains unspecified by [[CID-1.0]]. It depends on the particular type of URL used as the CID.

To achieve interoperability in CID-based agent identification and subsequent authentication on the Web, implementations of Clients and Servers require a common mechanism to dereference an agent's CID. This specification provides one such mechanism based on HTTP [[RFC9110]] and the architecture of the Web itself [[WEBARCH]].

This specification is for:

Identity provider server developers who want to identify agents on the Web using HTTP-based CIDs and to enable their server to support dereferencing an agent's CID for agent authentication - implementing a Server of Web-CID Agent Documents;
Authorization server developers who want to enable their server to support dereferencing an agent's CID for agent authentication - implementing a Client processing Web-CID Agent Documents;
Application developers who want to enable their application to support dereferencing an agent's CID, e.g., to obtain the agent's cryptographic assertion keys - implementing a Client processing Web-CID Agent Documents.

Terminology

This specification adopts terminology from [[CID-1.0]], including but not limited to:

controlled identifier document: as defined by [[CID-1.0]].
subject: as defined by [[CID-1.0]].
base identifier: as defined by [[CID-1.0]].
canonical URL: as defined by [[CID-1.0]].
controller: as defined by [[CID-1.0]].
verification method: as defined by [[CID-1.0]].

This specification further defines the following terminology:

agent: An agent is an entity that is able to initiate or perform actions, e.g., a person, an organisation, or a software application.
agent identifier: An HTTP URL identifying an agent.

Conformance

This section describes the conformance model of the Web-CID Profile.

Normative and Informative Content

All assertions, diagrams, examples, and notes are non-normative, as are all sections explicitly marked non-normative. Everything else is normative.

The key words “MUST”, “MUST NOT”, “SHOULD”, and “MAY” are to be interpreted as described in BCP 14 [[!RFC2119]] [[!RFC8174]] when, and only when, they appear in all capitals, as shown here.

The key words “strongly encouraged”, “strongly discouraged”, “encouraged", “discouraged", “can", “cannot”, “could”, “could not”, “might”, and “might not” are used for non-normative content.

Specification Category

The Web-CID Profile identifies the following Specification Category to distinguish the types of conformance: notation/syntax, processor behavior, protocol.

Classes of Products

The Web-CID Profile identifies the following Classes of Products for conforming implementations. These products are referenced throughout this specification.

Web-CID Agent Document: A CID document that describes an agent identified by an HTTP URL.
Server: A Server that responds to HTTP requests to provide an Web-CID Agent Document.
Client: A Client that issues HTTP requests to obtain and process an Web-CID Agent Document.

Interoperability

Client–Server interoperability: Interoperability of implementations for Client and Servers is tested by evaluating an implementation’s ability to request, respond and process HTTP messages that conform to this specification. Interoperability is achieved when a Client, given an agent identifier, can successfully obtain and validate the corresponding Web-CID Agent Document from any conforming Server, and thereby establish the authoritative binding between the agent identifier and Web-CID Agent Document, and subsequently, the agent's verification methods.

Agent Identifier

An agent and a Web-CID Agent Document are two distinct resources [[WEBARCH]], which cannot be identified by the same URL at the same time. This specification thus distinguishes between an agent identifier and an Web-CID Agent Document's identifier. Importantly, the two identifiers denote different things:

An agent identifier is an HTTP URL identifying an agent [[!URL]].
An Web-CID Agent Document's identifier is an HTTP URL identifying the corresponding information resource [[!WEBARCH]], i.e., the document describing the agent.

A Client can obtain the agent identifier's corresponding Web-CID Agent Document by dereferencing the agent identifier via HTTP [[RFC9110]]. A Client then validates the corresponding Web-CID Agent Document to be indeed authoritative for the expected agent identifier.

An agent identifier SHOULD be an HTTP URL that includes a fragment [[!URL]].

When a Server receives an HTTP request targeting an agent identifier that does not include a fragment, the Server MUST respond with a redirect and provide the corresponding Web-CID Agent Document identifier in the response's Location header field.

The current [[CID-1.0]] does not strictly forbid fragment identifiers but defines its algorithms in such a way that fragment identifiers effectively cannot be used.

A corresponding issue and a corresponding pull request are already opened.

Web-CID Agent Document

A Web-CID Agent Document is a CID document; all property definitions by [[!CID-1.0]] apply. In addition, this specification defines properties, restrictions, and examples as follows.

A Web-CID Agent Document thus describes the verification methods and services of an agent. It describes which verification methods or services might be used to make assertions about the agent. Such assertions can be used as authentication credentials, e.g., ID tokens [[OPENID-CONNECT-CORE]], SAML assertions [[SAML2-CORE]], or custom JSON Web Tokens [[RFC7519]], to authenticate an agent.

Let the following Web-CID Agent Document be available at: https://example.org/doc01

{
  "@context": [ 
    "https://www.w3.org/ns/cid/v1", 
    "https://www.w3.org/ns/web-cid/v1" 
  ],
  "id": "https://example.org/doc01#agent",
  "authentication": [{
        "id": "https://example.org/doc01#key0",
        "type": "Multikey",
        "controller": "https://example.org/doc01#agent",
        "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
  }]
}

Using distinct identifiers for the agent and their Web-CID Agent Document allows expressing additional document-related meta data without ambiguity between the two.

Let the following Web-CID Agent Document be available at: https://example.org/doc01

{
  "@context": [ 
    "https://www.w3.org/ns/cid/v1", 
    "https://www.w3.org/ns/web-cid/v1" 
  ],
  "id": "https://example.org/doc01#agent",
  "authentication": [{
        "id": "https://example.org/doc01#key0",
        "type": "Multikey",
        "controller": "https://example.org/doc01#agent",
        "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
  }],
  "isPrimaryTopicOf": { 
        "id" : "https://example.org/doc01",
        "controller": [ "https://example.org/doc01#agent", "https://example.org/doc00#provider" ] ,
        "http://xmlns.com/foaf/0.1/maker": "https://example.org/doc00#provider" ,
        "http://www.w3.org/ns/posix/stat#mtime" : 1773335535
  }
}

Data Model

Context

To ensure broad interoperability across implementations of Clients and Servers, a Web-CID Agent Document MUST specify JSON-LD's @context property to indicate its entire applicable context.

Concretely, a Web-CID Agent Document MUST include at least the JSON-LD context defined in [[CID-1.0]]. When a Web-CID Agent Document includes additional terms recommended by this specification, the Web-CID Agent Document MUST also include at least the corresponding JSON-LD context defined by in this specification.

Let the following JSON-LD context be available at: https://www.w3.org/ns/web-cid/v1.

{
  "@context": {
    "isPrimaryTopicOf": {
      "@id": "http://xmlns.com/foaf/0.1/isPrimaryTopicOf",
      "@type": "@id"
    }
  }
}

Agent ID

The CID specification [[!CID-1.0]] defines the subject of a CID document to be identified by the CID document's base identifier, which is the value of the id property in the topmost map of the CID document. This means that the topmost map present in a Web-CID Agent Document describes the agent, i.e., the subject of the CID document.

A Web-CID Agent Document MUST include an id property to indicate the agent identifier of the agent that this Web-CID Agent Document describes. The value of this id property is thus the URL of the agent who is the subject of the CID document.

Document ID

A Web-CID Agent Document SHOULD include an isPrimaryTopicOf property to indicate its own identifier [[!FOAF]].

The value of this isPrimaryTopicOf property is thus either the URL of the Web-CID Agent Document itself or a map whose id property value is the URL of the Web-CID Agent Document itself.

Controller

The CID specification [[!CID-1.0]] specifies the controller property such that it is possible to express that a CID document or a certain verification method is controlled by a particular agent. An agent might also be controlled by another agent, which is increasingly common with highly automated software agents. Using distinct identifiers for the agent and their Web-CID Agent Document prevents additional ambiguity.

A Web-CID Agent Document SHOULD specify a controller property to indicate at least one controller of the document itself.

A controller of a CID document is capable of modifying the document's contents. Examples of controllers include the following:

the agent that is the subject of the CID document
an agent acting as a guardian of another agent, e.g., a parent of a child
a third party controlling the CID document, e.g., an Identity Provider
the entity exerting URI ownership [[WEBARCH]] of the CID

Verification Method

A Web-CID Agent Document might include one or more verification methods of an agent, e.g., for authentication or claim assertion, the choice of which depends on the particular protocols in which the agent takes part. See [[CID-1.0]] for more details on verification relationships.

Let the following Web-CID Agent Document be available at <https://example.org/doc01>

{
  "@context": [ 
    "https://www.w3.org/ns/cid/v1", 
    "https://www.w3.org/ns/web-cid/v1"
  ],
  "id": "https://example.org/doc01#agent",
  "authentication": [{
        "id": "https://example.org/doc01#key0",
        "type": "Multikey",
        "controller": "https://example.org/doc01#agent",
        "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
  }],
  "isPrimaryTopicOf": { 
        "id" : "https://example.org/doc01",
        "controller": "https://example.org/doc01#agent"
  }
}

Service

A Web-CID Agent Document might include one or more services of an agent, e.g., to express ways of communicating with the controller or associated entities. A service can be any type of service the controller wants to advertise for further discovery, authentication, authorization, or interaction.

Let the following Web-CID Agent Document be available at <https://example.org/doc01>

{
  "@context": [ 
    "https://www.w3.org/ns/cid/v1", 
    "https://www.w3.org/ns/web-cid/v1"
  ],
  "id": "https://example.org/doc01#agent",
  "service": [{
        "type": "https://example.org/serviceTypes#OpenIdProvider",
        "serviceEndpoint": "https://example.org/"
  }],
  "isPrimaryTopicOf": { 
        "id" : "https://example.org/doc01",
        "controller": "https://example.org/doc00#provider"
  }
}

Server

A Server MUST conform to HTTP Semantics [[!RFC9110]]. A Server MUST use TLS connections through the https URL scheme in order to secure the communication with Clients.

If a Server provides redirects from an agent identifier to a corresponding Web-CID Agent Document, then the Server SHOULD use a 303 status code and provide the URL of the Web-CID Agent Document in the Location header field.

A Server MUST include a Content-Type header field in a message that contains content.

Content-Negotiation

A Server MUST support content-negotiation for the media types of application/cid [[!CID-1.0]], application/json [[!JSON]], and application/ld+json [[!JSON-LD]].

When serving a response based on content negotiation, a Server MUST include a Vary: Accept header field in the response to ensure proper caching behavior [[!RFC9110]].

Cross-Origin Resource Sharing (CORS)

A Server MUST support Cross-Origin Resource Sharing (CORS) [[!FETCH]].

Concretely, whenever a Server receives an HTTP request containing a valid Origin header [[!RFC6454]], the server MUST respond with the appropriate Access-Control-* header fields as specified in the CORS protocol [[!FETCH]]. When serving a response with dynamically generated CORS headers based on the request's origin, a Server MUST include a Vary: Origin header field in the response to ensure proper caching behavior [[!RFC9110]].

Client

A Client MUST conform to HTTP Semantics [[!RFC9110]].

A Client MUST use the Accept header field in a HTTP GET request to indicate acceptable media types for the requested Web-CID Agent Document [[!RFC9110]].

Dereferencing an Agent Identifier

If a Client does not expect a given HTTP URL to be an agent identifier, the Client derferences the URL as usual [[!RFC9110]][[!CID-1.0]].

If a Client expects a given HTTP URL to be an agent identifier, dereferencing that identifier is expected to yield its corresponding Web-CID Agent Document:

If the agent identifier includes a fragment, a Client strips the fragment from the HTTP URL as usual [[!RFC9110]]. The Client then dispatches an HTTP request to the resulting URL.
If the agent identifier does not include a fragment, the Client checks that the Server does not conflate the agent with its corresponding Web-CID Agent Document: When dereferencing this URL, if the Server responds with a 200 status code directly (implying the URL identifies the document itself), the Client MUST reject the response and provide a client-defined error. The Client MUST only accept a redirect response (e.g., 303 status code) that provides a distinct URL to the Web-CID Agent Document in the Location header field.

Whether or not a Client allows following redirects depends on the trust model assumed by the implementation. Examples include but are not limited to the following:

Some Clients might trust the controller of a Web-CID Agent Document to configure their Servers correctly and thus follow any redirect.
Some Clients might enforce stricter trust boundaries and thus follow only redirects that point to the same origin as the agent identifier's origin.
Some Clients might choose not follow any redirect.

If a Client chooses to not follow redirects, then the Client MUST provide a client-defined error as the result of the process.

Validating an Agent Identifier against a Web-CID Agent Document

A Client MUST reject a non-conforming Web-CID Agent Document and provide a corresponding error [[!CID-1.0]].

Security Considerations

As this profile is an application of the Controlled Identifier (CID) specification all security considerations defined in [[CID-1.0]] apply directly to this specification.

The following HTTP-specific considerations also apply:

HTTP Fetching Risks: Because Clients may handle HTTP redirects, implementations are subject to standard HTTP fetching risks, including redirect loops and resource exhaustion. Clients are strongly encouraged to apply standard mitigation strategies as defined in the Fetch Standard [[FETCH]].
Redirects: A Client that chooses to follow redirects assumes the risk of the redirecting Server being mistakingly or maliciously configured and thus points to a malicious Server. In such a case, the Client might obtain a valid Web-CID Agent Document and validate it to be authoritative despite it being malicious. This risk can be mitigated by a Client by not following redirects, which means to only accept agent identifiers that include a fragment [[URL]].
Reliance on DNS: As for any HTTP-based communication that relies on DNS, there exists the risk of a malicious actor being able to modify the DNS record of a particular domain name entry to point to a malicious Server. In such a case, the Client might obtain a valid Web-CID Agent Document and validate it to be authoritative despite it being malicious. This risk can be addressed by mechanisms presented in [[RFC4033]].
Authoritative Binding Validation: The security of this profile relies entirely on the Client successfully executing the validation steps. Clients strictly verifying the id property as defined by [[CID-1.0]] is a critical security control for the prevention of substitution of cryptographic material.

This specification only provides one component to be used in an authentication, authorization, or more general, interaction protocol on the Web. While composing such protocols is common-place in software engineering, it poses in inherent risk from a formal security perspective: Composing a protocol from components that are individually sound does not entail that the resulting protocol is also sound. To ensure desired security properties hold for the composed protocol, the authors of this specification strongly recommend making trust assumptions explicit and verifying the protocols' security properties using formal methods, e.g., following the approach presented in [[BHKM24]].