Web Agent Identification: A Web-based Controlled Identifier Profile (Web-CID)

HTTP URIs form the foundation of the Web. HTTP URIs are dereferenceable by any standard HTTP client without additional infrastructure. This specification defines an HTTP-based Controlled Identifier [[CID-1.0]] Profile for Agent Identification on the Web.

Introduction

HTTP URIs form the foundation of the Web. Importantly, HTTP URIs are dereferenceable by any standard HTTP client without additional infrastructure. This specification defines an HTTP-based Controlled Identifier (CID) [[CID-1.0]] Profile for Agent Identification on the Web.

A CID is a URI. Dereferencing a CID yields the CID document that may detail e.g. the CID's verification methods or its controller. How exactly a CID is to be derefenced remains unspecified by [[CID-1.0]]. It depends on the particular type of URI used as the CID.

To achieve interoperability in CID-based agent identification and subsequent authentication on the Web, implementations of Clients and Servers require a common mechanism to dereference an agent's CID. This specification provides one such mechanism based on HTTP [[RFC9110]] and the architecture of the Web itself [[WEBARCH]].

This specification is for:

Identity provider server developers who want to identify agents on the Web using HTTP-based CIDs and to enable their server to support dereferencing an agent's CID for agent authentication - implementing a Server of Web-CID Agent Documents;
Authorization server developers who want to enable their server to support dereferencing an agent's CID for agent authentication - implementing a Client processing Web-CID Agent Documents;
Application developers who want to enable their application to support dereferencing an agent's CID, e.g., to obtain the agent's cryptographic assertion keys - implementing a Client processing Web-CID Agent Documents.

HTTP-based CIDs allow a user to simply click on a CID they encounter on a Web page to obtain more information directly in their Web browser.

Use Case: LWS UCS #48

Terminology

This specification adopts terminology from [[CID-1.0]], including but not limited to:

controlled identifier document: as defined by [[CID-1.0]].
subject: as defined by [[CID-1.0]].
base identifier: as defined by [[CID-1.0]].
canonical URL: as defined by [[CID-1.0]].
controller: as defined by [[CID-1.0]].
verification method: as defined by [[CID-1.0]].

This specification further defines the following terminology:

agent: An agent is an entity that is able to initiate or perform actions, e.g., a person, an organisation, or a software application.
agent identifier: An HTTP URI identifying an agent.

Conformance

This section describes the conformance model of the Web-CID Profile.

Normative and Informative Content

All assertions, diagrams, examples, and notes are non-normative, as are all sections explicitly marked non-normative. Everything else is normative.

The key words “MUST”, “MUST NOT”, “SHOULD”, and “MAY” are to be interpreted as described in BCP 14 [[!RFC2119]] [[!RFC8174]] when, and only when, they appear in all capitals, as shown here.

The key words “strongly encouraged”, “strongly discouraged”, “encouraged", “discouraged", “can", “cannot”, “could”, “could not”, “might”, and “might not” are used for non-normative content.

Specification Category

The Web-CID Profile identifies the following Specification Category to distinguish the types of conformance: notation/syntax, processor behavior, protocol.

Classes of Products

The Web-CID Profile identifies the following Classes of Products for conforming implementations. These products are referenced throughout this specification.

Web-CID Agent Document: A CID document that describes an agent identified by an HTTP URI.
Server: A Server that responds to HTTP requests to provide an Web-CID Agent Document.
Client: A Client that issues HTTP requests to obtain and process an Web-CID Agent Document.

Interoperability

Client–Server interoperability: Interoperability of implementations for Client and Servers is tested by evaluating an implementation’s ability to request, respond and process HTTP messages that conform to this specification. Interoperability is achieved when a Client, given an agent identifier, can successfully obtain and validate the corresponding Web-CID Agent Document from any conforming Server, and thereby establish the authoritative binding between the agent identifier and Web-CID Agent Document, and subsequently, the agent's verification methods.

Agent Identifier

An agent and a Web-CID Agent Document are two distinct resources [[WEBARCH]], which cannot be identified by the same URI at the same time. This specification thus distinguishes between an agent identifier and an Web-CID Agent Document's identifier. Importantly, the two identifiers denote different things:

An agent identifier is an HTTP URI identifying an agent [[!URI]].
An Web-CID Agent Document's identifier is an HTTP URI identifying the corresponding information resource [[!WEBARCH]], i.e., the document describing the agent.

A Client can obtain the agent identifier's corresponding Web-CID Agent Document by dereferencing the agent identifier via HTTP [[RFC9110]]. A Client then validates the corresponding Web-CID Agent Document to be indeed authoritative for the expected agent identifier.

An agent identifier SHOULD be an HTTP URI that includes a fragment [[!URI]].

When a Server receives an HTTP request targeting an agent identifier that does not include a fragment, the Server MUST respond with a redirect and provide the corresponding Web-CID Agent Document identifier in the response's Location header field.

Web-CID Agent Document

A Web-CID Agent Document is a CID document; all property definitions by [[!CID-1.0]] apply. In addition, this specification defines properties, restrictions, and examples as follows.

A Web-CID Agent Document thus describes the verification methods of an agent. It describes which verification methods might be used to make assertions about its primaryTopic, the agent. Such assertions can be used as authentication credentials, e.g., ID tokens [[OPENID-CONNECT-CORE]], SAML assertions [[SAML2-CORE]], or custom JSON Web Tokens [[RFC7519]], to authenticate an agent.

{
  "@context": [ 
    "https://www.w3.org/ns/cid/v1", 
    "https://www.w3.org/ns/web-cid/v1" 
  ],
  "id": "https://example.org/001",
  "primaryTopic": "https://example.org/001#agent",
  "controller": "https://example.org/001#agent",
  "authentication": [{
        "id": "https://example.org/001#key0",
        "type": "Multikey",
        "controller": "https://example.org/001#agent",
        "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
  }]
}

Context

To ensure broad interoperability across implementations of Clients and Servers, a Web-CID Agent Document MUST specify JSON-LD's @context property to indicate its entire applicable context.

Concretely, a Web-CID Agent Document MUST at least include the JSON-LD context defined in [[CID-1.0]] and the JSON-LD context defined by in this specification.

Let the following JSON-LD context be available at: https://www.w3.org/ns/web-cid/v1.

{
  "@context": {
    "primaryTopic": {
      "@id": "http://xmlns.com/foaf/0.1/primaryTopic",
      "@type": "@id"
    }
  }
}

ID

A Web-CID Agent Document MUST specify the id property to indicate its own identifier.

The CID specification [[!CID-1.0]] defines the subject of a CID document to be identified by the CID document's base identifier which also MUST be the same as the canonical URL to retrieve the CID document from.

This means that the URI identifying the subject is the same as the URL of the CID document. When distinguishing between an agent and a CID document describing the agent, this then means that the subject of a CID document is the CID document itself. The base identifier, the canonical URL of the CID document, cannot identify an agent at the same time.

Primary Topic

A Web-CID Agent Document MUST specify the primaryTopic property to indicate the agent it describes [[!FOAF]]. The value of the primaryTopic property is thus an agent identifier. The value of the primaryTopic property MUST be different from the value of the id property.

Controller

A Web-CID Agent Document MUST specify at least one controller property to indicate any controller of the Web-CID Agent Document.

A controller of a CID document is capable of modifying the document's contents. Examples of controllers include

the agent that is listed as the document's primaryTopic
an agent acting as a guardian of the agent that is listed as the document's primaryTopic, e.g., a parent of a child
a third party controlling the CID document, e.g., an Identity Provider
the entity exerting URI ownership [[WEBARCH]] of the CID

Verification Method

A Web-CID Agent Document might include one or more verification methods, e.g., for authentication or claim assertion, the choice of which depends on the particular protocols the agent takes part in. See [[CID-1.0]] for more details on verification relationships.

{
  "@context": [ 
    "https://www.w3.org/ns/cid/v1", 
    "https://www.w3.org/ns/web-cid/v1"
  ],
  "id": "https://example.org/001",
  "primaryTopic": "https://example.org/001#agent",
  "controller": "https://example.org/001#agent",
  "authentication": [{
        "id": "https://example.org/001#key0",
        "type": "Multikey",
        "controller": "https://example.org/001#agent",
        "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
  }]
}

Service

A Web-CID Agent Document might include one or more services, e.g., to express ways of communicating with the controller, or associated entities. A service can be any type of service the controller wants to advertise for further discovery, authentication, authorization, or interaction.

{
  "@context": [ 
    "https://www.w3.org/ns/cid/v1", 
    "https://www.w3.org/ns/web-cid/v1"
  ],
  "id": "https://example.org/001",
  "primaryTopic": "https://example.org/001#agent",
  "controller": "https://example.org/000#provider",
  "service": [{
        "type": "https://example.org/serviceTypes#OpenIdProvider",
        "serviceEndpoint": "https://example.org/"
  }]
}

Server

A Server MUST conform to HTTP Semantics [[!RFC9110]]. A Server MUST use TLS connections through the https URI scheme in order to secure the communication with Clients.

If a Server provides redirects from an agent identifier to a corresponding Web-CID Agent Document, then the Server SHOULD use a 303 status code and provide the URL of the Web-CID Agent Document in the Location header field.

A Server MUST include a Content-Type header field in a message that contains content.

Content-Negotiation

A Server MUST support content-negotiation for the media types of application/cid [[!CID-1.0]], application/json [[!JSON]], and application/ld+json [[!JSON-LD]].

When serving a response based on content negotiation, a Server MUST include a Vary: Accept header field in the response to ensure proper caching behavior [[!RFC9110]].

Cross-Origin Resource Sharing (CORS)

A Server MUST support Cross-Origin Resource Sharing (CORS) [[!FETCH]].

Concretely, whenever a Server receives an HTTP request containing a valid Origin header [[!RFC6454]], the server MUST respond with the appropriate Access-Control-* header fields as specified in the CORS protocol [[!FETCH]]. A Server MUST also support the HTTP OPTIONS method [[!RFC9110]] such that it can respond appropriately to CORS preflight requests.

Client

A Client MUST conform to HTTP Semantics [[!RFC9110]].

A Client MUST use the Accept header field in a HTTP GET request to indicate acceptable media types for the requested Web-CID Agent Document [[!RFC9110]].

Dereferencing an Agent Identifier

If a Client does not expect a given HTTP URI to be an agent identifier, the Client derferences the URI as usual [[!RFC9110]][[!CID-1.0]].

If a Client expects a given HTTP URI to be an agent identifier, dereferencing that identifier is expected to yield its corresponding Web-CID Agent Document:

If the agent identifier includes a fragment, a Client strips the fragment from the HTTP URI as usual [[!RFC9110]]. The Client then dispatches an HTTP request to the resulting URI.
If the agent identifier does not include a fragment, the Client checks that the Server does not conflate the agent with its corresponding Web-CID Agent Document: When dereferencing this URI, if the Server responds with a 200 status code directly (implying the URI identifies the document itself), the Client MUST reject the response and provide a client-defined error. The Client MUST only accept a redirect response (e.g., 303 status code) that provides a distinct URI to the Web-CID Agent Document in the Location header field.

Whether or not a Client allows following redirects depends on the trust model assumed by the implementation:

Some Clients might trust the controller of a Web-CID Agent Document to configure their Servers correctly and thus follow any redirect.
Some Clients might enforce stricter trust boundaries and thus follow only redirects that point to the same origin as the agent identifier's origin.
Some Clients might choose not follow any redirect.

If a Client chooses to not follow redirects, then the Client MUST provide a client-defined error as the result of the process.

Validating an Agent Identifier against a Web-CID Agent Document

A Client MUST reject a non-conforming Web-CID Agent Document and provide a client-defined error.

In addition, a Client MUST verify that the value of the primaryTopic property for the Web-CID Agent Document matches the expected agent identifier, i.e., the exact URI inlcuding any fragment, to validate that the obtained document is indeed authoritative for the given agent identifier.

Security Considerations

As this profile is an application of the Controlled Identifier (CID) specification all security considerations defined in [[CID-1.0]] apply directly to this specification.

The following HTTP-specific considerations also apply:

HTTP Fetching Risks: Because Clients may handle HTTP redirects, implementations are subject to standard HTTP fetching risks, including redirect loops and resource exhaustion. Clients are strongly encouraged to apply standard mitigation strategies as defined in the Fetch Standard [[FETCH]].
Redirects: A Client that chooses to follow redirects assumes the risk of the redirecting Server being mistakingly or maliciously configured and thus points to a malicious Server. In such a case, the Client might obtain a valid Web-CID Agent Document and validate it to be authoritative despite it being malicious. This risk can be mitigated by a Client by not following redirects, which means to only accept agent identifiers that include a fragment [[URI]].
Reliance on DNS: As for any HTTP-based communication that relies on DNS, there exists the risk of a malicious actor being able to modify the DNS record of a particular domain name entry to point to a malicious Server. In such a case, the Client might obtain a valid Web-CID Agent Document and validate it to be authoritative despite it being malicious. This risk can be addressed by mechanisms presented in [[RFC4033]].
Authoritative Binding Validation: The security of this profile relies entirely on the Client successfully executing the validation steps. Clients strictly verifying the id and primaryTopic properties as defined is a security critical control to prevent cryptographic material substitution.

This specification only provides one component to be used in an authentication, authorization, or more general, interaction protocol on the Web. While composing such protocols is common-place in software engineering, it poses in inherent risk from a formal security perspective: Composing a protocol from components that are individually sound does not entail that the resulting protocol is also sound. To ensure desired security properties hold for the composed protocol, the authors of this specification strongly recommend making trust assumptions explicit and verifying the protocols' security properties using formal methods, e.g., following the approach presented in [[BHKM24]].