2. A2A over MQTT

This section defines agent-to-agent (A2A) messaging over MQTT. Agents are named long-running processes that publish presence, accept tasks, and may delegate tasks to peers. The wire protocol is small: a retained card per agent, a single inbox topic per agent, two response topics per agent (broadcast and per-task), and a defined task lifecycle.

2.1 Topic shapes

Agents MUST use the following topic patterns. {ns} is the namespace from Substrate, topic conventions, {agent_id} is the agent's identifier from Substrate, identity claims, and {task_id} is a task-scoped identifier (typically a UUID v4).

Topic	Direction	QoS	Retained	Purpose
`{ns}/agents/{agent_id}/card`	publish (agent)	1	yes	Agent card with capabilities and endpoints.
`{ns}/agents/{agent_id}/status`	publish (agent)	1	yes	Lightweight online/offline status.
`{ns}/agents/+/card`	subscribe (peer)	1	n/a	Wildcard agent-card discovery.
`{ns}/tasks/{agent_id}/inbox`	publish (sender), subscribe (recipient)	1	no	Inbound task notification, shape `{ "task_id": ... }`.
`{ns}/tasks/{agent_id}/results`	publish (sender of result), subscribe (delegator)	1	no	Async result callback for tasks the subscriber delegated.
`{ns}/tasks/{task_id}/result`	publish (executor), subscribe (requester)	1	no	Per-task result for synchronous request/response (see Request/response correlation below).

Agent identifiers MUST follow Substrate, topic conventions. Implementations MUST NOT publish to a peer's card or status topic; an agent MUST publish only its own retained presence. Brokers MAY enforce this; the protocol assumes it.

2.2 Presence

An agent MUST establish presence in this order on start:

CONNECT to the broker with persistent session (Substrate, session and presence) and a Will configured to publish an offline payload (see below) to the agent's status topic with qos=1 and retain=true.
PUBLISH the agent card (see Agent cards below) to {ns}/agents/{agent_id}/card with qos=1, retain=true and status="online".
PUBLISH a lightweight status payload to {ns}/agents/{agent_id}/status with qos=1, retain=true and status="online".
SUBSCRIBE to its own inbox topic {ns}/tasks/{agent_id}/inbox if it accepts work, and to {ns}/tasks/{agent_id}/results if it delegates work.

On graceful shutdown (stop), the agent MUST first publish updated retained card and status documents with status="offline", then DISCONNECT normally. Implementations MAY inject 0 to 3 seconds of random jitter at startup to avoid thundering-herd presence storms when many agents come up together.

The lightweight status payload SHOULD have the following shape:

{
  "status": "online",
  "agent": "agent-a",
  "version": "0.1.0",
  "timestamp": "2026-05-07T10:00:00.000Z"
}

Required fields: status (one of "online" or "offline"), agent (the publisher's agent_id). Optional fields: version (publisher SDK or app version), timestamp (ISO-8601 UTC). Implementations MAY include additional fields; consumers MUST tolerate unknown fields.

Example (non-normative): mosquitto_sub example for an operator watching one peer:

mosquitto_sub -h broker.example -t 'myapp/agents/agent-a/status' -q 1

2.3 Request/response correlation

The protocol supports two delegation modes from a sender (call it S) to a recipient (call it R):

Mode A, fire-and-forget (delegate). S records the task in its persistence layer, then publishes a notification envelope to {ns}/tasks/{R}/inbox:

{ "task_id": "abc-123" }

The envelope carries only the task identifier; R reads the full task, prompt, and conversation thread from the shared persistence layer using task_id. When R finishes, it MUST publish a result to {ns}/tasks/{S}/results (the broadcast results topic) and SHOULD also publish to {ns}/tasks/{task_id}/result for any synchronous waiter (Mode B).

Mode B, synchronous (request). S SUBSCRIBES to {ns}/tasks/{task_id}/result before publishing the inbox notification, then awaits the first message on that topic up to a configurable timeout. R MUST publish the same result envelope to that topic after completion; senders MUST treat the absence of a message within the timeout as a failure and SHOULD unsubscribe from the per-task topic to avoid topic accumulation.

In both modes, the result envelope has the following shape:

{
  "task_id": "abc-123",
  "status": "completed",
  "result": "agent's response text"
}

status MUST be one of "completed" or "failed". On failure, result SHOULD contain a human-readable reason. Implementations MAY include additional fields for application metadata, but consumers MUST tolerate unknown fields.

When publishing requests over MQTT 5, senders SHOULD set the Response Topic property to {ns}/tasks/{task_id}/result and the Correlation Data property to a binary representation of task_id. Recipients SHOULD echo the received Correlation Data on their response PUBLISH so MQTT-5-aware clients can correlate without parsing the JSON payload. Implementations whose broker or client library does not surface MQTT 5 properties MAY fall back to JSON-only correlation via the task_id field.

Example (non-normative): The A2A task lifecycle is:

waiting_approval -> pending -> executing -> completed
                                         \
                                          -> failed

waiting_approval is a sender-side state used when the task requires human or system approval before the inbox notification is published. pending indicates the inbox notification has been published. executing is recorded by the recipient when it begins work. completed and failed are terminal. Implementations MAY add states (for example, cancelled) but MUST treat completed and failed as the canonical terminal states for the result envelope's status field.

2.4 Agent cards (discovery)

An agent's card is a retained JSON document advertised on {ns}/agents/{agent_id}/card. It describes who the agent is and where to talk to it. The card MUST use content type application/json; the suggested structured content type is application/vnd.mqtt-agent.card+json. Senders MAY set the MQTT 5 Content Type property to the structured type; bare application/json is fully conformant.

Card schema (v0.1):

{
  "mqtt_agent_version": "0.1",
  "version": "1",
  "name": "agent-a",
  "namespace": "myapp",
  "capabilities": ["task-type-1", "task-type-2"],
  "version_info": { "sdk": "0.1.0", "app": "1.0.0" },
  "endpoints": {
    "inbox":   "myapp/tasks/agent-a/inbox",
    "results": "myapp/tasks/agent-a/results",
    "status":  "myapp/agents/agent-a/status"
  },
  "status": "online",
  "last_seen": "2026-05-07T10:00:00.000Z"
}

Required fields: mqtt_agent_version (see Substrate, common card fields), version, name, namespace, capabilities, endpoints, status, last_seen. Optional fields: version_info. Implementations MAY include additional fields; consumers MUST tolerate unknown fields. version MUST be the literal string "1" for cards conforming to v0.1 of this specification.

Discovery has two modes:

By name (exact, ACL-tolerant). A peer SUBSCRIBES to {ns}/agents/{name}/card with QoS 0 or 1, receives the retained card, then UNSUBSCRIBES. This MUST work on any conforming broker, including those that block wildcard subscriptions.
By wildcard (enumeration). A peer SUBSCRIBES to {ns}/agents/+/card and collects retained cards within a short window (1 to 3 seconds RECOMMENDED). This MUST be supported on brokers that do not enforce wildcard restrictions; on brokers that do (Substrate, wildcards and discovery), implementations SHOULD warn the caller.

Implementations MUST clear the online card on graceful shutdown by re-publishing the retained card with status="offline". The Will configured at CONNECT covers unclean shutdowns (Substrate, session and presence).

Feedback

Found an issue with the spec? Have a proposal or an RFC comment? The public spec repository is being prepared. In the meantime, send feedback to efi@cloudsignal.io.

Substrate

MCP over MQTT