RFC: The CaSH Model for MCP Server Design


Status	Informational Draft
Version	0.1.1
Category	Proposal
Published	2026-04-24
Author	Stefan Loesch
Contact	stefan@aigon.ai · LinkedIn
URL	aigon.ai/rfc-mcp-cash-model

Abstract

This document specifies the CaSH Model, a design pattern for Model Context Protocol (MCP) servers that minimizes context window consumption while preserving full functional expressiveness. The pattern exposes exactly three top-level MCP tools — call, help, and skill — backed by a hierarchical namespace structure with progressive disclosure. Servers conforming to this specification SHOULD expose no more than these three tools at the MCP protocol layer; all domain-specific functions are accessed through the call tool's namespace routing. This document provides normative requirements, conformance criteria, and non-normative examples for implementors.

1. Status of This Document

This document is an informational draft proposing a design pattern for MCP server design. It does not define an Internet Standard. Distribution is unlimited.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119] and [RFC 8174] when, and only when, they appear in all capitals.

2. Introduction and Motivation

2.1 Context Economy

The Model Context Protocol (MCP) enables language models to invoke server-provided tools. Each tool registered at the MCP layer consumes context window tokens: its name, parameter schema, and description are present in the model's working context for the duration of the session, whether or not the tool is ever invoked.

This specification is for deployments that treat context as a finite, costly resource. Context window size is orthogonal to context economy — even large windows suffer from quality degradation when filled with irrelevant material, and every token spent on tool definitions is a token unavailable for reasoning. If context efficiency is not a concern for a given deployment, this specification does not apply.

2.2 The Scaling Problem

A server that registers each domain function as a separate MCP tool creates linear growth in context overhead: N functions produce N tool definitions. Aggregating M servers produces M×N tool definitions. This growth is multiplicative and unbounded.

Some MCP hosts (e.g., Claude Code) implement their own progressive disclosure of tool definitions, loading only tool names initially and expanding descriptions on demand. However, this is host-dependent — not all MCP hosts support it — and even where it exists, every registered tool name still occupies context. A server with 200 functions still registers 200 tool names at the MCP layer.

The call primitive (Section 4.1) solves this at the server level, independent of host capabilities. It collapses an arbitrary number of domain functions behind a single MCP tool registration. The model sees one tool definition regardless of whether the server exposes 5 functions or 500. Progressive disclosure via help (Section 4.2) lets the model discover capabilities on demand, paying context cost only for information it actively needs.

A further advantage is that help and skill responses are authored by the server implementor, not generated by a framework. Standard MCP tool descriptions are typically verbose, structured for human documentation, and carry redundant schema information. CaSH discovery text can be written to be deliberately compact and targeted for LLM consumption — optimized for token efficiency rather than human readability.

2.3 Context-Dependent Visibility

Because help responses are ordinary return values — not protocol-level registrations — a CaSH server can trivially vary which namespaces and functions are visible based on the caller's authentication, role, or subscription tier. An authenticated user sees only the capabilities they have access to. The model never encounters functions it cannot invoke, eliminating wasted discovery and failed call attempts.

MCP does provide a notifications/tools/list_changed mechanism that allows servers to signal changes to the tool list at the protocol layer. However, this requires re-registering tools, regenerating full JSON Schema definitions, and relying on the host to re-fetch and re-process the entire tool list — a heavier operation that not all hosts handle reliably. With CaSH, filtering is a server-side concern resolved in application code: help() simply omits what the caller cannot see.

2.4 Goals

This specification defines a pattern that:

Bounds context overhead to exactly three tool definitions, independent of server count or function count.
Preserves full discoverability of server capabilities through progressive disclosure.
Supports aggregation of multiple logical servers under a single MCP interface without context explosion.
Provides text-only, code-free task instructions through a composable skill mechanism.

2.5 Non-Goals

This specification does not define:

Wire-level MCP protocol details.
Authentication or authorization mechanisms.
Specific programming languages or frameworks.
Strategies for streaming or partial responses.
Executable skill code or code execution environments (see Section 4.3).

3. Terminology

CaSH — The pattern described in this document. Abbreviation of Call, Skill, Help.

Namespace — A logical grouping of related functions or sub-namespaces, identified by a short string label (e.g., orders, inventory, crm). Labels are normalized to lowercase with underscores stripped (see Section 5.1).

Root Namespace — The default namespace, targeted when namespace is omitted or empty. Typical for small servers that expose functions directly without internal namespacing; the aggregator assigns a namespace label externally.

Hierarchical Namespace — A namespace that contains sub-namespaces, forming a tree structure with dot-separated levels (e.g., orders.retail, orders.wholesale). A hierarchical namespace MAY contain both sub-namespaces and functions at the same level.

Function — A named, invocable capability within a namespace (e.g., orders.create, inventory.list).

Tool — An MCP-layer construct registered with the MCP host. CaSH servers register three core tools: call, help, and skill.

Progressive Disclosure — The pattern of revealing information in layers, providing summary information at outer layers and detail at inner layers.

Context Overhead — The number of tokens consumed by tool definitions in the model's context window before any user input is processed.

Aggregator — A CaSH server that routes calls across multiple underlying servers, exposing them all under a single set of three tools.

Static Aggregation — An aggregation mode in which the namespace registry is configured at deployment time and does not change at runtime.

Dynamic Aggregation — An aggregation mode in which the namespace registry is built at runtime by fetching Server Manifests from upstream servers.

Server Manifest — A YAML document served at /.well-known/cash-mcp.yaml that describes a CaSH server's identity, endpoint, authentication requirements, and suggested namespaces.

Suggested Namespace — A namespace label recommended by a server in its manifest. Aggregators are not required to use it.

4. The CaSH Primitives

A CaSH server MUST register the three core MCP tools: call, help, and skill. Servers SHOULD NOT register additional tools at the MCP protocol layer. Registering additional tools is permitted but discouraged: each additional tool increases context overhead, defeats the aggregation benefits of the CaSH pattern, and is generally better served by exposing the functionality through a namespace within call.

The three primitives correspond to three distinct needs when interacting with an API:

call — execute. Invoke a function. This is the API itself.
help — discover. Learn what functions exist, what they accept, and what they return. This is how you navigate the API.
skill — compose. Learn how to combine multiple call and help invocations to accomplish a task. Users do not think in API calls; they think in goals. Skills bridge this gap by describing multi-step workflows that may reference help for discovery, span functions within a single namespace, across multiple namespaces, or even involve connecting to additional servers.

4.1 `call`

The call tool is the sole mechanism for invoking domain functions. It collapses an arbitrary number of domain functions behind a single MCP tool registration: the model sees one tool definition regardless of how many functions the server (or aggregator) exposes.

Purpose: Execute a named function within a namespace.

Parameters:

Parameter	Type	Required	Description
`namespace`	string	No	Target namespace. For hierarchical namespaces, use dot notation (e.g., `orders.retail`). If omitted or empty, targets the root namespace.
`function`	string	Yes	Function name within the namespace (e.g., `create`).
`kwargs`	object	No	Key-value arguments passed to the function. Defaults to empty object if omitted.
`sizelimit`	integer	No	Override the default gate threshold (in characters) for this request. See Section 7.

Behavior:

The server MUST route the call to the function identified by the combination of namespace and function.
If namespace does not identify a known namespace, the server MUST return an error response (suggested code: UNKNOWN_NAMESPACE).
If function does not identify a known function within the resolved namespace, the server MUST return an error response (suggested code: UNKNOWN_FUNCTION).
How the server handles extra, missing, or invalid keys in kwargs is implementation-defined.
The server MUST apply output size gating to the return value (see Section 7). Ungated responses can consume arbitrary context and incur significant cost.
The server MUST NOT execute arbitrary code supplied in kwargs values.

Response. The response format of call is entirely implementation-defined. This specification does not constrain the structure, encoding, or content of call responses — that is the domain of the individual function contract. CaSH is a routing and discovery layer; it does not interfere with the data plane.

4.2 `help`

The help tool implements progressive disclosure of server capabilities.

Purpose: Return documentation about namespaces, functions, or parameters, at a level of detail determined by the arguments supplied.

Parameters:

Parameter	Type	Required	Description
`namespace`	string	No	Target namespace. For hierarchical namespaces, use dot notation (e.g., `orders`, `orders.retail`).
`function`	string	No	Target function within `namespace`.
`kwargs`	object	No	Additional keyword arguments (e.g., `format`, `examples`, `params`). See below.

Behavior by invocation form:

Invocation	Response
`help()`	List of top-level namespaces with a one-line description each.
`help(namespace)`	List of child namespaces and/or functions within the namespace, with one-line descriptions each.
`help(namespace, function)`	Documentation for the specified function. MAY contain a subset of parameters (e.g., the most common ones) and advertise additional keyword arguments to retrieve more detail (e.g., `params="full"`). Any such arguments MUST be documented in the base response.

The server SHOULD NOT return more information than requested at each level. A response that lists namespaces SHOULD contain only the namespace name and a single descriptive line per entry. A response that lists functions SHOULD contain only the function name and a single descriptive line per entry.

Root namespace. The namespace parameter MAY be omitted or empty, in which case the call targets the root namespace. The root namespace is the default namespace for servers that do not require internal namespacing — typically small servers whose functions are only namespaced externally when aggregated with other servers. The root namespace SHOULD NOT contain sub-namespaces, though this is not prohibited.

Hierarchical namespaces. Namespaces MAY be hierarchical. A namespace orders may contain sub-namespaces orders.retail and orders.wholesale, and MAY also contain functions directly (e.g., orders.summary). Discovery follows the hierarchy: help() lists top-level namespaces; help(namespace="orders") lists both child namespaces and direct functions; help(namespace="orders.retail") lists that sub-namespace's contents. The depth of the hierarchy is not limited by this specification, but implementations SHOULD keep hierarchies shallow (2–3 levels) for usability.

Additional keyword arguments. The help tool MAY accept additional keyword arguments beyond namespace and function (e.g., examples=true, detailed=true, model="haiku"). These allow the help text author to customize responses for different contexts. Any additional arguments supported by a server MUST be documented in the base help() response so they are discoverable. Servers MUST NOT withhold help text or return an error when they receive unrecognised keyword arguments; they SHOULD silently ignore arguments they do not understand, and MAY note in the response that certain arguments were ignored.

Response. Help responses SHOULD be markdown. Markdown is well-understood by LLMs, compact, and renders readably in most contexts. Servers MAY support alternative formats via a format keyword argument (e.g., format="json", format="markdown"), but markdown is the RECOMMENDED default. The response SHOULD clearly communicate the information appropriate to the invocation level, and SHOULD distinguish between sub-namespaces (which can be drilled into) and functions (which can be called) when both are present. Help text MAY be specifically crafted for LLM consumption and MAY vary by caller or context.

help responses SHOULD NOT include references to skills. Skills and namespaces/functions are separate hierarchies: skills sit on top of calls as a composition layer, and mixing them into help responses conflates the two. Skill discovery is the domain of the skill tool.

A namespace MAY contain both sub-namespaces and functions at the same level — this is explicitly allowed and often a useful pattern (e.g., a namespace with convenience functions alongside more specialized sub-namespaces).

The following examples illustrate possible response structures. These are non-normative.

Example: help() response (markdown)

# Available Namespaces

- **orders** — Create and manage customer orders.
- **inventory** — Query and update stock levels.
- **crm** — Manage contacts and accounts.

Supported arguments: `format` (json|markdown), `examples` (true|false).

Example: help(namespace="orders") response (markdown)

# orders

## Sub-namespaces

- **retail** — Retail channel orders.
- **wholesale** — Wholesale and bulk orders.

## Functions

- **summary** — Aggregate order statistics.
- **cancel** — Cancel an existing order.

Example: help(namespace="orders", function="summary") response (markdown)

# orders.summary

Aggregate order statistics for a given time period.

## Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| since | string | no | ISO 8601 date filter. |
| groupby | string | no | Group by: day, week, month. Default: day. |

## Returns

Object with `total_orders`, `total_revenue`, and `breakdown` array.

Example: help() response (JSON, via format="json")

{
  "namespaces": [
    { "name": "orders", "description": "Create and manage customer orders." },
    { "name": "inventory", "description": "Query and update stock levels." }
  ]
}

4.3 `skill`

The skill tool returns text-only task instructions for accomplishing multi-step goals.

Purpose: Deliver natural-language instructions that describe how to accomplish a named task, potentially involving multiple function calls, conditional logic, or coordination across namespaces.

Parameters:

Parameter	Type	Required	Description
`namespace`	string	No	Target namespace. If omitted or empty, targets the root namespace.
`skillname`	string	Yes	Identifier for the skill (e.g., `createorderwithcheck`).
`kwargs`	object	No	Named inputs that parameterize the skill instructions.

Behavior:

The server MUST return a natural-language description of how to accomplish the task, suitable for direct consumption by a language model.
Skill responses MUST be text-only. Servers MUST NOT include executable code, code blocks intended for execution, or instructions that assume a code execution environment. The skill tool is a knowledge-delivery mechanism, not a code-delivery mechanism.
Skill responses MAY reference other skills by name, using the convention skill(skillname="<name>").
Skill responses MAY reference help and call — skills sit on top of both primitives. A skill instruction may direct the model to consult help for further detail or examples before making call invocations.
If skillname does not identify a known skill within the resolved namespace, the server MUST return an error response (suggested code: UNKNOWN_SKILL).
Skill text SHOULD be self-contained: it MUST NOT assume context from prior tool calls or session state that may not exist.

Relationship to Agent Skills. The skill tool serves a similar role to the Agent Skills open format, which packages task instructions as SKILL.md files with metadata frontmatter and optional bundled scripts. CaSH skills differ in one critical respect: they are delivered via MCP and are strictly text-only. The Agent Skills format permits bundled executable code (scripts/, references/, assets/); CaSH skills MUST NOT. A CaSH server MAY source its skill text from Agent Skills SKILL.md files, but MUST strip or ignore any executable content. This constraint is intentional: a CaSH skill server is a read-only knowledge source, not a code execution surface.

Response. Skill responses SHOULD conform to the Agent Skills content model — structured natural-language instructions suitable for LLM consumption — with the constraint that executable code MUST NOT be included. The response format is otherwise implementation-defined.

4.4 Error Handling

Error handling is delegated to the MCP transport layer. When an error condition occurs (unknown namespace, unknown function, unknown skill, gated response, etc.), the server MUST communicate the error through whatever mechanism MCP provides. This specification does not prescribe an error response schema, error codes, or error format — those are concerns of the MCP protocol and the server implementation.

5. Namespace Convention

5.1 Namespace Naming

Namespace labels MUST consist solely of alphanumeric characters (a-z, A-Z, 0-9) and optional underscores (_). Hyphens, dots, slashes, whitespace, and other punctuation are prohibited.

Normalization. Servers MUST normalize namespace labels to a canonical form before matching by: (1) stripping all underscores, and (2) converting all characters to lowercase. The canonical character set is therefore a-z, 0-9 only. The labels Order_Mgmt, order_mgmt, orderMgmt, and ordermgmt all normalize to the canonical form ordermgmt and MUST NOT coexist as distinct entries.

Uppercase and underscores MAY be used for readability, but they carry no semantic meaning. This constraint eliminates ambiguity between visually similar names and reduces the likelihood of LLM hallucination in function routing.

The dot character (.) is reserved as the hierarchical namespace separator (Section 5.2) and MUST NOT appear within a single namespace label.

Examples: orders, inventory, crm, Order_Mgmt (canonical: ordermgmt)
Invalid: order-mgmt, my.namespace, order mgmt

5.2 Hierarchical Namespaces

Namespaces MAY be hierarchical, with levels separated by dots: orders.retail, orders.wholesale. Each level follows the same naming rules as top-level namespaces.

The call and help tools use the full dotted namespace path in the namespace parameter. Functions are always identified separately in the function parameter.

Examples: call(namespace="orders.retail", function="create", kwargs={...}), help(namespace="orders") → lists sub-namespaces retail, wholesale.

5.3 Identifier Naming and Normalization

The normalization rules defined for namespace labels (Section 5.1) apply uniformly to all identifiers in the CaSH layer:

Namespace labels (Section 5.1)
Function names
Skill names
Tool parameter names (e.g., namespace, function, skillname, kwargs, sizelimit)
Keys within kwargs

All identifiers MUST consist solely of alphanumeric characters (a-z, A-Z, 0-9) and optional underscores. Servers MUST normalize by stripping underscores and lowercasing before matching. The canonical form is a-z, 0-9 only.

contact_update, contactUpdate, and contactupdate all identify the same function. customer_id, customerID, and customerid all identify the same kwargs key. Servers SHOULD normalize kwargs keys before matching.

Function names SHOULD be expressive and precise, using verb-object form: getmetadata rather than just get.

Examples: create, list, contactupdate, getmetadata

5.4 Aggregation

An Aggregator server routes calls from a single set of CaSH tools to multiple underlying servers. From the MCP host's perspective, the aggregator is indistinguishable from a single server.

Aggregators MUST: - Expose the three CaSH tools (call, help, skill). - Route call(namespace, function, kwargs) to the appropriate underlying server based on namespace. - Merge namespace listings in help() responses across all underlying servers. - Present skills from all underlying servers in skill() responses, routed by namespace, disambiguated if skill names conflict.

Aggregators MAY expose additional tools beyond the three CaSH primitives (e.g., mount, unmount, management or diagnostic functions). Additional tools are the aggregator's own interface and are not part of the CaSH specification.

Namespace assignment is the aggregator's responsibility. A server advertises a suggested namespace (see Section 5.5); the aggregator MAY use that suggestion, modify it, or discard it entirely. Two servers that both suggest orders may be registered as shoporders and warehouseorders. Namespace labels MUST be unique within an aggregator's registry.

5.4.1 Static Aggregation

In static aggregation, the aggregator's namespace registry is configured at deployment time as an explicit list of (chosen_label, endpoint) pairs. The registry does not change at runtime.

Static aggregation is RECOMMENDED for production deployments where the set of upstream servers is known and stable. It requires no discovery protocol and carries no runtime trust decisions.

5.4.2 Dynamic Aggregation

In dynamic aggregation, the aggregator discovers upstream servers at runtime by fetching their Server Manifests (Section 5.5). The aggregator is configured with a list of manifest URLs; it fetches each manifest, reads the suggested_namespace and other metadata, resolves any label conflicts, and builds its namespace registry dynamically.

Aggregators performing dynamic aggregation MUST: - Fetch manifests over HTTPS. - Validate that the manifest's endpoint field matches the host from which the manifest was fetched, or is explicitly trusted by operator configuration. - Refresh manifests at intervals no shorter than the manifest's cache_ttl field, if present. - Treat a manifest fetch failure as a transient error; the namespace SHOULD remain visible in help() with an available: false indicator until the server recovers or the aggregator is reconfigured.

5.5 Server Manifest

A CaSH server SHOULD publish a machine-readable manifest at the well-known URL path:

/.well-known/cash-mcp.yaml

The manifest is a YAML document. Servers MUST serve it with Content-Type application/yaml. Servers MAY additionally serve it as application/json via content negotiation.

Manifest Schema:

cash_version: "<string>"        # REQUIRED. CaSH spec version this server implements (e.g. "1.0").
last_updated: "<ISO 8601 date>" # REQUIRED. Date of last manifest update.
publisher: "<string>"           # REQUIRED. Name of the publishing organisation or individual.
copyright: "<string>"           # OPTIONAL. Copyright statement (e.g. "2026 Aigon AI").
endpoint: "<URL>"               # REQUIRED. MCP endpoint URL (SSE or stdio descriptor).
cache_ttl: <integer>            # OPTIONAL. Seconds aggregators should cache this manifest. Default: 3600.
authentication:                 # OPTIONAL. Omit if the server requires no authentication.
  scheme: "<string>"            # REQUIRED if block present. One of: none, bearer, apikey, oauth2.
  docs_url: "<URL>"             # OPTIONAL. URL to authentication documentation.
namespaces:                     # REQUIRED. At least one entry.
  - suggested: "<string>"       # REQUIRED. Suggested namespace label (short, lowercase, hyphens ok).
    description: "<string>"     # REQUIRED. One-line description.
skills:                         # OPTIONAL.
  - name: "<string>"            # REQUIRED if block present.
    description: "<string>"     # REQUIRED if block present.

Field notes:

suggested under namespaces is advisory only. Aggregators are not required to honour it.
Reverse-domain namespace suggestions (e.g., com.example.orders) are NOT RECOMMENDED. Dots are the hierarchical namespace separator, so com.example.orders would be interpreted as a three-level hierarchy requiring intermediate namespaces com and com.example to exist. Furthermore, after normalization (lowercasing, stripping underscores), reverse-domain labels can produce ambiguous canonical forms. Short, flat labels are preferred.
The skills list is provided so aggregators can advertise skills before connecting.

Example manifest:

cash_version: "1.0"
last_updated: "2026-04-24"
publisher: "Aigon AI"
copyright: "2026 Aigon AI"
endpoint: "https://mcp.aigon.ai/sse"
cache_ttl: 3600
authentication:
  scheme: bearer
  docs_url: "https://aigon.ai/docs/mcp-auth"
namespaces:
  - suggested: orders
    description: "Create and manage customer orders."
  - suggested: inventory
    description: "Query and update stock levels."
skills:
  - name: create-order-with-inventory-check
    description: "Create an order after verifying stock availability."

5.6 Server Identity Resource

In addition to the HTTP well-known endpoint, CaSH servers SHOULD expose their identity in-protocol via an MCP resource named:

cash://server

The resource content MUST be the server's manifest serialized as JSON. This allows an aggregator that already holds an MCP connection to re-validate server identity, check version compatibility, or retrieve authentication metadata without a separate HTTP request.

MCP hosts that surface resource listings MUST NOT treat cash://server as a user-facing data resource. It is an operator/aggregator interface.

6. Progressive Disclosure

Progressive disclosure is the foundational principle of the CaSH help interface. Its purpose is context economy: a model can orient itself at the top level with minimal token expenditure, then drill into detail only for capabilities it intends to use.

6.1 Layered Information Architecture

Information is structured in layers that follow the namespace hierarchy:

Layer 0 — Index. help() returns a list of top-level namespaces. Each entry SHOULD contain: name, one-line description. Each entry SHOULD NOT contain function lists, sub-namespace contents, or parameter information.

Layer N — Namespace. help(namespace) returns the contents of the specified namespace: child namespaces and/or functions, each with a one-line description. Each entry MUST NOT contain full signatures or parameter detail.

Leaf+1 — Function. help(namespace, function) returns documentation for one function. This MAY contain a subset of parameters (e.g., the most commonly used) rather than the full list, with additional keyword arguments available to retrieve more detail. Progressive disclosure applies within function documentation as well as across the namespace hierarchy.

For flat (non-hierarchical) namespaces, this reduces to three layers: index, namespace, function — identical to the non-hierarchical case.

6.2 Discipline Requirements

Servers MUST NOT conflate layers. A server that returns parameter detail in a namespace-level response, or function lists in a top-level response, violates this specification. The purpose of strict layering is to guarantee that context cost is proportional to the depth of discovery the model actually needs.

7. Output Size Gating

7.1 Rationale

call responses may return arbitrarily large data. Without gating, a single function call can flood the model's context window, causing the same class of problem that CaSH is designed to solve.

7.2 Normative Requirements

Servers MUST enforce a configurable response size limit (the gate threshold). The gate threshold is measured in characters. The gating does not need to be exact — the server SHOULD gate responses that are reasonably close to the threshold (within a few percent), but this specification does not require precise measurement. The goal is to prevent massive oversized responses, not to enforce byte-level precision.

When a call response exceeds the gate threshold, the server MUST NOT return the full response. Instead, the server MUST return a gated response that communicates:

That the response was gated (i.e., suppressed due to size).
A count or summary of the result set.
A suggested size limit that the caller can pass to retrieve the data. The server SHOULD provide a size limit value that, if supplied by the caller, would allow the response to go through. This accounts for variability — the actual response may be slightly larger or smaller than the gated one, so the suggested limit SHOULD include a reasonable margin.
A directive to the model to narrow the query (e.g., add filters, reduce page size) or to explicitly request the response with the suggested size limit.

Example gated response (markdown):

**Gated:** Response exceeds size limit (4821 orders, ~48,000 characters).

To retrieve, either:
- Add filters (e.g., status, date_range, customer_id) to narrow results
- Pass `sizelimit=50000` to allow the full response through

When a caller explicitly provides a size limit that accommodates the response, the server MUST return the full response.

7.3 Gate Threshold

The default gate threshold is 10,000 characters. The call tool SHOULD accept an optional sizelimit keyword argument that overrides the default gate threshold for a single request. Callers MAY always request a lower size limit than the server's default.

If a server uses a gate threshold higher than 10,000 characters, it MUST advertise the actual threshold in its top-level help() response so that callers can know the size limit in advance without trial and error. Servers using the default threshold or a lower one need not advertise it explicitly.

7.4 Streaming Responses

Output size gating does not apply to streaming responses. Streaming responses are assumed to be useful at every point in the stream — the consumer can disconnect when it has enough data. It is the caller's responsibility to manage context consumption for streaming responses.

8. Examples

8.1 Discovering Capabilities

A model wishing to understand available capabilities calls:

call: help()

Response:

# Available Namespaces

- **orders** — Create and manage customer orders.
- **inventory** — Query and update stock levels.
- **crm** — Manage contacts and accounts.

Supported arguments: `format` (json|markdown), `examples` (true|false).

8.2 Drilling Into a Namespace

call: help(namespace="orders")

Response:

# orders

## Functions

- **create** — Create a new order.
- **get** — Retrieve an order by ID.
- **list** — List orders with optional filters.
- **cancel** — Cancel an existing order.

8.3 Function Detail

call: help(namespace="orders", function="create")

Response:

# orders.create

Create a new order for a customer.

## Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| customer_id | string | yes | Unique customer identifier. |
| items | array | yes | Line items; each must have product_id and quantity. |
| notes | string | no | Free-text notes attached to the order. |

## Returns

Object with assigned order ID, status, line items with pricing, and total.

Additional parameters available with `params="full"`.

8.4 Invoking a Function

call: call(namespace="orders", function="create", kwargs={"customer_id": "C-42", "items": [{"product_id": "P-7", "quantity": 3}]})

Response:

{
  "result": {
    "order_id": "ORD-1091",
    "status": "confirmed",
    "customer_id": "C-42",
    "items": [{ "product_id": "P-7", "quantity": 3, "unit_price": 12.50 }],
    "total": 37.50
  }
}

8.5 Output Size Gating

call: call(namespace="orders", function="list", kwargs={})

Response (gated):

**Gated:** Response exceeds size limit (4821 orders, ~48,000 characters).

To retrieve, either:
- Add filters (e.g., status, date_range, customer_id) to narrow results
- Pass `sizelimit=50000` to allow the full response through

8.6 Unknown Function Error

call: call(namespace="orders", function="ship", kwargs={"order_id": "ORD-1091"})

Response:

**Error:** No function `ship` in namespace `orders`. Use `help(namespace="orders")` to see available functions.

8.7 Skill Lookup

call: skill(namespace="orders", skillname="createorderwithcheck")

Response:

# createorderwithcheck

To create an order with an inventory pre-check:

1. Call `help(namespace="inventory", function="check")` to confirm the check function's parameters.
2. Call `call(namespace="inventory", function="check")` for each product_id in the intended order, passing the required quantity. If any item returns `available=false`, report the shortage to the user before proceeding.
3. If all items are available, call `call(namespace="orders", function="create")` with the customer_id and items array.
4. Confirm the returned order_id to the user.

If `inventory.check` is unavailable, proceed directly to `orders.create`; the server will enforce stock constraints and return an error if items are out of stock.

9. Open Questions

Skill versioning. As server APIs evolve, skills referencing old function signatures will break silently. Should skills carry version identifiers, and should skill() support a version parameter? A companion document should address lifecycle management.
Manifest signing. In dynamic aggregation, a compromised manifest could redirect an aggregator to a malicious endpoint. Should manifests carry a cryptographic signature, and should aggregators be required to verify it? A companion document should address manifest integrity.
Authentication surface. A companion document or future revision should specify a minimal token-based auth mechanism and define how call, help, and skill behave when authentication fails.

Appendix A. LLM Quick Reference

This appendix is a progressive skill: start at Layer 0 and drill deeper only as needed.

Layer 0 — Basics

This server uses the CaSH pattern. You have exactly three tools:

Tool	Purpose	Parameters
`call`	Execute a function	`namespace`, `function`, `kwargs`
`help`	Discover what's available	`namespace`, `function`, `kwargs`
`skill`	Get task instructions	`namespace`, `skillname`, `kwargs`

Start here:

Call help() to see available namespaces.
Call help(namespace="x") to see functions in a namespace.
Call help(namespace="x", function="y") to see parameters.
Call call(namespace="x", function="y", kwargs={...}) to execute.
Call skill() to see available task workflows.

All parameters are keyword arguments. namespace may be omitted for root-level access.

Layer 1 — Discovery and Gating

Progressive discovery. help reveals information in layers — index, then namespace contents, then function detail. Each layer gives you just enough to decide whether to drill deeper. Don't call help for everything upfront; discover what you need as you go.

Output gating. Large responses are gated — you'll get a summary and a suggested sizelimit value instead of the full data. Either narrow your query with filters or pass sizelimit=N to call to allow the full response through.

Help kwargs. help may accept extra keyword arguments (e.g., examples=true, params="full", format="json"). Check the base help() response — supported arguments are listed there.

Skills. Skills describe multi-step workflows in plain text. They may tell you to call help for details, then make several call invocations. Follow them step by step. Skills may reference other skills by name.

Layer 2 — Edge Cases

Hierarchical namespaces. Namespaces can be nested with dots: orders.retail. Call help(namespace="orders") to see sub-namespaces and functions at that level. A namespace may contain both.

Root namespace. Small servers may not use namespaces internally. If help() shows functions directly (no namespace needed), call them with namespace omitted or empty.

Name normalization. Uppercase and underscores are visual sugar — Order_Mgmt and ordermgmt are the same namespace. The server normalizes all identifiers (namespaces, functions, skill names, kwargs keys) before matching.

Context-dependent visibility. What help shows may vary based on your authentication or role. If a function doesn't appear in help, you don't have access to it.

Gated response handling. When gated, prefer narrowing the query over raising the size limit. Filters are cheaper than large context. Only raise sizelimit when you genuinely need the full dataset.

10. References

Normative

[RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC 8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, May 2017.
[MCP] Anthropic, "Model Context Protocol Specification", https://modelcontextprotocol.io/

Informative

Loesch, S., "MCP Best Practices — The CaSH Model", Aigon Blog, 2026-04-23, https://aigon.ai/blog/2026-04-23-mcp-cash-model/
[Agent Skills] "Agent Skills Specification", https://agentskills.io/specification — Open format for packaging agent task instructions. CaSH skills serve a similar role but are restricted to text-only delivery via MCP (see Section 4.3).

RFC: The CaSH Model for MCP Server Design

Abstract

1. Status of This Document

2. Introduction and Motivation

2.1 Context Economy

2.2 The Scaling Problem

2.3 Context-Dependent Visibility

2.4 Goals

2.5 Non-Goals

3. Terminology

4. The CaSH Primitives

4.1 call

4.2 help

4.3 skill

4.4 Error Handling

5. Namespace Convention

5.1 Namespace Naming

5.2 Hierarchical Namespaces

5.3 Identifier Naming and Normalization

5.4 Aggregation

5.4.1 Static Aggregation

5.4.2 Dynamic Aggregation

5.5 Server Manifest

5.6 Server Identity Resource

6. Progressive Disclosure

6.1 Layered Information Architecture

6.2 Discipline Requirements

7. Output Size Gating

7.1 Rationale

7.2 Normative Requirements

7.3 Gate Threshold

7.4 Streaming Responses

8. Examples

8.1 Discovering Capabilities

8.2 Drilling Into a Namespace

8.3 Function Detail

8.4 Invoking a Function

8.5 Output Size Gating

8.6 Unknown Function Error

8.7 Skill Lookup

9. Open Questions

Appendix A. LLM Quick Reference

Layer 0 — Basics

Layer 1 — Discovery and Gating

Layer 2 — Edge Cases

10. References

Normative

Informative

4.1 `call`

4.2 `help`

4.3 `skill`