Skip to main content

Search

Endpoints for retrieving memories. The full search runs query expansion, co-retrieval, reranking, and an LLM repair loop. The fast path skips the LLM repair loop and cross-encoder reranking to hit a sub-200ms target.

Base URL: http://localhost:3050

All request bodies are JSON (Content-Type: application/json). Field names on the raw HTTP prototype surface use snake_case.

POST /v1/memories/search

Full search with query expansion, co-retrieval, reranking, and LLM repair loop.

Request:

{
"user_id": "ethan",
"query": "What is their tech stack?",
"source_site": "claude",
"limit": 5,
"as_of": "2026-04-01T00:00:00Z",
"retrieval_mode": "flat",
"token_budget": 4000,
"namespace_scope": "work"
}
FieldTypeRequiredNotes
user_idstringyesUser identifier
querystringyesSearch query
source_sitestringnoFilter by source platform
limitnumbernoMax results (1–100, default: server config)
as_ofstringnoISO timestamp for temporal filtering
retrieval_modestringnoflat, tiered, or abstract-aware
token_budgetnumbernoToken budget for results (100–50,000)
namespace_scopestringnoRestrict search to a namespace
workspace_idstringnoScope search to a workspace (requires agent_id)
agent_idstringnoScope search to an agent within the workspace
agent_scopestringnoAgent visibility scope: all, self, or others

Scope resolution follows the platform's scope contract: when workspace_id + agent_id are omitted, the scope is { kind: 'user', userId }; when both are present, the scope is { kind: 'workspace', userId, workspaceId, agentId, agentScope } and the response echoes it back. Workspace visibility is enforced against each memory's stored visibility column — it is not a caller-provided filter.

Response (captured from running prototype):

{
"count": 3,
"retrieval_mode": "flat",
"scope": { "kind": "user", "userId": "docs-demo" },
"memories": [
{
"id": "7a52eec6-ede8-4904-8bfd-e393bf83f279",
"content": "User is allergic to peanuts and avoids all tree nuts.",
"similarity": 0.5556080705511763,
"score": 2.8112157186128353,
"importance": 0.7,
"source_site": "chatgpt",
"created_at": "2026-04-05T03:21:42.748Z"
},
{
"id": "f4e240d1-293d-4b58-a72a-401d26dbd09d",
"content": "The v2 launch deadline is April 15, 2026 and it is a hard deadline we cannot move.",
"similarity": 0.3323145702287046,
"score": 2.2646287539301384,
"importance": 0.6,
"source_site": "chatgpt",
"created_at": "2026-04-05T03:21:30.380Z"
},
{
"id": "3fa330cb-ee9e-4614-825c-1ce27539d24d",
"content": "Our production stack is TypeScript, React, PostgreSQL with pgvector, and we deploy on Fly.io.",
"similarity": 0.3731595762622564,
"score": 2.246318742363717,
"importance": 0.5,
"source_site": "claude",
"created_at": "2026-04-05T03:21:28.152Z"
}
],
"injection_text": "### Subject: site/chatgpt\n- [2026-04-05] [answer] The v2 launch deadline is April 15, 2026 and it is a hard deadline we cannot move.\n- [2026-04-05] [context] User is allergic to peanuts and avoids all tree nuts.\n\n### Subject: site/claude\n- [2026-04-05] [context] Our production stack is TypeScript, React, PostgreSQL with pgvector, and we deploy on Fly.io.",
"citations": [
"7a52eec6-ede8-4904-8bfd-e393bf83f279",
"f4e240d1-293d-4b58-a72a-401d26dbd09d",
"3fa330cb-ee9e-4614-825c-1ce27539d24d"
],
"observability": {
"retrieval": { "stages": [/* ... */] },
"packaging": { "tier_budget_tokens": 4000 },
"assembly": { "injection_tokens": 312 }
}
}
Response FieldNotes
scopeEchoes the resolved MemoryScope{ kind: 'user', userId } or { kind: 'workspace', userId, workspaceId, agentId, ... }
memories[]Ranked results with cosine similarity, composite score, and importance
injection_textPre-formatted markdown grouped by ### Subject: site/<source> with date-stamped [context]/[answer] bullets
citationsMemory IDs referenced in injection_text, matching memories[].id order
tier_assignmentsPresent when retrieval_mode is tiered
expand_idsIDs for follow-up /v1/memories/expand calls
lesson_checkSafety check against learned lessons
consensusConflict resolution stats when multiple memories conflict
observabilityOptional trace payload (retrieval / packaging / assembly sub-objects) — present when the runtime produced per-stage summaries. See observability for the schema.

Example:

curl -X POST http://localhost:3050/v1/memories/search \
-H 'Content-Type: application/json' \
-d '{"user_id": "docs-demo", "query": "Do they have any food allergies?", "limit": 3}'

POST /v1/memories/search/fast

Latency-optimized search (sub-200ms target). Skips the LLM repair loop and cross-encoder reranking.

Request:

FieldTypeRequiredNotes
user_idstringyesUser identifier
querystringyesSearch query
source_sitestringnoFilter by source platform
limitnumbernoMax results (1–100)
namespace_scopestringnoRestrict to namespace
workspace_idstringnoScope search to a workspace (requires agent_id)
agent_idstringnoScope search to an agent within the workspace
agent_scopestringnoAgent visibility scope: all, self, or others

Response (captured from running prototype — same schema as /v1/memories/search, including scope and optional observability):

{
"count": 3,
"retrieval_mode": "flat",
"memories": [
{
"id": "7a52eec6-ede8-4904-8bfd-e393bf83f279",
"content": "User is allergic to peanuts and avoids all tree nuts.",
"similarity": 0.3755667878488398,
"score": 4.951131619541372,
"importance": 0.7,
"source_site": "chatgpt",
"created_at": "2026-04-05T03:21:42.748Z"
},
{
"id": "f4e240d1-293d-4b58-a72a-401d26dbd09d",
"content": "The v2 launch deadline is April 15, 2026 and it is a hard deadline we cannot move.",
"similarity": 0.4432517683214816,
"score": 4.153163511162466,
"importance": 0.6,
"source_site": "chatgpt",
"created_at": "2026-04-05T03:21:30.380Z"
},
{
"id": "3fa330cb-ee9e-4614-825c-1ce27539d24d",
"content": "Our production stack is TypeScript, React, PostgreSQL with pgvector, and we deploy on Fly.io.",
"similarity": 0.6176793662882996,
"score": 3.568684490548965,
"importance": 0.5,
"source_site": "claude",
"created_at": "2026-04-05T03:21:28.152Z"
}
],
"injection_text": "### Subject: site/chatgpt\n- [2026-04-05] [answer] The v2 launch deadline is April 15, 2026 and it is a hard deadline we cannot move.\n- [2026-04-05] [context] User is allergic to peanuts and avoids all tree nuts.\n\n### Subject: site/claude\n- [2026-04-05] [context] Our production stack is TypeScript, React, PostgreSQL with pgvector, and we deploy on Fly.io.",
"citations": [
"7a52eec6-ede8-4904-8bfd-e393bf83f279",
"f4e240d1-293d-4b58-a72a-401d26dbd09d",
"3fa330cb-ee9e-4614-825c-1ce27539d24d"
]
}

Note: /search/fast composite score values are higher than /search because the fast path applies different boosting (recency, access frequency) without the LLM repair normalization.

Example:

curl -X POST http://localhost:3050/v1/memories/search/fast \
-H 'Content-Type: application/json' \
-d '{"user_id": "docs-demo", "query": "What is their tech stack?", "limit": 3}'