Providers
Disambiguation. This page is about embedding and LLM providers inside the engine — OpenAI, Ollama, Anthropic, etc. The SDK has a separate concept called memory providers (
MemoryProvider— the interface a memory backend implements so the SDK can route through it). Different layer, different concept. See the SDK's memory providers.
Embeddings and LLM calls in AtomicMemory are pluggable providers behind single-method interfaces. You pick OpenAI, Anthropic, Google, Groq, Ollama, a local WASM model, or any OpenAI-compatible endpoint at deploy time via environment variables. Nothing above the provider boundary changes — no code, no imports, no service wiring.
That is the second pillar of the platform layer: the services that call
embedText() and chat() don't know, and can't know, which provider is
serving the call. The selection is made once at composition-root time and
erased behind an interface.
The two interfaces
Both provider families are tiny. That's deliberate: the less surface the interface has, the more backends can satisfy it, and the harder it is to leak provider-specific concepts into business logic.
EmbeddingProvider
From Atomicmemory-core/src/services/embedding.ts:69-72:
export type EmbeddingTask = 'query' | 'document';
export interface EmbeddingProvider {
embed(text: string): Promise<number[]>;
embedBatch(texts: string[]): Promise<number[][]>;
}
Two methods. One for single embeddings, one for batch. Every backend — OpenAI REST, Ollama's native API, local WASM via transformers.js, any OpenAI-compatible endpoint — satisfies exactly this shape.
LLMProvider
From Atomicmemory-core/src/services/llm.ts:60-74:
export interface ChatMessage {
role: 'system' | 'user' | 'assistant';
content: string;
}
export interface ChatOptions {
temperature?: number;
maxTokens?: number;
jsonMode?: boolean;
seed?: number;
}
export interface LLMProvider {
chat(messages: ChatMessage[], options?: ChatOptions): Promise<string>;
}
One method. Chat messages in, completion text out. JSON mode, seed, and
temperature are passed as options — any backend that can't honor one of them
(for instance, Anthropic ignores seed) degrades gracefully inside the
adapter, never leaking the difference to callers.
The supported providers
Embeddings
Provider name (EMBEDDING_PROVIDER=) | Backend |
|---|---|
openai | OpenAI embeddings REST API |
ollama | Ollama native /api/embed endpoint |
openai-compatible | Any OpenAI-schema endpoint (LM Studio, vLLM, TGI, …) |
transformers | Local WASM via @huggingface/transformers + ONNX Runtime |
Declared in config.ts:14:
export type EmbeddingProviderName =
'openai' | 'ollama' | 'openai-compatible' | 'transformers';
LLM
Provider name (LLM_PROVIDER=) | Backend |
|---|---|
openai | OpenAI chat completions |
anthropic | Anthropic Messages API |
google-genai | Google Gemini via OpenAI-compatible endpoint |
groq | Groq via OpenAI-compatible endpoint |
ollama | Ollama native /api/chat endpoint |
openai-compatible | Any OpenAI-schema endpoint (LM Studio, vLLM, …) |
Declared in config.ts:15:
export type LLMProviderName =
EmbeddingProviderName | 'groq' | 'anthropic' | 'google-genai';
Note the subtype relationship: every embedding provider name is also a
valid LLM provider name (except transformers, which has no chat backend).
The factory is the only place the provider name is visible
This is the crux of the provider-agnostic boundary. The entire codebase
above embedding.ts never matches on provider name. The switch happens
once, inside the factory, and is erased the moment the provider is returned.
From embedding.ts:232-254:
function createEmbeddingProvider(): EmbeddingProvider {
const config = requireConfig();
switch (config.embeddingProvider) {
case 'openai':
return new OpenAICompatibleEmbedding(
config.openaiApiKey, config.embeddingModel,
undefined, config.embeddingDimensions,
);
case 'ollama':
return new OllamaEmbedding(
config.embeddingModel, config.ollamaBaseUrl,
);
case 'openai-compatible':
return new OpenAICompatibleEmbedding(
config.embeddingApiKey ?? config.openaiApiKey,
config.embeddingModel,
config.embeddingApiUrl,
config.embeddingDimensions,
);
case 'transformers':
return new TransformersEmbedding(config.embeddingModel);
default:
throw new Error(
`Unknown embedding provider: ${config.embeddingProvider}`,
);
}
}
And the LLM factory, from llm.ts:259-289:
export function createLLMProvider(): LLMProvider {
const config = requireConfig();
switch (config.llmProvider) {
case 'openai':
return new OpenAICompatibleLLM(config.openaiApiKey, config.llmModel);
case 'ollama':
return new OllamaLLM(config.llmModel, config.ollamaBaseUrl);
case 'groq':
return new OpenAICompatibleLLM(
config.groqApiKey ?? '',
config.llmModel,
'https://api.groq.com/openai/v1',
);
case 'anthropic':
return new AnthropicLLM(config.anthropicApiKey ?? '', config.llmModel);
case 'google-genai':
return new OpenAICompatibleLLM(
config.googleApiKey ?? '',
config.llmModel,
'https://generativelanguage.googleapis.com/v1beta/openai/',
);
case 'openai-compatible':
return new OpenAICompatibleLLM(
config.llmApiKey ?? config.openaiApiKey,
config.llmModel,
config.llmApiUrl,
);
default:
throw new Error(`Unknown LLM provider: ${config.llmProvider}`);
}
}
Two things to notice:
- Three providers (Groq, Google Gemini, OpenAI-compatible) reuse
OpenAICompatibleLLM. The OpenAI SDK's wire format is the industry default, so the adapter is written once and pointed at differentbaseURLs. That's what "openai-compatible" costs us — nothing. - Anthropic gets its own adapter because the Messages API has a
different message shape (system prompt is top-level, assistant/user
messages are separate). The adapter normalizes it to
chat(messages, options)and callers never see the difference.
Changing provider with zero code change
This is the headline. Here's the ingest pipeline calling embedText:
import { embedText } from './services/embedding.js';
// Inside the ingest service — no provider name anywhere.
const embedding = await embedText(userMessage, 'document');
await stores.memory.storeMemory({
userId, content: userMessage, embedding, importance, sourceSite,
});
The same call site runs against OpenAI:
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small
OPENAI_API_KEY=sk-…
Or against a local Ollama:
EMBEDDING_PROVIDER=ollama
EMBEDDING_MODEL=snowflake-arctic-embed2
OLLAMA_BASE_URL=http://localhost:11434
Or against fully-local WASM with zero network:
EMBEDDING_PROVIDER=transformers
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2
Or against an OpenAI-compatible server (LM Studio, vLLM, TGI, a corporate proxy):
EMBEDDING_PROVIDER=openai-compatible
EMBEDDING_MODEL=bge-large-en-v1.5
EMBEDDING_API_URL=http://internal-embed.corp:8080/v1
EMBEDDING_API_KEY=… # optional
The ingest service, the search service, the AUDN decision loop, the repair
loop — none of them have a single if (provider === 'ollama') branch. That
is the provider-agnostic boundary working as designed.
Provider quirks stay inside the adapter
Being agnostic doesn't mean being naïve. Real embedding models have real quirks, and the adapter layer is where those quirks live — never above. Two examples from the shipped code:
Instruction prefixes
Some embedding models (mxbai, nomic) need task-specific prefixes on query
text but not document text. The provider-agnostic embedText function
handles that before dispatch. From
embedding.ts:292-308:
function getInstructionPrefix(model: string, task: EmbeddingTask): string {
if (task === 'document') return '';
if (model.includes('mxbai-embed-large')) {
return 'Represent this sentence for searching relevant passages: ';
}
if (model.includes('nomic-embed-text')) {
return 'search_query: ';
}
return '';
}
Callers pass 'query' or 'document' as a semantic tag. The prefix (if
any) is model-specific, and the logic lives in one place.
ONNX Runtime serialization (WASM provider)
The local WASM provider has a known concurrency issue — ONNX Runtime's
mutex corrupts under concurrent async calls. Rather than leak a "don't call
concurrently" caveat into every consumer, the adapter serializes internally.
From embedding.ts:168-218:
class TransformersEmbedding implements EmbeddingProvider {
private model: string;
private pipelinePromise: Promise<TransformersPipeline> | null = null;
private inferenceQueue: Promise<void> = Promise.resolve();
private serialized<T>(fn: (extractor: TransformersPipeline) => Promise<T>): Promise<T> {
return new Promise<T>((resolve, reject) => {
this.inferenceQueue = this.inferenceQueue.then(async () => {
try {
const extractor = await this.getPipeline();
resolve(await fn(extractor));
} catch (err) {
reject(err);
}
});
});
}
async embed(text: string): Promise<number[]> {
return this.serialized(async (extractor) => {
const output = await extractor(text, { pooling: 'mean', normalize: true });
return Array.from(output.data as Float32Array);
});
}
// embedBatch follows the same pattern.
}
The rule: every provider-specific workaround lives inside the adapter.
The EmbeddingProvider interface is the same shape for every backend.
Cost telemetry is cross-cutting
Every provider adapter — OpenAI, Anthropic, Ollama, Google, Groq — calls
writeCostEvent() with the same shape after each request. That gives you
one cost log across heterogeneous backends, keyed by provider, model, and
stage. You can swap models and still see apples-to-apples cost data in a
single stream.
See the recordOpenAICost helper in
llm.ts:147-164
for the OpenAI-compatible path; every other adapter writes the same event
shape inline.
Writing your own provider
Because EmbeddingProvider and LLMProvider are interfaces, not base
classes, adding a new backend is mechanical:
- Implement the one-or-two-method interface.
- Add a case to the factory switch.
- Extend
EmbeddingProviderNameorLLMProviderNameinconfig.ts. - Wire any required API keys into
RuntimeConfig.
There are no base classes to extend, no lifecycle hooks to implement, no
plugin registry to register with. The provider layer is as small as the
EmbeddingProvider and LLMProvider signatures say it is.
Startup-only selection
Provider and model selection is startup-only by design. The modules
hold their config as module-local state, bound once by the composition
root. From embedding.ts:51-56:
export function initEmbedding(config: EmbeddingConfig): void {
embeddingConfig = config;
provider = null;
providerKey = '';
embeddingCache.clear();
}
That's deliberate. Hot-swapping embedding providers mid-process would
invalidate the embedding cache, invalidate pgvector index assumptions, and
potentially mix embedding widths in the same table. We sidestep all of that
by making provider selection a deploy-time decision. Tests can rebind by
calling initEmbedding again (which explicitly clears the cache) — that's
the one sanctioned rebind path.
Related
- Stores — the other half of the platform layer: pluggable storage behind narrow interfaces.
- Composition — how providers, stores, and services are wired together at startup.