nanoclaw's memory architecture: sqlite + 200-message rolling window

15 Mar 2026

what is nanoclaw
the memory question
architecture overview
the 200-message rolling window
cursor-based retrieval mechanism
accessing context beyond the window
comparison to RAG-based systems
trade-offs and design decisions
source citations

what is nanoclaw

nanoclaw is a WhatsApp-integrated AI assistant built on Claude that runs in Docker containers with per-group isolation. it’s designed for multi-group conversations with separate memory contexts, scheduled tasks, and persistent message history.

the interesting question isn’t what it does – it’s how it manages conversation context across multiple groups with potentially thousands of messages.

the memory question

when building LLM-based agents, one of the first architectural decisions is: how do you handle conversation history that exceeds the context window?

popular approaches:

RAG (Retrieval-Augmented Generation): embed messages, store in vector DB, retrieve semantically relevant chunks
Summarization: periodically summarize old messages, keep summaries in context
Sliding window: keep last N messages, drop older ones
Hybrid: combine multiple strategies

nanoclaw uses none of these. let’s see what it actually does.

architecture overview

nanoclaw’s memory system has two primary components:

SQLite database (store/messages.db) - stores all messages with metadata
Markdown files (conversations/ folder) - searchable conversation exports

the database schema is straightforward:

CREATE TABLE messages (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  chat_jid TEXT NOT NULL,           -- WhatsApp group ID
  sender TEXT NOT NULL,              -- Phone number
  sender_name TEXT,                  -- Display name
  content TEXT NOT NULL,             -- Message text
  timestamp TEXT NOT NULL,           -- ISO 8601
  is_from_me INTEGER DEFAULT 0,     -- Bot's own messages
  is_bot_message INTEGER DEFAULT 0  -- Messages from bot
);

source: src/db.ts lines 15-25

no embeddings. no vector columns. no fancy indexing beyond the standard B-tree on chat_jid and timestamp.

the 200-message rolling window

when nanoclaw processes a new message, it retrieves conversation history using getMessagesSince():

export function getMessagesSince(
  chatJid: string,
  sinceTimestamp: string,
  botPrefix: string,
  limit: number = 200,
): NewMessage[] {
  const sql = `
    SELECT * FROM (
      SELECT id, chat_jid, sender, sender_name, content, timestamp, is_from_me
      FROM messages
      WHERE chat_jid = ? AND timestamp > ?
        AND is_bot_message = 0 AND content NOT LIKE ?
        AND content != '' AND content IS NOT NULL
      ORDER BY timestamp DESC
      LIMIT ?
    ) ORDER BY timestamp
  `;
  return db.prepare(sql).all(chatJid, sinceTimestamp, `${botPrefix}:%`, limit);
}

source: src/db.ts lines 341-364

key observations:

Hard limit of 200 messages - enforced at the SQL level via LIMIT ?
Cursor-based pagination - uses sinceTimestamp instead of fixed time window
Chronological order - ORDER BY timestamp ensures messages are in conversation order
Filters bot’s own messages - excludes is_bot_message = 1 to avoid self-references

the 200-message limit is not configurable and not adaptive based on token count. it’s a simple message count cap.

cursor-based retrieval mechanism

the “rolling window” isn’t time-based – it’s cursor-based. here’s how it works:

// src/index.ts lines 158-163
const sinceTimestamp = lastAgentTimestamp[chatJid] || '';
const missedMessages = getMessagesSince(chatJid, sinceTimestamp, ASSISTANT_NAME);

// After processing...
// src/index.ts lines 183-184
lastAgentTimestamp[chatJid] = missedMessages[missedMessages.length - 1].timestamp;

flow:

Initial state: lastAgentTimestamp is empty string ''
- retrieves last 200 messages from entire history
After first run: cursor advances to timestamp of last processed message
Next run: retrieves all messages AFTER that timestamp (up to 200)
If < 200 new messages: gets all of them
If > 200 new messages: gets only 200 most recent, older ones are dropped

this means:

if you send 500 messages while the bot is offline, it only sees the last 200
the cursor never goes backward
there’s no “lookback” or “re-retrieval” of older context

accessing context beyond the window

so what happens if you reference something from message #201?

nanoclaw provides manual retrieval tools:

1. conversation folder exports

# conversations/
The `conversations/` folder contains searchable history of
past conversations. Use this to recall context from previous sessions.

source: groups/main/CLAUDE.md line 39

agents can use the Read tool to read exported conversation markdown files.

2. grep tool

agents can search message content using the Grep tool:

Grep({
  pattern: "budget discussion",
  path: "/workspace/project/conversations/",
  output_mode: "content"
})

3. direct database queries

agents can query the SQLite database directly via Bash tool:

sqlite3 /workspace/project/store/messages.db "
  SELECT timestamp, sender_name, content
  FROM messages
  WHERE chat_jid = '[email protected]'
    AND content LIKE '%budget%'
  ORDER BY timestamp DESC
  LIMIT 10;
"

key point: retrieval is manual and tool-initiated. the agent must explicitly decide to search for old context. it’s not automatic like RAG.

comparison to RAG-based systems

feature	nanoclaw	typical RAG system
storage	SQLite (relational)	Vector DB (Pinecone, Chroma, Weaviate)
retrieval	manual tool calls	automatic semantic search
context selection	chronological (last 200)	semantic similarity top-k
embeddings	none	required
search	SQL WHERE / Grep	vector similarity (cosine, euclidean)
latency	sub-millisecond SQL	depends on vector DB, usually 10-100ms
cost	zero (SQLite is free)	vector DB hosting + embedding API calls
complexity	low (just SQL)	medium-high (embedding pipeline, vector indexing)

why this matters:

RAG systems automatically retrieve relevant context based on semantic similarity:

user asks “what was our budget discussion?”
system embeds the query
retrieves top 5 semantically similar messages
adds them to context

nanoclaw requires the agent to explicitly search:

agent sees “what was our budget discussion?”
agent thinks “I need to search for this”
agent calls Grep or Read tool
agent adds findings to response

this is more transparent (you see the search happening) but less automatic (agent might forget to search).

trade-offs and design decisions

why 200 messages?

likely a balance between:

context window limits: keeping token count manageable
conversation coherence: 200 messages covers most multi-turn conversations
query performance: SQLite LIMIT 200 is fast even on large tables

why no embeddings?

embeddings add complexity:

need embedding API (OpenAI, Cohere, etc.) or local model
need vector storage and indexing
need embedding refresh on message updates
adds latency and cost

for a personal assistant handling dozens of groups, simplicity > sophistication.

why cursor-based?

alternatives:

time window (last 7 days): breaks if conversation is inactive for a week
fixed offset (messages 1000-1200): doesn’t adapt to conversation growth
cursor (since last processed): always picks up where you left off

cursor-based ensures continuity even with irregular message patterns.

why manual retrieval?

automatic RAG retrieval can:

add irrelevant context (semantic similarity isn’t perfect)
increase latency (every message triggers vector search)
use more tokens (retrieved chunks added to every request)

manual retrieval gives the agent control over when to pay the cost of searching.

source citations

all analysis based on nanoclaw repository source code:

database schema: src/db.ts lines 15-25
getMessagesSince function: src/db.ts lines 341-364
cursor advancement: src/index.ts lines 158-163, 183-184
conversation folder docs: groups/main/CLAUDE.md line 39
message filtering: excludes is_bot_message and bot-prefixed content

bottom line: nanoclaw’s memory is simple by design. no embeddings, no RAG, just SQLite with a 200-message rolling window and manual tool-based search. it trades automatic semantic retrieval for simplicity, transparency, and zero external dependencies.

for a multi-group WhatsApp assistant, that’s probably the right call.

Sid Ngeth's Blog A blog about anything (but mostly development)