Skip to content
DB

Architecture Decisions

Why the site is built the way it is — every significant choice with its rationale and tradeoffs.

System at a Glance

Browser
Next.js App Router — Server + Client Components, streaming UI
Next.js frontend
SSR + edge-friendly rendering
FastAPI backend
Async Python · security middleware chain
Postgres + pgvector
RAG vector store
Redis
Semantic cache · rate limits · sessions
OpenAI
Embeddings · chat · Realtime
Voice lane

Browser ⇄ WebRTC audio with the Realtime model via a short-lived ephemeral token; transcripts stream over a separate WebSocket.

MCP server

Read-only tools any agent can call: search, get_timeline, get_cv.

Two services — a Next.js frontend and a FastAPI backend — over Postgres / pgvector, Redis, and OpenAI, with a WebRTC voice lane and a read-only MCP server.

Live System Metrics

The cache and latency numbers below come straight from the running backend — the full dashboard is on the Stats page.

View live metrics →

Why Two Services?

The frontend (Next.js) and backend (FastAPI) are intentionally separate services, not a monolith. The primary reason is demonstration: a monolith on a serverless platform can't run persistent connections, background tasks, or a vector similarity cache — all of which this site needs.

The split also reflects real-world backend engineering. The FastAPI service handles authentication, rate limiting, semantic caching, vector retrieval, and streaming — things that belong in a backend, not a Next.js API route.

Next.js Frontend

Server Components for static content

Client Components for interactive features

Streaming UI for progressive enhancement

App Router for nested layouts

FastAPI Backend

Async Python with full type annotations

RAG pipeline for LLM grounding

Redis semantic cache

Security middleware chain

RAG pipeline and retrieval architecture details are available to unblocked visitors.

Vector storage infrastructure details are available to unblocked visitors.

Semantic caching implementation details are available to unblocked visitors.

Security pipeline details are available to unblocked visitors.