Architecture Decisions

Why the site is built the way it is — every significant choice with its rationale and tradeoffs.

System at a Glance

Browser

Next.js App Router — Server + Client Components, streaming UI

Next.js frontend

SSR + edge-friendly rendering

FastAPI backend

Async Python · security middleware chain

Postgres + pgvector

RAG vector store

Redis

Semantic cache · rate limits · sessions

OpenAI

Embeddings · chat · Realtime

Voice lane

Browser ⇄ WebRTC audio with the Realtime model via a short-lived ephemeral token; transcripts stream over a separate WebSocket.

MCP server

Read-only tools any agent can call: search, get_timeline, get_cv.

Two services — a Next.js frontend and a FastAPI backend — over Postgres / pgvector, Redis, and OpenAI, with a WebRTC voice lane and a read-only MCP server.

Live System Metrics

The cache and latency numbers below come straight from the running backend — the full dashboard is on the Stats page.

View live metrics →

Why Two Services?

The frontend (Next.js) and backend (FastAPI) are intentionally separate services, not a monolith. The primary reason is demonstration: a monolith on a serverless platform can't run persistent connections, background tasks, or a vector similarity cache — all of which this site needs.

The split also reflects real-world backend engineering. The FastAPI service handles authentication, rate limiting, semantic caching, vector retrieval, and streaming — things that belong in a backend, not a Next.js API route.

Next.js Frontend

Server Components for static content

Client Components for interactive features

Streaming UI for progressive enhancement

App Router for nested layouts

FastAPI Backend

Async Python with full type annotations

RAG pipeline for LLM grounding

Redis semantic cache

Security middleware chain

RAG pipeline and retrieval architecture details are available to unblocked visitors.

Vector storage infrastructure details are available to unblocked visitors.

Semantic caching implementation details are available to unblocked visitors.

Security pipeline details are available to unblocked visitors.