System Overview

Craft is a monorepo with three packages that each serve a distinct role: a FastAPI backend that owns all data and computation, an Astro frontend that renders the user interface, and a FastMCP server that exposes the same capabilities to LLM agents. This page describes how the pieces fit together and why each technology was chosen.

The three packages

astrolock/
  packages/
    api/     FastAPI backend (Python, Skyfield, SQLAlchemy, asyncpg)
    web/     Astro frontend (CesiumJS, satellite.js, shadcn-ui)
    mcp/     FastMCP server (proxies to the API)
  docs/      Starlight documentation (this site)

The API is the hub. It owns the database, performs all celestial computations through Skyfield, manages TLE updates, runs background data fetchers, and controls antenna rotors via the Hamlib rotctld protocol. Every other component talks to it.

The Web frontend is an Astro site with React islands. It renders a CesiumJS 3D globe, satellite pass predictions, search results, and rotor control panels. For smooth 60fps satellite animation, it runs its own SGP4 propagation client-side using satellite.js rather than polling the API at video frame rate. The API remains the authority on accurate positions — satellite.js handles the visual interpolation between server updates.

The MCP server is a thin proxy. It exposes the same find, track, search, and predict capabilities as the REST API, but through the Model Context Protocol so that LLM agents (Claude, etc.) can interact with Craft programmatically. It makes HTTP calls to the API and reformats the responses for MCP clients.

Why this split matters

The monorepo structure keeps deployment synchronized — a single docker compose invocation brings up everything with matching versions. But the packages are deliberately decoupled:

The API has no knowledge of the frontend. It serves JSON over HTTP and WebSocket.
The frontend has no direct database access. It only knows the API’s URL.
The MCP server treats the API as an opaque HTTP service.

This means you can run the API standalone for automation, replace the frontend with a different client, or connect the MCP server to a remote Craft instance by changing a single environment variable (ASTROLOCK_API_URL).

Request flow

All external traffic enters through Caddy, which handles TLS termination and path-based routing:

graph TD
    HTTPS["HTTPS Request"] --> Caddy["Caddy\n(TLS + routing)"]
    Caddy -->|"/api/*"| API["API\n(FastAPI)"]
    Caddy -->|"/ws/*"| WS["API\n(WebSocket)"]
    Caddy -->|"/docs/*"| Docs["Docs\n(Starlight)"]
    Caddy -->|"/*"| Web["Web\n(Astro)"]

Caddy inspects the request path and forwards accordingly:

/api/* and /ws/* go to the FastAPI backend (port 8000)
/docs/* goes to the Starlight documentation site (port 3000)
Everything else (/*) falls through to the Astro frontend (port 4321)

WebSocket connections for live tracking (/ws/tracking/{type}/{id}) and rotor status (/ws/rotor/{id}) receive special proxy configuration — Caddy disables response buffering and extends timeouts so these long-lived connections survive without being closed for apparent inactivity.

Data flow

Data flows into Craft from several external sources, gets stored in PostgreSQL, and is enriched with vector embeddings for semantic search.

graph LR
    CT["CelesTrak TLEs"] --> API
    ST["Space-Track"] --> API
    LL["Launch Library 2"] --> API
    MPC["MPC Comets"] --> API
    NOAA["NOAA Space Weather"] --> API
    SH["SondeHub"] --> API
    OM["Open-Meteo"] --> API
    ADSB["OpenSky ADS-B"] --> API
    API --> DB["PostgreSQL"]
    DB --> VW["pgai vectorizer\nworker"]
    VW --> ES["Embedding Stores"]
    VW --> GPU["GPU Gateway\n(mxbai-embed-large)"]
    GPU --> VW
    ES --> DB

Ingestion

The API runs background fetcher tasks on configurable intervals. On startup, the FastAPI lifespan handler launches each fetcher:

Space weather (NOAA SWPC): solar indices, K-index, aurora data — every 15 minutes
Comets (Minor Planet Center): orbital elements — daily
Launches (Launch Library 2): upcoming and recent launches — every 15 minutes
Reentry predictions: decay forecasts — hourly
Space events: conjunction warnings, near-Earth objects, astronaut data — hourly
Radiosondes (SondeHub): weather balloon positions — every 10 minutes
Atmospheric profiles (Open-Meteo): seeing and propagation conditions — every 30 minutes
Aircraft (OpenSky ADS-B): nearby aircraft positions — every 10 minutes

TLE updates are triggered on demand through the API (seed script or manual refresh) rather than on a timer, since CelesTrak rate-limits aggressively.

Enrichment

When satellite or celestial object records are created or updated, the API builds a search_text column that concatenates the object’s name, designator, group memberships, frequencies, and description into a single text blob. This denormalized column is what the pgai vectorizer reads.

The vectorizer worker polls the database every 5 seconds, finds rows with new or changed search_text, sends them to the GPU embedding gateway (running mxbai-embed-large with 1024 dimensions), and stores the resulting vectors in dedicated embedding tables. This process runs entirely outside the API — the worker is a separate container that talks directly to PostgreSQL and the GPU endpoint.

Key technology choices

Skyfield for celestial computation. Skyfield implements the full IAU 2000/2006 precession-nutation model and handles Earth satellite propagation, planetary ephemerides, star positions, and comet orbits through a single consistent API. The alternative was to use multiple libraries (sgp4 for satellites, astropy for everything else), but Skyfield’s unified observe() -> apparent() -> altaz() pipeline means every target type — from the ISS to Comet ATLAS to Vega — flows through identical code. This consistency matters for a system that needs to point a physical antenna at any of these objects.

satellite.js for client-side animation. The CesiumJS globe needs to animate satellite positions at 60 frames per second. Polling the API at that rate would be absurd. Instead, the frontend loads TLE data once and runs SGP4 propagation locally in JavaScript. The positions are close enough for smooth visual animation, even though they drift slightly from Skyfield’s more rigorous computation. The API corrects for this drift through its 1Hz WebSocket updates.

pgvector + pgai for semantic search. Traditional text search (ILIKE with trigram indexes) works well when users know exactly what they are looking for — “ISS” or “NORAD 25544.” But amateur radio operators also search for things like “2m FM repeater satellites” or “weather imagery downlink,” which are conceptual queries that do not match any specific column value. Vector embeddings encode meaning, so a search for “amateur radio” surfaces the ISS (which has ham radio frequencies) even though the word “amateur” does not appear in its name. The hybrid search combines both approaches: semantic results are merged with text matches, and objects found by both methods get a score boost.

PostgreSQL 17 with TimescaleDB HA for the database. The TimescaleDB HA image bundles pgvector and several other extensions in a single container, which avoids the complexity of building a custom PostgreSQL image with multiple extensions compiled in. TimescaleDB’s time-series capabilities are not heavily used yet, but the image’s extension ecosystem and operational maturity made it the pragmatic choice.

Caddy for the reverse proxy. Caddy handles TLS certificate issuance automatically via ACME DNS challenges, which means new subdomains get HTTPS without any manual certificate management. The caddy-docker-proxy integration reads routing configuration from Docker labels on each service container, so there is no separate Caddyfile to maintain.

FastMCP for the MCP server. The Model Context Protocol lets LLM agents call Craft’s capabilities as structured tool invocations. FastMCP wraps the HTTP proxy calls with proper type annotations and descriptions, so an agent can ask “what satellites are above the horizon right now?” and get a structured response it can reason about.