Archive Methodology

Half of the official URLs for the documents in this archive no longer work. DTIC returns 403 Forbidden. NGA redirects to HTML pages with no PDF. CelesTrak times out. The Internet Archive and NASA ADS are doing the heavy lifting of preserving access to foundational government research.

This page documents how we built the archive and how to add to it.

Why Local Archival

These documents are U.S. Government works (17 USC 105) — not subject to copyright. They were produced by employees of the Aerospace Defense Command, Defense Mapping Agency, and similar organizations. They are meant to be publicly available. But “meant to be” and “actually accessible” diverge over time:

Document	Official Source	Status (Feb 2026)	How We Got It
WGS-72 (Seppelin, 1974)	DTIC	403 Forbidden	Internet Archive mirror
WGS-84 (TR8350.2, 2000)	NGA	HTML redirect, no PDF	GIS-Lab mirror
Kelso (2007)	CelesTrak	Connection timeout	Wayback Machine cache
STR#1 (Hujsak, 1979)	DTIC	403 Forbidden	Internet Archive mirror
STR#2 (Lane & Hoots, 1979)	DTIC	403 Forbidden	Internet Archive mirror
Brouwer (1959)	NASA ADS	504 timeout on direct PDF	ADS article query CGI
Tapley (1975)	DTIC	403 Forbidden	Internet Archive

Server migrations, URL restructuring, access policy changes, and simple neglect erode availability. For documents this foundational — the theoretical basis for all TLE-based satellite tracking worldwide — local archival is the most reliable long-term strategy.

The Extraction Pipeline

Each document goes through eight phases, from acquisition to published page.

1. Acquire the PDF

Government/military technical reports are searched in order:

DTIC (apps.dtic.mil/sti/citations/{accession}) — often blocked, but has the canonical accession numbers
Internet Archive (archive.org/details/DTIC_{accession}) — mirrors of DTIC holdings
Wayback Machine — cached copies of original URLs that no longer resolve
NASA ADS — for academic journal papers (Brouwer, Lyddane, Hoots)
Institutional mirrors — universities, GIS labs, foreign archives

2. Archive the Original

The unmodified PDF is stored in docs/source/{author-year}/ alongside extraction notes. The PDF is committed to version control for permanence. These documents have a way of disappearing.

3. Read and Analyze

The document is read in full and structured extraction notes are written. This is the analytical layer that the PDF itself doesn’t provide. The notes capture:

Full citation with DTIC accession number and grant information
Relationship to the SGP4 lineage — how does this paper connect to the chain from Brouwer (1959) through to modern implementations?
Theory summary with key equations
Equation-to-code mappings — paper symbols mapped to FORTRAN variable names in the STR#3 source code (where applicable)
Key findings and their practical implications
Cross-references to other documents in the archive
Notable bibliography entries — what does this paper cite that we have or are missing?
Source quality notes — scan legibility, equation ambiguities, known errata

The notes file is the connective tissue that turns a PDF collection into a traceable intellectual lineage. Without it, you have twelve separate documents. With it, you can follow an equation from Brouwer’s 1959 Section 9 through Lyddane’s 1963 singularity fix, through Lane & Hoots’ 1979 drag additions, to a specific line in sgp4.f.

4. Create the Documentation Page

Each document gets a Starlight page in the sgp4-theory/foundations/ directory. Pages follow a consistent structure:

Source citation in a callout
Context first — “how does this relate to SGP4?” before diving into technical content
Key technical content — equations, tables, diagrams
Practical warnings — datum mixing, sign conventions, accuracy limits
Cross-links — CardGrid navigation to related pages in the archive

Equations are rendered with KaTeX. Diagrams use Mermaid. Cross-references use Starlight’s LinkCard component for navigable connections between documents.

5. Update the Inventory

Two inventory files track the complete collection:

The archive overview document inventory table (the live site)
The COLLECTION.md source inventory (the raw archive)

Both get a new row with directory name, citation, year, page count, and significance.

6. Build and Verify

The Starlight site must build with zero errors. KaTeX equations must render. Mermaid diagrams must process. Pagefind must index the new page for search.

7. Cross-Link Audit

Every href in the new page is verified against actual file paths. Common failure modes:

Numeric prefix mismatch — foundation pages use {NN}-{slug}.mdx filenames, so the URL must include the prefix (e.g., /docs/sgp4-theory/foundations/03-str-1/, not just str-1/)
Year discrepancy — some papers are dated differently than their publication year (Hoots’ 1981 paper is sometimes called “Hoots 1980” from the submission date)
Mermaid click targets — click directives in Mermaid diagrams need full paths with the /docs/ prefix

8. Commit

One commit per document, with a message that captures what the paper covers and why it matters for the lineage.

The Transcription Problem

A recurring theme across this archive: there are no authoritative digital source files for the foundational SGP4 documents. The STR#3 FORTRAN was distributed only as a printed report. Every digital copy was produced by hand-typing, OCR, or PDF text extraction — all methods vulnerable to errors.

For FORTRAN IV specifically, this is catastrophic. The language uses a fixed-format column layout where a single misplaced space silently changes meaning:

Columns 1—5: statement label
Column 6: continuation marker
Columns 7—72: source code
Columns 73—80: ignored (sequence numbers)

A character that spills from column 72 into column 73 becomes invisible to the compiler. An OCR error that shifts a line by one column can change a DO loop into an assignment statement. These errors produce code that compiles without warnings but gives wrong results.

This transcription fragility is a root cause of the 25 years of SGP4 implementation divergence that Vallado’s Rev-1 paper documented. Our own 2026 extraction reproduced the exact same class of error. See the STR#3 extraction notes for the full analysis.

Adding a New Document

If you have access to a paper in the SGP4 lineage that isn’t in this archive, the pipeline is:

Verify it’s a government work or otherwise freely distributable
Create docs/source/{author-year}/ with the PDF
Write the NOTES.md extraction notes following the structure above
Create the Starlight page in docs/src/content/docs/sgp4-theory/foundations/
Update both inventory files
Build, audit links, commit

The documents we haven’t been able to obtain:

Document	Why
Lane & Cranford (1969), AIAA 69-925	Behind AIAA paywall (not a government work)
Crawford (1995), Kepler’s equation fix	Unpublished technical note, likely lost
Vallado, “Fundamentals of Astrodynamics” (2013)	Copyrighted commercial publication

If you can locate any of these, the archive would be more complete.

SGP4 Theory Overview The full document lineage, inventory, and cross-document insights

STR#3 Extraction Notes A case study in the transcription problem -- our 2026 encounter with column-shift errors