Skip to main content
Back to Use Cases

Open-data capability demonstration · Heritage & Culture

From Digitised to Computable: an Open Standard for Aerial Photography Heritage

Most of the UK's heritage is digitised but not computable. It is scanned, catalogued and on a map, yet a researcher still cannot query it at scale. This is a worked, fully open demonstration of how we close that gap for one collection type done properly, tested against real national archives in the UK, Canada and the United States.

Real frames

292

harvested live from the public catalogue

Substrate present

100%

already carry a footprint and an ISO date

Validation

0

SHACL violations after lift to Baseline

The Challenge

For twenty years, heritage funding has been measured in things scanned. By that metric the work is a success. By the metric that now matters it is not finished: a photograph on an interactive map is legible to a human with a browser but not to a machine. A researcher cannot ask a collection of thirty million aerial images to return every frame over a given city between 1943 and 1946, run an image-similarity search, or cross-reference a reconnaissance sortie to another archive, without clicking through by hand. This is the gap the Towards a National Collection programme (AHRC / UKRI) and its N-RICH work set out to close, and the gap we address for aerial photography specifically.

We built NAPH, a computation-ready digitisation standard for aerial photography heritage. It is synthesis, not invention: it is assembled entirely from existing standards (GeoSPARQL, PROV-O, Dublin Core, IIIF, Records in Contexts), and it defines three tiers (Baseline, Enhanced, Aspirational) that an institution can adopt incrementally without rebuilding what it already has.

We measured a real collection, not a hypothetical one

We took the National Collection of Aerial Photography (NCAP), one of the world's largest aerial archives at over thirty million images and part of Historic Environment Scotland, and measured a sample of 300 real records from its public catalogue. Metadata only, read-only, rate-limited, and in good faith. The result is a compliment to the collection: the hard part is already done.

World choropleth of NCAP's georeferenced frames by country, on a log scale. The United Kingdom, France and Germany are densest; the reach also covers Iran, China, South Africa, Nigeria, the Caribbean, Australia and more, spanning five continents.
The georeferenced, mapped subset of NCAP already reaches dozens of countries across multiple continents (log scale; densest countries are cluster-capped, so shown as lower bounds). Our 292-record sample is a small, deliberately stratified slice of this index.
#FieldWhat the real data shows
1Machine-readable footprintA WKT polygon is present for 100% of records, in EPSG:3857. Reaching a geographic CRS is a reprojection, not new data.
2ISO-8601 capture dateEvery record already carries an ISO-8601 date, with a self-declared precision flag distinguishing day-level from year-level dates.
3Stable identifierA stable unique identifier is present for 100% of records, ready to be minted into a resolvable URI.
4Archival referenceAn ISAD(G) archival reference is present for 86% of records, anchoring each frame to its finding aid.
5Machine-readable rightsAbsent from the payload. This is the one genuine baseline gap, and the one that matters most for a collection that licenses its imagery.

292 real frames, on one map

Once the footprints are reprojected and the records are linked data, the collection becomes something you can see and query. Every frame below is a real record, positioned by its own footprint and coloured by decade, from a 1924 Royal Navy sortie over Hong Kong to post-war surveys across the Caribbean. Click any footprint in the live demo and its full NAPH metadata appears.

Interactive map of 292 real aerial photography frames from the NCAP collection, positioned by reprojected WGS84 footprint and coloured by decade, spanning 1924 to 1956 across the globe.
292 real frames, harvested live and auto-lifted to the Baseline tier, queryable by space and time. Markers coloured by decade; the panel shows one frame's full linked-data record.

Zoom to a single sortie and the value of computable footprints becomes obvious. Below are the frames of a 1924 Royal Navy reconnaissance run over Hong Kong. Reprojected and overlapped, the individual footprints trace the aircraft's flight line across Victoria Harbour, a hundred-year-old survey you can now select, query and cross-reference frame by frame.

Map of a 1924 Royal Navy aerial reconnaissance sortie over Hong Kong, its individual frame footprints overlapping to trace the flight line across Victoria Harbour and Kowloon, each a queryable NAPH record.
One sortie, frame by frame: a 1924 reconnaissance run over Hong Kong, its footprints reprojected to WGS84 and each linked to its full catalogue record.

One standard, three national collections, three continents

A standard tested against a single archive risks being quietly shaped around it. So we put the same question to two more national collections on two more continents, and changed nothing but the thin harvester that reads each catalogue. The ontology, the SHACL shapes and the RiC-O to STAC crosswalk stayed identical. All three lift to the same NAPH Baseline at zero SHACL violations, and the interesting result is that each collection is missing a different Baseline piece, which is exactly what a shared standard exists to normalise.

World map showing three national aerial photography collections harvested live and lifted to one NAPH Baseline: NCAP (United Kingdom) in teal, NAPL (Canada) in red, and WHAIFinder (United States, Wisconsin) in amber. Same ontology, shapes and crosswalk across all three; only the harvester differs.
Three national collections, one standard: NCAP (UK), NAPL (Canada) and WHAIFinder (USA), each harvested live from public endpoints and validated against the identical NAPH Baseline at zero SHACL violations.
CollectionReal records testedThe one Baseline piece it lacksSingle closing transform
NCAP, United Kingdom
Historic Environment Scotland
292 frames (1924–1956), frame-levelMachine-readable rights (0%)Reproject footprint EPSG:3857 to WGS84
NAPL, Canada
Natural Resources Canada
40 dated mosaics across 8 regions (1932–2004)Frame-level granularity (publishes regional mosaics)None for geometry: footprints are native WGS84, rights already present (OGL-Canada)
WHAIFinder, United States
UW-Madison / Wisconsin SCO
225 frames (1937–1967, public-domain USDA)Polygon geometry (publishes a centerpoint, not an area)Reconstruct footprint closed-form from centerpoint plus map scale

The UK collection has frame-level detail but no machine-readable rights; Canada's open subset has rights and native WGS84 but publishes mosaics; the US index has rights and frames but only a centerpoint. Three collections, three different gaps, one unchanged standard that makes each gap precise and automatable instead of leaving every archive to describe itself in its own vocabulary. That is the portability claim demonstrated, not asserted.

The expert ground: binding two worlds that never met

The archival community describes these collections with Records in Contexts and PROV-O, capturing custody and provenance but not spatial computability. The geospatial community indexes imagery with STAC and GeoSPARQL, delivering search by space and time but no archival provenance. Nobody had crosswalked the two for historic aerial photography. NAPH publishes that bridge, wrapped in FAIR and CARE, and every term mapped to Records in Contexts was verified against the published RiC-O 1.1 ontology rather than assumed.

Outcome

We harvested 292 real frames spanning 1924 to 1956, from Hong Kong to the Caribbean, reprojected their footprints to WGS84, and validated them against the standard at zero SHACL violations. Testing against real holdings also earned its keep: year-only dates in the sample exposed a defect in the standard's own date-precision shape, which we then corrected. The same records export cleanly to a STAC 1.0 catalogue, to GeoJSON and to IIIF, so a collection adopting the standard gains the entire geospatial and viewer ecosystem without bespoke integration.

The full ontology, the SHACL shapes, the live harvester, the RiC-O crosswalk and an interactive map are published open source as a case study in Open Ontologies, our open data-validation platform.

"Computation-readiness for heritage is not a request to start over. The substrate is usually already in the data. The work is a thin, automatable layer: reproject the geometry, mint a stable URI, attach machine-readable rights. Do that, and twenty years of digitisation becomes twenty years of computable research."

Tesseract Academy

Explore the open-source case study

Standard, ontology, SHACL shapes, harvester, STAC and RiC-O crosswalk on GitHub. MIT + CC BY 4.0.