Beyond the Readable Archive: Historical Codebooks, Large Language Models, and the Precolonial African Bead Trade, 1500–1900

Presented by

  • Lauren Coetzee
    University of Luxembourg

African history before colonialism is not absent from the archive — it is dispersed across thousands of pages of travel writing, missionary records, and merchant accounts. Yet, African-authored sources and oral histories still remain significantly harder to locate, access, and digitise for this period — a disparity that itself reflects the archival legacies of colonial knowledge production. Across this uneven landscape, the interpretive labour required to transform narrative prose into structured, analysable evidence has remained a barrier to systematic historical inquiry. The problem is not scarcity but scale, and the absence of computational methods capable of handling the cultural and historical specificity that African sources demand. Recovering dynamic, diachronic economic practices from historical written accounts is therefore not only a historiographical problem but a methodological one: the source base needed to challenge these narratives is too large for traditional close reading, yet too historically and culturally specific for off-the-shelf natural language processing pipelines trained predominantly on contemporary, non-African data.

This paper presents a methodology for doing exactly that, and questions what becomes possible for African history and digital humanities research when it is applied. Drawing on research spanning pre-colonial African trade networks, commodity currencies, and the digital analysis of European travelogues, this paper presents an LLM-assisted workflow for extracting structured economic data from the Time Traveller corpus — a large corpus of European travelogues compiled from accounts produced before 1900, capturing observations of African societies, economies, and landscapes at the moment of encounter. The methodological centrepiece is a historical codebook: a set of variable definitions, decision rules, and annotated examples developed from these sources, designed to calibrate LLM annotation to the evidentiary logic of a specific archive rather than to generalised text-processing categories. Applied to the African bead trade — the circulation of glass, shell, and metal beads as commodity currencies, status markers, and exchange media across continental networks — this workflow has produced over 27,000 coded observations capturing bead type, exchange context, geographic location, and trading partners, enabling spatial and temporal analysis of a market that conventional scholarship has left largely underresearched.

Historical codebooks reframe what LLMs are asked to do: they become instruments for recovering specific categories of evidence that historians already know to look for but cannot extract at scale, moving beyond using LLMs as blunt instruments toward deploying them as precision tools calibrated to a specific historical problem. For African history, this matters enormously — it opens access to a vast body of primary source material that has shaped how the continent’s past has been narrated, and creates the conditions for scholars to interrogate and access those narratives from within the source archive itself. The paper reflects on where the method works well, where human validation and domain knowledge remain essential, and what the workflow makes possible for researchers working across African corpora and languages.

Supported by

Point SudSTIAS — Stellenbosch Institute for Advanced StudyDeutsche Forschungsgemeinschaft (DFG)Goethe University FrankfurtUniversity of Bayreuth / Africa MultipleKing's College LondonSADiLaR

© 2026 Frédérick Madore, Vincent Hiribarren, Emmanuel Ngue Um, Menno van Zaanen. All rights reserved.