A Practitioner's Guide to Using Large Language Models and Generative AI in Economic History | NBER
Skip to main content
Search
Search
A Practitioner's Guide to Using Large Language Models and Generative AI in Economic History
Andreas Ferrara
Share
Bluesky
Threads
Link
Working Paper 35374
DOI 10.3386/w35374
Issue Date June 2026
Large language models (LLMs) are lowering the entry barriers to working with exciting data sources that used to require strong data science skills, such as handwritten ledgers, text, images, or sound recordings. This guide provides an introduction for researchers who are new to LLMs. It sets out a step-by-step workflow for turning a research idea into working code and data, and describes the four main ways of interacting with an LLM: the chat window, editor-integrated assistants, agentic coding tools, and the API. It then works through the decisions a practitioner meets in sequence, beginning with whether an LLM is the right tool and whether the data are allowed to be sent to one, then how to select models, write prompts, manage context limits, and control costs, and finally how to validate, reproduce, document, and correct LLM-generated measures in regression settings. A review of recent research shows how these tools already extract, link, harmonize, and classify historical data at scale. Four worked examples with replication files illustrate the use of LLMs. They classify emotions in paintings, link census records without names, measure newspaper salience and sentiment around the 1882 Chinese Exclusion Act, and score the emotional delivery of Franklin D. Roosevelt's wartime speeches. The guide also condenses the workflow, the best-practice recommendations, and the preparation of replication packages into summary tables and checklists to aid applied economists.
Acknowledgements and Disclosures
I am grateful to Humoyun Abdumavlon, Omer Ali, Sascha O. Becker, Stephan Heblich, Sharun Mukand, Max Steinhardt, Patrick A. Testa, Gabby Toborg, and Sebastian Garcia Torres for helpful discussion and comments. Replication files for all worked examples in this guide are available at: https://doi.org/10.3886/E249897V2 The views expressed herein are those of the author and do not necessarily reflect the views of the National Bureau of Economic Research.
Citation and Citation Data
Copy Citation
Andreas Ferrara, "A Practitioner's Guide to Using Large Language Models and Generative AI in Economic History," NBER Working Paper 35374 (2026), https://doi.org/10.3386/w35374.
Copy to Clipboard
Download Citation
MARC
RIS
BibTeΧ
Download Citation Data
Related
Topics
Econometrics
Estimation Methods
Data Collection
History
Programs
Development of the American Economy
More from the NBER
In addition to working papers, the NBER disseminates affiliates’ latest findings through a range of free periodicals — the NBER Reporter, the NBER Digest, the Bulletin on Health, and the Bulletin on Entrepreneurship — as well as online conference reports, video lectures, and interviews.
2025, 17th Annual Feldstein Lecture, N. Gregory Mankiw," The Fiscal Future"
Feldstein Lecture
Presenter:
N. Gregory Mankiw
N. Gregory Mankiw, Robert M. Beren Professor of Economics at Harvard University, presented the 2025 Martin Feldstein...
2025, Methods Lecture, Raj Chetty and Kosuke Imai, "Uncovering Causal Mechanisms: Mediation Analysis and Surrogate Indices"
Methods Lectures
Presenters:
Raj Chetty
& Kosuke Imai
SlidesBackground materials on mediationImai, Kosuke, Dustin Tingley, and Teppei Yamamoto. (2013). “Experimental Designs...
2025, International Trade and Macroeconomics, "Panel on The Future of the Global Economy"
Panel Discussion
Presenters:
Oleg Itskhoki,
Paul R. Krugman
& Linda Tesar
Supported by the Alfred P. Sloan Foundation grant #G-2023-19633, the Lynde and Harry Bradley Foundation grant #20251294...
Follow
© 2026 National Bureau of Economic Research. All Rights Reserved.