Guide to Using Large Language Models and Generative AI in Economic History

paulpauper1 pts0 comments

A Practitioner's Guide to Using Large Language Models and Generative AI in Economic History | NBER

Skip to main content

Search

Search

A Practitioner's Guide to Using Large Language Models and Generative AI in Economic History

Andreas Ferrara

Share

LinkedIn

Facebook

Bluesky

Threads

Email

Link

Working Paper 35374

DOI 10.3386/w35374

Issue Date June 2026

Large language models (LLMs) are lowering the entry barriers to working with exciting data sources that used to require strong data science skills, such as handwritten ledgers, text, images, or sound recordings. This guide provides an introduction for researchers who are new to LLMs. It sets out a step-by-step workflow for turning a research idea into working code and data, and describes the four main ways of interacting with an LLM: the chat window, editor-integrated assistants, agentic coding tools, and the API. It then works through the decisions a practitioner meets in sequence, beginning with whether an LLM is the right tool and whether the data are allowed to be sent to one, then how to select models, write prompts, manage context limits, and control costs, and finally how to validate, reproduce, document, and correct LLM-generated measures in regression settings. A review of recent research shows how these tools already extract, link, harmonize, and classify historical data at scale. Four worked examples with replication files illustrate the use of LLMs. They classify emotions in paintings, link census records without names, measure newspaper salience and sentiment around the 1882 Chinese Exclusion Act, and score the emotional delivery of Franklin D. Roosevelt's wartime speeches. The guide also condenses the workflow, the best-practice recommendations, and the preparation of replication packages into summary tables and checklists to aid applied economists.

Acknowledgements and Disclosures

I am grateful to Humoyun Abdumavlon, Omer Ali, Sascha O. Becker, Stephan Heblich, Sharun Mukand, Max Steinhardt, Patrick A. Testa, Gabby Toborg, and Sebastian Garcia Torres for helpful discussion and comments. Replication files for all worked examples in this guide are available at: https://doi.org/10.3886/E249897V2 The views expressed herein are those of the author and do not necessarily reflect the views of the National Bureau of Economic Research.

Citation and Citation Data

Copy Citation

Andreas Ferrara, "A Practitioner's Guide to Using Large Language Models and Generative AI in Economic History," NBER Working Paper 35374 (2026), https://doi.org/10.3386/w35374.

Copy to Clipboard

Download Citation

MARC

RIS

BibTeΧ

Download Citation Data

Related

Topics

Econometrics

Estimation Methods

Data Collection

History

Programs

Development of the American Economy

More from the NBER

In addition to working papers, the NBER disseminates affiliates’ latest findings through a range of free periodicals — the NBER Reporter, the NBER Digest, the Bulletin on Health, and the Bulletin on Entrepreneurship — as well as online conference reports, video lectures, and interviews.

2025, 17th Annual Feldstein Lecture, N. Gregory Mankiw," The Fiscal Future"

Feldstein Lecture

Presenter:

N. Gregory Mankiw

N. Gregory Mankiw, Robert M. Beren Professor of Economics at Harvard University, presented the 2025 Martin Feldstein...

2025, Methods Lecture, Raj Chetty and Kosuke Imai, "Uncovering Causal Mechanisms: Mediation Analysis and Surrogate Indices"

Methods Lectures

Presenters:

Raj Chetty

& Kosuke Imai

SlidesBackground materials on mediationImai, Kosuke, Dustin Tingley, and Teppei Yamamoto. (2013). “Experimental Designs...

2025, International Trade and Macroeconomics, "Panel on The Future of the Global Economy"

Panel Discussion

Presenters:

Oleg Itskhoki,

Paul R. Krugman

& Linda Tesar

Supported by the Alfred P. Sloan Foundation grant #G-2023-19633, the Lynde and Harry Bradley Foundation grant #20251294...

Follow

© 2026 National Bureau of Economic Research. All Rights Reserved.

data guide models economic nber large

Related Articles