Set Up vs. Setup — A Grammar Audit of Open Source
One is a verb. One is a noun.<br>Most repos pick the wrong one.
I scanned … public repositories on GitHub and found<br>… of them mixing up<br>set up (the verb) with setup (the noun).<br>This is my attempt at public shaming, hoping it helps people remember to use the correct one.
A minor but widespread annoyance
Set up is a phrasal verb: the action of preparing something.<br>Setup is a noun: the configuration that results.<br>Developers too often write setup the database when they mean<br>set up the database. The mistake hides in READMEs, code<br>comments, and docs across the open-source (and closed-source) world.
to place or fix in position;<br>to assemble, prepare, or arrange (something) for use or for a particular purpose.
e.g. “Run npm install to set up the project.”
— Merriam-Webster
the manner or act of arranging or organizing;<br>the equipment, devices, or apparatus put together for a specific purpose.
e.g. “The CI setup uses GitHub Actions.”
— Merriam-Webster
Quick rule of thumb
If you can replace it with configure or install, it's the<br>verb — write set up . If you can put the or<br>a in front of it, it's the noun, write setup .
How the data was collected
Over the course of 6 months a Python crawler slowly (read: 250 at a time) walked the public GitHub<br>search API for repositories with at least 25 stars, sliced into six-month windows<br>of creation date. For each repository it scanned READMEs, documentation files, and code<br>comments for occurrences of setup used in a verb position<br>(setup the …, to setup …, Setup your …,<br>and similar patterns). Lines that matched were captured with file path<br>and line number for inspection.
The code was largely written by Claude, because of course it was. As such there might be<br>some inaccuracies in the results, but over the course of collecting the data I spot checked<br>frequently and didn't find any false-positives.
Last update: ….<br>The full crawler lives in setup_checker.py.
About the data
Early entries in the data do not have examples of the improper grammar.<br>I later added functionality to grab an example and store it in the log,<br>so some repos in the data will return a highlighted example in the search<br>results. If a repo does not have an example that just means it was from<br>an entry early in the process.
Function names were intionally left out of the scan simply because camel case<br>doesn't allow for the chance to use the proper form where snake case would: set_up_user vs setupUser.<br>I have wondered if the common use of camel case is part of the reason this mistake is so ubiquitous.
The numbers
repositories analyzed
have at least one error
of analyzed repos
average errors in offending repos
median errors per offending repo
Distribution of errors per repository
Among the … repositories that contain<br>at least one error.
Where the errors live
Where in the repo the mistakes were found.
Most prolific offenders
Repositories with the highest count of setup/set up mistakes.
Look up a repository
Wondering whether your favorite project is on the list? Search for it.
Search →