Micro GPT written in Excel formulas

GitHub - pyxll/excel-gpt: Minimal GPT model implemented in Excel · GitHub

/" data-turbo-transient="true" />

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Clear

Search syntax tips

Provide feedback

--> We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

/;ref_cta:Sign up;ref_loc:header logged out"}" Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

pyxll

excel-gpt

Public

Notifications You must be signed in to change notification settings

Fork

Star

main

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files NameNameLast commit message Last commit date Latest commit

History 4 Commits 4 Commits

images

LICENSE

README.md

excel-gpt.xlsx

View all files

Repository files navigation

Excel-GPT

Minimal GPT model implemented in Excel inspired by https://karpathy.github.io/2026/02/12/microgpt/.

Minimal Implementation : Focuses on the core GPT architecture with zero dependencies.

Excel Only : Only uses Excel's formula capabilities for computation, no VBA.

The included Excel file generates plausible sounding names.

The workbook is explained in this video:

It is well worth reading https://karpathy.github.io/2026/02/12/microgpt/ as you follow along with the spreadsheet. Everything is explained very well there, and I have not repeated everything here.

Motivation

Using tools like PyXLL (https://www.pyxll.com) we can integrate Python code into Excel. We can wrap the GPT Python model to generate text from Excel that way and instantly call the model from Excel.

As a learning exercise, I wanted to do the opposite here and implement the micro GPT model entirely in Excel formulas without any Python code. This, of course, results in a more complex spreadsheet than simply calling a single function, but it allows us to peek inside the model in a way that is much harder with a plain Python script.

In real-world scenarios I would never expect to build a spreadsheet with this much complexity baked into it. It would be far better to move the complexity into Python, where it can be properly tested and debugged, and then call that Python code from Excel using the PyXLL add-in.

Architecture

The model is implemented in the Model sheet.

The model is implemented as an unrolled loop in Excel, with a block for each output token. Each block takes a previous token, a position, the parameters, and the keys and values from the previous positions. The output of each block is the logits (scores) over what token the model predicts next, and the predicted output token.

We follow Andrej Karpathy's microGPT and use the same simplifications: RMSNorm instead of LayerNorm, no biases, and ReLU instead of GeLU.

Each block starts with the current position id, the previous token, and the token from the current training target. The target isn't used when running the model, it is only used in training which is not part of this sheet. A special token '?' is used to indicate the start and end of the name.

Next are the embedding vectors. These follow the original microGPT code and are the learned vectors for the position and token looked up from the weights tables. The position and token embeddings are summed to give a joint embedding.

The attention block is the same as in the original microGPT project, but with the loop unrolled and each step repeated for each iteration of the loop. We are using 4 attention heads, so there are 4 sets of rows for this. We compute the query (Q), key (K) and value (V) for the current token, and make the key and value from previous positions available to the current position. Each attention head computes the dot product between the query and keys (current key and previous keys), and takes the weighted sum of the values. The head outputs are recombined and projected to the attention output through the trained projection matrix attn_wo.

The MLP (multilayer perceptron) block projects the attention output through the MLP projection matrix mlp_fc1, applies ReLU to clamp values to >= 0, and then projects the result back down to the embedding dimension through mlp_fc2.

Both the MLP and attention blocks output residuals, which are added back to the inputs to produce the output of the block. This lets gradients flow directly through the network and makes deeper models trainable.

Finally, the MLP output is projected back to the vocabulary dimension through the...

Micro GPT written in Excel formulas

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast