The company that doesn't exist

utdiscant1 pts0 comments

The Company That Doesn't Exist: Testing a Company Brain

How we invented an entire fintech startup with 112 employees to test our company brain and whether it can keep secrets, get facts straight and avoid confusing Sophie with the other Sophie.

Every company is full of information that not everyone is meant to see. The board knows things the team doesn't. Finance sees numbers the rest of us don't. Your DMs are your own.

At Agentwork we're building an AI that reads all of a company's data, the Slack messages, the docs, the CRM, the code, and answers questions about it. Think of it as a shared brain for the whole company. The hard part isn't answering the question. It's giving each person only the answer they're allowed to have. The CEO and a brand-new engineer should not hear the same thing.

The clearest way to show what we mean is to ask our system one question as two different people. So we did, at Saldra, a company we'll properly introduce in a minute.

The question: "Are we going to raise another round of funding?"

Asked as Mette Krarup, the CEO, it answered in full: no. The board decided back in December to stop raising, stretch the runway, and aim for profitability instead. Internally they call the plan "default-alive."

Asked as Frederik Kjær, a backend engineer, it declined. It could tell that runway and fundraising get mentioned at all-hands, but it found no decision it was allowed to share. That conversation lived in a private board channel Frederik can't see.

Same question. Two answers. Both correct.

Mette and Frederik don't exist. Neither does Saldra, the fintech they work for, nor the board that made the call. We invented all of it: 112 people and a year of data spread across Slack, Notion, a CRM, GitHub, Google Drive and more.

We made up all this data to be able to test our own product. Can our memory system keep a company's secrets, can it answer complicated questions and can it navigate the tribal knowledge that lives in a company's internal data?

This post is the long version of the answer. How we generated the data, what we hoped to achieve, how we made evals, a mocked ingestion pipeline, and an agent that automatically does research and suggests improvements.

Why we needed a fake company

Initially when we tried testing our own system, we connected it to our own internal Slack, Notion, Google Drive and Fireflies (meeting recording app). And then we started asking questions to see whether it worked. To improve our system we created a bunch of memory evals - questions we could ask the agent where we knew what the right answer was supposed to look like. Some examples were:

Which agent framework did we migrate to, and from what?<br>We migrated the agent architecture from LangGraph to Pydantic AI (around April 2026).

Which of our investors is CTO of a startup that raised $1.1B, and what's the startup called?<br>Lasse Espeholt (ex-DeepMind) - CTO of Ineffable Intelligence, which raised $1.1B (April 2026 chatter).

Who is our Head of Marketing?<br>We don't have one

That led to a bunch of quick insights and improvements early on, but we quickly ran into challenges.

The first problem was that our own data is fairly limited. Being just a team of three people, the amount of complexity is relatively small. Nobody had joined the company and nobody had left. Nobody had the same first name. The next issue was that we were talking about our evals in Slack and meetings, and this led to evals becoming poisoned.

Separately to those issues, we didn't have great seed data to test our app with. Whenever we wanted to test our memory ingestion system, or test the system with realistic data we ran into problems. Our seed data was just a few hand-created people like "Customer Customersson", but that was limiting.

How to generate a company in a few shots

When we had decided to create a fake company as a synthetic dataset, we booted up Claude and asked it to get started. We kicked off with this prompt:

I want to create a synthetic dataset that we can use for Agentwork. It will be used both for evals and for sales demos. It needs to feel like a real company, with real employees, data etc. The end result is that we create a larger seed data source that we can ingest for evals and load for demos.

I have attached an example of how our data looks today when dumping the raw source records. But before we get to building the actual data, I want to make a few preliminary documents describing the company, their product(s), etc.

Let's make it a SaaS company. It should have raised some venture funding. It has ~100 employees. It is headquartered in Denmark, but has an office in London too. After this, first company description, then I want to build a story of a year in the life of this company and then build personas for each person in the company (not very deep for everyone person). And then based on that story and those people, I want to generate realistic data.

Start with the company overview, and when that is...

company data system evals test people

Related Articles