GPT‑NL: a sovereign language model for the Netherlands

root-parent1 pts0 comments

GPT‑NL: a sovereign language model for the Netherlands

More

Geselecteerde taal: EN

English<br>Nederlands

Soort project:<br>Project

Thema:<br>Artificial intelligence

GPT‑NL: a sovereign language model for the Netherlands

Language‑based AI is becoming integral to the workplace, education and public services. Yet control over this technology matters. GPT‑NL shows that a different approach is possible: one built on strong governance, transparency and a firm commitment to public values.

Language models, integrated in applications such as ChatGPT, demonstrate the potential of AI for innovation, productivity and societal solutions. At the same time, they raise fundamental questions. Who decides how these models work? Which data do they use? And how do we safeguard public values such as privacy, copyright and transparency?

With GPT‑NL, TNO - together with SURF and the Netherlands Forensic Institute (NFI) is building an independent Dutch language model and ecosystem. This strengthens the digital autonomy of the Netherlands and Europe, and provides a solid foundation for responsible AI applications.

Where does GPT‑NL stand today?

Curious about where GPT‑NL stands today? Product Manager Saskia Lensink and R&D Manager Frank Brinkkemper assess the situation and look ahead to the next exciting phase. Read all about it in our progress report.

Download report (in Dutch) (pdf)

GPT‑NL values

We are building a responsible language model for the Dutch language and context: trustworthy, transparent, reciprocal and sovereign.

Sovereign: control over technology that matters

GPT‑NL is developed within the Netherlands and Europe. This gives us full control over the model, the data and the choices we make. We avoid dependency on non‑European providers and invest in a sustainable AI ecosystem aligned with our laws, values and societal goals.

Open and transparent: insight from source to model

GPT‑NL is built on transparency. We clearly document the choices we make during data collection and training, and how we address risks such as bias and ethical concerns. We publish the source code as open source and share detailed insights into the dataset. Model weights are made available under a controlled licence. This allows us to know who uses the model and to inform users about updates or changes, for example following a data opt‑out. In this way, we operate transparently without compromising security or regulatory compliance.

Trustworthy: protecting users and citizens

We train GPT‑NL entirely from scratch. This prevents unclear data provenance, copyright risks or potential personal data from being inherited from existing models.

To ensure a reliable foundation, our data collection meets strict criteria:

Safeguarding intellectual property<br>Removing and anonymising personal data before model training<br>Excluding confidential information<br>Excluding harmful content<br>Avoiding duplication within the dataset

Reciprocal: fair agreements on data and value

GPT‑NL deliberately works with a clean and lawful data supply chain. We collaborate closely with data providers and actively involve them in the development of the model.

Through the Content Board, these data providers and rights holders have a voice in the future of GPT‑NL. Part of the revenues flows back to the creators. This creates a fairer innovation model in which value is shared rather than extracted.

Using resources efficiently

AI development requires significant computing power and energy. That is why we actively focus on energy efficiency and responsible use of resources. Based on scientific research, we optimise both the size of the model and the training process, with explicit attention to energy and water consumption.

Publicly funded, publicly accountable

GPT‑NL is funded by the Netherlands Enterprise Agency (RVO) on behalf of the Ministry of Economic Affairs and Climate Policy. A total of €13.5 million has been allocated to the project. This public investment underlines the importance of an independent, trustworthy and future‑proof Dutch language model.

GPT‑NL shows that powerful AI and public values can go hand in hand. Together, we are building technology that makes the Netherlands stronger, more autonomous and fairer.

Behind the scenes

What happens when you build a large language model with only a fraction of a Silicon Valley budget? In the latest episode of the Media Innovation Podcast, Product Manager Saskia Lensink talks about how GPT‑NL was made.

Listen to the podcast (in Dutch)

Contact us

Skip navigation (Contact us)

Saskia Lensink

Functie:

Consultant & Business Developer

More about Saskia

Saskia Lensink works as a consultant and business developer and specializes in language and speech technologies. She applies her knowledge of NLP and ASR in various projects, and is active in a diverse set of consortia and networks to promote sovereign and high-performing European large language models.

More about Saskia

Standplaats:<br>Den Haag - New...

model language data netherlands sovereign saskia

Related Articles