MOT: A tool to fight openwashing in AI

MOT: a tool to fight openwashing in AI [LWN.net]

LWN .net News from the source

Content Weekly Edition Archives Search Kernel Security Events calendar Unread comments

LWN FAQ Write for us

Edition Return to the Front page

User: Password: |

MOT: a tool to fight openwashing in AI

[LWN subscriber-only content]

Welcome to LWN.net

The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider subscribing to LWN. Thank you for visiting LWN.net!

By Joe Brockmeier May 27, 2026

OSSNA

Many large language models (LLMs) are described as open source, but if one looks a bit deeper it turns out that is not actually so; the model may be free to download, it may be "open weight", but it does not fit the Open Source Initiative (OSI) Open Source Definition (OSD). Assessing the actual openness of models is not easy, as Arnaud Le Hors explained in his talk about the Model Openness Tool (MOT) at Open Source Summit North America 2026. The tool is designed to help users of LLMs understand to what degree a model is (or is not) open, and to combat the openwashing that is prevalent with LLMs.

The problem

Le Hors began by asking the audience a rhetorical question, "do you think that all the models that are on Hugging Face are open source? Are they even open models?" Hugging Face, of course, is a popular site for sharing and downloading LLMs, data sets, and applications for working with them.

Much of what is available on Hugging Face, he said, falls short of the basic requirements of an open-source license. Many vendors or projects are creating their own licenses for models. Le Hors said that this was not unlike the early days of open source; that created "a lot of chaos", which led to the creation of OSI and its definition of open source. "Now, many years later, we're seeing a similar type of challenge with 'open' AI."

The models are often described as open-source, or just open, which causes many problems. He said that, in fact, "there are a lot of restrictions associated with the licenses under which they are made available". For example, some licenses try to limit the number of users or try to place restrictions on the types of use: "They can say, well, you can use my model, but not for military use." That kind of limitation may be well-intended, but a license with use restrictions still falls short of being open source.

People believe that if something is on Hugging Face, they can simply download it and do whatever they want with it. He said that those users may be infringing on the licenses and taking a legal risk. Worse, some users download a model, do their own fine-tuning, and then republish the model under a different license. This would be the equivalent of downloading software under the GPL and then republishing it under the Apache License. "You just can't. Legally, it's not allowed."

Model Openness Framework

Le Hors said that those were the kind of problems that the Generative AI Commons working group of the Linux Foundation's AI & Data Foundation has been trying to solve with the Model Openness Framework (MOF). One might wonder, what about the OSI's Open Source AI Definition (OSAID)? He did not address the OSAID during the talk, but it could be because the work on MOF was underway separately from OSAID and a final version was introduced in April 2024, while OSI was still working on OSAID, which was not finalized until October 2024.

The MOF provides a structure for evaluating machine-learning models and provides a framework for describing how open (or not) a model actually is. The specification sets up a tiered system with three classes that represent "ascending levels of model completeness and openness", with a Class III ("Open Model") being the least open and a Class I ("Open Science Model") being the most open because it not only allows distribution and tuning, but also enables others to study how the model was created as well as the data used to train it. If a model's terms are too restrictive, it does not receive a classification at all.

According to the specification, a Class III model would allow fine tuning of a model, unrestricted usage, and creation of a product or service based on the model. To meet the Class II definition, a model would also need to include supporting libraries and tools, inference code, evaluation code, as well as code for training the model. A Class I model would have all the components included with the previous classes, as well as a research paper that explains the model, the components that would be needed to reproduce a similar model, and the training data "used for any form of model training" that users could examine.

Openness, he said, has to do with the license a model and its artifacts are provided under, while completeness refers to what is included with the...

MOT: A tool to fight openwashing in AI

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought

The Mirror Is Part of the Machine