MOT: a tool to fight openwashing in AI [LWN.net]
LWN<br>.net<br>News from the source
Content Weekly Edition<br>Archives<br>Search<br>Kernel<br>Security<br>Events calendar<br>Unread comments
LWN FAQ<br>Write for us
Edition Return to the Front page
User:<br>Password: |
Log in /<br>Subscribe /<br>Register
MOT: a tool to fight openwashing in AI
[LWN subscriber-only content]
Welcome to LWN.net
The following subscription-only content has been made available to you<br>by an LWN subscriber. Thousands of subscribers depend on LWN for the<br>best news from the Linux and free software communities. If you enjoy this<br>article, please consider subscribing to LWN. Thank you<br>for visiting LWN.net!
By Joe Brockmeier<br>May 27, 2026
OSSNA
Many large language models (LLMs) are described as open source, but<br>if one looks a bit deeper it turns out that is not actually so; the<br>model may be free to download, it may be "open weight", but it<br>does not fit the Open Source<br>Initiative (OSI) Open Source<br>Definition (OSD). Assessing the actual openness of models is not<br>easy, as Arnaud Le Hors explained in his talk about the Model Openness Tool (MOT) at Open<br>Source Summit North America 2026. The tool is designed to help<br>users of LLMs understand to what degree a model is (or is not) open,<br>and to combat the openwashing<br>that is prevalent with LLMs.
The problem
Le Hors began by asking the audience a rhetorical question,<br>"do you think that all the models that are on Hugging Face are open<br>source? Are they even open models?" Hugging Face, of course, is a<br>popular site for sharing and downloading LLMs, data sets, and<br>applications for working with them.
Much of what is available on Hugging Face, he said, falls short<br>of the basic requirements of an open-source license. Many vendors or<br>projects are creating their own licenses for models. Le Hors said<br>that this was not unlike the early days of open source; that created<br>"a lot of chaos", which led to the creation of OSI and its<br>definition of open source. "Now, many years later, we're seeing a<br>similar type of challenge with 'open' AI."
The models are often described as open-source, or just open, which<br>causes many problems. He said that, in fact, "there are a lot of restrictions<br>associated with the licenses under which they are made<br>available". For example, some licenses try to limit the number of<br>users or try to place restrictions on the types of use: "They can<br>say, well, you can use my model, but not for military use." That<br>kind of limitation may be well-intended, but a license with use<br>restrictions still falls short of being open source.
People believe that if something is on Hugging Face, they can<br>simply download it and do whatever they want with it. He said that<br>those users may be infringing on the licenses and taking a legal risk.<br>Worse, some users download a model, do their own fine-tuning,<br>and then republish the model under a different license. This would be<br>the equivalent of downloading software under the GPL and then<br>republishing it under the Apache License. "You just<br>can't. Legally, it's not allowed."
Model Openness Framework
Le Hors said that those were the kind of problems that the Generative AI Commons working<br>group of the Linux Foundation's AI & Data Foundation has<br>been trying to solve with the Model<br>Openness Framework (MOF). One might wonder, what about the OSI's<br>Open<br>Source AI Definition (OSAID)? He did not address the OSAID during<br>the talk, but it could be because the work on MOF was underway<br>separately from OSAID and a final version was introduced<br>in April 2024, while OSI was still working on OSAID, which was not<br>finalized until October 2024.
The MOF provides a structure for<br>evaluating machine-learning models and provides a framework for<br>describing how open (or not) a model actually is. The specification<br>sets up a tiered system with three classes that represent<br>"ascending levels of model completeness and openness", with a<br>Class III ("Open Model") being the least open and a Class I ("Open<br>Science Model") being the most open because it not only allows<br>distribution and tuning, but also enables others to study how the<br>model was created as well as the data used to train it. If a model's<br>terms are too restrictive, it does not receive a classification at<br>all.
According to the specification, a Class III model would allow<br>fine tuning of a model, unrestricted usage, and creation of a product<br>or service based on the model. To meet the Class II definition, a<br>model would also need to include supporting libraries and tools,<br>inference code, evaluation code, as well as code for training the<br>model. A Class I model would have all the components included with<br>the previous classes, as well as a research paper that explains the<br>model, the components that would be needed to reproduce a similar<br>model, and the training data "used for any form of model<br>training" that users could examine.
Openness, he said, has to do with the license a model and its<br>artifacts are provided under, while completeness refers to what is<br>included with the...