Gemma 3, a family of open AI models that Google released this week, quickly received praise for their impressive efficiency. However, Gemma 3’s license makes commercial use of the models risky, as several developers complained on X. It is not a Gemma 3-exclusive issue.
Companies like Meta also apply custom, non-standard licensing terms to their openly available models, and the terms present legal challenges for companies. By asserting the more burdensome clauses, some businesses, particularly smaller ones, are concerned that Google and others might “pull the rug” from under their business.
Nick Vidal, head of community at the Open Source Initiative, a long-standing organization aiming to define and “steward” all things open source, told TechCrunch, “The restrictive and inconsistent licensing of so-called ‘open’ AI models is creating significant uncertainty, particularly for commercial adoption.”
“Despite the fact that these models are marketed as open, the actual terms impose a number of legal and practical obstacles that prevent businesses from integrating them into their products or services,” the article states. Open model developers choose proprietary licenses over industry-standard options like Apache and MIT for a variety of reasons.
For instance, the AI startup Cohere has made it clear that it intends to support scientific, but not commercial, work based on its models. However, Gemma and Meta’s Llama licenses have restrictions that restrict how businesses can use the models without risking legal repercussions. Meta, for instance, prohibits developers from improving any model other than Llama 3 or creating “derivative works” with the “output or results” of Llama 3 models.
Additionally, it prevents organizations with more than 700 million monthly active users from deploying Llama models without first acquiring a unique, additional license. Gemma’s license generally has fewer requirements. However, it does grant Google the authority to “restrict (remotely or otherwise) usage” of Gemma if Google believes that Gemma is in violation of “applicable laws and regulations” or the company’s prohibited use policy.
These terms apply to more than just the original Gemma and Llama models. The Llama and Gemma licenses must also be adhered to by models based on either animal. In Gemma’s case, that includes models trained on synthetic data generated by Gemma.
Florian Brand, a research assistant at the German Research Center for Artificial Intelligence, believes that — despite what tech giant execs would have you believe — licenses like Gemma and Llama’s “cannot reasonably be called ‘open source.’”
“Most companies have a set of approved licenses, such as Apache 2.0, so any custom license is a lot of trouble and money,” Brand told TechCrunch. “Small businesses will stick to models with standard licenses because they don’t have the money or legal teams.” Brand pointed out that AI model developers with specialized licenses, such as Google, haven’t yet vigorously enforced their terms.
He added, however, that the threat is frequently sufficient to prevent adoption. Brand stated, “These restrictions affect the AI ecosystem — even AI researchers like me.” Moody’s director of machine learning Han-Chung Lee concurs that custom licenses like those associated with Gemma and Llama render the models “not usable” in numerous commercial contexts. So does Eric Tramel, a staff applied scientist at AI startup Gretel.
“Model-specific licenses make specific carve-outs for model derivatives and distillation, which causes concern about clawbacks,” Tramel said. “Imagine a business that is specifically producing model fine-tunes for their customers. What license should a Gemma-data fine-tune of Llama have? What would happen to all of their customers down the line? According to Tramel, the most likely scenario for deployers is the models acting as a kind of trojan horse. “A model foundry can put out [open] models, wait to see what business cases develop using those models, and then strong-arm their way into successful verticals by either extortion or lawfare,” he said.
“For instance, Gemma 3 appears to be a solid release that has the potential to have a broad impact. However, due to its license structure, the market cannot adopt it. As a result, businesses will probably stick with Apache 2.0 models, which may be weaker and less reliable. To be clear, despite their restrictive licenses, some models have been widely distributed. Llama, for example, has been downloaded hundreds of millions of times and built into products from major corporations, including Spotify.
However, according to Yacine Jernite, head of machine learning and society at AI startup Hugging Face, they could be even more successful if they were permissively licensed. Jernite urged service providers like Google to switch to open license frameworks and “collaborate more directly” with users on terms that are widely accepted. According to Jernite, “it all serves primarily as a declaration of intent from those actors” because “there is no consensus on these terms” and “many of the underlying assumptions haven’t yet been tested in courts.”
“[But if certain clauses] are interpreted too broadly, a lot of good work will find itself on uncertain legal ground, which is particularly scary for organizations building successful commercial products.”
Vidal said that there’s an urgent need for AI models companies that can freely integrate, modify, and share without fearing sudden license changes or legal ambiguity.
According to Vidal, “the current landscape of AI model licensing is rife with confusion, restrictive terms, and misleading claims of openness.” “The AI industry should align with established open source principles to create a truly open ecosystem rather than redefining “open” to suit corporate interests.”