The Model Openness Framework: A Practical Approach to AI Transparency
Vincent Caldeira
Chief Technology Officer, APAC at Red Hat ? Technical Oversight Committee Member at FINOS ? Green AI Committee Member at Green Software Foundation ? Technical Advisor at OS-Climate ? Technology Advisor at U-Reg
The original MOF research paper (March 2024) defined a three-tiered classification for AI models based on their openness:
While this research set a foundation for defining AI openness, the newly released MOF Implementation Framework translates it into practical evaluation criteria that organizations can adopt. It strengthens transparency requirements, provides guidance on licensing and reproducibility, and introduces an assessment process to standardize how AI models are classified.
AI Openness Is Not Binary: Moving Beyond Open vs. Closed Models
I believe that from a practical point of view, AI models exist on a spectrum of openness, not a simple open vs. closed divide. Some models release code but not training data; others open-source weights but restrict commercial use. The MOF framework acknowledges this complexity and provides structured criteria to evaluate transparency levels accurately.
Rather than enforcing full openness, the framework promotes completeness of disclosure, ensuring that enterprises can make informed decisions about the AI models they adopt.
A key example of this is DeepSeek-V3 and DeepSeek-R1, two recent open-source AI models that push the boundaries of open innovation while also demonstrating the limitations of transparency in AI development.
DeepSeek-V3 and DeepSeek-R1: Advancing Open-Source AI with Partial Transparency
Technical Innovation Through Open Research
DeepSeek has contributed significant advancements in reinforcement learning (RL) and distillation techniques through its latest models:
These contributions accelerate innovation in the open-source AI ecosystem by allowing researchers and developers to build upon state-of-the-art training methodologies.
Limited Transparency in Training and Tuning Data
Despite their technical openness, DeepSeek-V3 and DeepSeek-R1 do not fully disclose their training and fine-tuning datasets, which is a common issue in today’s AI landscape. The primary concerns include:
领英推荐
The MOF framework provides a structured approach to evaluating such models, recognizing their contributions to open-source innovation while also flagging areas where transparency is incomplete.
The Reality of Data Management: Why Full Openness Is Unrealistic
While transparency is critical, it is not always realistic for AI models to fully disclose their training data due to:
Instead of requiring full data openness, the MOF framework emphasizes clear documentation of dataset sources, bias considerations, and processing methods. This enables organizations to assess risks, compliance, and ethical considerations without needing direct access to training data.
The DeepSeek models illustrate this balance—contributing significantly to the AI research community while retaining commercial and strategic control over fine-tuning data.
Transparency as a Foundation for AI Risk Management
AI models that lack full openness require stronger system-level guardrails. The MOF framework helps organizations manage AI risks in line with their AI usage and risk tolerance by promoting:
Additionally, organizations can implement AI system-level controls such as:
By integrating disclosure-driven risk assessments and system-level guardrails, enterprises can ensure accountability in AI deployments without relying on full data openness, making AI systems more transparent, compliant, and aligned with ethical and operational goals.
AI ‘openness’ is starting to feel like a marketing buzzword rather than a true commitment to transparency. If companies pick and choose what to disclose, are we really any better off than with fully closed models? ????
Technical Support Manager | Support Engineering Manager | Customer Success Leader | Linux Expertise, Troubleshooting, Coaching & Career Development | IT & Customer Support Operations Expert
1 个月The shift from viewing AI models as simply open or closed to recognizing a spectrum of openness is revolutionizing how we approach transparency and innovation. DeepSeek-V3 and DeepSeek-R1 highlight this complexity—advancing AI with significant technical contributions while navigating the limitations of full transparency due to privacy, intellectual property, and security concerns. The MOF framework offers a practical solution by emphasizing disclosure and risk assessment over unattainable full openness. This balance allows organizations to innovate responsibly, ensuring compliance and ethical integrity without stifling progress. As we stand at this crossroads, the question arises: Can embracing nuanced transparency drive accountability while still fueling the evolution of AI?
Data and Artificial Intelligence | Engineering Director | DataLake | MLOps | Big Data Analytics | Natural Language Processing | Computer Vision | Generative AI | LLM | CISO | Indian Air Force | Quantum Computing | QML
1 个月Indeed the MOF promotes reproducibility of AI models that is crucial for democratising AI, accelerating the innovation cycles across the world and building trust in AI. Thanks for sharing.
AI Leader · CEO/CTO · MBA · Founder · Xoogler
1 个月Open Source is (necessarily) binary though, with the part of the spectrum of openness we care about being the various license styles that all meet the OSD’s requirements (e.g. permissive MIT vs copyleft GPL).
Head of Product, GenAI Foundation Model Platforms
1 个月Good stuff. For open science model, do I mean full transparency about the “training data”or do you mean full access. Those could be two different things and while full transparency is manageable granting full access to all the traing data may not be practical