"On the Validity of Knowledge with AI

"On the Validity of Knowledge with AI

What is epistemology?

Epistemology is the branch of philosophy that studies the nature, origin, limits, and validity of knowledge. It questions what we know, how we know it, and under what conditions knowledge can be considered valid. It encompasses questions such as:

  • What is scientific truth?
  • How to distinguish a belief from justified knowledge?
  • What are the criteria for the validity of a theory?

Epistemology can be general (analysis of the foundations of knowledge in general) or applied to specific disciplines (epistemology of physical sciences, mathematics, social sciences, etc.).

How does epistemology relate to semantics and ontologies?

  1. Semantics and Epistemology
  2. Ontologies and Epistemology

Epistemology = critical study of knowledge (validity, origin, limits)
Semantics = study of the meaning and significance of terms
Ontologies = structured models of knowledge, influenced by epistemological choices
Epistemology reflects on knowledge, semantics expresses it, and ontologies organize it.

Epistemology and the Scientific Method: Connections with Scientific Domains

Epistemology is closely linked to the scientific approach and scientific fields, as it provides the fundamental principles that guide the production and validation of scientific knowledge.

1. Epistemology and the Scientific Method

The scientific method relies on a rigorous methodology to produce reliable knowledge. It follows several key steps:

  • Observation and Questioning: Identifying a phenomenon to study.
  • Formulation of a Hypothesis: Proposing a possible explanation.
  • Experimentation and Data Collection: Testing the hypothesis through experimentation and observation.
  • Analysis and Validation: Confirming or refuting the hypothesis according to rigorous criteria.
  • Publication and Peer Review: Submitting the results to the critique of the scientific community.

Epistemology analyzes these steps and questions:

  • What constitutes a good scientific hypothesis?
  • What types of evidence are considered valid?
  • What is the distinction between science and pseudoscience?

Thus, it helps to improve and clarify the methods used in scientific research.

2. Epistemology and Scientific Fields

Epistemology directly influences various scientific fields, as each discipline has its own validity criteria and methods.

Formal Sciences (Mathematics, Logic, Computer Science)

These sciences are based on formal systems, where statements are established from axioms and rules of deduction. Epistemology examines:

  • The nature of mathematical truths (Platonism vs. Constructivism).
  • The role of formal models in computer science and artificial intelligence.

Experimental Sciences (Physics, Chemistry, Biology)

They rely on the observation of the real world and experimentation. Epistemology raises questions such as:

  • Are physical laws objective discoveries or human constructions?
  • Is a scientific theory definitive or always revisable? (e.g., the transition from Newtonian mechanics to general relativity).

Human and Social Sciences (Psychology, Sociology, Economics)

These disciplines study complex phenomena involving human behaviors and social interactions. They pose specific epistemological challenges:

  • Can the same methods as in natural sciences be applied?
  • What role does subjective interpretation play in the production of knowledge?
  • What is the place of cognitive and cultural biases in research?

Information Sciences and Ontologies

Epistemology is crucial in information sciences, notably in artificial intelligence and ontology engineering:

  • Does an ontology reflect an objective reality or a convention?
  • How to structure knowledge so that it is exploitable by machines?
  • How to avoid biases and ensure a faithful representation of the world?

3. Science, Truth, and Uncertainty

Epistemology reminds us that science is not an accumulation of definitive truths, but a dynamic process of improving knowledge.

  • A scientific theory is always provisional; it can be refuted or refined.
  • Science advances through falsification (Karl Popper): a hypothesis is scientific if it can be tested and potentially refuted.
  • Some disciplines have a greater margin of uncertainty, such as meteorology or economics, which requires specific management of models and predictions.

?? Epistemology is the foundation of the scientific approach, defining validity criteria and questioning the foundations of scientific disciplines.
?? Each science has its own epistemological challenges, depending on whether it relies on formal models, experiments, or interpretative analyses.
?? Sciences are constantly evolving, not towards an absolute truth, but towards increasingly precise and robust models.
?? Thus, epistemology helps us better understand how knowledge is constructed and how it can be used, criticized, and improved.

Epistemology and Data Science: Between Scientific Approach and Knowledge Formalization

Data science plays a pivotal role in the production and structuring of knowledge. It contributes both to the scientific approach, by enabling the analysis and validation of hypotheses, and to the formalization of knowledge, by organizing information to make it usable by humans and machines.

1. Data Science and the Scientific Approach

Data science disciplines—such as statistics, machine learning, and big data—enhance the scientific method by facilitating:

  • Exploration of complex phenomena: Analyzing large volumes of data to detect correlations and formulate hypotheses.
  • Experimentation and simulation: Testing models on extensive and varied datasets.
  • Automation of hypothesis validation: Implementing statistical tests and predictive models.

Epistemological Issues Raised by Data Analysis

While data science facilitates empirical validation, it also raises questions:

  • Correlation vs. causation: Detecting relationships between variables does not imply a causal link. Epistemology reminds us that any correlation should be interpreted cautiously.
  • Bias and data quality: Scientific analysis is reliable only if the data are representative and unbiased. History is replete with examples where biases led to misinterpretations.
  • Falsifiability and reproducibility: Data analysis should not become merely an exercise in hypothesis confirmation but should allow for testing refutable predictions.

2. Data Science and Knowledge Formalization

Data science not only aids in scientific validation but also in structuring and formalizing knowledge, notably through:

  • Databases: Relational, graph-based, NoSQL.
  • Ontologies and the Semantic Web: Organizing knowledge as graphs of concepts linked by formal relationships (e.g., OWL, RDF).
  • Natural Language Processing (NLP): Extracting knowledge from texts and making it available in a usable form.

Epistemological Issues in Knowledge Formalization

  • Modeling and reduction of reality: A database or ontology is always a simplification of the world. How can we ensure that this formalization is accurate and relevant?
  • Interpretation of concepts: The terms used to structure knowledge carry meanings that can vary across disciplines and contexts.
  • Access to referents and meaning: Machines manipulate symbols (unique identifiers), but how can we ensure that these symbols correspond to the concepts we have in mind?

In summary, data science significantly contributes to both the scientific method and the formalization of knowledge. However, it is essential to remain vigilant about the epistemological challenges it presents to ensure rigorous and meaningful knowledge production.

3 - Towards a Data Epistemology?

?? Data science contributes to the acquisition, validation, and structuring of knowledge, but it requires epistemological vigilance to avoid the pitfalls of biases, hasty interpretations, and overly simplistic models.

?? Epistemology reminds us that data are never neutral: they are collected, selected, and interpreted based on specific objectives.

?? The production of knowledge always relies on a dialogue between humans and machines: while algorithms can extract trends, it is always humans who provide meaning and validate the relevance of models.

?? Therefore, an epistemological reflection on data science is essential to ensure a rigorous and informed use of the knowledge produced!

Artificial Intelligence (AI) and Large Language Models (LLMs) in the Epistemological Context and Data Science

The rise of artificial intelligence (AI), particularly Large Language Models (LLMs) like GPT, raises fundamental questions about the production, validation, and structuring of knowledge. These models sit at the intersection of data science, epistemology, and knowledge formalization, but their functioning raises several critical issues.

1 - LLMs: Between Statistical Processing and Meaning Production

Large Language Models (LLMs) are trained on vast text corpora and generate content based on word sequence probabilities. Their operation relies on three fundamental characteristics:

  • Statistical Language Modeling: They do not comprehend concepts as humans do but produce sentences based on the frequency and structures observed in training data.
  • Absence of Direct Referent: Unlike humans, who link words to external realities (referents), LLMs manipulate symbols without direct access to the real world.
  • Implicit Learning of Regularities: By observing millions of documents, these models capture linguistic and discursive regularities without formal validation of the knowledge they produce.

Epistemological Issue: An LLM does not construct structured knowledge based on evidence or falsifiable hypotheses. It mimics credible discursive forms without guaranteeing their validity.

2 - AI Facing the Challenges of Knowledge Validation and Interpretation

Data science and AI have profoundly transformed knowledge production but pose several challenges:

  • Veracity and Bias: LLMs do not verify the accuracy of the information they produce, which can lead to errors or biases present in their training data.
  • Lack of Explicit Reasoning: Unlike symbolic approaches (ontologies, formal logic), LLM-type models do not follow explicit logic to relate facts to each other.
  • Production of "Unfounded Knowledge": These models can generate content that seems plausible without being scientifically or philosophically justified.

Fundamental Issue: LLMs do not produce knowledge in the epistemological sense; they generate statistically coherent linguistic assemblies. Therefore, a critical approach is necessary regarding the content they produce.

3. AI and Knowledge Structuring: A Challenge for Ontologies and the Semantic Web

Ontologies (OWL, RDF, etc.) aim to structure knowledge by associating concepts with explicit relationships, whereas Large Language Models (LLMs) operate on a statistical and non-formal basis. This presents a problem:

  • Ontologies: Enable rigorous modeling of knowledge but are often rigid and require significant human intervention.
  • LLMs: Are flexible and capable of processing massive data volumes but do not guarantee formal consistency or the validity of the knowledge produced.

Can the two approaches be combined? Some research seeks to use LLMs to automate the creation and enrichment of ontologies (e.g., entity extraction, concept alignment, definition generation). However, the risks of errors and inconsistencies remain high.

Epistemological Issue: If AI becomes a tool for formalizing knowledge, how can we ensure that models capture conceptual distinctions and do not produce a biased or erroneous ontology?

4. Towards a Hybridization Between Symbolic Reasoning and Statistical AI?

The current challenges in AI within the knowledge domain encourage consideration of a hybridization between symbolic reasoning and statistical approaches:

  • Complementarity of AI and Ontologies: Use AI to extract and suggest relationships between concepts but validate these relationships through formal structures.
  • Convergence of Semantic Web and Deep Learning: Develop models integrating ontologies to provide a more robust interpretation of the results produced.
  • Interpretability and Explanation: LLMs should be accompanied by mechanisms explaining the source and structure of the information generated.

The key question remains epistemological: Can the construction and validation of knowledge be automated without human intervention? Or should AI remain a tool to assist in the formalization and interpretation of knowledge, with indispensable human oversight?

Conclusion: AI, a Powerful but Not Autonomous Tool in Knowledge Production

  • AI and LLMs do not replace the scientific and epistemological approach. They allow for the manipulation and organization of knowledge but cannot guarantee its validity without critical human intervention.
  • The challenge is not only technological but also philosophical and methodological. Knowledge is not limited to statistical correlations; it relies on definitions, principles of validity, and formal structuring.
  • The future likely lies in the integration between AI and knowledge modeling, while maintaining epistemological rigor. AI can assist in the formalization of knowledge, but validation must remain a scientific and human process.

Emmanuel Mouquet

Exploring life, Social Entrepreneur at Zeee

2 周

Great insights, thanks Nicolas! ?? It would be interesting to get your views on what some people call the "AI mad cow disease" which could happen as LLMs will use more and more content generated by LLMs themselves. Will LLMs be able to distinguish "validated data" from LLMs generated ones without human validation? Do we face a risk of bigger and bigger LLMs' hallucinations?

Nicolas Figay

Model Manager | Enterprise Architecture & ArchiMate Advocate | Expert in MBSE, PLM, STEP Standards & Ontologies | Open Source Innovator(ArchiCG)

3 周
回复
Shaun Kenyon

Making Space Compliance Easier for Everyone. Training AI on Space Tech. ex-Surrey, ex-Spire Global.

3 周

A wonderful, lightspeed trip through the issues facing us in AI-powered knowledge solutions, and yet all bases covered. Wish I could write this succinctly.

Fred Simkin

Developing and delivering knowledge based automated decisioning solutions for the Industrial and Agricultural spaces.

3 周

Great piece

Dickson Lukose

Knowledge Scientist

3 周

Thoroughly enjoyed reading it. Agree with your conclusion!

要查看或添加评论,请登录

Nicolas Figay的更多文章

社区洞察

其他会员也浏览了