From Concepts to Conceptualizations for Knowledge Hypergraphs
Concepts form the very core of all our work on representing, accumulating, and manipulating knowledge with algorithms. Without them, we would simply have no framework to even start thinking about knowledge. Concepts are the atoms or nodes with which we create “molecules” like knowledge graph triples, “tissues” like knowledge hypergraphs, and “organs” like algorithms that learn and reason. And concepts help us distinguish between knowledge graphs and graph-structured other things.
But concepts are not stored, static objects – like marbles in a box – that spontaneously arise as if by magic.??
In a recent post, Alan Morrison synthesizes and reminds us of fundamental characteristics of concepts that are a consequence of their dynamic nature. He underscores the importance of an earlier post – essential reading! – by Gadi Singer [my former boss : )] about the foundational processes by which concepts are created:? conceptualization.?
Gadi’s thinking mirrors closely the fundamental shift in basic psychological research (see in particular the extensive, fascinating work of Lawrence Barsalou) from a view that emphasizes “semantic memory” (a repository of objective, stable, modality- and context-independent, already-formed concepts or facts) to a "generative" one that focuses not on concepts as objects but on robust operations or methods for constructing and shaping them.?
This newer, second view emphasizes how humans operate on sensory and proprioceptive inputs along with prior knowledge (as inputs) to adaptively create and modify dynamic, grounded, sometimes idiosyncratic, and usually situation-specific combinations of inputs forming elements of knowledge, now called conceptualizations (with an "s") to emphasize the stark contrast from fixed, objective concepts.
This dramatic shift is from a focus on the products (concepts and conceptualizations -- which we represent as knowledge graph nodes and hypergraphs) to an emphasis on the variable processes and mechanisms that create and adapt them (conceptualization or concept formation).
Conceptualization or concept formation plays a key role in enabling us to represent not only seemingly fixed, nearly-universal concepts but also distinct distinct, frequently-contradictory, changing conceptualizations of what at first glance seems to be the same thing.? These varying conceptualizations are called -- when we associate them with people, groups of people, or schools of thought -- perspectives or points of view, i.e., idiosyncratic or systematically divergent ways of conceiving the same thing. Some simple examples of different perspectives are the often dramatic differences between buyers and sellers when they think of "a good deal", between doctors and patients when they think of "abdominal pain", or between children and professional athletes when they think of?"a good player". An even more subtle and interesting example is the difference between expert physicians working on problems inside or outside of their areas of specialization.
These are not examples of differing terminology or simple ambiguity – the concepts in question refer to or denote the same entities and often use the same labels.? Different people simply think of the same things in different ways.?
The clearest example that I have to hand is rice.
In all of these cases, rice denotes the same entity: the seeds of the rice plant. In that sense, the term rice is not really ambiguous because it doesn't refer to different things – but it does have different meanings and different implications for different people. According to what they know and what they're doing, people have different conceptualizations or "mental models" of rice.
This is a fundamental problem for knowledge representation and for AI generally because the standard, na?ve assumption is that we are trying to represent a single, coherent, absolute truth about how the world "really" is. But if we simply aggregate all of the features mentioned, we will have concepts with contradictory or unknown attributes and values:
Which of these assertions or attributes represent the one true concept of rice?
We have no answer to this question. The key idea here is that one single, fixed, universal concept of rice will not suffice. A simple collection of non-contradictory knowledge graph triples or hypergraphs won't ever be sufficient to enable different perspectives or conceptualizations.
To represent knowledge effectively, we need not only to store but also to assemble triples and shape different but related concepts – usually on the fly – for different users and different purposes, in different situations. AI is inching reluctantly in this direction, but we are not there yet for concepts or knowledge.
Methods for Assembling Concepts and Conceptualizations
Static concepts and adaptive conceptualizations (with an "s"), then, are the output of cognitive processes that:
These foundational mechanisms of calibrating, prioritizing, and selectively aggregating knowledge triples underlie the different conceptualization processes that we use to assemble, shape, and adapt hypergraphs for both concepts and conceptualizations.
Right now, we can identify at least four "generative" methods for conceptualization, i.e., for the off-line or on-line construction of knowledge hypergraphs.? We'll need significant progress in distinguishing, automating, mixing, and evaluating all of these kinds of conceptualization to make AI really work.??
Extensional conceptualization.? We can define knowledge hypergraphs as we would sets:? by declaring that the collection of instances x, y, z, etc. constitute a group that we want to label, like a software engineer's enumerations. Definition by declaration requires no evidence or criteria.? One clear example is the concept of month as one of {January, February, March, …, December}. Since there are no criteria offered for defining this ad hoc concept, we cannot judge whether the list is correct, coherent, or complete – only that it is as defined. And we have no criteria to correctly add or remove items from such a collection.?
Intensional conceptualization.? We can define knowledge hypergraphs as a collection of features:? by describing what instances will be like, i.e., by the types of triples or predicates included. Any instances that have all (or nearly all) of the defining features are instances of that concept and receive the label that we have given it.? Definition by description makes explicit the criterial features that define a category.? Queries for relational databases also work this way:? the query encompasses a list of fields with specific values (defining an ad hoc, unlabeled concept) and only rows that match the query are returned as instances of that query-concept.?
Associative conceptualization.? We can also define knowledge hypergraphs as a collection of instances that share some relation (or predicate) to a given target.? For example, the seemingly ad hoc concepts like things to take on a vacation or surgical equipment – Barsalou's ad hoc and goal-derived categories. Definition by association makes explicit the target concept (camping, surgery) but not the kinds of relations that are considered relevant. And quite often the instances of a concept defined this way share few tangible similarities:? tents, fishing tackle, food, etc. are not overtly or perceptibly similar, for example.
This "definition" by association seems to be a common approach in language models, as well:? they model target concepts as strings, then identify associated strings within a window of x tokens on either side of the target string. Leveraging a collection of co-occurring strings as a "concept" or "definition" of a target string provides usable and useful results on many tasks but begs the question of how a contextual string that we can't interpret might "define" the concept of another string that we don't know. It might all just be gibberish. This specific case seriously calls into question whether we are in fact talking about real concepts rather than just patterns of uninterpreted behaviors in the case of LLMs.
Prototypical conceptualization.? We can also define knowledge hypergraphs in terms of subjective similarity to some instance (or group of instances) that we take as a prototype -- like distance to a clustering centroid. Definition by prototype makes explicit some target (ex., robins) as prototypical of a concept (ex., birds) but does not make explicit the specific features that are considered relevant -- the objective, defining features.? Using prototypes, a concept like bird is defined as anything similar to robins or a concept like politician might be defined as anyone similar to Mikhail Gorbachov or to Donald Trump.
We think of, understand, and model both human and machine knowledge in terms of concepts that we store (for algorithms) as nodes in knowledge graphs and conceptualizations as hypergraphs. Concepts and conceptualizations, then, form the very core of all our work on representing, accumulating, and manipulating knowledge with algorithms.
Our progress in building knowledge systems will be accelerated or stymied to the extent that we can understand conceptualization operations – how concepts are built, structured, and adapted – not merely by how many concepts we've accumulated or how frequently we encountered them.
Interested in research, monitoring, and investigation of everything related to the Earth, the Earth’s atmosphere, and the links with the universe, the hourglass
8 个月Nice
25 years experience | scalable work systems deployment | project driven | world scale
11 个月Do you think that new techniques in computational semantics that are grounded in a complete and consistent ontology could leverage LLMs to generate concrptualizations for a given context?
Founder Proprietor at Knowledge Enabler Systems
1 年Yes, concepts are dynamic, evolving and adaptive but most models of concepts are passive. Object-Oriented Analysis and Design of software provides us the means for creating and activating the concepts by adding "operations" to the software objects of concepts. By describing "concepts" as atoms and molecules the emphasis is given solely to ONE KIND OF concepts. The realm of concepts has two kinds of concepts, namely 1)stand alone concepts and 2) LINKING concepts. The LINKING concepts are many and significant. All the verbs or predicates are the LINKING concepts. The atom and molecule analogy does NOT have sufficiently prominent LINKING components in the analogy. We have to find an appropriate analogy in which LINKING elements are as prominent and varigated as the nodes or vertices in a network or a graph. In a knowledge graphs there are many edges having multiple subject vertices and object vertices. So, a knowledge graphs is actually a HYPERGRAPH. With all these extensions we can now model concepts as dynamic, adapting and evolving system of concepts.
Terminology Consultant and Trainer at BIK Terminology
1 年Remember our conversation about ISO 704, Mike Dillinger, PhD? Essentially, you summarized a good chapter here. The one distinction that terminologists make is that when we analyse existing concepts, we start by identifying the domain. That way, it is clear whether we are dealing with rice from the agricultural, the nutritionist's or the chef's perspective.
Perfect Knowledge Conception AGI plus The Living Paradigm Connector
1 年Great article! The taste of Intelligence Theory is almost palpable. Intelligence Theory is based on the concept of Information Theory, where you can bravely expand meaningfully the riddles of input and go down different paths through the power of deduction. There's also Systems Theory, Graph Theory, and something new I think: Conception Theory or Concept Theory. Maybe we need to teach Data Science to an AI agent. I think principles can be taught but we need to teach AI to look really hard at the "contextual lay of the land" for applications like Artificial Science or Artificial Technical Invention.