Personal Portals: Semantic, Rule-Based Personalization
A while ago I created a personal paper on paper.li that was about “artificial intelligence” to track news stories about AI. This is a very limited form of personalization that does a decent job in getting some relevant and personalized information. However, I still have to do quite a bit of manual sifting through all that is pushed at me as there are thousands of stories that contain (with some frequency, and/or in the title, etc.) the phrase “artificial intelligence” I also cannot customize my paper to a more specific subject matter that might involve some conjunctions, disjunctions, and negations, nor can I expect my topics to be considered semantically, i.e., not literally.
Ideally, I would like something that allows me to get specific stories about artificial intelligence (AI), for example, about the latest in a specific AI subject, namely natural language processing (NLP), but not all such stories either. For example, I might be interested in catching AI/NLP stories that are specifically discussing semantic and ontological approaches to NLP. I also do not want stories that talk about AI and NLP in the context of chatbots. And let us say that I am a bit more particular on this very specific paper: I want all such stories in the context of Asia (perhaps I want to track all such work that is happening in Asia).
I am very demanding, I know. But I have a feeling we all are (about various other stories, of course). So ideally, I want a system that would get stories that satisfy the following condition:
Given the above condition, I also expect the system to be smart enough to consider the subjects/topics written in blue above semantically, or conceptually, but not literally. Specifically, I expect the system to do the following:
- Not to push at me all articles that simply mention “artificial intelligence”. By putting that phrase in the condition I expect the system to be able to decide if artificial intelligence is a key subject/topic in the story. Of course, the same holds for Asia; that is, Asia must not be accidentally mentioned, but it must be (or semantically related entities) a key subject in the story.
- I also expect that system to get me stories where natural language processing is a key topic (along with AI), even if the phrase ‘natural language processing’ was never mentioned, but where semantically/conceptually related phrases were used instead (e.g., computational linguistics or natural language understanding)
That would be nice. Wouldn’t it?
Incidentally, in requirement (1) above I essentially expect the system to have good semantic precision, and by requirement (2), I expect the system to have good semantic recall.