登录查看更多内容

A Semantic Retrieval System for Web Platforms

Gilles-Antoine Nys

Geo-ICT Analyst, TL, PhD

发布日期: 2019年10月4日

Let us imagine that you are the manager of a platform of services on the Web. As an ambitious manager, you logically want to enhance your business and augment the number of your customers. For this reason, how are you sure that customers get the relevant services for their needs when they visit your website? How to be sure, considering that they do not know anything about your field of expertise, that you understand each other and consequently sell them the service they are searching for?

The problem is customers do not have a sufficient knowledge of what you are able to achieve. After all: “you are the expert”. On the other hand, a good manager needs to improve its impact on the market balancing with the investments. It is important to understand the potential users’ needs to better present your service and make the sale. The problematic relies in the fat that a semantic gap might occur your communication and particularly if your advertising is done through a static web page.

One of our researches proposes to reduce the semantic gap between the customers and the services providers on a web market platform. The study aimed to create an algorithm that understands the users queries and respond to it with an expert answer in an automatic way. This was especially done using Natural Language Processing (NLP) and ontologies. While the former represents ways to program computers to process and analyze large amounts of natural language data, the later structures knowledge within high-defined graph-structures.

To put things in context, in another previous paper, we proposed a knowledge base that defines what a processing chain, or a service, is and what are its constituting parts. This knowledge management was especially illustrated in the scope of a remote sensing services platform. Within this topic, we explored the possibilities to set an algorithm that gathers terms expressed by the users. After that, it structures them in another kind of knowledge graph to be reused. Such a knowledge graph, also called a thesaurus, structures natural language and formalizes the relations between the terms (i.e. synonyms, broader and narrower terms …). Finally, this knowledge might be mined to ensure the communication and provide a relevant answer through diverse uses.

The following picture shows the workflow from the user’s query in the upper left corner through the classification of services that best answer the initial query in the bottom left corner. Within this process, the relevant part of the query is extracted and improves the structure of the databases as it goes. Therefore, the more the tool is used, the more it is useful. Machine learning, you know.

Without getting deeper into details, queries are first processed by a Part-of-Speech Tagging module: every term is given a tag specifying its role in the sentence: verb, adjective, noun … Secondly, based on these tags, exceptions are filtered as many terms might bring fuzziness (e.g. the term “state” which might be related to the notion of “country” or related to the “condition”).

After that, thanks to these tags again, filters are applied to divide the different possibilities: some terms will be considered as spatial information, others not. These spatial terms will be put next to the GeoNames database (GeoNames geographical graph database covers all countries and contains over eleven million placenames that are available for download free of charge). Thanks to this spatial contextualization, the areas of interest of the query and services are compared and affect the services classification after all. What is intended for the terms that do not gather spatial information is a bit different: enhance and structure the knowledge of the web platform.

As it was explained, users may lack knowledge in a particular domain but still need to explore it. As not everything might be structured as useful for this domain or application, there was a need to create a dedicated knowledge base. In order to structure this knowledge, a wider reference ontology was used as a basis: the UNESCO Thesaurus (The UNESCO Thesaurus is a controlled and structured list of terms used in subject analysis and retrieval of documents and publications in the fields of education, culture, natural sciences, social and human sciences, communication and information). Data mining and knowledge building were used. Many more details might be found in the scientific paper at the following address: 10.5194/isprs-archives-XLII-2-W13-1593-2019

Gilles-Antoine Nys的更多文章

Spatio-Temporal Reasoning in CIDOC CRM

2018年12月10日

Spatio-Temporal Reasoning in CIDOC CRM

Archaeology is the study of past through analysing of artefacts created by human activities. Archaeology covers…
Towards an Ontology for the structuring of Remote Sensing operations shared by different processing chains

2018年11月5日

Towards an Ontology for the structuring of Remote Sensing operations shared by different processing chains

Remote sensing is the scientific discipline that brings together all the knowledge and techniques used for observing…

A Semantic Retrieval System for Web Platforms

Gilles-Antoine Nys

Geo-ICT Analyst, TL, PhD

Gilles-Antoine Nys的更多文章

社区洞察

其他会员也浏览了

Exploring RAG with LangChain

The Semantic Web Project Revitalized: From Vision to Reality with Reasoning and Inference

From Data to Intelligence: How Knowledge Graphs are Shaping the Future

A deep dive on Vector Search and its implementation

Retrieval Augmented Generation (RAG): The Ultimate Guide

Advanced Technical Insights into Perplexity AI's Potential Algorithm Weights: A Case Study

Important question today is “Should you have Vector DB on-premise or not?”

Beyond Keywords: Redefining Discovery with Multimedia Semantic Search

Knowledge Graph Writers

Do you need LLM or a Knowledge Graph?

Gilles-Antoine Nys的更多文章

Spatio-Temporal Reasoning in CIDOC CRM

Towards an Ontology for the structuring of Remote Sensing operations shared by different processing chains

社区洞察

其他会员也浏览了

Exploring RAG with LangChain

The Semantic Web Project Revitalized: From Vision to Reality with Reasoning and Inference

From Data to Intelligence: How Knowledge Graphs are Shaping the Future

A deep dive on Vector Search and its implementation

Retrieval Augmented Generation (RAG): The Ultimate Guide

Advanced Technical Insights into Perplexity AI's Potential Algorithm Weights: A Case Study

Important question today is “Should you have Vector DB on-premise or not?”

Beyond Keywords: Redefining Discovery with Multimedia Semantic Search

Knowledge Graph Writers

Do you need LLM or a Knowledge Graph?