LM-KBC Challenge @ ISWC 2024

LM-KBC Challenge @ ISWC 2024

Have you ever looked at seemingly simple entity relationships and attributes and thought, "How hard could it be to answer questions based on that relationship?" We ( Nadeen Fathallah Nicole Obretincheva and I) did.


As we delved into the intricacies of the LM-KBC Challenge @ISWC 2024, we quickly realized that even the most straightforward connections can lead to a tangled web of complexities especially when it involves asking a Large Language Model to find the answers.

The Illusion of Simplicity

Let's consider a few examples:

  • Countries and Borders: While it may seem straightforward to map countries to their neighbours, factors like historical changes, disputed territories, and islands without land borders can complicate matters.
  • People and Places: Tracking where people were born or died might appear simple, but consider the challenges of historical data, migration, and individuals with uncertain origins or a historical change of name of the city
  • TV Shows and Episodes: While this relationship might seem straightforward, factors like reboots, spin-offs, and special episodes can introduce nuances as well. Should the total number be searched under Series or Seasons? what if there are gaps between Series across years

Additionally, the neither the entities nor the relationships or their properties are always in flat structures. For example,

  • Hierarchical Structures: Relationships often form hierarchies, like a company with multiple listings, or cases where the parent-subsidiary relationships are required to identify the listings all while accounting for discrepancies like variations in listing identifiers, currencies, or stock symbols
  • Many to Many Relationships: It is possible that many people have won the an award the same year or across different years and it is also possible an award ceremony is not held in a given year
  • Set Theory in Action: Understanding the intersection, union, and negation of multiple responses to the same question is crucial for accurate analysis for e.g. and the same person may have won award in the same year across different categories or nominated across different categories

Our Approach & lessons learned

  • The gap between what LLMs "know" and what they can accurately express is vast bridging this requires strategic design and patience.
  • Real-world data is rarely clean or straightforward; it demands resilience and adaptability in problem-solving.

It was fun participating in the challenge and collaborate. A big thank you to co-authors and collaborators Nadeen Fathallah Nicole Obretincheva and thank you to our challenge hosts Jan-Christoph Kalo , Bohui Zhang , Simon Razniewski , Tuan-Phong Nguyen . For details on our approach, results and findings please refer to our paper Navigating Nulls, Numbers and Numerous Entities: Robust Knowledge Base Construction from Large Language Models

要查看或添加评论,请登录

Arunav Das的更多文章

社区洞察

其他会员也浏览了