Automated text classification and validation of thousands of news articles enhances performance of AI-model for German construction technology com

Automated text classification and validation of thousands of news articles enhances performance of AI-model for German construction technology com



Client Profile.

A German technology company that leverages its in-house construction leads data platform to disseminate exhaustive and well-researched construction projects data across USA and Europe. Its clients include small to Fortune 500 companies operating in the real estate, construction, and building materials manufacturing space.

Business Need.

The company captured upcoming or existing construction projects data in real-time from multi-lingual and multi-format online publications across Europe and USA. The unstructured data parsed using automated crawlers included property type, project start/end date, site location, size, cost, and construction phases. The captured data was auto-classified into relevant segments using intelligent AI algorithms. 20% of the data with high complexity which could not be auto-tagged was labelled using human intervention.

The company partnered with HitechDigital to:

  • verify and validate the auto classified text to ensure accuracy and navigate the challenges of tool performance
  • append missing information
  • manually annotate the 20% data which couldn’t be auto classified

Challenges.

  • Identifying and understanding the contextual information from the construction articles and tagging/labelling it based on categories like project size, phase, location, owner, architects, start date and end date, etc.
  • Managing a massive input volume of hundreds of articles within the stipulated 24-hour timeline while conforming to stringent quality checks
  • Training a skilled and experienced team to understand architectural data and validate auto-classified information

Solution.

HitechDigital’s data specialists classified and labelled more than 10,000 construction related articles. The labelled data was then funneled through a validation, verification and data append process to ensure accuracy and credibility of annotation.

The verified and validated data improved the accuracy of the AI algorithms and ensured better and more relevant search results on construction project data.

Approach.

  • KPIs, SOPs and other metrics were defined and documented based on project assessment and understanding of business needs.
  • Client provided domain and functional training to an experienced team of text annotators and quality check specialists to meet project requirements.
  • The auto-classified input data was accessed over the client portal through secure credentials.
  • Manual validation and verification processes were carried out to check accuracy of the auto-tagged data.
  • Appended classified text with missing information wherever necessary; the modification/correction served to improve future tool performance.
  • Manually verified 20% of the articles which were complex and couldn’t be auto-classified.
  • Quality check: A two-step quality check process was applied to each batch of articles to ensure labelling accuracy which in turn would enhance AI model performance.

Technology Used:

Secure Login to Client portal through web-browser

Business Impact.

Enhanced text labelling accuracy increased performance of the model

Significant reduction in turnaround time from days to few hours

Increased algorithmic accuracy resulted in higher customer acquisition

Offshoring model saved 50% on project cost for client


Explore further details by reading the HitechDigital/CaseStudy

要查看或添加评论,请登录

社区洞察

其他会员也浏览了