Having Fun with Opait Report Miner

Having Fun with Opait Report Miner

Step One: Acquiring Data Files

We used publicly available quarterly reports from the Home Depot Website for the years 2013-2015. One PDF report for each quarter (10-Q and 10-K), for a total of 12 documents. Each report had standard financial tables such as Cash Flows, Balance Sheets, Profit & Loss Statements and so on.

Step Two: Defining Data Model

We opened 2013-Q1 report in Opait Report Miner, located the Cash Flow table and drew two selection rectangles (rubber bands) to tag the specific table for extraction. 

Step Three: Building the Model

We used the Opait Report Miner automation API to run this model against all 12 quarters, picking a single column from each, and building a 12 columns data table. We then plotted the Net Earnings values for each quarter:

Home Depot business is indeed seasonal with maximum earnings during Summer building months. It also has a growing trend on a year by year basis. All this insight from drawing two rubber bands! The power of RPA at work.

Data mining can be simple and fun! 

About Opait Software

Opait Software specializes in high quality extraction of structured data such as fields, tables, sections and paragraphs from unstructured documents in many file formats. Automatic identification and extraction of tabular data, as well as, tagging and filtering NLP elements of PDF documents allows advanced analytics, RPA automation and semantic search using data trapped in PDF and other unstructured documents. These data-mining products are particularly suited to financial modeling and analytics. Automatic processing of statements, remittances, bills, financial reports and contracts are some applications of this technology.


要查看或添加评论,请登录

Farhad Khalafi的更多文章

  • The Nifty Barcoder

    The Nifty Barcoder

    Introducing the Nifty Barcoder, the only barcoding library you need! Barcodes are convenient, accurate and efficient…

    4 条评论
  • The Little Engine That Could!

    The Little Engine That Could!

    When it comes to parsing and extracting data from mega PDF files, many well-known candidates run into deep problems…

    1 条评论
  • Summarizing Web Pages

    Summarizing Web Pages

    Automatic summarization is an active field in natural language processing. Teaching a computer to analyze an article…

社区洞察

其他会员也浏览了