Doxie AI

IT 服务与咨询

Structured data extraction from digital documents.

关注

查看全部 1 位员工

关于我们

Historical records, whether in special collections of libraries or basements of companies, have a wealth of information waiting to be extracted. However, mere digitization is not enough, data has to be extracted in a structured manner. Doxie is unlocking this rich data source by building custom pipelines to extract information from the most challenging data sources.

所属行业: IT 服务与咨询
规模: 2-10 人
总部: San Jose
类型: 私人持股
创立: 2021
领域: Digital Data Extraction

地点

主要

US，San Jose

获取路线

Doxie AI员工

Jui Lim

IP Licensing Executive | AI/ML | Entrepreneur | Start-ups | Advisor | Inventor | Wharton MBA Candidate

查看全部员工

动态

Doxie AI

190 位关注者
1 个月
举报此动态
The Music Index is a publication that?indexes music-related articles, news, reviews, and obituaries.?It was first published in print in 1949 by Harmonie Park Press, and is now published electronically by EBSCO Information Services.?The online version of The Music Index covers 1973 to the present, and includes full text for over 170 journals. The print versions look like the sample below. Is it possible to process digital copies (images) of past publications and covert them into a highly structured records that can be ingested into a database? Margaret Richter
1 条评论

赞评论分享
Doxie AI

190 位关注者
1 个月
举报此动态
EBSCO Information Services and Doxie AI have been working on an exciting project that is producing some great results. We will be releasing more information about this over a period of next several weeks. Stay tuned. Margaret Richter

赞评论分享
Doxie AI

190 位关注者
1 个月已编辑
举报此动态
Backstage Library Works Thank you for inviting us! It was a great meeting at #NAGARA2024 to showcase the possibilities using #AI #ML to extract and enable structure on unstructured data.
Backstage Library Works

2,021 位关注者
4 个月

What in the world could an 80’s children’s toy have to do with #NAGARA2024 and #digitization? If you couldn’t catch Thomas Forsythe’s lunchtime comments today at NAGARA, stop by our table to learn more. We’ll also be presenting later this afternoon at 4:25: sit in on a presentation by Beth Brevik and Anna Newman at Block 3, Session 14 to discuss preparing your archival collections for digitization.
赞评论分享
Doxie AI

190 位关注者
5 个月
举报此动态
#ALA2024 Backstage Library Works
赞评论分享
Doxie AI

190 位关注者
5 个月
举报此动态
If you are at #ALA2024 in San Diego, come to this session to learn how #AI is revolutionizing the digital libraries and special collections space.

此处无法显示此内容

在领英 APP 中访问此内容等

赞评论分享
Doxie AI

190 位关注者
5 个月
举报此动态
Doxie AI will be at #ALAAC24 in San Diego along with our partners Backstage Library Works. https://lnkd.in/exdwe3fu Digitization initiatives can take a big step towards providing both access and insight by extracting highly structured data into XML, CSV etc. from their digital assets. Doxie AI uses cutting edge #AI models to extract domain specific and highly accurate data and metadata that is much more than simple OCR. Learn why Bancroft Library (Berkeley), EBSCO Information Services, U of W and Backstage have used Doxie.AI's custom data extraction services. Beth Ann Goodwill Casey Cheney, PMP Alexandra Parran

2024 ALA Annual Conference & Exhibition

2024.alaannual.org

赞评论分享
Doxie AI转发了

UC Berkeley School of Information

17,670 位关注者
7 个月
举报此动态
The Bancroft Library worked with Doxie.AI, a company started by MIDS alums, to extract fielded data from Japanese American internment records. “There is great potential for machine learning and AI in libraries. There is a lot of discussion right now in library forums around what AI [and machine learning] can do to help us work better and faster,” said Mary Elings, the Interim Deputy Director and Head of Technical Services at the Bancroft Library. More: https://lnkd.in/gHPrpqud

Data Science Helps Bancroft Library Organize Historic Japanese-American Confinement Records

ischool.berkeley.edu

1 条评论

赞评论分享
Doxie AI转发了

Bryan MacQuarrie, CPA, CMA

Partner & COO at Frank, Rimerman + Co. LLP
7 个月
举报此动态
Here's a nice use of ML to capture historical data from over 100,000 Americans thanks to UC Berkeley School of Information, Bancroft Library and Doxie AI https://lnkd.in/gxpg75-Q

Expanding Access to WWII Japanese American Incarceree Data Using Machine Learning

https://www.youtube.com/

1 条评论

赞评论分享
Doxie AI

190 位关注者
8 个月
举报此动态
The Library Corporation's mission is "... providing the latest technology for your libraries and embrace a service mission that breaks through the software and hardware." Doxie AI specializes in performing highly accurate and customized data & meta-data extraction using the latest computer vision and natural language #ai #technology. Based in the heart of Silicon Valley, Doxie AI can convert unstructured data such as images, PDFs and audio data into curated information ready for research. We would be honored to become a partner. Kindly reach out. Below is a small sample of our work. Justin Duewel-Zahniser Annette Harwood Murphy Sam Brenizer
2 条评论

赞评论分享
Doxie AI转发了

Hugging Face

792,615 位关注者
8 个月
举报此动态
UDOP, a new generative model by Microsoft useful for document intelligence tasks, is now available in the Transformers library. See below for more info:
Niels Rogge

Machine Learning Engineer at ML6 & Hugging Face
8 个月

There's a new, powerful document AI model in the Transformers library: UDOP (short for Universal Document Processing) by Microsoft Research. ?? A recent trend in document AI is the move towards generative (GPT-like) models, which are trained to generate structured text given an image of a PDF or similar document. This is more general and end-to-end compared to the BERT-like models like LayoutLM, as they just take in an image as input and produce text as output, without relying on any OCR engine or painful subword token-level classification. Hence they can be trained to generate JSON given a document image, answer questions users might have, or generate whatever useful text people may want given an image. Examples of these are Donut ?? and Pix2Struct which are already available in Hugging Face, as well as models like GPT-4V, Gemini Pro Vision, or Claude-3 which came out yesterday. Document AI is going to move more and more towards these end-to-end Transformer models. UDOP is similar in the sense that it also has this vision encoder, text decoder architecture, but it extends this with the use of a traditional OCR engine to combine the best of LayoutLMv3 and Donut in a single model. The model is pre-trained with both a text and vision decoder, allowing it to learn the layout structure of documents. Docs: https://lnkd.in/eVG_dHJv Checkpoints: https://lnkd.in/efgx4MR4 Demo notebooks: https://lnkd.in/e-tsdUrw #documentai #microsoft #huggingface #artificialintelligence
34 条评论

赞评论分享

相似主页

查看职位

有意向到Doxie AI工作吗？

Doxie AI

IT 服务与咨询

Structured data extraction from digital documents.

关于我们

地点

Doxie AI员工

Jui Lim

IP Licensing Executive | AI/ML | Entrepreneur | Start-ups | Advisor | Inventor | Wharton MBA Candidate

动态

2024 ALA Annual Conference & Exhibition

2024.alaannual.org

Data Science Helps Bancroft Library Organize Historic Japanese-American Confinement Records

ischool.berkeley.edu

Expanding Access to WWII Japanese American Incarceree Data Using Machine Learning

https://www.youtube.com/

立即加入，查看您错过的职场动态

相似主页

DOXLE

DOXIE, INC

Scanned Inc.

Techson IP

Equinox Open Library Initiative

Tech Logic

Byson Engineering

IPValue

collectionHQ

Library Futures

查看职位

业务拓展职位

市场营销经理职位

用户体验设计师职位

专利律师职位

副总裁职位

开发总监职位

工程师职位