Anthropic的动态

查看Anthropic的公司主页,图片

580,187 位关注者

Claude can now view images within a PDF, in addition to text. Enable the feature preview to get started: https://claude.ai/new?fp=1. This helps Claude 3.5 Sonnet more accurately understand complex documents, such as those laden with charts or graphics. The Anthropic API now also supports PDF inputs in beta: https://lnkd.in/emvau9Ez

Napoleon Skolarikis

Founder @ WizaLabs | I save money and retain customers for Shopify brands

3 周

I love anthropic

We recently published a blog showing examples of how you can potentially use computer-use to help automate SEO and marketing tasks! https://starterseoaudit.com/blog/using-anthropic-claude-35-computer-use-for-seo/

Kunal Agarwala

GenAI Lead - Sr Manager

3 周

Anthropic This is great.. Internally combining text and image both to pass on to the llm.. This is exactly what we have done multiple times in projects as it gave us better results.. But I am also confused about how it works.. in the doc it says, 1. First the doc gets converted to to image 2. Second, text is extracted and combined with Image So far we have using OCR services like Textract the text as a first step in every IDP pipeline and to do that, we always rasterize the doc to image first.. I would say thats internal to Textract.. But how do you guys do that? In your second step when you extract the text, do you use the image from first step or you use native pdf functionality to extract.. ? I would assume its through image.. as one of the most common use case is scanned pdf.. But this is great.. For one of use cases, user won’t have to do itnon their own.. but would you say from a design angle, it still better to first extract the text and store it as it has many other downstream use cases and we wont have to pass along the text any more to Claude, as it is doing it anyway..

Ivan Hernanz Cianca

Chief Information Officer (CIO) / Chief Technology Officer (CTO) con más de 20 a?os de experiencia

2 周

"?Totalmente de acuerdo! En el ámbito de document intelligence, estamos viendo cómo los OCR tradicionales están siendo rápidamente superados por modelos multimodales avanzados. Estos modelos no solo reconocen texto, sino que también comprenden el contexto visual, estructuras y patrones de los documentos, lo cual eleva considerablemente la precisión y el valor de la extracción de datos. Los modelos multimodales pueden interpretar tablas, gráficos y otros elementos visuales de los documentos, algo que los OCR convencionales no alcanzan a hacer bien sin una preprocesamiento extenso. Así, en lugar de una simple lectura de texto, obtenemos una 'comprensión' profunda, ideal para aplicaciones empresariales complejas. ?? ?El futuro? Document intelligence será un entorno de ‘comprensión’ total, donde los multimodelos ofrecerán un análisis detallado en tiempo real, optimizando procesos y facilitando una toma de decisiones más inteligente. ????"

Florian Bansac

Disfold.com - Boost Your Investing & Swing Trading With Algos & Signals, Brute Force & AI Tools

4 小时前

Is it also in the API? Come share what you build and learn with us in the AI Agents group on linkedin: https://www.dhirubhai.net/groups/6672014

回复
Matt Lane

Product Strategist and Conceptual Software Designer

3 周

Anthropic, downloading reports, etc. (artifacts) other than .tsx, e.g., non-branded pdf, would be awesome.

Maranatha Poirier

IT Champion & AI Advocate for Growing Organizations | You have a goal, let's make it happen

2 周

Woah, that is very cool. PDF is a printer language and text can be pulled directly out of the file. Whereas, an image of text requires OCR to recognize and pull the text out. The ability to so quickly and effectively recognize the content of images as easily as PDFs is huge for working with large document repositories.

Kabeer Singh Thockchom

AI & Data @ EY | GenAI Products and Financial Quantitative Modeling | Passionate SaFe 6.0 Product Owner / Product Manager & Full Stack Developer | Building AI Agents

3 周

Claude is just the best for text generation and summarization, we are using it as the last step in several pipelines

Shay Irani

Global Technology & Digital Transformation Leader

3 周

I wonder if the new model can transcribe a tech journal of sorts in a “for dummies” edition ??

查看更多评论

要查看或添加评论,请登录