登录查看更多内容

GenAI Weekly — Edition 21

Shuveb Hussain

Co-founder at Unstract

发布日期: 2024年7月15日

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs

Stay at the forefront of the Gen AI revolution with Gen AI Weekly! Each week, we curate the most noteworthy news, insights, and breakthroughs in the field, equipping you with the knowledge you need to stay ahead of the curve.

? Click subscribe to be notified of future editions

Checkbox extraction from PDFs using LLMWhisperer

From the Unstract blog:

Processing PDF forms involves extracting and interpreting elements like checkboxes and radio buttons, which is essential for automating business processes. These forms can be native PDFs, scanned paper documents, or images taken with smartphones. This article explores methods for extracting and interpreting both native text and scanned image formats of PDF documents.

Meta plans to release a 403-b parameter Llama-3 model by end of July

Matthias Bastian writing for The Decoder:

Matthias Bastian

The 405-billion-parameter model will be multimodal, capable of processing both images and text, reports The Information. This means that the model should be able to generate new images from a combination of images and text, for example. Previous Lama models were limited to text generation.

There were rumors that Meta would not make the weights of the 400 billion model available. AI leaker Jimmy Apples reported on X about alleged objections from Facebook co-founder Dustin Moskovitz to Mark Zuckerberg.

Despite these objections, Meta "apparently at the moment of this update" decided to publish the model, including the weights, as open source, according to Jimmy Apples.

My take on this: I wonder how performance scales in relation to the number of parameters. This releases should provide us more insights in this regard.

Amazon AI chatbot Rufus is now live for all US customers

Sarah Perez writing for Techcrunch:

Sarah Perez

Amazon’s AI-powered shopping assistant, named “Rufus,” is now live for all U.S. customers in the Amazon mobile app, the retailer announced on Friday. The assistant, which lives on the bottom right of the app’s main navigation bar, is designed to offer customers help with finding products, performing product comparisons and getting recommendations on what to buy.

Rufus was initially available in beta only to select customers in the U.S. within the Amazon mobile app ahead of Friday’s launch. Now, Amazon says all shoppers in the U.S. can try it after testing the chatbot across “tens of millions of questions.”

First announced in February, the AI chatbot has been trained on Amazon’s product catalog, customer reviews, community Q&As and other public information found around the web — though Amazon isn’t disclosing specifically which websites’ data was used to help its assistant make better recommendations, or whether that included other retail websites.

Rufus itself is powered by an internal large language model (LLM) specialized for shopping, allowing customers to ask questions about products, including things like factors to consider when buying, how items are different from other products, or how well the product holds up, as described by customer reviews and other expert analysis pulled from around the web.

My take on this: This tremendously reduces the friction of customers having to contact a real customer support person. This is possibly the largest roll-out of such a chat bot.

AWS App Studio promises to generate enterprise apps from a written prompt

Ron Miller writing for Techcrunch:

Ron Miller

App Studio promises to help you create an enterprise software application from a written prompt. That’s correct: You simply describe the program you want, and AWS says it will write the code for you without the need for any professional developers.

“App Studio is for technical folks who have technical expertise but are not professional developers, and we’re enabling them to build enterprise-grade apps,” Sriram Devanathan, GM of Amazon Q Apps and AWS App Studio, told TechCrunch.

Amazon defines enterprise apps as having multiple UI pages with the ability to pull from multiple data sources, perform complex operations like joins and filters, and embed business logic in them.

It is aimed at IT professionals, data engineers and enterprise architects, even product managers who might lack coding skills but have the requisite company knowledge to understand what kinds of internal software applications they might need. The company is hoping to enable these employees to build applications by describing the application they need and the data sources they wish to use.

My take on this: Reviews of Github Workspace that is a “description to app” system said it’s not there yet. I’d wait for more reviews of App Studio as well.

OpenAI reportedly nears breakthrough with “reasoning” AI, reveals progress framework

Benj Edwards writing for Ars Technica:

Ars Technica Benj Edwards

领英推荐

How to use AI to Keep up with AI?

Michael Spencer 1 年前

Google Gone Bard

AIM 1 年前

?? Daily News in AI Agents: Key Updates 01/27 -…

?? Jim Schwoebel 1 个月前

OpenAI's five levels—which it plans to share with investors—range from current AI capabilities to systems that could potentially manage entire organizations. The company believes its technology (such as GPT-4o that powers ChatGPT) currently sits at Level 1, which encompasses AI that can engage in conversational interactions. However, OpenAI executives reportedly told staff they're on the verge of reaching Level 2, dubbed "Reasoners."

Bloomberg lists OpenAI's five "Stages of Artificial Intelligence" as follows:

Level 1: Chatbots, AI with conversational language
Level 2: Reasoners, human-level problem solving
Level 3: Agents, systems that can take actions
Level 4: Innovators, AI that can aid in invention
Level 5: Organizations, AI that can do the work of an organization

A Level 2 AI system would reportedly be capable of basic problem-solving on par with a human who holds a doctorate degree but lacks access to external tools. During the all-hands meeting, OpenAI leadership reportedly demonstrated a research project using their GPT-4 model that the researchers believe shows signs of approaching this human-like reasoning ability, according to someone familiar with the discussion who spoke with Bloomberg.

My take on this: What constitutes AGI isn’t very clear—at least as of now.

Vision language models are blind

Pooyan Rahmanzadehgervi et al:

Large language models with vision capabilities (VLMs), e.g., GPT-4o and Gemini-1.5 Pro are powering countless image-text processing applications and scoring high on existing vision-understanding benchmarks. Yet, we find that VLMs fail on 7 visual tasks absurdly easy to humans such as identifying (a) whether two circles overlap; (b) whether two lines intersect; (c) which letter is being circled in a word; and (d) counting the number of circles in an Olympic-like logo. The shockingly poor performance of four state-of-the-art VLMs suggests their vision is, at best, like that of a person with myopia seeing fine details as blurry, and at worst, like an intelligent person who is blind making educated guesses.

My take on this: We have a long way to go. But, I’m sure, in some dark corner somewhere, someone is toiling away to ensure LLMs can figure out if circles overlap or not.

Coders' Copilot code-copying copyright claims crumble against GitHub, Microsoft

Matthew Connatser writing for The Register:

Matthew Connatser

Claims by developers that GitHub Copilot was unlawfully copying their code have largely been dismissed, leaving the engineers for now with just two allegations remaining in their lawsuit against the code warehouse.

The class-action suit against GitHub, Microsoft, and OpenAI was filed in America in November 2022, with the plaintiffs claiming the Copilot coding assistant was trained on open source software hosted on GitHub and as such would suggest snippets from those public projects to other programmers without care for licenses – such as providing appropriate credit for the source – thus violating the original creators' intellectual property rights.

[…]

The anonymous programmers have repeatedly insisted Copilot could, and would, generate code identical to what they had written themselves, which is a key pillar of their lawsuit since there is an identicality requirement for their DMCA claim. However, Judge Tigar earlier ruled the plaintiffs hadn't actually demonstrated instances of this happening, which prompted a dismissal of the claim with a chance to amend it.

The amended complaint argued that unlawful code copying was an inevitability if users flipped Copilot's anti-duplication safety switch to off, and also cited a study into AI-generated code in attempt to back up their position that Copilot would plagiarize source, but once again the judge was not convinced that Microsoft's system was ripping off people's work in a meaningful way.

Hmm.

AMD to Buy European AI Lab Silo in Race Against Nvidia

Ian King writing for Bloomberg (via Yahoo Finance):

Advanced Micro Devices Inc. has agreed to buy Silo AI for $665 million in cash, adding a maker of artificial intelligence models that will help its push to close the gap on Nvidia Corp.

The US chipmaker is acquiring the Helsinki-based company, which describes itself as Europe’s largest private AI lab and has customers that include Allianz SE, Unilever Plc and Bayerische Motoren Werke AG unit Rolls-Royce, it said in a statement on Wednesday. Co-founder and Chief Executive Officer Peter Sarlin will continue to lead his team, which will become part of AMD’s Artificial Intelligence Group.

Santa Clara, California-based AMD is seen as Nvidia’s closest potential competitor in the fast-growing market for hardware used to develop new software and services powered by AI. Graphics chips, which the two companies specialize in, have proven the most efficient means to train the large language models that underpin user-facing services like OpenAI’s ChatGPT and Microsoft Corp.’s Copilot.

My take on this: Hardware companies are turning into software companies in the AI melee.

Goldman Sachs: AI Is Overhyped, Wildly Expensive, and Unreliable

Jason Koebler writing for 404 Media:

404 Media Jason Koebler

Investment giant Goldman Sachs published a research paper about the economic viability of generative AI which notes that there is “little to show for” the huge amount of spending on generative AI infrastructure and questions “whether this large spend will ever pay off in terms of AI benefits and returns.”?

The paper, called “Gen AI: too much spend, too little benefit?” is based on a series of interviews with Goldman Sachs economists and researchers, MIT professor Daron Acemoglu, and infrastructure experts. The paper ultimately questions whether generative AI will ever become the transformative technology that Silicon Valley and large portions of the stock market are currently betting on, but says investors may continue to get rich anyway. “Despite these concerns and constraints, we still see room for the AI theme to run, either because AI starts to deliver on its promise, or because bubbles take a long time to burst,” the paper notes.?

My take on this: GenAI stories exist on a very wide spectrum.

If you've made it this far and follow my newsletter, please consider exploring the platform we're currently building: Unstract—a no-code LLM platform that automates unstructured data workflows.

For the extra curious

GenAI Weekly

2,314 位关注者

要查看或添加评论，请登录

Shuveb Hussain的更多文章

GenAI Weekly — Edition 37

2024年11月25日

GenAI Weekly — Edition 37

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…
GenAI Weekly — Edition 36

2024年11月18日

GenAI Weekly — Edition 36

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…
GenAI Weekly — Edition 35

2024年11月11日

GenAI Weekly — Edition 35

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

5 条评论
GenAI Weekly — Edition 34

2024年11月4日

GenAI Weekly — Edition 34

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…
GenAI Weekly — Edition 33

2024年10月28日

GenAI Weekly — Edition 33

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

1 条评论
GenAI Weekly — Edition 32

2024年9月30日

GenAI Weekly — Edition 32

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

2 条评论
GenAI Weekly — Edition 31

2024年9月23日

GenAI Weekly — Edition 31

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

1 条评论
GenAI Weekly — Edition 30

2024年9月16日

GenAI Weekly — Edition 30

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

2 条评论
GenAI Weekly — Edition 29

2024年9月9日

GenAI Weekly — Edition 29

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…
GenAI Weekly — Edition 28

2024年9月2日

GenAI Weekly — Edition 28

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

See all articles

GenAI Weekly — Edition 21

Shuveb Hussain

Co-founder at Unstract

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs

Checkbox extraction from PDFs using LLMWhisperer

Meta plans to release a 403-b parameter Llama-3 model by end of July

Amazon AI chatbot Rufus is now live for all US customers

AWS App Studio promises to generate enterprise apps from a written prompt

OpenAI reportedly nears breakthrough with “reasoning” AI, reveals progress framework

领英推荐

Vision language models are blind

Coders' Copilot code-copying copyright claims crumble against GitHub, Microsoft

AMD to Buy European AI Lab Silo in Race Against Nvidia

Goldman Sachs: AI Is Overhyped, Wildly Expensive, and Unreliable

For the extra curious

GenAI Weekly

2,314 位关注者

Shuveb Hussain的更多文章

社区洞察

其他会员也浏览了

??Breaking Updates on AI: You Won't Want to Miss This!

Big Tech "GenAI" Quarterly Report

AI Newsletter

Google Gone Bard

Will AI Replace Google

AI NEWS YOU MISSED ?#50

The AI Economics Paradox: When Growth Increases Costs

Embracing the Future: Voice Search, AI Indexing, and How Small Businesses Can Thrive

Product is key to distribution in AI

AI News February 2025: OpenAI, Google, and Mistral AI Transform the Future

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs

Checkbox extraction from PDFs using LLMWhisperer

Meta plans to release a 403-b parameter Llama-3 model by end of July

Amazon AI chatbot Rufus is now live for all US customers

AWS App Studio promises to generate enterprise apps from a written prompt

OpenAI reportedly nears breakthrough with “reasoning” AI, reveals progress framework

领英推荐

Vision language models are blind

Coders' Copilot code-copying copyright claims crumble against GitHub, Microsoft

AMD to Buy European AI Lab Silo in Race Against Nvidia

Goldman Sachs: AI Is Overhyped, Wildly Expensive, and Unreliable

For the extra curious

GenAI Weekly

2,314 位关注者

Shuveb Hussain的更多文章

GenAI Weekly — Edition 37

GenAI Weekly — Edition 36

GenAI Weekly — Edition 35

GenAI Weekly — Edition 34

GenAI Weekly — Edition 33

GenAI Weekly — Edition 32

GenAI Weekly — Edition 31

GenAI Weekly — Edition 30

GenAI Weekly — Edition 29

GenAI Weekly — Edition 28

社区洞察

其他会员也浏览了

??Breaking Updates on AI: You Won't Want to Miss This!

Big Tech "GenAI" Quarterly Report

AI Newsletter

Google Gone Bard

Will AI Replace Google

AI NEWS YOU MISSED ?#50

The AI Economics Paradox: When Growth Increases Costs

Embracing the Future: Voice Search, AI Indexing, and How Small Businesses Can Thrive

Product is key to distribution in AI

AI News February 2025: OpenAI, Google, and Mistral AI Transform the Future