GenAI Weekly — Edition 21

GenAI Weekly — Edition 21

Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs

Stay at the forefront of the Gen AI revolution with Gen AI Weekly! Each week, we curate the most noteworthy news, insights, and breakthroughs in the field, equipping you with the knowledge you need to stay ahead of the curve.

? Click subscribe to be notified of future editions


Checkbox extraction from PDFs using LLMWhisperer

From the Unstract blog:

Processing PDF forms involves extracting and interpreting elements like checkboxes and radio buttons, which is essential for automating business processes. These forms can be native PDFs, scanned paper documents, or images taken with smartphones. This article explores methods for extracting and interpreting both native text and scanned image formats of PDF documents.


Meta plans to release a 403-b parameter Llama-3 model by end of July

Matthias Bastian writing for The Decoder:

Matthias Bastian

The 405-billion-parameter model will be multimodal, capable of processing both images and text, reports The Information. This means that the model should be able to generate new images from a combination of images and text, for example. Previous Lama models were limited to text generation.
There were rumors that Meta would not make the weights of the 400 billion model available. AI leaker Jimmy Apples reported on X about alleged objections from Facebook co-founder Dustin Moskovitz to Mark Zuckerberg.
Despite these objections, Meta "apparently at the moment of this update" decided to publish the model, including the weights, as open source, according to Jimmy Apples.

My take on this: I wonder how performance scales in relation to the number of parameters. This releases should provide us more insights in this regard.


Amazon AI chatbot Rufus is now live for all US customers

Sarah Perez writing for Techcrunch:

Sarah Perez

Amazon’s AI-powered shopping assistant, named “Rufus,” is now live for all U.S. customers in the Amazon mobile app, the retailer announced on Friday. The assistant, which lives on the bottom right of the app’s main navigation bar, is designed to offer customers help with finding products, performing product comparisons and getting recommendations on what to buy.
Rufus was initially available in beta only to select customers in the U.S. within the Amazon mobile app ahead of Friday’s launch. Now, Amazon says all shoppers in the U.S. can try it after testing the chatbot across “tens of millions of questions.”
First announced in February, the AI chatbot has been trained on Amazon’s product catalog, customer reviews, community Q&As and other public information found around the web — though Amazon isn’t disclosing specifically which websites’ data was used to help its assistant make better recommendations, or whether that included other retail websites.
Rufus itself is powered by an internal large language model (LLM) specialized for shopping, allowing customers to ask questions about products, including things like factors to consider when buying, how items are different from other products, or how well the product holds up, as described by customer reviews and other expert analysis pulled from around the web.


My take on this: This tremendously reduces the friction of customers having to contact a real customer support person. This is possibly the largest roll-out of such a chat bot.


AWS App Studio promises to generate enterprise apps from a written prompt

Ron Miller writing for Techcrunch:

Ron Miller

App Studio promises to help you create an enterprise software application from a written prompt. That’s correct: You simply describe the program you want, and AWS says it will write the code for you without the need for any professional developers.
“App Studio is for technical folks who have technical expertise but are not professional developers, and we’re enabling them to build enterprise-grade apps,” Sriram Devanathan, GM of Amazon Q Apps and AWS App Studio, told TechCrunch.
Amazon defines enterprise apps as having multiple UI pages with the ability to pull from multiple data sources, perform complex operations like joins and filters, and embed business logic in them.
It is aimed at IT professionals, data engineers and enterprise architects, even product managers who might lack coding skills but have the requisite company knowledge to understand what kinds of internal software applications they might need. The company is hoping to enable these employees to build applications by describing the application they need and the data sources they wish to use.

My take on this: Reviews of Github Workspace that is a “description to app” system said it’s not there yet. I’d wait for more reviews of App Studio as well.


OpenAI reportedly nears breakthrough with “reasoning” AI, reveals progress framework

Benj Edwards writing for Ars Technica:

Ars Technica Benj Edwards

OpenAI's five levels—which it plans to share with investors—range from current AI capabilities to systems that could potentially manage entire organizations. The company believes its technology (such as GPT-4o that powers ChatGPT) currently sits at Level 1, which encompasses AI that can engage in conversational interactions. However, OpenAI executives reportedly told staff they're on the verge of reaching Level 2, dubbed "Reasoners."

Bloomberg lists OpenAI's five "Stages of Artificial Intelligence" as follows:

  • Level 1: Chatbots, AI with conversational language
  • Level 2: Reasoners, human-level problem solving
  • Level 3: Agents, systems that can take actions
  • Level 4: Innovators, AI that can aid in invention
  • Level 5: Organizations, AI that can do the work of an organization

A Level 2 AI system would reportedly be capable of basic problem-solving on par with a human who holds a doctorate degree but lacks access to external tools. During the all-hands meeting, OpenAI leadership reportedly demonstrated a research project using their GPT-4 model that the researchers believe shows signs of approaching this human-like reasoning ability, according to someone familiar with the discussion who spoke with Bloomberg.

My take on this: What constitutes AGI isn’t very clear—at least as of now.


Vision language models are blind

Pooyan Rahmanzadehgervi et al:

Large language models with vision capabilities (VLMs), e.g., GPT-4o and Gemini-1.5 Pro are powering countless image-text processing applications and scoring high on existing vision-understanding benchmarks. Yet, we find that VLMs fail on 7 visual tasks absurdly easy to humans such as identifying (a) whether two circles overlap; (b) whether two lines intersect; (c) which letter is being circled in a word; and (d) counting the number of circles in an Olympic-like logo. The shockingly poor performance of four state-of-the-art VLMs suggests their vision is, at best, like that of a person with myopia seeing fine details as blurry, and at worst, like an intelligent person who is blind making educated guesses.

My take on this: We have a long way to go. But, I’m sure, in some dark corner somewhere, someone is toiling away to ensure LLMs can figure out if circles overlap or not.


Coders' Copilot code-copying copyright claims crumble against GitHub, Microsoft

Matthew Connatser writing for The Register:

Matthew Connatser

Claims by developers that GitHub Copilot was unlawfully copying their code have largely been dismissed, leaving the engineers for now with just two allegations remaining in their lawsuit against the code warehouse.
The class-action suit against GitHub, Microsoft, and OpenAI was filed in America in November 2022, with the plaintiffs claiming the Copilot coding assistant was trained on open source software hosted on GitHub and as such would suggest snippets from those public projects to other programmers without care for licenses – such as providing appropriate credit for the source – thus violating the original creators' intellectual property rights.

[…]

The anonymous programmers have repeatedly insisted Copilot could, and would, generate code identical to what they had written themselves, which is a key pillar of their lawsuit since there is an identicality requirement for their DMCA claim. However, Judge Tigar earlier ruled the plaintiffs hadn't actually demonstrated instances of this happening, which prompted a dismissal of the claim with a chance to amend it.

The amended complaint argued that unlawful code copying was an inevitability if users flipped Copilot's anti-duplication safety switch to off, and also cited a study into AI-generated code in attempt to back up their position that Copilot would plagiarize source, but once again the judge was not convinced that Microsoft's system was ripping off people's work in a meaningful way.

Hmm.


AMD to Buy European AI Lab Silo in Race Against Nvidia

Ian King writing for Bloomberg (via Yahoo Finance):

Advanced Micro Devices Inc. has agreed to buy Silo AI for $665 million in cash, adding a maker of artificial intelligence models that will help its push to close the gap on Nvidia Corp.
The US chipmaker is acquiring the Helsinki-based company, which describes itself as Europe’s largest private AI lab and has customers that include Allianz SE, Unilever Plc and Bayerische Motoren Werke AG unit Rolls-Royce, it said in a statement on Wednesday. Co-founder and Chief Executive Officer Peter Sarlin will continue to lead his team, which will become part of AMD’s Artificial Intelligence Group.
Santa Clara, California-based AMD is seen as Nvidia’s closest potential competitor in the fast-growing market for hardware used to develop new software and services powered by AI. Graphics chips, which the two companies specialize in, have proven the most efficient means to train the large language models that underpin user-facing services like OpenAI’s ChatGPT and Microsoft Corp.’s Copilot.

My take on this: Hardware companies are turning into software companies in the AI melee.


Goldman Sachs: AI Is Overhyped, Wildly Expensive, and Unreliable

Jason Koebler writing for 404 Media:

404 Media Jason Koebler

Investment giant Goldman Sachs published a research paper about the economic viability of generative AI which notes that there is “little to show for” the huge amount of spending on generative AI infrastructure and questions “whether this large spend will ever pay off in terms of AI benefits and returns.”?
The paper, called “Gen AI: too much spend, too little benefit?” is based on a series of interviews with Goldman Sachs economists and researchers, MIT professor Daron Acemoglu, and infrastructure experts. The paper ultimately questions whether generative AI will ever become the transformative technology that Silicon Valley and large portions of the stock market are currently betting on, but says investors may continue to get rich anyway. “Despite these concerns and constraints, we still see room for the AI theme to run, either because AI starts to deliver on its promise, or because bubbles take a long time to burst,” the paper notes.?

My take on this: GenAI stories exist on a very wide spectrum.


If you've made it this far and follow my newsletter, please consider exploring the platform we're currently building: Unstract—a no-code LLM platform that automates unstructured data workflows.


For the extra curious


要查看或添加评论,请登录

Shuveb Hussain的更多文章

  • GenAI Weekly — Edition 37

    GenAI Weekly — Edition 37

    Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

  • GenAI Weekly — Edition 36

    GenAI Weekly — Edition 36

    Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

  • GenAI Weekly — Edition 35

    GenAI Weekly — Edition 35

    Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

    5 条评论
  • GenAI Weekly — Edition 34

    GenAI Weekly — Edition 34

    Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

  • GenAI Weekly — Edition 33

    GenAI Weekly — Edition 33

    Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

    1 条评论
  • GenAI Weekly — Edition 32

    GenAI Weekly — Edition 32

    Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

    2 条评论
  • GenAI Weekly — Edition 31

    GenAI Weekly — Edition 31

    Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

    1 条评论
  • GenAI Weekly — Edition 30

    GenAI Weekly — Edition 30

    Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

    2 条评论
  • GenAI Weekly — Edition 29

    GenAI Weekly — Edition 29

    Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

  • GenAI Weekly — Edition 28

    GenAI Weekly — Edition 28

    Your Weekly Dose of Gen AI: News, Trends, and Breakthroughs Stay at the forefront of the Gen AI revolution with Gen AI…

社区洞察

其他会员也浏览了