New Inference Framework Speeds up LLMs Without Raising Costs
Embedded Computing Design
The leading source of in-depth tech knowledge, news, views, & instructional design content for the electronics engineer.
Large language models (LLMs) are some of today’s most impactful technologies. They’re what make advanced chatbots and generative AI possible, but as their functionality grows, so too do their costs and complexity. A new framework from Stanford researchers could change that.
In a recent research paper, a team unveiled a modular inference framework called Archon. Inference is the stage where LLMs draw on what they’ve learned in training to determine appropriate responses or make predictions based on new data. This requires a considerable amount of complicated computing, so it’s often either slow or expensive. Archon speeds it up without raising costs. Read more.
Power Modules Result in Smaller (and Lighter) Vehicles
For the automakers, weight reduction in electric vehicles (EVs) is at or near the top of their engineering priority lists. EVs are traditionally heavier than their non-electric counterparts even if those weight reductions come in small increments, every little bit counts.
Three new dc-to-dc converter power modules from Vicor can remove significant weight by reducing the size of some components or completely eliminating them completely. Read more.
Third Party IP Block Licensing from Sondrel
Sondrel has?made its in-house IP available for licensing including a suite of IP blocks for standard SoC management that are designed to operate start-up of devices, clock and reset control, and power domain handling. The SoC Management Suite is divided into three parts, the PMU (Power Management Unit, the URG (Universal Reset Generator), and the UCG (Universal Clock Generator). Read more.
View the Latest White Papers
Speaking Opportunities
Electronica 2024 Opportunities
Content
Events
Content/Lead Strategies
New Products
Other Opportunities
#ew24 #ai #iot #aiot #embeddedsystems #machinelearning #computex2024 #electronicafair #ewna24 #riscvsummit #embeddedworld #ewna #ewna24 #ewconference #ewnorthamerica
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
2 周The focus on efficiency gains in LLMs through new inference frameworks raises questions about whether this prioritizes speed over other crucial aspects like explainability and bias mitigation. Recent research on "AI for Social Good" emphasizes the need for ethical considerations alongside performance improvements. How would these power modules impact the accessibility of electric vehicles in developing nations with limited infrastructure?