Stanford & Google Unveil Generative Agents for Hyper-Realistic Human Behavior in Interactive Applications

Stanford & Google Unveil Generative Agents for Hyper-Realistic Human Behavior in Interactive Applications

ML twitter is abuzz about the news of generative agents that are simulating realistic human behavior in interactive applications like The Sims. Researchers introduced a benchmark assessing the ethical behaviors of artificial agents, Astronomers released an improved image of a black hole thanks to ML and Google’s Assured OSS is now GA. Let’s dive in!

Research Highlights

No alt text provided for this image

  • Researchers from Stanford and Google developed generative agents, a new computational software that simulates believable human behavior for interactive applications. These agents, integrated into an interactive sandbox environment inspired by The Sims, exhibit individual and emergent social behaviors through observation, planning, and reflection. The architecture of these agents allows them to store experiences in natural language, synthesize memories into higher-level reflections, and retrieve them dynamically for future planning. In evaluations, the generative agents are claimed to successfully executed complex tasks such as planning and attending a Valentine's Day party autonomously. This innovative fusion of large language models and interactive agents offers a promising framework for creating immersive and realistic simulations of human behavior.
  • Researchers introduced MACHIAVELLI, a benchmark containing 134 Choose-Your-Own-Adventure games, to examine the ethical behaviors of artificial agents such as GPT-4. MACHIAVELLI's scenarios focus on social decision-making, with scenario labeling automated using large language models, which outperform human annotators. The study explores harmful behaviors, such as power-seeking, causing disutility, and committing ethical violations, to evaluate agents' tendencies toward Machiavellianism. Researchers observed a tension between maximizing reward and behaving ethically in agents. However, through LM-based methods, they demonstrated that agents can be steered toward less harmful behaviors, offering hope for progress in machine ethics and creating safer, more capable artificial agents.
  • Stanford researchers investigated the effectiveness of reasoning in large language models, focusing on the role of intermediate steps for complex tasks. They tested the hypothesis that reasoning is most effective when training data consists of local clusters of strongly connected variables. Results claim that intermediate steps were helpful only when training data was locally structured with respect to dependencies between variables and when intermediate variables were relevant to the relationship between observed information and target inferences. The study emphasizes that the statistical structure of training data plays a crucial role in driving the effectiveness of step-by-step reasoning.

ML Engineering Highlights

No alt text provided for this image

  • The first image of a black hole, previously resembling a fuzzy orange donut, has been significantly sharpened into a fiery ring due to machine learning and computer simulations. Researchers generated over 30,000 simulated images of black holes, identifying common patterns and learning correlations between different parts of the images to fill in gaps created by missing data. The new image of the M87 black hole, published in The Astrophysical Journal Letters, is consistent with the old one, but the ring of hot gasses swirling around the black hole is significantly thinner, providing valuable insights into the behavior of matter around black holes.
  • AWS has launched Amazon Bedrock, a platform that allows users to build generative AI-powered applications using pretrained models from startups including AI21 Labs, Anthropic, and Stability AI. The platform, currently in limited preview, provides access to Titan FMs, a family of foundation models developed in-house by AWS. Bedrock aims to cater to large customers looking to build "enterprise-scale" AI apps, differentiating it from competitors such as Google Cloud and Azure.
  • UserTesting, a company specializing in user experience testing, has introduced Friction Detection, a new feature in its Human Insights Platform. Friction Detection utilizes machine learning to analyze video recordings of user sessions, identifying moments when users encounter difficulties or confusion while performing tasks or navigating workflows. This feature aims to help product designers and developers identify areas for improvement and enhance overall user experience. The announcement follows UserTesting's merger with UserZoom in a $1.3 billion deal completed in April 2023.

Open Source Highlights

  • Google launched its Assured Open Source Software (Assured OSS) service into general availability, offering it for free to help developers defend against supply chain security attacks. Assured OSS supports over a thousand Java and Python packages, continuously scanning for known vulnerabilities, conducting fuzz tests to discover new ones, and fixing issues before contributing the fixes back upstream. Developers and organizations can integrate Assured OSS into their existing development pipeline, helping to ensure the integrity of their software supply chain and protect their business applications.
  • Databricks released Dolly 2.0, claimed to be the first open-source instruction-tuned language model, trained using a methodology similar to InstructGPT but with a 100% open-source dataset. Dolly 2.0 is free for commercial use and was trained on the databricks-dolly-15k dataset, consisting of 15,000 original prompt and response pairs generated by Databricks employees.

Tutorial of the Week

No alt text provided for this image

Ready to supercharge your AI models without breaking the bank on resources? Dive into Sebastian Raschka’s comprehensive tutorial about parameter-efficient finetuning methods for large language models (LLMs), including prefix tuning, adapters, and LLaMA-Adapter! Unlock the power to train AI models on a vast array of devices, from laptops to smartphones, all while reducing energy consumption and carbon footprint.


Upcoming Events

Join us tomorrow, April 14th for an Engineering AMA on our Discord Server to learn about the process of building Lit-LLaMA, LLMs, replicating a paper, open source, and PyTorch Lightning. Come with your questions!

No alt text provided for this image


New York! Join Air Street Capital and Lightning AI on April 25th for 3 talks focused on helping builders in the NYC AI community jumpstart their LLM practice. Happy hour drinks and ample time for networking (rooftop included) will follow. RSVP here to confirm your spot.

No alt text provided for this image

LLaMA Highlight

No alt text provided for this image

Short on GPU memory? With gradient accumulation, you can simulate training with large batch sizes if you are short of GPU memory or don't have multiple GPUs! See how we are doing it for LLaMA-Adapter!

Don’t Miss the Submission Deadline

  • BMVC 2023: 34th annual conference on machine vision, image processing, and pattern recognition. Nov 20 - 24, 2023. (Aberdeen, United Kingdom). Submission deadline: Fri May 12 2023 16:59:59 GMT-0700
  • NeurIPS 2023: 37th conference on Neural Information Processing Systems. Dec 10 - 16, 2023. (New Orleans, Louisiana). Submission deadline: Wed May 17 2023 13:00:00 GMT-0700
  • ICCVS 2023: The 14th International Conference on Computer Vision Systems. Sep 27-29, 2023. (Vienna, Austria). Submission Deadline: Mon May 29 2023
  • ICMLA 2023L: The 22nd International Conference on Machine Learning and Applications. Dec 15-17, 2023. (Jacksonville, Florida). Submission Deadline: Sat Jul 15 2023

Want to learn more from Lightning AI? “Subscribe” to make sure you don’t miss the latest flashes of inspiration, news, tutorials, educational courses, and other AI-driven resources from around the industry. Thanks for reading!

KRISHNAN N NARAYANAN

Sales Associate at American Airlines

1 年

Thanks for sharing

回复
Francesco Saverio Zuppichini

OPINIONS ARE MY OWN - Machine Learning Engineer ???? | Computer Vision | Generative AI | NLP | Web Dev | Open Source

1 年

really love the newsletter, probably the best outhere

KRISHNAN N NARAYANAN

Sales Associate at American Airlines

1 年

Thanks for sharing

CHESTER SWANSON SR.

Next Trend Realty LLC./wwwHar.com/Chester-Swanson/agent_cbswan

1 年

Thanks for posting.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了