AI at Meta

AI at Meta

研究服务

Menlo Park,California 911,969 位关注者

Together with the AI community, we’re pushing boundaries through open science to create a more connected world.

关于我们

Through open science and collaboration with the AI community, we are pushing the boundaries of artificial intelligence to create a more connected world. We can’t advance the progress of AI alone, so we actively engage with the AI research and academic communities. Our goal is to advance AI in Infrastructure, Natural Language Processing, Generative AI, Vision, Human-Computer Interaction and many other areas of AI enable the community to build safe and responsible solutions to address some of the world’s greatest challenges.

网站
https://ai.meta.com/
所属行业
研究服务
规模
超过 10,001 人
总部
Menlo Park,California
领域
research、engineering、development、software development、artificial intelligence、machine learning、machine intelligence、deep learning、computer vision、engineering、computer vision、speech recognition和natural language processing

动态

  • 查看AI at Meta的公司主页,图片

    911,969 位关注者

    ?? Today we’re excited to premiere Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ?? https://go.fb.me/00mlgt Movie Gen Research Paper ?? https://go.fb.me/zfa8wf ??? Movie Gen models and capabilities ? Movie Gen Video: A 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. ??Movie Gen Audio: A 13B parameter transformer model can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. ??Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. ??Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

  • 查看AI at Meta的公司主页,图片

    911,969 位关注者

    As part of our continued work to help assure the future security of deployed cryptographic systems, we recently released new code that will enable researchers to benchmark AI-based attacks on lattice-based cryptography — and compare them to new and existing attacks going forward. We shared more on our work on Salsa — as well as seven other new releases for the open source community in this post ?? https://go.fb.me/h3f1fl

  • 查看AI at Meta的公司主页,图片

    911,969 位关注者

    Following the release of our latest system level safeguards, today we're sharing new research papers outlining our work and findings on Llama Guard 3 1B & Llama Guard 3 Vision — models that support input/output safety in lightweight applications on the edge and in multimodal prompts. Llama Guard 3 1B research paper ?? https://go.fb.me/o8y8m1 Llama Guard 3 Vision research paper ?? https://go.fb.me/1cb0xh Our hope in releasing this research openly is that it helps practitioners build new customizable safeguard models — and that this work inspires further research and development in LLM safety.

    • 该图片无替代文字
  • 查看AI at Meta的公司主页,图片

    911,969 位关注者

    The NVIDIA team shared more on how they optimized Llama 3.2 on-device and vision models for performance and cost-efficiency from data-center scale all the way to low-power edge devices.

  • 查看AI at Meta的公司主页,图片

    911,969 位关注者

    Whether you're attending #EMNLP2024 in person or following from your feed, here are five research papers being presented by AI research teams at Meta to add to your reading list. 1. Distilling System 2 into System 1: https://go.fb.me/5l9832 2. Altogether: Image Captioning via Re-aligning Alt-text: https://go.fb.me/1eanji 3. Beyond Turn-Based Interfaces: Synchronous LLMs for Full-Duplex Dialogue: https://go.fb.me/e25irp 4. Memory-Efficient Fine-Tuning of Transformers via Token Selection: https://go.fb.me/c67v9h 5. To the Globe (TTG): Towards Language-Driven Guaranteed Travel Planning: https://go.fb.me/9cknbp

    • 该图片无替代文字
    • 该图片无替代文字
    • 该图片无替代文字
    • 该图片无替代文字
    • 该图片无替代文字
  • 查看AI at Meta的公司主页,图片

    911,969 位关注者

    Newly published in the latest issue of Science Robotics today: NeuralFeels with neural fields — Visuotactile perception for in-hand manipulation.

    查看Sudharshan Suresh的档案,图片

    Research Scientist, Boston Dynamics

    For robot dexterity, a missing piece is general, robust perception. Our new Science Robotics article combines multimodal sensing with neural representations to perceive novel objects in-hand. See it on the cover of the November issue! https://lnkd.in/ezZRs5dN We estimate pose and shape by learning neural field models online from a stream of vision, touch, and proprioception. The frontend achieves robust segmentation and depth prediction for vision and touch. The backend combines this information into a neural field, while also optimizing for pose. Vision-based touch (digit.ml/digit) perceives contact geometries as images, and we train an image-to-depth tactile transformer in simulation. For visual segmentation, we combine powerful foundation models (SAMv1) with robot kinematics. It doubles up as a multimodal pose tracker, when provided CAD models of the objects at runtime. For different levels of occlusion, we find that “touch, at the very least, refines and, at the very best, disambiguates visual estimates during in-hand manipulation." We release a large dataset of real-world and simulated visuo-tactile interactions and tactile transformer models on Hugging Face: bit.ly/hf-neuralfeels This has been in the pipeline for a while, thanks to my amazing collaborators from AI at Meta,? Carnegie Mellon University, University of California, Berkeley, Technische Universit?t Dresden, and CeTI: Haozhi Qi, Tingfan Wu, Taosha F., Luis Pineda, Mike Maroje Lambeta, Jitendra MALIK, Mrinal Kalakrishnan, Roberto Calandra, Michael Kaess, Joseph Ortiz, and Mustafa Mukadam Paper: https://lnkd.in/ezZRs5dN Project page: https://lnkd.in/dCPCs4jQ #ScienceRoboticsResearch

    • Science Robotics November Cover
  • 查看AI at Meta的公司主页,图片

    911,969 位关注者

    Together with Reskilll, we hosted the first official Llama Hackathon in India. This hackathon brought together 270+ developers & 25+ mentors from across industries in Bengaluru. The result? 75 impressive new projects built with Llama in just 30h of hacking! Read the full recap, including details on some of the top projects like CurePharmaAI, CivicFix, Evalssment and Aarogya Assist ?? https://go.fb.me/0n8xkz

    • 该图片无替代文字
    • 该图片无替代文字
    • 该图片无替代文字
    • 该图片无替代文字
    • 该图片无替代文字
      +13
  • 查看AI at Meta的公司主页,图片

    911,969 位关注者

    Join us in San Francisco (or online) this weekend for a Llama Impact Hackathon! Teams will be spending two days building new ideas and solutions with Llama 3.1 + Llama 3.2 vision and on-device models. Three challenge tracks for this hackathon: 1. Expanding Low-Resource Languages 2. Reducing Barriers for Llama Developers 3. Navigating Public Services Join us and build for a chance to win awards from a $15K prize pool ?? https://go.fb.me/vnzbd3

    • 该图片无替代文字
  • 查看AI at Meta的公司主页,图片

    911,969 位关注者

    Today at Meta FAIR we’re announcing three new cutting-edge developments in robotics and touch perception — and a new benchmark for human-robot collaboration to enable future work in this space. Details on all of this new work ?? https://go.fb.me/vdcn2b 1. Meta Sparsh is the first general-purpose encoder for vision-based tactile sensing that works across many tactile sensors and many tasks. Trained on 460K+ tactile images using self-supervised learning.? 2. Meta Digit 360 is a breakthrough artificial fingertip-based tactile sensor, equipped with 18+ sensing features to deliver detailed touch data with human-level precision and touch-sensing capabilities. 3. Meta Digit Plexus is a standardized platform for robotic sensor connections and interactions. It provides a hardware-software solution to integrate tactile sensors on a single robot hand and enables seamless data collection, control and analysis over a single cable. To make these advancements more accessible for different applications, we’re partnering with GelSight and Wonik Robotics(??????) to develop and commercialize these touch-sensing innovations. Additionally, looking towards the future, we’re releasing PARTNR: a benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration. Built on Habitat 3.0, it’s the largest benchmark of its kind to study and evaluate human-robot collaboration in household activities. We hope that standardizing this work will help to accelerate responsible research and innovation in this important field of study. The potential impact of expanding capabilities and components like these for the open source community ranges from medical research to supply chain, manufacturing and much more. We’re excited to share this work and push towards a future where AI and robotics can serve the greater good.

关联主页

相似主页