AI This Week: Microsoft’s Phi-3 Models, Copilot+ PCs, and Cutting-Edge AI Trends
TOP NEWS
Industry
Microsoft unveils new Phi-3 models, GitHub Copilot updates, AI PCs, and more
Let's go through each of the releases!
Small Language Models
Microsoft aims to lead the Small Language Model race with new additions to the Phi-3 family:
Microsoft Copilots and GitHub Copilot
New updates and features for the Copilots family:
New Copilot+ PCs
Introducing a new category of Windows PCs designed for AI:
Other Announcements
TRENDING SIGNALS
TOP REPOS
Crowd-Sourcing
Fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
领英推荐
Multimodal Models
Lumina-T2X is a series of text-conditioned Diffusion Transformers capable of transforming textual descriptions into images, videos, detailed multi-view 3D images, and synthesized speech. This family of models allows generation of output in any modality, resolution, and duration within one framework, requiring low training resources.
RAG
Verba is Weaviate's open-source application designed to offer an end-to-end, streamlined, and user-friendly interface for Retrieval-Augmented Generation (RAG) out of the box. You can run Verba either locally with HuggingFace and Ollama or through LLM providers such as OpenAI, Cohere, and Google. It also supports customizable RAG frameworks, data types, chunking, and retrieving techniques.
Language Models
A new repository offers a detailed implementation of Llama3, covering:
TOP PAPERS
Text-to-Image
Chameleon is a family of early-fusion, token-based mixed-modal models that can understand and generate images and text in any sequence. Its fully token-based architecture allows for seamless information integration across modalities, achieving state-of-the-art performance on image captioning and visual QA benchmarks.
3D Generation
CAT3D uses a multi-view diffusion model to generate multiple output views given one or more input views and camera poses. It outperforms prior work on few-view and single-image 3D reconstruction benchmarks and can generate high-quality 3D content in as little as one minute.
Language Models
This study compares LoRA and full finetuning on programming and mathematics, showing that while LoRA underperforms full finetuning in target domains, it better preserves base model performance and offers stronger regularization.
Subscribe to Newsletter: https://lnkd.in/guxfrUSM