The newest meaningful small models and research on embeddings: ? Hymba: NVIDIA’s hybrid-head architecture merges transformer attention with state space models (SSMs). It outperforms larger models like Llama-3.2-3B with less memory use and higher throughput. ? SlimLM: Built for smartphones, it achieves impressive document assistance tasks directly on mobile devices. ? BlueLM-V-3B: Large multimodal for mobile, excels in multilingual OCR and image-to-text with efficient embeddings. ? Jina CLIP v2: Offers multilingual, multimodal text and image embeddings using compact Matryoshka representations. Find more AI/ML news in our FOD: https://lnkd.in/eFX_Ghkw
TuringPost
科技、信息和媒体
Newsletter about AI and ML. ?? Sign up for free to get your list of essential AI resources ??
关于我们
Turing Post is everything you need to make smarter decisions about AI. We connect the dots to understand where AI comes from, its current impact on the world, and where it leads us. Or, hopefully, where we are driving it. ?? Bonus for those who have read this far: Sign up now to receive your free AI essential kit with resources to master AI and ML ???? https://www.turingpost.com/subscribe ?? What to expect in your inbox? - Froth on the Daydream: our weekly newsletter giving you a full picture of the ever-evolving AI landscape. We read over 150 newsletters so you don’t have to - ML Series on Wednesdays: Currently, a monumental FMOps series. - Unicorn Chronicle: Exclusive profiles and insights you won't find anywhere else. We have already covered OpenAI, Anthropic, Inflection, Hugging Face, and Cohere. - Foreign AI Affairs: A global perspective on AI as we explore its advancements in China, Russia, Israel, Europe, and beyond. and more is coming!
- 网站
-
https://www.turingpost.com/
TuringPost的外部链接
- 所属行业
- 科技、信息和媒体
- 规模
- 2-10 人
- 总部
- New York
- 类型
- 合营企业
- 创立
- 2023
- 领域
- Data Science、Machine Learning、Artificial Intelligence、Deep Learning、Neural Networks、GAN、Data Labeling、Feature Stores、Technology、Education、Startups、Investing、Research、AI、ML、Coding、MLOps、Computer Science、Big Data、Reinforcement Learning、Algorithms、Data Visualization和Chatbot
地点
-
主要
US,New York
TuringPost员工
动态
-
Top 5 research of the week - all about models: 1. Tülu 3: https://lnkd.in/eVeFRZ8y The Allen Institute for AI’s Tülu 3 excels in open post-training with curated prompts, synthetic fine-tuning, and an advanced RLVR framework, outperforming Llama 3.1-Instruct and nearly rivaling proprietary systems on GSM8K and IFEval. 2. Marco-o1: https://lnkd.in/eQze84Ne Alibaba’s Marco-o1 uses Chain-of-Thought tuning and Monte Carlo Tree Search to boost performance. Achieving +6% MGSM gains and outperforming Google Translate in complex tasks, it redefines reasoning potential. 3. DeepSeek-R1-Lite: https://lnkd.in/gE6Q9rDt DeepSeek launches R1-Lite-Preview, a reasoning AI for logic, math, and real-time problem-solving. It competes with OpenAI's o1-preview on AIME and MATH benchmarks. DeepSeek plans to open source their models 4. Bi-Mamba: https://lnkd.in/e_kchygB Bi-Mamba, from MBZUAI and Carnegie Mellon University, makes 1-bit modeling a reality. It reduces storage by 80%, saves energy, and matches full-precision models like Mamba-2. Designed for low-bit hardware, it shows efficiency doesn't need to compromise. 5. Pixtral Large: https://lnkd.in/g5fDyH8s Pixtral Large, with 124B parameters, redefines multimodal AI. From documents to high-res images, it handles enterprise challenges effortlessly, outpacing GPT-4o and Claude-3.5 Sonnet on key tests. Find a complete list of the latest research papers in our free weekly digest: https://lnkd.in/eFX_Ghkw Also, elevate your AI game with our free newsletter ↓ https://lnkd.in/dtfp4U4e
-
Let's investigate what makes SAMURAI model great for segmenting and tracking video objects in real-time
SAMURAI model for perfect segmenting and tracking objects in videos
TuringPost,发布于领英
-
What is Semi-Supervised Learning? You can use these flashcards to refresh your knowledge or share key ML and AI concepts with people who need an easy understanding of them. Here's Semi-Supervised Learning, a technique used to train ML models?? Find more flashcards on methodologies behind training machine learning models here: https://lnkd.in/eb92qSXm
-
The free ultimate guide to fine-tuning LLMs by Ireland's Centre for AI introduces key theory aspects and practical approaches. This guide covers: - History and growth of LLMs - Fine-tuning methods - Seven-stage pipeline for fine-tuning - Useful frameworks - Fine-tuning of multimodal LLMs - Open issues Check it out here: https://lnkd.in/ethzkw7U
-
Want to know more about Sparse Autoencoders (SAEs)? Here are 12 recent studies on SAEs: - Sparse Autoencoders Find Highly Interpretable Features in LMs - Compute Optimal Inference and Provable Amortisation Gap in SAEs - Direct Preference Optimization Using Sparse Feature-Level Constraints - Jumping Ahead - Decoding Dark Matter - 4 studies on SAEs in steering LLMs ... Check this out for the full collection: https://lnkd.in/e9CjJSM4
12 Researches on Sparse Autoencoders
turingpost.com
-
Important findings on activation sparsity from the Tsinghua University?? Firstly, a brief overview: Activation sparsity means that many neural network outputs are zero or very small, contributing little to the final result for an input. High activation sparsity leads to: - Faster computation - Easier training - Better understanding of LLM's work So the Tsinghua University's researchers developed a new metric, PPL-p% sparsity, to assess how sparse a model is. It measures how model works across different architectures (flexibility), how much performance drops, and which neurons contributes less. Here are their key findings: ? Deeper LLM architectures tend to be sparser. But we need balance between depth and width. ? Larger models don't necessarily lead to much higher sparsity. ? Smaller models reach their maximum sparsity more quickly. ? Activation function matters: For example, ReLU tends to create higher sparsity than SiLU. ? The more data a model is trained on, the more sparse it becomes, but it depends more on activation functions. Paper: https://lnkd.in/ePFyjikv Materials: https://lnkd.in/e8cuK_7X