For those that couldn't join us last week, here's the recording of our meetup that covered Pixtral. There were some surprising things we were able to do with it: - Answer questions about a complicated graph - Explain what was in a photo - Build a webpage from a sketch The other fun part is the transcription engine having trouble with my southern pronunciation of Pixtral :) https://lnkd.in/ee8pugMK #pixtral #artificialintelligence #meetup
Huntsville AI的动态
最相关的动态
-
Thanks to Nicolai Nielsen for making great videos about Keylabs Annotation Platform! If anyone’s interested in data science, your expert’s right here! ?? Made a series of videos on: - Setting up datasets with Keylabs - Segmentation and autolabeling - Facial recognition and frame interpolation - Project workflows, annotation process, all that good stuff. There’s an entire playlist here for anyone looking to learn more: https://lnkd.in/eCScSXpD? #AI #Keylabs #Computervision
Keylabs - Image Annotation - YouTube
youtube.com
要查看或添加评论,请登录
-
A common challenge when creating a video is writing a compelling script with the appropriate tone. Here’s our solution: Synthesia's AI video assistant ??that can create videos ???from an idea (prompt) ???from any public link ????from text documents, such as Word docs and PDFs Put your magic hat on and join us for an in-depth exploration of this exciting feature on January 31st. More info on the link below!
要查看或添加评论,请登录
-
A common challenge when creating a video is writing a compelling script with the appropriate tone. Here’s our solution: Synthesia's AI video assistant ??that can create videos ???from an idea (prompt) ???from any public link ????from text documents, such as Word docs and PDFs Put your magic hat on and join us for an in-depth exploration of this exciting feature on January 31st. More info on the link below!
要查看或添加评论,请登录
-
??? Unlock the full potential of your recordings with LLMs and Speaker Diarization! ?? Learn how to use AssemblyAI and #Haystack together to enhance your RAG application by detecting multiple speakers in audio recordings, and providing a transcript that attributes each utterance to the speaker. ?? Read more in this post: https://lnkd.in/eXGMTQEY ???? Build a RAG application with speaker labels: https://lnkd.in/gq8zfdbU ?? AssemblyAI x Haystack Integration: https://lnkd.in/gcMgfntf #opensource #llm #rag #multimodality
Level up Your RAG Application with Speaker Diarization | Haystack
haystack.deepset.ai
要查看或添加评论,请登录
-
Hugging Face launches Idefics2 vision-language model https://buff.ly/49Eew7v * ?? Hugging Face has released a new vision-language model called Idefics2. * ?? Idefics2 is a powerful tool that understands both images and text, excelling at tasks like visual question answering and image-based storytelling. * ?? Idefics2 is open-source (Apache 2.0 license), making it accessible to the community. * ?? Idefics2's advanced OCR capabilities allow it to process text within images. * ?? Idefics2 is easy to integrate and fine-tune due to its compatibility with Hugging Face Transformers. * ?? Idefics2 was trained on a diverse dataset, leading to its versatility in handling different use cases.
Hugging Face launches Idefics2 vision-language model
https://www.artificialintelligence-news.com
要查看或添加评论,请登录
-
AI researchers write "TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation" https://lnkd.in/eMV5fS6z. #artificialintelligence #generativeai #tango #github
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation
pantomatrix.github.io
要查看或添加评论,请登录
-
Solving ARC Puzzles with a Collaborative Agent + Human Workflow ???? In case you missed it: Val Andrei Fajardo created a task solver for ARC (Abstraction and Reasoning Corpus for AGI) that allows an agent to work with a human to solve an ARC task. Currently the best AI systems only get ~35% success, while humans get 85%. This human-in-the-loop collaboration creates training examples that can then enable the LLM to get stronger at reasoning. Defining this flow is simple with LlamaIndex workflows, and we have a full Streamlit demo as well. Check it out ?? ?? ARC-AGI: https://lnkd.in/grgPKDGM ?? Streamlit app: https://lnkd.in/gspATbZb ??? Github repo: https://lnkd.in/gTCDafWT ?? Workflows: https://lnkd.in/giseEZ5q
要查看或添加评论,请登录
-
?? ????????'?? ???????? ?????????????? - ?? ???????????????? ???? ?????????????????? ???????? ?????? ???????? ???? ???????? ????????????. ?? One of the most impressive aspects of this AI assistant is its lightning-fast response time. It delivers answers in just one and a half seconds, thanks to a complex network of neurons, including wav2lip, Mistral, and Whisper, working tirelessly under the hood. ?? ?The code is on GitHub (https://lnkd.in/ePqi9HPR). I love that it's open-source and accessible to everyone. Great move! ?? ?????????? ???????? ??????????????: ?? ???????? ?? ?????????????????? ???????? ???? ??????????://????????.????/????????????????
要查看或添加评论,请登录
-
https://lnkd.in/gQB5E-vV "It is harder to apply a watermark to text than to images, because word choice is essentially the only variable that can be altered. DeepMind’s watermark — called SynthID-Text — alters which words the model selects in a secret, but formulaic way that can be detected with a cryptographic key." #AIWatermark #SynthIDText
Google unveils invisible ‘watermark’ for AI-generated text
nature.com
要查看或添加评论,请登录