Tech Insights 2024 Week 46
Do you apply to jobs on LinkedIn? Then it's time to buy Premium. Only Premium members can see if they are a Top Applicant, and only Top Applicants get prioritized by the new AI Hiring Assistant. Is there a risk that great talents are missed? Yes. Can large organizations optimize their recruitment processes? Absolutely. I think it will be very interesting to see how many companies adopt the AI Hiring Assistant and how it will affect the job application process as a whole going forward.
Among other news this week: GitHub reports a 59% surge in GenAI projects year-over-year, finally making Python the most popular language over JavaScript. ByteDance (owners of TikTok) unveiled X-Portrait 2, which promises to change movie-making and social media forever once it is released. Black Forest Labs launched a new "RAW" mode which creates images that look very close to real photos, at least when it comes to shading and textures. NVIDIA launched four new tools for robot development, and Microsoft unveiled Magentic-One, a new multi-agent AI system.
WANT TO RECEIVE THIS NEWSLETTER AS A WEEKLY EMAIL?
If you prefer to receive this newsletter as a weekly email straight to your inbox, you can sign up at: https://techbyjohan.com/newsletter/ . You will receive one email per week, nothing else, and your contact details will never be shared with any third party.
THIS WEEK'S NEWS:
LinkedIn Launches AI Hiring Assistant for Recruiters
The News:
My take:
Read more:
Python Overtakes JavaScript as Most Popular Language on GitHub
The News:
What you might have missed: GitHub posted three important takeaways from this news:
My take: I believe that every organization and software development team can benefit from Generative AI. From amazing tools like v0, Replit and Cursor, to amazing APIs such as the OpenAI Real-time API. There are very few processes that cannot benefit from AI, and I meet companies every week that are just starting their AI journey and once they see all the benefits they just want to accelerate. If you and your organization are not using Generative AI today, feel free to reach out to me and let's set up a meeting where I can give you pointers where to start.
Social Media is About to Change Forever: ByteDance Unveils X-Portrait 2
The News:
My take: There are two sides to this. On the plus side — soon anyone with a smartphone will be able to create good looking cartoons or movies at home, without having to consider lightning, expensive cameras or post production tools. On the flip side, it won't be long before social media as we know it will be messed up in ways we could never have imagined. It has now been 1.5 years since the Bold Glamour filter was launched on TikTok, one of the first filters that reconstructs the face entirely using a generative adversarial network (GAN). X-Portrait 2 continues in the same direction, but now uses a model that basically allows you to impersonate anyone or anything. With AI models such as Flux rapidly improving in quality (the new RAW mode launched last week is incredible), it's getting increasingly difficult to detect AI-generated content and videos.
Black Forest Labs Launches Flux Ultra and RAW Modes
The News:
My take: Flux is my current favorite among all AI image generators, and I have been wishing for higher resolutions for quite some time, so the new Ultra mode is very welcome. And while the RAW mode do generate better "look" in the images overall, the AI is still just as bad in generating fine details as previous versions, if you zoom in you can quickly determine the image is AI generated. But it's getting there!
Read more:
ElevenLabs launches "Voice Design" for Custom and Unique Voices
The News:
My take: ElevenLabs is my current go-to system for text-to-speech, and the new Voice Design feature just made it even better. ElevenLabs is the only model I have found that creates long sections of Swedish text that I can actually listen to without getting crazy over all the mispronunciations. If you are using ElevenLabs or any other text-to-speech engine I think it's worth trying out the new Voice Design feature, it looks very promising!
Waymo Robotaxis Cost More and Has Higher Trip Time
The News:
My take: This study presents an interesting contrast to Waymo's recent announcements about providing 150,000 rides per week. Like all technology, Waymos robotaxis will get much faster and much cheaper over time, so even if they are much slower and much more expensive today, this should change rapidly in the coming years.
Read more:
NVIDIA Launches Major Updates for Robot Development
The News:
My take: Wow, what a massive update for the robotics industry! NVIDIA is basically providing an entire ecosystem for humanoid robot development, from simulation to perception to control. Among all the new features, I think the new Cosmos Tokenizer is the most impressive, which can now process visual data 12x faster while maintaining the same or better quality. I have included links to Isaac Lab, the Cosmos tokenizer and NeMo Curator below if you are interested in learning more!
Read more:
Microsoft Unveils Magentic-One: Multi-Agent AI System
The News:
What you might have missed: All agents in Magentic-One are based on GPT-4o, however it is model agnostic and can use different models to support different capabilities.
My take: Magentic-One sounds very good in theory, however I could not find a single web page or YouTube video that has actually tested the framework to see how well it works for specific use cases, which is unusual. When OpenAI Swarm launched we had dozens of demonstrations posted within hours. The main benefit of Magentic-One seems to be ease of use, and if you are curious in exactly what makes Magnetic-One better than AutoGen, Mehul Gupta made an excellent summary here.
Read more:
Oasis Launches First Playable Open-World AI Model
The News:
My take: Oasis definitely looks like a first step towards AI-powered game worlds that can be generated and modified in real-time. However much like the real-time Doom demo from a few weeks ago, this is a very early research prototype. For example, distant object appear warped and fuzzy, and it has difficulties over long contexts where objects seen in the horizon fall outside the context and is changed when moved out and in of context. However for a first release I believe Oasis is amazing, and definitely worth a read if you are interested in game development and AI.
Read more: You can actually download and run Oasis yourself from their GitHub page.
Tencent Releases Hunyuan3D-1.0, First Open-Source Text and Image-to-3D Model
The News:
My take: While technically Hunyuan3D is "open source", its license agreement is very restrictive, making it unusable for most business applications. It is currently positioned more as a research and academic tool rather than a truly open-source commercial solution. So if you have the need for text-to-3D and image-to-3D my recommendation is still to use Meshy3D which by many is still considered state-of-the-art for text-to-3D and image-to-3D generation.
Read more:
Tencent Releases Hunyuan-Large, Industry's Largest Open-Source MoE Model
The News:
My take: These "almost open source" models are getting annoying. I think it would be better to call it "some source available but with significant restrictions". While the model weights and inference code for Hunyuan-Large are publicly available, the core training code and data remain proprietary. And the license terms are much more restrictive than traditional open-source licenses like Apache or MIT, making it closer to a permissive commercial license than "true" open source.
Read more: