?? Summer of AI: Silicon Valley Heating Up??
June 30, 2024
Spanish ??
We invite you to read our latest posts
PART I: ??? When the Future Watches Us
As summer kicks off, Silicon Valley is abuzz with intense philosophical debates about artificial intelligence. The divide is sharper than ever, with two distinct camps: the messianic followers on a "mission," like OpenAI, and those who adopt a different perspective.
Yann LeCun: The Skeptic's Saviour
Amidst the clamor, Yann LeCun stands out as a refreshing voice, despite a recent embarrassing clash with Elon Musk on X (formerly known as Twitter). LeCun distances himself from outlandish conspiracy theories by questioning whether artificial general intelligence (AGI) can be achieved merely by scaling up generative LLM models. He urges us to consider new architectures for creating mental maps.
Zuckerberg: The AI Heretic
Mark Zuckerberg clearly distances himself from the messianic approaches in the AI field. He advocates for multiple artificial intelligences to diversify the market. In his own words, some companies seem to believe they are creating God.
The Fallen Angels of OpenAI
On the other hand, 11 OpenAI employees, along with other industry workers, penned an open letter condemning what they see as a policy of silence by major corporations. They call for greater freedom to voice their critical concerns and seek protection against retaliation. This letter is supported by prominent AI authorities such as Stuart Russell and Turing Award winners Yoshua Bengio and Geoffrey Hinton.
The Prometheus Archetype Podcast
A trending podcast features Leopold Aschenbrenner, a former member of OpenAI's dismantled superalignment team. His controversial vision involves a superintelligence explosion powered by unlimited energy.
Athena in Action
In response, the brilliant Sabine Hossenfelder dismantles Leopold Aschenbrenner's essay with two words: energy and data. She bluntly states, "Honestly, I think these guys have totally lost their minds. They live in a techno-utopian bubble with groupthink written in capital letters."
???
PART II: Back to Earth... The Present
?? We Highlight "Virtual Rodent"
Google DeepMind, in collaboration with Harvard University, has developed a virtual biomechanical model of a rat in a physical simulation powered by neural networks. Using deep reinforcement learning, this virtual agent mimics the behavior of real rats, monitored in real time by brain activity. This virtual doppelg?nger helps "interpret the structure of neural activity in movements and relate it to theoretical principles of motor control."
Remarkably, the neural network also performed zero-firing movements in sequences it was never trained on, closely aligning with the behavior of a biological brain, even under the theoretical energy-saving assumption.
Synthetic Data Generators
???? FLORENCE-2 VLM from Microsoft
Microsoft has released Florence-2, an open-source model under the MIT license that promises to revolutionize synthetic data generation. With a base model of only 0.23B parameters and a large model of 0.77B parameters, it is so lightweight that it can even be deployed on mobile devices.
Florence-2 was trained on the massive FLD-5B dataset, which contains over 5 billion annotations for 126 million images. It uses an iterative automated annotation strategy with specialized models, creating an embedding of DaViT image encoding and multitasking prompts. This embedding is then passed to the transformer encoder, generating outputs that range from titles and detailed descriptions to pixel-level segmentations and annotations.
This tool is invaluable for mass image labeling, enabling the training of new classes and the generation of custom synthetic data through fine-tuning.
NVIDIA Nemotron-4 340B
NVIDIA introduces the Nemotron-4 340B family of generative models, developed with its RT-LLM tensors. These open-source models include instructional and reward variants, designed to generate high-quality synthetic text data that enhances the training of custom LLMs.
Generative Model Releases
? Claude 3.5 Sonnet from Anthropic The standout in LLMs is Claude 3.5 Sonnet, outperforming its predecessors and GPT-4.0. It utilizes "Artifacts" that translate to code execution in its playground.
Google's Gemini 1.5Pro This model boasts an impressive context window of 2 million tokens and incorporates a cache to reuse forwarding tokens, reducing costs.
DeepSeek LLM 7B/67B Available as open source for research, these models feature DeepSeek-V2 and DeepSeek-Coder-V2 APIs, with 236 billion parameters trained on 2 trillion tokens in English and Chinese. They outperform GPT-3.5 in performance.
Qwen2 This model includes base and aligned versions for instructional tasks, available in five sizes: 0.5B, 1.5B, 7B, 57B-A14B, and 72B parameters. It excels in mathematical thinking, achieving 39% accuracy on the GSM8K benchmark.
META's impressive OpenSource Models
Mamba-2: The second version of this model offers an efficient and scalable alternative to traditional Transformer models, improving computational efficiency and benchmark performance.
Alternatives to Copilot:
Mistral AI's codestral 22B An open-source model proficient in over 80 programming languages, designed to boost developer productivity in code generation, test writing, and bug fixing.
LSP-AI An open-source language server that acts as a backend for AI-driven functionality, enabling Codestral to assist software engineers in code generation.
?? MULTIMEDIA: VIDEO, AUDIO
?? The First Ever Ad Created Entirely with SORA Launches
The world of advertising witnesses a milestone with the launch of the first-ever ad generated entirely by SORA. This achievement underscores the potential of AI in multimedia content creation, opening new possibilities and raising questions about the future of advertising.
?? KLING: SORA's New Contender
From China, Kuaishou introduces KLING, a text-to-video model capable of producing high-quality videos up to 2 minutes long. Although it is currently available only to those on a waiting list, expectations are sky high. You can check out the demos on its official YouTube channel to see its capabilities.
The model's visual coherence is impressive. One demo features a child on a bicycle in a 2-minute-long shot, seamlessly transitioning from spring to winter.
KLING also excels at animating images, such as the famous Afghan Girl cover from National Geographic.
领英推荐
Runway and Its Gen-3 Alpha
Runway, a familiar name in text-to-video technology, is raising the bar with the upcoming Gen-3 Alpha version. This promises high fidelity and realism, with fine temporal control, ideal for generating physical systems.??
?? Luma and Its Dream Machine with Controversy
San Francisco startup Luma has launched Dream Machine, another video generation tool. Their demos feature "Monster Camp," but the characters and styles closely resemble those from Pixar's "Monsters, Inc." This raises potential copyright infringement issues, putting the spotlight on originality and rights in AI content generation.
?? Google V2A (Video to Audio))
Google introduces the V2A model, which adds an audio dimension to videos. This model not only adds sound effects but also character dialogue that matches the tone and context of the video, as well as full soundtracks. It marks a significant breakthrough for multimedia.
SStable Diffusion 3 (Text to Image)
Stable Diffusion 3, a leader in open-source image generation, is undergoing a paradigm shift with a new restrictive license. The company's economic difficulties have led to changes in licensing conditions, resulting in a ban on SD3 derivatives on platforms like CivitAI. This is a turning point in the use and distribution of this historically significant open-source tool.
BIG TECH NEWS HIGHLIGHTS:
?? APPLE
OpenAI has announced that ChatGPT will be available for iOS, iPadOS, and macOS. However, its launch in Europe might be delayed due to privacy policy concerns. Instead of paying OpenAI for this integration, Apple is leveraging "exposure" on its millions of devices as a bargaining chip.
Apple's attempt to appropriate the acronym A.I (Artificial Intelligence) for naming its future Apple Intelligence device has not been well received. The new version promises improvements to Siri and future health monitoring features. This anticipation has boosted Apple's market value to $3 trillion. It also led to changes in its privacy policies, with new sections on advertising and online tracking added.
The company also announced Genmoji for iOS 18, which will be available starting in September. This new feature uses AI to generate unique emojis based on user prompts, expanding the current selection of nearly 3,800 emojis compatible with Unicode 15.1.
?? META
Meta continues to release multiple open-source models while discussing a potential collaboration with Apple. However, its plans to train its artificial intelligence using data from users of its platforms without their consent have sparked controversy. This has led NOYB, the European Centre for Digital Rights, to file complaints in 11 EU countries against the parent company of Facebook and Instagram.
?? OpenAI: Energy Accelerationism Without Resilience
OpenAI is experiencing a significant loss of confidence. The Scarlett Johansson case, delayed product releases, layoffs, and the dismantling of the superalignment team have disappointed many of its loyal followers. Additionally, CEO Sam Altman taking on the role of head of the safety committee has added to the concerns.
The Scarlett Johansson case, tied to her role as the voice of Samantha in the movie "Her," exemplifies OpenAI's controversial policies. Johansson was approached for voice logs for ChatGPT but turned down the job. Despite this, the project proceeded with a voice mimicking Samantha's sensuality. The demos caused discomfort among some users, leading to criticism over the unauthorized use of Johansson's voice. OpenAI ultimately removed the Sky voice from the package, stating it was based on another actress. Johansson's subsequent lawsuit forced the company to delay the release of this update.
Experts believe this lawsuit could set a precedent for synthetic content, much like Bette Midler's successful 1980 lawsuit against an imitation of her singing voice for a commercial.
"HIS" o"HIS" or "Her.?"r "Her.?"
As mentioned earlier, AI workers have signed an open letter expressing concerns about restrictive policies, incentives, and confidentiality agreements imposed by companies on their models. Altman claimed ignorance of such clauses being enforced on departing employees.
The departure of Jan Leike (now with Anthropic) and Ilia Sutskever (who has started his own AI company), both from the security team, along with the dismissals of Leopold Aschenbrenner and Pavel Izmailov for alleged leaks, led to the dismantling of the superalignment team. The new security committee is now chaired by CEO Sam Altman.
Sam Altman, who has secured more than $2.8 billion in investments according to the Wall Street Journal, has admitted he does not fully understand how his own technology works. Meanwhile, OpenAI Insider estimates a 70% chance that AI will destroy or harm humanity.
Despite these challenges, OpenAI continues to grow, with annualized revenues reaching $3.4 billion according to The Information. The company has also acquired new firms such as Rockset and Multi (formerly Remotion).
Elon Musk: From Lawsuit to Lawsuit
Tesla shareholders have filed a lawsuit in Delaware Chancery Court, alleging that Elon Musk has been diverting resources and talent from Tesla to his new venture, xAI. This lawsuit adds to Musk's growing collection of legal troubles:
Despite these issues, Musk continues to make headlines with his lucrative goal-based compensation plan from SpaceX.
Always tireless in his predictions, Musk claims that thousands of humanoid robots will be working at Tesla by 2025 and that his mass-produced "Optimus" robot could become the company's biggest asset.
Regarding the Apple-OpenAI union, Musk has declared a war in his style. He has threatened to ban the use of iPhones and Apple devices in his companies, even for visitors, claiming that the integration of ChatGPT in iPhones is like having "creepy spyware." Curiously, he withdrew his lawsuit against OpenAI just before a key hearing.
Musk has also announced his intention to create a computing supercenter with more than 100,000 Nvidia chips by 2025 in Memphis, Tennessee. Meanwhile, his social network X now allows sexually explicit content.
As a final touch, Musk starred in an embarrassing public argument on X with Yann LeCun, head of AI at Meta. LeCun criticized Musk for his treatment of scientists and his conspiracy theories. Musk, true to form, responded with memes, accused LeCun of "going soft," and challenged him with, "What have you done in the last five years?" (Futurism).
?? Microsoft
The company is keeping a low profile, terminating its marine data center experiments and leveraging this knowledge for other technologies. Additionally, they are abandoning the Copilot GPT Builder project after just three months.
Privacy concerns have raised suspicions about Copilot's user tracking, as well as a new tool, the AI Recall function, which is intended to help with anything you have searched for but risks being used as a spy tool.
That’s it for some of the top stories we’ve highlighted. From IA-ismo, we wish you a happy summer.
Subscribe to our newsletter: IA-ismo – Ethical, Legal, and Technological Challenges for AI Integration.