🚀 Stargate's $500B Gamble, DeepSeek's AI Breakthrough & Anthropic’s Next Leap 🤖
PLUS: Tencent’s 3D Gen, Google DeepMind’s Image Boost & AI Coworkers in 2025
👋 This week in AI.
Each week, I wade through the fire hose of AI noise and distil it into what you actually need to know.
🎵 Don’t feel like reading? Listen to two synthetic podcast hosts talk about it instead.
📰 Latest news
Stargate's $500 Billion Promise: Is Elon Undermining Trump's AI Vision?
The United States has announced Stargate, an ambitious artificial intelligence infrastructure project, with the intention to invest $500 billion over the next four years, with a stated initial investment of $100 billion.
Led by Softbank, Oracle, and OpenAI, the project plans to construct twenty data centres, each spanning 46,452 square metres. The aim is to strengthen AI capabilities in the United States, create jobs, and drive economic growth, with construction already underway in Texas.
However, the announcement has been met with public scepticism and controversy. Elon Musk, a member of the Trump administration's Department of Government Efficiency (DOGE), directly challenged the financial viability of Stargate on X (formerly Twitter).
In response to OpenAI's announcement, Musk posted "They don't actually have the money" and followed up with "SoftBank has well under $10B secured. I have that on good authority."
These claims contradict the project's stated financial backing, raising serious questions about its feasibility. Musk, being part of DOGE, adds a layer of complexity to this situation.
Why It Matters
The Stargate project, if fully realised, represents a large investment in the future of AI within the US, with potential for job creation and economic advancements.
However, Musk's public dispute raises questions about the project's credibility. His position within the Trump administration, as part of DOGE, and his willingness to challenge the Stargate project so publicly could indicate potential friction between Musk and the administration itself.
This controversy calls into question the project's credibility and has the potential to impact public perception.
📰 Article by the ABC on Stargate
DeepSeek: The Awakened Dragon & R1
Deepseek, a Chinese AI startup backed by High-Flyer, takes a distinctive approach by prioritising fundamental AI research over immediate commercial applications.
This commitment has led to the development of innovative techniques, such as multi-head latent attention (MLA) and sparse mixture-of-experts (DeepseekMoE), which enable more efficient model inference.
Their R1 model, a large-scale open-weights model with 671 billion parameters, has already demonstrated superior performance compared to OpenAI's o1 on some benchmarks.
This is an impressive feat considering it comes at a much lower cost, at just $2.9 per million tokens, compared to competitors like OpenAI, which charges $60 per million tokens.
Beyond its impressive performance, the Deepseek R1 model has also shown its ability to generate practical applications, such as generating a functional app to extract URLs from PDFs using just HTML and CSS.
Beyond code generation, the R1 also provides access to "reasoning tokens" that reveal the model's step-by-step thinking. These tokens give the user the ability to better understand and explore the model's semantic reasoning process.
Why It Matters
Deepseek's approach disrupts the conventional wisdom that Chinese companies are primarily application-focused.
Its emphasis on research and open-source philosophy has the potential to reshape China's AI landscape and also challenge the dominant players in the industry.
The impressive price point is a significant factor and could have the potential to drive down costs across the industry.
Something worth mentioning is that the model is aligned with the Chinese governmernts policies, including not to talk about Tiananmen Square
The development of the R1 model and its open-source release has created an exciting new avenue for practical AI application development. Its low cost combined with the reasoning capabilities makes it a strong competitor in the AI space.
📝 Read an interview with DeepSeek's CEO
📰 Article by CNBC anout the cost of AI in China
AI Coworkers: Transforming Work in Early 2025
Interesting watch!
Anthropic is set to launch several key updates. Web access for Claude is coming soon, along with improved memory for better long-term assistance.
While voice mode and image generation are lower priorities, the focus is on enhanced models within 3-6 months, using reinforcement learning for better reasoning.
This will see a more fluid combination of reasoning with other model capabilities. Fuelled by a massive 10x revenue jump to the billion-dollar range, Anthropic is massively scaling up its computing power with hundreds of thousands of Trannium 2 chips by 2026 and potentially a million or more chips in the next few years.
The most impactful development is the anticipated arrival of powerful "virtual collaborators," expected in the first half of this year.
These AI agents can perform any computer-based task a human can, managing long-term projects autonomously, from coding and testing to communication.
This aims to transform work by acting as a tireless co-worker. They plan to release models that act more as assistants in the short term and in the long term are better than humans in almost everything.
Why It Matters
The emergence of virtual collaborators signals a major shift in human-AI interaction.
This tech has the potential to dramatically enhance productivity by automating complex tasks, freeing humans to focus on high-impact work and is designed to be a productive experience.
Anthropic is positioning itself as a leader in AI advancement while maintaining an ethical approach. Their innovations could redefine work and accelerate the adoption of AI agents.
The Future of 3D: Tencent's Hunyuan3D 2.0
Tencent's Hunyuan3D 2.0 is a new open-source system for creating high-quality 3D assets. It uses a two-stage approach: Hunyuan3D-DiT generates detailed 3D shapes, followed by Hunyuan3D-Paint for realistic textures.
This system captures intricate details often missed by other models and produces high-resolution textures.
The Hunyuan3D-Studio includes tools like sketch-to-3D and character animation.
Testing shows that Hunyuan3D 2.0 outperforms existing open and closed-source systems in geometry detail, texture quality and input image alignment.
Why It Matters
Hunyuan3D 2.0's open-source release and advanced two-stage creation process will drastically speed up 3D asset creation. By making this powerful tool accessible, Tencent is likely to spur innovation and accelerate the adoption of 3D tech.
It reduces development time, allowing creators to produce more imaginative and detailed content. Its superior performance means Hunyuan3D 2.0 is a leading tool in 3D creation.
Better Image Generation: DeepMind New Breakthrough
Google DeepMind researchers, along with collaborators, have developed a new framework to improve the output of diffusion models, which are used to generate images, audio, and videos.
These models work by gradually "denoising" random data. Traditionally, the output quality is improved by adding more denoising steps, but this approach has its limits. This new framework optimises the starting point of this process by searching for the best initial noise to produce higher quality results.
Rather than just adding more denoising steps, the system keeps the number of steps constant and uses a search method to find better noise.
Testing shows this method consistently produces higher-quality outputs across multiple benchmarks, particularly excelling in text-prompt accuracy.
Why It Matters
This new framework offers a more efficient method to improve the performance of diffusion models without the need for additional training.
By focusing on optimising the initial noise, this framework unlocks higher quality outputs that would otherwise require time consuming and costly retraining. This advancement opens the doors to more better and more efficient image generation.
The innovation of optimising the inference process rather than retraining highlights a key improvement to diffusion models.
📰 Article by Analytics India Mag