Scaling Hits a Wall ⚡ Microsoft’s Infinite Memory 🧠 OpenAI’s $100B Plan 🌐
Plus: OpenAI's Operator January debut, Stripe enables AI transactions, and breakthroughs in COVID nanobodies
👋 Welcome to this week in AI.
📰 Latest news
AI Progress Slows: Labs Shift Approach to Scaling
OpenAI co-founder Ilya Sutskever confirms that scaling large language models (LLMs) with more data and computational power has plateaued.
After underwhelming results from OpenAI's Orion model, researchers are shifting to techniques like "test-time compute", which enhances AI performance during inference by evaluating multiple possibilities before choosing the best outcome.
OpenAI's o1 model employs this method, achieving significant performance gains without additional data or extended training.
Other AI labs, including Anthropic, xAI, and Google DeepMind, are adopting similar strategies.
Why It Matters
This shift marks a crucial advancement in AI research, offering a more efficient path to smarter systems.
By optimising models during inference, developers can enhance AI capabilities without the immense costs of scaling data and computation.
As Ilya Sutskever stated, "The 2010s were the age of scaling; now we're back in the age of wonder and discovery once again."
This new approach could democratise AI development, reduce reliance on expensive hardware, and accelerate the deployment of advanced applications across various industries.
AI That Never Forgets: Microsoft’s Vision for Proactive Companions by 2025
Microsoft AI CEO Mustafa Suleyman unveiled prototypes with "near-infinite memory," a feature enabling AI to remember and build upon past interactions indefinitely.
Expected by 2025, this capability marks an "inflection point" in AI development, transitioning systems from frustrating, shallow interactions to deeply engaging, evolving dialogues.
Suleyman highlights this advancement as critical to making AI truly useful and worth investing in for users.
Additionally, he anticipates a shift from reactive chatbots to proactive AI companions, capable of understanding user needs and forming lasting relationships.
These companions are envisioned to assist across diverse domains like personalised learning, healthcare advice, and creative support.
Backed by scaling laws, Microsoft has achieved a 99% reduction in AI model costs over two years, further driving accessibility and innovation.
Why It Matters
This development could redefine human-computer interaction by creating AI systems that "just don't forget."
Persistent memory allows for ongoing conversations, leading to more meaningful, personalised, and efficient experiences.
Proactive AI companions offer new possibilities, from tailored medical advice at zero marginal cost to customised learning paths and creative collaborations.
Suleyman's vision underscores the transformative potential of AI, powered by advances in computation and reasoning capabilities.
With rapidly declining costs and increased accessibility, AI is poised to become an integral, intuitive companion in everyday life, empowering individuals to achieve their goals more effectively.
Meet Operator: The AI Agent That Could Handle Your Tasks—and Even Earn for You
OpenAI is set to launch "Operator" in January, an AI agent capable of autonomously controlling computers to perform multi-step tasks with minimal human oversight.
According to a Bloomberg report, Operator can manage activities like booking flights or writing code by operating a web browser on the user's behalf.
CEO Sam Altman highlighted during a recent Reddit AMA that agentic capabilities will "feel like the next giant breakthrough" in AI advancement.
The tool will be available as both a research preview and a developer API, positioning OpenAI ahead in the competitive landscape where Anthropic and Google are developing similar agents.
Why It Matters
Operator signifies a major step towards AI systems that can handle menial and repetitive tasks, freeing users from routine computer interactions.
By automating activities such as data entry, scheduling, or even gaming tasks like "farming" to accrue points, Operator could enhance productivity and open new opportunities for users to generate value.
This shift towards agentic AI highlights the industry's move from passive chatbots to active agents that navigate the digital world on our behalf, promising increased efficiency and a transformative impact on how we interact with technology.
OpenAI Calls for AI Zones and $100B Data Centre
OpenAI has released a blueprint to enhance American AI infrastructure and international cooperation.
The plan proposes creating 'AI Economic Zones' to fast-track AI infrastructure projects. OpenAI suggests forming a "North American AI Alliance" with democratic allies.
The blueprint advocates modernising the power grid through a National Transmission Highway Act, focusing on transmission, fibre, and natural gas.
Notably, OpenAI has discussed building a $100 billion, 5-gigawatt data centre—five times larger than any existing facility.
Why It Matters
These proposals highlight AI's potential to drive U.S. economic growth and technological advancement.
Accelerating AI projects and modernising infrastructure can enhance national AI capabilities.
OpenAI states, “AI presents an unmissable opportunity to reindustrialise the US and...revitalise the American Dream.”
Forming a North American AI Alliance positions the country to better compete with China's advancing AI industry, promoting technology shaped by democratic values.
These initiatives aim to reindustrialise the nation, stimulate economic growth, and secure a leading global AI position.
Stripe Empowers AI Agents with Payment Processing Abilities
Stripe has introduced an agent toolkit that integrates its financial services into LLM agent workflows, enabling AI agents to perform tasks like processing payments and managing invoices.
The toolkit supports popular frameworks, allowing developers to build multi-agent workflows where AI agents can execute financial transactions securely.
Additionally, the financial technology sector is witnessing the emergence of AI-to-AI transactions, marking an important paradigm shift.
Companies are exploring ways to facilitate autonomous transactions between AI agents, highlighting the growing role of AI in financial operations.
A recent example is Coinbase’s first AI-to-AI crypto transaction.
Why It Matters
Integrating Stripe's financial services into AI agent workflows significantly expands the functionality of AI agents, allowing businesses to automate complex financial operations securely and efficiently.
The emergence of AI-to-AI transactions underscores this important trend, pointing to a future where AI agents can conduct financial transactions autonomously, streamlining operations and enhancing productivity.
🎓 Studies
AI Agents Create Nanobodies Against New COVID Variants
Stanford researchers have introduced the Virtual Lab, an AI research platform where specialised AI agents collaborate with human scientists to tackle complex scientific challenges.
In a notable achievement, the Virtual Lab successfully designed 92 new nanobodies targeting recent COVID-19 variants.
Impressively, over 90% of these AI-designed molecules were stable and functioned as intended in laboratory tests.
Two of these nanobodies stood out by exhibiting improved binding to the latest JN.1 and KP.3 variants of SARS-CoV-2 while maintaining strong binding to the original virus, marking them as promising candidates for further investigation.
Why It Matters
The Virtual Lab's success demonstrates the tangible impact of AI-human collaboration in real-world applications.
By accelerating the design and validation of effective nanobodies against evolving COVID-19 strains, this platform showcases how AI can significantly enhance and expedite scientific research.
The ability of AI agents to work alongside scientists—conducting complex analyses and generating viable therapeutic candidates—opens new avenues for addressing pressing health challenges.
As AI systems continue to advance, their contribution to breakthroughs in medicine and other fields holds immense potential for improving global health outcomes.
Readers Prefer AI-Generated Poetry Over Human Classics
A University of Pittsburgh study found that AI-generated poetry is often preferred over human-written works by renowned poets like Shakespeare and Dickinson.
Involving over 1,600 participants, readers correctly identified AI-generated poems only 46.6% of the time.
AI poems scored higher on 13 qualitative measures such as rhythm, beauty, and emotional impact.
Interestingly, poems least likely to be considered human were actually by famous poets, while those most "human-like" were AI-generated.
Why It Matters
This study showcases AI's advanced capabilities in creative fields, demonstrating that AI can produce poetry that deeply resonates with readers.
The preference for AI-generated poems suggests that AI has achieved a level of creativity comparable to humans, opening new possibilities for its application in literature and the arts.
🔥 Hot tip
Chain-of-Thought Prompting: A Quick Guide
Introduction
Chain-of-Thought (CoT) prompting improves a language model's ability to reason through complex business problems by encouraging step-by-step thinking. It's effective for tasks like strategic planning, decision-making, and problem-solving within a business context.
How to Use Chain-of-Thought Prompting
Step 1: Identify a Complex Business Problem
Choose a business issue that requires sequential reasoning. Examples:
Assessing a potential merger
Developing an international expansion strategy
Analysing supply chain disruptions
Step 2: Craft a Clear Prompt with Instructions
Write a concise problem statement and instruct the model to respond step by step. Use phrases like:
"Explain your reasoning step by step:"
"Let's analyse this scenario logically:"
Step 3: Combine Instructions and Problem
Merge your instructions with the problem to create the final prompt. Example:
"Explain your reasoning step by step: Should our company invest in new technology that increases efficiency but requires significant upfront costs?"
Step 4: Review and Refine
After the model responds, review the output for logical flow and completeness. Adjust your prompt if necessary for clarity.
🎥 Video tutorial on chain-of-thought prompting (6:15)