AI's Big Week: π¨βπ¬ Research π€ Voice Assistants πΆοΈ Image Generation Get Spicier
What you actually need to know about AI, weekly.
Welcome to this week in AI.
This week, Sakana AI unleashed an AI scientist, Google's Gemini challenges ChatGPT with advanced voice capabilities, and xAI's Grok-2 pushes the boundaries of image generation.
We'll also explore the transformative impact of AI on businesses and the substantial ROI early adopters are already experiencing.
Letβs get into it!
π If youβre new here, welcome!
Subscribe to get your AI insights every Thursday (usually).
The AI Scientist: Sakana AI's AI-Powered Research
Sakana AI has unveiled "The AI Scientist," an AI system poised to democratise the field of scientific research.
This innovative system automates the entire research process, from brainstorming ideas to writing and even peer-reviewing papers.
In its initial demonstration, The AI Scientist focused on machine learning research, producing papers deemed "Weak Accept" quality for top conferences at a remarkably low cost of $15 per paper.
Why it Matters
This new model has the potential to greatly accelerate scientific discovery by enabling tireless AI researcher agents in tackling the world's most pressing problems.
The system's affordability could also democratise research, making it accessible to a wider range of individuals and institutions.
Rather than replacing human scientists, The AI Scientist can function as a powerful supplemental tool, collaborating with researchers to streamline tasks and enhance productivity.
This partnership could lead to new discoveries at an accelerated pace.
While the current version has limitations, future iterations incorporating multi-modal models promise even greater capabilities.
The AI Scientist is a leap towards a future where AI-driven innovation plays a role in scientific discovery.
π Article by Decrypt
π Example paper by The AI Scientist
Google Challenges ChatGPT With Gemini Live
Google has launched Gemini Live, a mobile conversational AI with advanced voice capabilities, directly competing with OpenAIβs ChatGPT voice mode.
This new feature allows for more natural, hands-free conversations with 10 different human-like voice options, and users can interrupt and ask follow-up questions mid-response.
Gemini Live integrates directly with Google apps for context-aware answers and is now the default assistant on Googleβs Pixel 9.
Why it Matters
Gemini Live aims to deliver a truly conversational experience with AI. Weβre moving from text-based prompts to more natural, collaborative conversations.
This has the potential to change how we use AI for complex tasks, saving time and increasing productivity.
Things didnβt go completely to plan. During the live event showcasing the new feature, Gemini failed twice before succeeding.
While OpenAIβs Advanced Voice Mode is still in a limited alpha phase, Googleβs widespread rollout of Gemini Live positions them at the forefront of this exciting new interaction paradigm.
π° Blog by Google with features
π First review by The Verge
AI-Powered Shift: From Generalisations to Specifics in Business
A recent Harvard Business Review article explores how advancements in AI are enabling businesses to move beyond broad generalisations and engage with the specific details and nuances of their operations.
AI's capacity to process massive datasets enables it to identify subtle patterns and trends, personalise customer experiences, and uncover hidden talents within organisations.
This shift towards specifics is already influencing various business areas, including strategy, talent management, leadership roles, and supply chain management.
Why it Matters
This shift, driven by AI's ability to "take in vast amounts of data and re-coordinate logistics in real-time", presents opportunities for businesses to gain a competitive edge.
By harnessing the power of AI to understand and respond to individual needs and circumstances, companies can improve customer satisfaction, optimise operations, and foster innovation.
However, this new focus on specifics also challenges traditional business models and hierarchical structures, requiring a greater emphasis on adaptability, agility, and decentralised decision-making.
As AI continues to evolve, businesses that embrace this shift towards specifics, recognising that "the whole business should be run in ways that go far beyond just which bots are managing its supply chain", will be better positioned to thrive in an increasingly complex and dynamic marketplace.
π° Read the article by Harvard Business Review
ChatGPT-4o Is Back on Top!
OpenAI's ChatGPT-4o has made a triumphant return to the top spot in Chatbot Arena, surpassing Google's Gemini with a score of 1314.
The model now excels at following instructions and tackling challenging prompts.
Why it Matters
While the battle rages for the top spot, the gains in capability are incremental improvements.
This is set to change with OpenAI hinting at a new model being released soon, CEO Sam Altman has also been cryptically posting about project Strawberry which many believe will be a phase change in reasoning capability.
π X user tests out new model
Grok 2 Is Here & Itβs Spicy
Elon Musk's xAI has launched Grok-2, an advanced AI model with image generation capabilities that sets itself apart by having fewer restrictions than its competitors like DALL-E and Gemini.
This allows Grok-2 to generate images of politicians and copyrighted brands, a feature that has stirred both excitement and controversy.
Why it Matters
The launch of Grok-2 highlights xAIβs push to become a dominate player, while the technical advancements of the model are notable, concerns remain about the potential for misuse, particularly in generating misleading or harmful content.
xAI is currently training its next frontier model with the largest compute cluster on earth, so future models will be more exciting.
π° Launch post by xAI
Research Shows Strong ROI for Early Adopters of AI
A new global research study by Google Cloud reveals the impressive return on investment (ROI) that generative AI is delivering across various industries.
The study found that a majority of executives (61%) are already utilising generative AI, with 86% of those early adopters reporting a revenue increase of over 6%.
Key figures from the report highlighting the benefits of generative AI:
Productivity: Nearly half of executives reported that employee productivity has at least doubled due to gen AI implementation.
Security: 56% of executives stated that gen AI has strengthened their organisation's security posture.
Business Growth: 77% of executives saw improved leads and customer acquisition thanks to gen AI solutions.
User Experience: 85% of executives reported increased user engagement and 80% noted improved user satisfaction.
The study also emphasises that strong C-suite support is crucial for successful gen AI adoption and scaling.
Additionally, it identifies a trend of reinvesting gains from gen AI back into technology, talent, and data quality to foster further innovation.
Why it Matters
This research underscores the transformative potential of generative AI across various business functions.
For businesses, it highlights the tangible benefits of early adoption, from increased revenue to enhanced security and improved user experience.
For individuals, it signals the growing importance of AI skills and the potential for AI to reshape the future of work.
By understanding and embracing generative AI, businesses and individuals can position themselves at the forefront of this technological wave, driving innovation and unlocking new opportunities for growth.
This report is based on a survey of 2,508 senior leaders of global enterprises ($10M+ revenue), conducted by Google Cloud and National Research Group from February 23-April 5, 2024.
π Research paper from Google Cloud
Missed the last one?
Thatβs a Wrap!
If you want to chat about what I wrote, you can reach me through LinkedIn.
If you liked it, give it a share!