5,000x Faster Medical Scans, π Anthropic's Agents, π€ State of AI 2024 Highlights π
Plus, Ideogram's Infinite Canvas, BitNet's Efficient AI Models & More
Welcome to this week in AI.
This week's news: UCLA's SLIViT speeds up 3D medical scan analysis by 5,000x; Anthropic unveils Claude 3.5 AI agents automating complex tasks; and Microsoft's autonomous agents aim to boost enterprise efficiency.
Plus, explore Ideogram's Infinite Canvas redefining AI creativity, key insights from the State of AI 2024 report, and Microsoft's BitNet 1-Bit LLM bringing efficient AI models to local devices.
Letβs get caught up!
Don't Miss Out
Join me at the QLD AI Meetup on October 30th in Brisbane to discuss redefining UX for AI, plus get my upcoming Ultimate Guide For Designing AI Systems.
SLIViT: Breakthrough Delivers Expert-Level 3D Medical Scans 5,000x Faster
Researchers at UCLA have developed SLIViT (SLice Integration by Vision Transformer), a new AI model that can analyse 3D medical scans with expert-level accuracy at a rate 5,000 times faster than human specialists.
The model is adaptable across different types of medical imaging, including MRIs, CT scans, and ultrasounds, making it versatile in clinical settings.
Unlike many AI models that require vast amounts of data, SLIViT can be trained on just a few hundred samples, thanks to its use of transfer learning from more common 2D medical datasets.
This breakthrough is not only efficient in terms of speed but also cost-effective, reducing the time and resources needed for complex medical diagnoses.
SLIViTβs ability to handle smaller datasets without sacrificing accuracy sets it apart from current 3D models, which often require far larger datasets and longer training times.
Why it Matters
SLIViTβs ability to analyse 3D medical images rapidly and with high accuracy could significantly improve diagnostic workflows in healthcare, reducing the time specialists need to spend on complex image analysis.
By being able to train on smaller datasets, the model is accessible to healthcare providers with limited resources, potentially bringing advanced diagnostic tools to more clinics and hospitals.
Moreover, SLIViTβs flexibility across various imaging types ensures that it can be applied in a wide range of clinical scenarios, from detecting heart disease to identifying malignancies in CT scans, which could lead to faster and more accurate patient diagnoses globally.
Anthropic Unveils Claude 3.5 AI Agents for Multi-Step Task Automation
ββ Highly recommend watching this ββ
Anthropic has introduced upgraded AI modelsβClaude 3.5 Sonnet and Claude 3.5 Haikuβthat function as advanced AI agents capable of automating complex computer tasks.
These models bring a new "computer use" capability in public beta, allowing them to interact with computers much like humans do: viewing screens, moving cursors, clicking buttons, and typing text.
Accessible via Anthropic's API, Amazon Bedrock, and Google Cloud Vertex AI, these AI agents can navigate user interfaces and execute multi-step tasks without specialised tools.
Performance-wise, Claude 3.5 Sonnet has made significant strides, boosting its SWEbench Verified score from 33.4% to 49%, outperforming models like GPT-4o.
Companies such as Replit and GitLab are utilising these AI agents to automate software testing and development processes, enabling their teams to focus on higher-value work.
Why It Matters
The advent of AI agents like Claude 3.5 Sonnet and Haiku represents a substantial leap in automating routine and complex tasks, allowing humans to dedicate more time to strategic and creative endeavours.
By handling workflows that involve dozens or even hundreds of steps, these AI models minimise the time and effort spent on repetitive processes.
For instance, Replit employs Claude 3.5 Sonnet to evaluate apps in real-time during development, streamlining the testing phase and freeing developers to concentrate on innovation.
With impressive performance metricsβsuch as a 49% score on SWEbench Verifiedβthese AI agents are not only efficient but also highly effective in executing tasks that traditionally required human intervention.
This shift enables businesses to reallocate human resources to higher-value activities like strategic planning, problem-solving, and customer engagement. The integration of these AI agents into various industries could lead to increased productivity and innovation, maximising both human potential and technological capabilities.
How Autonomous Agents Will Enhance Enterprise Business Efficiency
Microsoft is advancing its AI capabilities by introducing new autonomous agents through Copilot Studio and Dynamics 365, aiming to bring AI-first business processes to every organisation.
Next month, the ability to create autonomous agents with Copilot Studio will enter public preview, enabling more customers to reimagine critical business operations with AI.
These agents range from simple prompt-and-response to fully autonomous functions, executing and orchestrating business processes on behalf of individuals, teams, or entire functions.
Ten new autonomous agents are being introduced in Dynamics 365 to enhance capacity for sales, service, finance, and supply chain teams.
Internally, Microsoft is using Copilot and agents to transform business processes across every function. One sales team achieved a 9.4% higher revenue per seller and closed 20% more deals by using Copilot.
The customer service team is resolving cases nearly 12% faster. The marketing team has seen a 21.5% increase in conversion rates on Azure.com with a custom agent designed to assist buyers.
In Human Resources, an employee self-service agent is helping answer questions with 42% greater accuracy.
Why it Matters
The integration of autonomous agents into enterprise software, like Microsoftβs Dynamics 365, is poised to reshape how large organisations operate.
These agents streamline critical functionsβsuch as sales qualification, customer service, and supply chain managementβby automating routine tasks, enabling faster decision-making, and enhancing overall productivity.
In enterprises that rely on complex software systems to run their operations, the introduction of AI agents can unlock new efficiencies, reduce administrative burdens, and optimise processes across teams.
As businesses look to scale and improve operational workflows, these agents represent a shift towards more intelligent, adaptable systems that can work seamlessly within existing enterprise infrastructures, leading to faster results and cost savings.
π Microsoft blog post
Redefining AI Creativity: Ideogram's Infinite Canvas and New Tools
Ideogram, a Canadian AI image startup founded by former Google Brain researchers, has introduced innovative tools that transform user interaction with AI-generated images.
The centrepiece is the Infinite Canvasβa limitless, interactive workspace where users can generate, manipulate, and combine images in a visual environment, moving beyond the traditional text prompt interface.
This Canvas allows users to spread out images, compare generations, resize, reorder, and merge multiple AI-generated visuals into new composites. It also supports user-uploaded images, enhancing creative flexibility.
Alongside Canvas, Ideogram unveiled two features: Magic Fill and Extend:
Magic Fill lets users edit specific areas of an image with text prompts, replacing objects, adjusting backgrounds, or fixing details with high resolution.
Extend expands images beyond their original borders while preserving style, ideal for resizing or adapting to different formats. Both tools run on Ideogramβs proprietary models, including Ideogram 2.0.
Why it Matters
These advancements signal a shift in AI interaction paradigms, moving beyond the limitations of the prompt box to more intuitive, hands-on tools.
For creators and designers, this means enhanced control and efficiency in crafting and refining images, streamlining the creative process.
By providing a dynamic visual workspace and precise editing features, Ideogram empowers users to realise their creative visions with greater precision and ease.
State of AI 2024: Key Innovations and Industry Shifts Shaping the Future
The recently published State of AI Report 2024 highlights the most significant advancements in artificial intelligence across several domains: research, industry, policy, safety, and predictions.
A central theme is the rapid progress of AI technology, including multimodal models that integrate language, vision, and robotics, as well as AI applications in the sciences, such as AlphaFold 3, which models interactions between proteins and small molecules.
In the industry, NVIDIA solidifies its dominance as the most powerful player in the AI hardware space, reaching a $3 trillion valuation. Meanwhile, Chinese labs have demonstrated their ability to thrive despite sanctions, leading advancements in areas like computational efficiency and coding applications.
Policy efforts to regulate AI are growing at the national and regional levels, particularly in the US and EU. However, global AI governance remains slow, with few meaningful regulations beyond high-level voluntary commitments.
The safety landscape has shifted from cautious to accelerated as companies rush to commercialise AI products. Researchers remain concerned about long-term risks, especially sophisticated attacks and AI jailbreaks.
Why it Matters
AI is driving fundamental changes across industries and scientific disciplines, with innovations like AlphaFold 3 pushing the boundaries of drug discovery and biological research.
Enterprises are leveraging AI to scale operations, improve efficiency, and optimise supply chains.
NVIDIAβs dominance in AI hardware suggests that companies that invest in AI infrastructure will have a competitive edge, while Chinaβs rise in AI innovation shows that global competition in AI development is intensifying.
Policy changes will increasingly impact AI deployment, as governments look to regulate the technology while companies aim to push forward.
This evolving landscape means that businesses and researchers must navigate both the opportunities and the risks associated with this fast-moving technology.
As AI continues to evolve, keeping up with its advancements will be essential for staying competitive.
π Read the report (it's comprehensive)
BitNet 1-Bit LLM: Bringing Efficient AI Models to Local Devices
Microsoftβs BitNet 1-Bit LLM is a new technique that drastically reduces the size of large language models, making it possible to run them efficiently on local devices.
By shrinking the modelβs parameters to just 1.58 bits, this method keeps the model's performance intact while dramatically cutting down on the processing power and energy needed.
BitNet can handle models with up to 100 billion parameters on a single CPU, which opens the door for running powerful AI models without the need for large cloud servers.
BitNet is now open-sourced, and HuggingFace has integrated it into their platform, allowing developers to fine-tune existing models using this technique.
It delivers impressive speedups of up to 6x on various CPUs, and energy savings of up to 70%. For example, a BitNet model was shown running on an Apple M2 chip, achieving a steady performance of 5-7 tokens per second.
Why it Matters
BitNetβs ability to run large AI models on everyday devices could make AI more accessible, reducing the need for expensive cloud infrastructure.
This means businesses and developers can deploy powerful AI tools in a more cost-effective and energy-efficient way.
By making AI more efficient and easier to run on local hardware, BitNet could drive innovation in fields where computing resources are limited, making AI solutions more practical for a broader range of applications.
π BitNet Github repo