AI Minds Newsletter
Posts
Andrew Ng’s new program launch, Why we’re poorly equipped to recognize voice clones, and Google Deepmind on 60 Minutes

Andrew Ng’s new program launch, Why we’re poorly equipped to recognize voice clones, and Google Deepmind on 60 Minutes

Andrew Ng announces the launch of a new 5-course program. Researchers study why we are currently poorly equipped to recognize when an AI is speaking versus a human. Google Deepmind gets a spotlight on 60 Minutes. And much, much more is revealed in this edition of AI Minds!

Jose Nicholas Francisco
April 22, 2025

Welcome (back) to AI Minds, a newsletter about the brainy and sometimes zany world of AI, brought to you by the Deepgram editorial team.

In this edition:

🎥 Which LLM makes the best doctor, according to a doctor
🔊 Why we’re poorly equipped to recognize voice clones
🏥 Healthcare AI Research: How far we are from achieving Baymax
⚡ Webinar - “Voice AI in 2025: From Robotic IVRs to Human-like Voice AI Agents”
🔨 Webinar: “Build Enterprise-Ready Voice Experiences with Aura-2”
🐦 Social Media Buzz: Ng’s new program launch, Elon’s Colossus 2, and more!
📲 Three new, trending AI apps for you!
🧠 Google DeepMind on 60 Minutes
💥 Exploring the TextAttack Framework: Components, Features & Applications
🔊 2025 State of Voice Report (Featured last week)
📚 Deep Dive: AI Lifecycle Management

Thanks for letting us crash your inbox; let’s party. 🎉

Looking for a cutting-edge AI medical transcription model? Click here. 🥳

🎥 Which LLM Makes the Best Doctor?

Dr. Mikhail "Mike" Varshavski D.O. is an actively practicing board certified Family Medicine Doctor living in NYC. In this video, he really puts LLMs to the test. Namely, Doctor Mike asks ChatGPT, Llama, Gemini, and Grok a series of medical questions to “see which has the best chance of replacing [him].” Check it out!

🔍 Why we’re poorly equipped to recognize voice clones and how far we are from Baymax

People are poorly equipped to detect AI-powered voice clones - Through a series of perceptual studies, the authors of this paper report on the realism of AI-generated voices in terms of identity matching and naturalness. They find human participants cannot consistently identify recordings of AI-generated voices.

A Survey of LLM-based Agents in Medicine: How far are we from Baymax? - This survey provides a comprehensive review of LLM-based agents in medicine, examining their architectures, applications, and challenges. It analyzes the key components of medical agent systems, including system profiles, clinical planning mechanisms, medical reasoning frameworks, and external capacity enhancement.

⚡ Webinar - “Voice AI in 2025: From Robotic IVRs to Human-like Voice AI Agents”

Check out the webinar here!

About this talk:

AI-powered voice agents are poised to close the satisfaction gap that has long plagued traditional voice technologies like IVR systems.

The 2025 State of Voice AI Report reveals a striking disconnect: while 80% of organizations have implemented traditional voice agents, a mere 21% report satisfaction with their current solutions.

Despite this gap, forward-thinking enterprises are increasing their investments in voice technology, recognizing that next-generation voice AI agents represent a fundamental shift in customer experience and operational efficiency.

Join Opus Research and Deepgram for this live, interactive webinar as we unveil findings from the 2025 State of Voice AI Report, exploring the most compelling business reasons to implement voice AI agents and what key improvements are required to unlock even greater adoption.

🔨 Webinar: “Build Enterprise-Ready Voice Experiences with Aura-2”

See how developers are building real-time, high-performance voice applications with Aura-2: Deepgram’s newest text-to-speech model, built on the same enterprise-grade runtime that powers our STT and speech-to-speech capabilities.

TUNE IN TO LEARN

🔊 Why enterprise-ready TTS needs more than just a natural voice – Hear how Aura-2 handles specialized language, tone, and pacing with clarity and consistency.
📈 How the Deepgram Enterprise Runtime powers scalable voice AI – Discover automated model adaptation, built-in hot-swapping, and flexible hosting.
🔎 See Aura-2 in action (live demo) and get guidance for integrating it into your apps.

Major program launch: Data Analytics Professional Certificate! This large, five-course sequence takes you all the way to being job-ready as a data analyst, and shows how to use Generative AI as a thought partner to enhance your work in this role.
Offered by
— Andrew Ng (@AndrewYNg)
6:07 PM • Apr 1, 2025

NEWS: xAI is now embarking on building Colossus 2, its next AI supercomputer cluster that will be much bigger and more powerful than Colossus 1, which was already the biggest and most powerful supercomputer in the world with 200,000 Nvidia GPUs.
Elon Musk said publicly two
— Sawyer Merritt (@SawyerMerritt)
3:36 PM • Apr 21, 2025

We’re excited to announce two @Nature publications from Project AMIE (Articulate Medical Intelligence Explorer), a research AI system optimized for diagnostic reasoning and conversations 💬
Paper 1: goo.gle/4lpQ8xg
Paper 2: goo.gle/3G4DNPe
— Google Health (@GoogleHealth)
4:56 PM • Apr 9, 2025

Media Semantics Character API - The Character API is "animation in the cloud". It takes a TTS voice as input and produces a live, lip-synced, talking character from it, complete with gestures and emotional response. Delivering multiple character styles, it can be used for everything from videos to fully interactive applications. Together with LLMs, the technology enables you to create embodied, "social" agents like a greeter, teacher, or virtual concierge.

ChatGPT for YouTube is a free Chrome extension that utilizes ChatGPT to generate text summaries of YouTube videos. This allows users to quickly understand the key points and content of a video without having to watch the full length.

Draw3D AI is a revolutionary AI tool that converts hand-drawn sketches into photorealistic images. Upload a sketch and Draw3D AI will automatically transform it into a realistic image using AI technology. It works with any detailed sketch - landscapes, animals, objects, etc. Bring your imagination to life!

🤖 Bonus Bits and Bytes!

🧠 What's next for AI at DeepMind, Google's artificial intelligence lab | 60 Minutes - At Google DeepMind, researchers are chasing what’s called artificial general intelligence: a silicon intellect as versatile as a human's, but with superhuman speed and knowledge.
💥 Exploring the TextAttack Framework: Components, Features, and Practical Applications - Frameworks like TextAttack have been developed to address NLP challenges, particularly with regards to adversarial attacks. Learn more here!
🔊 2025 State of Voice Report - 2025 is the year of human-like voice AI agents. Check out this survey of over 400 companies across various industries (healthcare, retail, CX, etc.) to find out why!
📚 AI Lifecycle Management - There’s a common hurdle AI organizations face: ensuring their ML systems are both effective and compliant with stringent regulations. This article delves deep into the intricacies of AI lifecycle management