AI Minds Newsletter
Posts
Sam Altman’s Easter Eggs (New Models!), Shortcomings of LLM Benchmarks, and AI Medical Documentation

Sam Altman’s Easter Eggs (New Models!), Shortcomings of LLM Benchmarks, and AI Medical Documentation

How we should've seen o1 coming, where current benchmarks fail, and why doctors are seriously checking out the latest AI models.

Jose Nicholas Francisco & Marcel Santilli
September 17, 2024

Welcome (back) to AI Minds, a newsletter about the brainy and sometimes zany world of AI, brought to you by the Deepgram editorial team.

In this edition:

🧠 Everything you need to know about Open AI’s o1 model in 5 minutes
📉 Where LLM benchmarks (surprisingly?) fail
👁️ The unique hallucinations of Large Vision Language Models (LVLMs)
❗ Last chance to sign up for the Virtual Workshop on Voice AI Agents
🏥 Why AI is the future of medical documentation
💬 Must know: Building and applying conversational AI
🐣 Sam Altman’s social media Easter Eggs about new models
🐦 Twitter: Stanford Professor skeptical about Google’s brain wiring diagrams
📲 Three trending, new AI Apps for you!
📝 Free Transcription Tool from Deepgram!
🎤 New AI Minds Podcast Episode with Co-founders of NovaSquare Ltd!
💾 Making the ultimate AI cheating device for school
🚗 Washington Post on self-driving cars
🥊 Battle of the new models: Llama 3.1 versus Mistral NeMo versus o1
📖 Deep Dive: Attention Mechanisms and the origin of all these new LLMs

Thanks for letting us crash your inbox; let’s party. 🎉

Deepgram just released a brand new medical transcription model! Check it out here. 🥳

🎥 Everything You Need to Know about OpenAI’s o1 Model in 5 Minutes

OpenAI shocked the world by releasing its new o1 model much earlier than anticipated. This video quickly and cleverly reviews the new state-of-the-art model in a whimsical yet straightforward way.

🧑‍🔬 Where LLM Benchmarks Fail and How Large Vision Language Models Hallucinate

Inadequacies of Large Language Model Benchmarks in the Era of Generative Artificial Intelligence - Noticing preliminary inadequacies in various benchmarks, this paper embarks on a study to critically assess 23 state-of-the-art LLM benchmarks, using a novel evaluation framework through the lenses of people, process, and technology, under the pillars of functionality and security.

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback - It’s expensive to detect and mitigate hallucinations for Large Vision Language Models (LVLMs). Thus, this paper proposes that we should detect and mitigate such hallucinations in LVLMs via fine-grained AI feedback.

In three days, this opportunity will have expired! Sign up now 🚀

Master building voice AI agents in this practical, hands-on workshop hosted by Deepgram and Groq. Limited time: get 20% off with code AUGAI20 until 08/31. Sign up

When: Friday, September 20th | 9AM - 12PM PT

Where: Zoom

🏇 The Future of Medical Documentation and Conversational AI

Why AI is the Future of Medical Documentation - Medical transcription faces unique challenges due to specialized terminology, acronyms, and high-stakes consequences of errors. Inaccurate transcription can lead to patient safety risks, legal issues, and financial losses for healthcare providers. Here’s how AI provides the solution for these problems.

Must Know: Building and Applying Conversational AI - Building effective Conversational AI systems involves integrating key components such as LLMs, ASR, and NLG. Challenges include ensuring quality and accuracy, managing complexity for developers, and addressing privacy and security concerns. Overcoming these challenges is essential for creating robust and reliable systems. Here’s how the experts do it.

Sam Altman’s tweet Easter eggs are pretty cool I have to admit.
OpenAI’s Orion is coming…
theinformation.com/articles/opena…
— Amir Efrati (@amir)
3:56 PM • Sep 14, 2024

Automating AI research is exciting! But can LLMs actually produce novel, expert-level research ideas?
After a year-long study, we obtained the first statistically significant conclusion: LLM-generated ideas are more novel than ideas written by expert human researchers. x.com/i/web/status/1…
— CLS (@ChengleiSi)
3:30 PM • Sep 9, 2024

But, unfortunately, a complete “wiring diagram” of a brain still gives almost no idea of how it works for learning, memory, reacting, …
“We don’t even understand the [302 neuron] brain of a worm” scoffs Christof Koch, Chief Scientist of @AllenInstitute
— Christopher Manning (@chrmanning)
3:14 AM • Jul 17, 2024

Skipit is an AI-powered YouTube video summarizer that gives us instant summaries, saving us time by answering our questions in seconds instead of minutes. Key features include unlimited messages, interactive Q&A, and support for videos up to 12 hours long.

Advolve AI is a cutting-edge platform designed to revolutionize performance marketing through automation, data-driven insights, and advanced AI capabilities. With a focus on maximizing sales returns, Advolve AI simplifies the entire digital marketing process, from creative generation to performance attribution.

Pienso is a comprehensive solution for leveraging the capabilities of LLMs to meet the specific needs of your business. This app democratizes the power of AI, enabling customer-facing teams and their supporting data scientists and business analysts to generate real-time insights from their data.

📝 Free Transcription Forever! New Speech-to-Text AI Tool

Looking for a simple way to convert speech to text? Deepgram's free transcription tool is your ultimate solution. Whether it's conversations, audio files, or YouTube videos, our advanced AI transcription tool supports over 36 languages and dialects, making it the best free AI transcription tool available online. Discover how easy and efficient transcription can be with our tool.

🎤 The AI Minds Podcast!

Mathieu and Nicolas, Co-founders of NovaSquare Ltd, share their journey of building an AI platform that revolutionizes community engagement for streamers and content creators. They discuss Alicia’s evolution from a simple chat tool to a sophisticated system capable of enhancing viewer interaction and community cohesion.

🤖 Bonus Bits and Bytes!

If you’ve scrolled down this far, we’ve got some extra fun bonus content for you!

I Made the Ultimate Cheating Device - This video shows viewers how to essentially jailbreak a TI-84 calculator into a computer that can access GPT models, the internet, and more.
What the Washington Post thinks of Self-Driving Cars - Title says it all. Do Waymo’s impress or suck in the eyes of journalists?
Improvement or Stagnant? Llama 3.1 and Mistral NeMo - With all the talk about o1 coming out, we can quickly forget just how amazed the world was with Meta’s and Mistral’s latest models. Check out this article to see how all these models compare.
Attention was all we needed - With all of these new models coming out, it can feel rather breathtaking to see how far we’ve come from the original transformer model. This in-depth article reminds us from whence we came: Attention Mechanisms.