• AI Minds Newsletter
  • Posts
  • Deepseek R1 Causes Big Tech to Panic, Karpathy’s Deep Learning Insights, and Vision-Language AI (VLMs)

Deepseek R1 Causes Big Tech to Panic, Karpathy’s Deep Learning Insights, and Vision-Language AI (VLMs)

How DeepSeek R1 shook the AI world in under a week, What Karpathy's latest Deep Learning insights mean, and Why Vision-Language AI (VLMs) might be the next big thing.

Welcome (back) to AI Minds, a newsletter about the brainy and sometimes zany world of AI, brought to you by the Deepgram editorial team.

In this edition:

  • 🎥 Why DeepSeek R1 is causing Big Tech to panic

  • 💸 Research: The Financial LLMs Leaderboard “Open FinLLM”

  • 🌍 CultureVLM, a Multimodal Vision-Language AI finetuned on a cultured dataset

  • 🐶 Coding Tutorial: Implementing a Virtual Veterinarian using Deepgram API

  • 📳 The Best Voice AI Agents for Call Centers in 2025

  • 🐦 Social Media Buzz: The Deepseek R1 Hype continues with Karpathy and Santiago

  • 📲 Three New, Trending AI Apps for You!

  • 🎙️ AI Minds Podcast with Abdulrahman Jamjoom, Co-Founder and CEO at Arini AI

  • 🎞️ Translating baby sounds using Google AI

  • 🧠A Curated list of modern Generative AI Projects and Services

  • 🚑 Top 5 arXiv Papers on AI and Medicine

  • 📚 “Entropy in Machine Learning,” A glossary page

Thanks for letting us crash your inbox; let’s party. 🎉

Deepgram released a brand new medical transcription model! Check it out here. 🥳

🎥 Big Tech in panic mode... Did DeepSeek R1 just pop the AI bubble?

Video Description: “Chip stocks like Nvidia are in trouble after the DeepSeek R1 AI model has proven that it is possible to train and run state-of-the-art reasoning models with minimal hardware. Let's find out why China's latest AI model has big tech and Wall Street in panic mode.”

🔍 Research: Financial LLMs Leaderboard and Vision Language Models (VLMs)

Open FinLLM Leaderboard: Towards Financial AI Readiness - In collaboration with Linux Foundation and Hugging Face, the authors of this paper create an open FinLLM leaderboard, which serves as an open platform for assessing and comparing LLMs' performance on a wide spectrum of financial tasks.

CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries - The authors of this paper CultureVerse, a large-scale multimodal benchmark covering 19,682 cultural concepts, 188 countries/regions, 15 cultural concepts, and 3 question types, with the aim of characterizing and improving VLMs' multicultural understanding capabilities. They then propose CultureVLM, which is finetuned on this dataset.

⚡ Building an AI Veterinarian and Showcasing the best Voice AI Agents for Call Centers in 2025

Implementing a Virtual Veterinarian Using Deepgram API - In this tutorial, we reveal ways that an individual developer can use AI to build a rudimentary veterinarian aid. While we doubt that such an app will replace vets entirely, the point remains that AI is an incredible building block for any innovations you can creatively pursue. Use this tutorial to flesh out your AI coding skills!

The Best Voice AI Agents for Call Centers for 2025 - The title says it all! If you’re interested in how Voice AI Agents can revolutionize the communications industry, check out this article!

🐝 Social Media Buzz: DeepSeek R1 Hype

Oscar Bedtime Stories is revolutionizing the way parents and children experience storytelling. This unique app transforms bedtime into an interactive and personalized adventure where children become the protagonists of their own tales.

Tandem GPT is your revolutionary AI language partner designed to make language learning more interactive, accessible, and enjoyable. Created with the goal of breaking down language barriers, Tandem GPT offers users the unique opportunity to engage in realistic conversations with an artificial intelligence that understands and responds in 10 different languages.

Pixelhunter is a game-changer in the realm of digital marketing and content creation—offering a sophisticated AI-powered solution for resizing images tailored to various social media platforms.

🎤 The AI Minds Podcast

Abdulrahman Jamjoom, Co-Founder and CEO at Arini AI, shares his journey from Saudi Arabia to Jordan and then to the U.S., recounting his experiences at Threads and his determination to create a startup before even graduating from Harvard, which eventually led him to co-found Arini AI.

Abdul dives into the innovative AI solution Arini AI offers, focusing on its ability to handle scheduling, rescheduling, confirming, and canceling dental appointments by analyzing phone calls and text messages, reducing the burden on dental office staff and enhancing patient satisfaction.

🤖 Bonus Bits and Bytes!

  • 🎞️ Translating baby sounds using Google AI - What if there were a way to convert a baby's cries into a language model that artificial intelligence can understand? That is what Software Developer Senthil Komar is trying to do in this incredible video.

  • 🧠A Curated list of modern Generative AI Projects and Services - If you haven’t seen this GitHub repository before, you’re in for an incredible treat. From ChatGPT extensions to innovative image generation models, you’re almost guaranteed to find the AI tool you’ve been looking for right here.

  • 🚑 Top 5 arXiv Papers on AI and Medicine - An in-depth look at the intersection between technology and healthcare.

  • 📚 Entropy in Machine Learning- This article dives deep into the essence of entropy within machine learning, unraveling its significance, from the foundational theories to its practical applications in improving predictive models.