Good Morning! Google just changed the game for AI speech by launching Gemini 3.1 Flash TTS, a model so natural it’s already topping the leaderboards and giving you "director" level control over every word. Plus, I’ll show you how to recreate viral YouTube shorts without filming using AI.

Plus, in today’s AI newsletter:

  • Google Debuts Ultra-Expressive Gemini 3.1 TTS

  • Google Launches Gemini AI App on Mac

  • Adobe Launches Firefly AI Assistant to Automate Creative Workflows

  • How to Recreate Viral YouTube Shorts Without Filming

  • 4 new AI tools worth trying

AI MODELS

Google is rolling out a new text-to-speech (TTS) model based on Gemini 3.1 Flash. Designed to be the most natural and expressive voice output in Google's history, the model introduces "audio tags", simple text commands that allow developers to manually control style, tempo, tone, and accent.

  • Features an Elo rating of 1,211 on the Artificial Analysis rankings, beating out ElevenLabs v3 in overall quality while offering a superior quality-to-price ratio.

  • Supports over 70 languages and is capable of handling complex, multi-speaker dialogues within a single output.

  • Includes a "Free Tier" (where Google uses data for training) and a "Paid Tier" that offers data privacy for $1.00 per million text tokens and $20.00 per million audio tokens.

  • Enterprise-ready through Vertex AI and Google Vids, with a "Batch Mode" that slashes costs by 50% for high-volume users.

By adding "audio tags," Google is shifting TTS from a black-box generation process into a controllable creative tool. Developers can now fine-tune AI performances with the nuance of a human voice actor, making this one of the most competitive options for high-quality, scalable voice applications, especially with a pricing model that undercuts major rivals like ElevenLabs.

AI TOOLS

The new Gemini macOS app brings Google’s assistant directly to the desktop, allowing users to interact with the AI without ever switching tabs. It positions Gemini as a direct competitor to the Spotlight-style integrations from OpenAI, Anthropic, and Perplexity.

  • Features a new Option + Space shortcut that pulls up a floating chat bubble for instant access from any screen.

  • Includes a "share window" feature that allows Gemini to see and pull information from your active apps to help answer questions or provide context.

  • Deeply integrated with the Google ecosystem, allowing users to upload files, photos, and documents directly from Google Drive.

  • Supports native generation of images, videos (via Veo), and music (via Lyria) directly within the desktop interface.

The battle for the desktop is about more than just convenience, it's about becoming the primary interface for work. While OpenAI and Anthropic are focusing on agents that can click buttons for you, Google is betting on deep integration with your existing files and a seamless, Spotlight-like experience. If Google can bridge the gap to autonomous task execution, Gemini could become the default brain for millions of Mac-using professionals.

AI TOOLS

Following its "Project Moonlight" preview, Adobe has officially launched the Firefly AI Assistant. This isn't just another chatbot; it’s an agentic layer that can jump between different Creative Cloud apps to execute multi-step tasks that used to take hours of manual clicking.

  • The assistant can orchestrate workflows across the entire Adobe ecosystem, including Photoshop, Premiere, Lightroom, Express, and Illustrator.

  • It introduces "Skills", pre-built, multi-step automations like a social media asset skill that can instantly crop, expand, and optimize images for every platform simultaneously.

  • Users maintain granular control through context-aware sliders and buttons; for example, a single slider can adjust the density of a forest in a photo without you needing to manually mask a single leaf.

  • Adobe is integrating third-party video models Kling 3.0 and Kling 3.0 Omni directly into Firefly’s library, alongside new AI-powered audio tools for noise reduction and reverb adjustment.

Adobe is attacking the "blank page" and "learning curve" problems simultaneously. By unifying its massive catalog of professional tools under a single agentic interface, they are removing the friction of mastering complex software. For the industry, this signals that being an "expert" in a specific tool like After Effects matters less than being an expert at directing the AI agent that controls it.

HOW TO AI

🗂️ How to Recreate Viral YouTube Shorts Without Filming

In this tutorial, you will learn how to reverse-engineer and generate viral short-form videos using AI tools like NotebookLM, OpenArt, and ElevenLabs, allowing you to produce high-quality, consistent content without ever picking up a camera.

🧰 Who is This For

  • Content creators looking to scale their channel without hiring actors

  • Researchers and marketers analyzing and replicating proven viral video formulas

  • Faceless channel owners who need consistent, cinematic AI video generation

  • Anyone who wants AI to handle scripting, storyboarding, and voiceovers

STEP 1: Analyze & Script

Find a YouTube channel that is crushing it with a specific style, and copy the links to 3 to 6 of their best-performing Shorts.

Head to Google NotebookLM, create a new notebook, and paste those links as your sources. In the chat box, give it a detailed prompt describing what is happening: “This is a handheld vlog style POV where a wife records her husband's daily interactions with a wolf. Create a similar 30-second short script with 10 scenes.”

Hit Enter, and NotebookLM will analyze the pacing, tone, and structure to build your script.

STEP 2: Generate Consistent Visuals with OpenArt

Now we need visuals. The biggest mistake beginners make is using "text-to-video," which often results in random, inconsistent clips. We will use "image-to-video" for maximum control.

Ask NotebookLM to generate detailed image prompts for every scene in your new script. Then go to OpenArt.ai, select "Images," and choose a realistic model (like Nano Banana Pro). Set your aspect ratio to vertical (9x6 or 9x16) for Shorts.

Generate your first image for scene one.

Once you have an image you like, use OpenArt's "Reference Image" feature and upload that first picture. Keep generating the rest of your scenes using this reference to ensure your characters, lighting, and environments look perfectly consistent.

STEP 3: Animate the Images into Video

Go back to NotebookLM and ask it to generate "video animation prompts" (like camera movements or subtle motions) for all your images.

In OpenArt, switch to the "Image to Video" section and select a top-tier video model. Upload your generated images one by one, paste the corresponding motion prompt, and set the duration to about 4 seconds.

Use the first-frame and last-frame control features to ensure smooth, natural transitions between your clips, avoiding the stiff, robotic look common in AI videos.

STEP 4: The Final Polish (Voice & Edit)

You have silent video clips. Now we need to add the emotion and assemble the final product.

Take your finished script and paste it into ElevenLabs to generate a realistic, natural-sounding voiceover (keep the speed at 1x to maintain authenticity). Finally, open a free editor like CapCut.

Import your AI video clips and your audio track, arranging everything so the narration syncs perfectly with the visuals. Add a few subtle sound effects, like footsteps or wind, to build immersion, then click export. Your video will be ready to post.

OpenAI updates Agents SDK with native sandboxing and an in-distribution harness for deploying and testing agents on long-horizon tasks.

Google launches Skills, repeatable AI prompts that Chrome users can run with a keyboard shortcut; users can set up their own Skills or choose from 50+ presets.

Anthropic rolls out identity verification that may require Claude users to provide a government-issued photo ID and live selfie to access “certain capabilities”.

Apple plans to send ~200 people from its Siri team, a group internally known as a laggard, to an AI coding bootcamp.

💳 Lovable Payments: Add payments to your app with just one chat

💎 Gemma 4: Google’s powerful small AI model

⚙️ HeyGen CLI: Create videos straight from your terminal with AI

💻 Holo 3: Open AI agent that can use computers like a human

Which image is real?

Login or Subscribe to participate

THAT’S IT FOR TODAY

Thanks for making it to the end! I put my heart into every email I send, I hope you are enjoying it. Let me know your thoughts so I can make the next one even better!

See you tomorrow :)

- Dr. Alvaro Cintas

Keep Reading