🧠 China open-sources Opus 4.5 level model

Good Morning! Alibaba just dropped Qwen 3.5 Small, a new family of lightweight LLMs built for on-device AI, ranging from just 0.8B to 9B parameters. Plus, I’ll show you how to run a private, free AI on your own computer in 5 minutes.

Plus, in today’s AI newsletter:

Alibaba Launches Qwen 3.5 Small Models
Claude Code Creator Warns of “Builder” Era
Claude Code Gets Voice Mode
How to Run a Private, Free AI On Your Own Computer
4 new AI tools worth trying

AI MODELS

📱 Alibaba Launches Qwen 3.5 Small Models

Alibaba’s Qwen team released Qwen 3.5 Small, designed around “More Intelligence, Less Compute.” Instead of scaling to massive parameter counts, the lineup focuses on efficient reasoning, native multimodality, and edge deployment.

Models range from 0.8B to 9B parameters, optimized for mobile, IoT, and local-first apps
4B model features native multimodal architecture (text + vision in one latent space)
9B model uses Scaled Reinforcement Learning to boost reasoning and reduce hallucinations
Available now on Hugging Face and ModelScope (Base + Instruct versions)

Qwen 3.5 signals a shift from “bigger is better” to smarter and more efficient. With strong reasoning and multimodality in compact sizes, these models make capable AI practical for on-device, privacy-first, and edge applications.

AI IMPACT

👨‍💻 Claude Code Creator Warns of “Builder” Era

Boris Cherny, chief architect behind Anthropic’s Claude Code, believes AI coding agents are rapidly replacing the need for hands-on software engineering. In a recent podcast, he predicted the job title “software engineer” may soon disappear altogether.

Cherny claims he hasn’t manually edited code since November, relying fully on AI
Predicts engineers will evolve into “builders” or product-focused operator
Says deep coding fundamentals may not matter within 1–2 years
Admits AI coding still requires human oversight for safety and correctness

If AI agents can autonomously handle production-level code, the structure of tech teams could fundamentally change. While new roles may emerge, the transition could disrupt millions of engineering jobs, and redefine what it means to “build” software.

AI TOOLS

🎙️ Claude Code Gets Voice Mode

Anthropic is rolling out voice mode to Claude Code, letting developers use push-to-talk to interact with their projects, with no extra cost and no impact on rate limits.

Hold space to talk, release to send (push-to-talk style)
Voice transcription streams directly at your cursor position
Mix typing and speaking in the same prompt seamlessly
Free to use, transcription tokens don’t count against rate limits (Pro, Max, Team, Enterprise)

This removes friction between thinking and coding. Instead of rewriting messy prompts or context-switching, developers can “think out loud” directly into their workflow, making AI-assisted coding faster and more natural.

HOW TO AI

🗂️ How to Run a Private, Free AI On Your Own Computer in 5 Minutes

In this tutorial, you’ll learn how to run large language models (LLMs) directly on your own computer using Docker Model Runner.

🧰 Who is This For

Developers who want to replace hosted AI APIs with local models.
Engineers building AI apps that need better privacy and cost control.
Backend developers integrating LLMs into production systems.
Anyone curious about running AI fully offline.

STEP 1: Install and Configure Docker Desktop

Head over to docker.com and download the free Docker Desktop application for your Mac or Windows computer.

Install the app and open it. Once you are in, click the Settings icon (the gear in the top right corner). Then on the left sidebar, click "AI".

Check the boxes for both "Enable Docker Model Runner" and "Enable host-side TCP support". Change the cores setting to "All".

STEP 2: Find and Download Your Free AI Model

On the left sidebar of the main Docker Desktop window, click the "Models" tab. At the top, click the "Docker Hub" tab to see a list of available open-source models you can download.

In the search bar, type a lightweight model name, like "small3" (using a smaller model ensures it runs smoothly on your computer without needing a massive, expensive graphics card).

Click the "Download" button next to the model. Wait a few moments for it to finish saving to your machine.

STEP 3: Run the Model and Start Chatting

Switch from the "Docker Hub" tab back to the "Local" tab (still under the Models section). You will now see your downloaded model listed here.

Click the "Run" button (or the play icon) next to your model.

An interactive chat screen will immediately open inside Docker. Type "Hello" and hit Enter to test it. You are now chatting with a private AI running entirely on your own hardware, with zero network latency.

STEP 4: Plug the Local AI into Your Workflows

Because you enabled TCP support in Step 1, this local AI acts exactly like a paid OpenAI account, but for free. It runs silently in the background on a local server at http://localhost:12434.

If you have existing code, a custom AI agent, or an automation framework (like Langchain) that normally connects to OpenAI, look for the "Base URL" setting in your script.

Change that Base URL to http://localhost:12434/v1 and keep the rest of your configuration exactly the same. Your automations will now pull from your free, open-source local model instead of charging your credit card!

Anthropic brings Claude's memory feature to free users, after launching it for paid users in 2025,

Sam Altman says OpenAI is amending its DOD contract to ensure “the AI system” is not “intentionally used for domestic surveillance of US persons and nationals”.

Anthropic's $60B+ in funding, half of which came just last month, from over 200 investors is now at risk due to the company's contract dispute with the Pentagon.

Nvidia partners with Cisco, Nokia, and others to build 6G networks based on open, software-defined AI radio access networking (AI-RAN) architecture.

💻 Perplexity Computer: An AI system that can use multiple models to handle long, complex tasks.

💼 Claude Cowork: Anthropic’s team AI platform with plug-ins

🤖 Hermes-Agent: An AI assistant that remembers things and works across different messaging apps.

🎥 Replit Animation: Turn text prompts into animated videos

Which image is real?

Option A |
Option B

THAT’S IT FOR TODAY

Thanks for making it to the end! I put my heart into every email I send, I hope you are enjoying it. Let me know your thoughts so I can make the next one even better!

See you tomorrow :)

- Dr. Alvaro Cintas