Happy Sunday! We just had another crazy week in AI. DeepSeek just dropped the world’s most powerful open-source AI that rivals Claude Opus and GPT-5.5 for free, while Nvidia launched a Nemotron 3 Nano Omni, a ~30 billion parameter reasoning model.

And that's not all, here are the most important AI moves you need to know this week.

DeepSeek has officially released DeepSeek-V4, a 1.6-trillion-parameter Mixture-of-Experts (MoE) model under a commercially-friendly MIT License. It approaches, and in some cases, surpasses, the performance of the world’s most advanced closed-source systems like GPT-5.5 and Claude Opus 4.7, at a radically compressed price point.

  • DeepSeek-V4-Pro is priced at roughly 1/6th the cost of Claude Opus 4.7 and 1/7th the cost of GPT-5.5. The smaller V4-Flash variant is nearly 98% cheaper than the premium proprietary models.

  • Features a native 1-million-token context window, utilizing a radical new "Hybrid Attention Architecture" that requires only 10% of the memory footprint (KV cache) of its predecessor.

  • While GPT-5.5 and Opus 4.7 still hold slight leads on direct reasoning benchmarks, V4-Pro-Max gets unusually close, scoring 83.4% on the agentic BrowseComp benchmark to beat Opus 4.7 and nearly match GPT-5.5.

  • Introduces three reasoning modes ("Non-think", "Think High", and "Think Max") to let users efficiently match compute effort to the difficulty of the task.

Try it now → chat.deepseek.com/

Amazon Web Services (AWS) is taking a massive swing at the AI assistant market with the launch of Amazon Quick. Instead of waiting for you in a browser tab, Quick is a native desktop app that runs continuously in the background, connecting all your fragmented enterprise tools into a single, proactive AI layer.

  • Connects directly to your local files and third-party apps like Slack, Gmail, Zoom, Salesforce, and Microsoft 365, completely breaking free from vendor-locked ecosystems.

  • Operates proactively rather than reactively: by building a "knowledge graph" of your habits, it automatically preps for meetings, surfaces relevant files, and tracks project updates before you even ask.

  • Can instantly generate polished presentations, interactive data dashboards, and functional no-code internal apps directly from natural language prompts.

  • Completely frictionless onboarding: you don't need a complex AWS console setup or cloud infrastructure; anyone can download it and start using it for free with just an email address.

Anthropic has launched new creative connectors for Claude, allowing the AI to tap directly into apps like Adobe Creative Cloud, Affinity, Blender, and Ableton. This marks a major push into the creative industry following the recent launch of Claude Design.

  • The Adobe connector can pull from Photoshop, Premiere, and Express to manipulate images, videos, and designs directly within the Claude interface

  • The Blender integration gives the 3D app’s Python API a natural-language interface, allowing users to debug scenes, build tools, and batch-apply object changes via chat

  • The Ableton connector allows Claude to source information and answer complex queries directly from the music software’s official documentation

  • Anthropic is also throwing its financial weight behind the open-source community, becoming a Corporate Patron of the Blender Development Fund to help keep the software free

Try it now → https://claude.ai/

Now that GPT-5.5 is live, OpenAI has published a wealth of official tips on how to properly steer the model. The core message? Treat it as a brand-new intelligence, scrap your old legacy prompts, and stop being so polite.

  • Skip the pleasantries and be direct: GPT-5.5’s default style is efficient and task-oriented. It doesn't need conversational padding—just tell it exactly what the expected outcome and success criteria are.

  • Lean on system prompts: Put most of your operating rules and tool-specific guidance in the system instructions rather than the user prompt.

  • Drop the hand-holding: Remove detailed step-by-step process guidance. Legacy prompts over-specified the process because older models needed help staying on track. With GPT-5.5, over-explaining adds noise and restricts its ability to independently find the best solution path.

  • Stop defining output schemas: Remove JSON schema definitions from your text prompt and rely strictly on the API's Structured Outputs feature for validation.

Mistral AI has launched Mistral Medium 3.5, a powerful new 128-billion parameter dense model with a 256k context window. Alongside the model, Mistral introduced "Work Mode" for Le Chat, taking autonomous coding agents out of local environments and dropping them straight into the cloud.

  • The model integrates reasoning, instruction-following, coding, and vision (handling variable image sizes) into a single powerhouse that can be deployed on as few as four GPUs.

  • It scored a massive 77.6% on the SWE-Bench Verified benchmark, outperforming heavyweights like Devstral 2 and Qwen3.5 in coding and agentic tasks.

  • Through the new Mistral Vibe CLI and Le Chat "Work Mode," users can run complex, multi-step agentic workflows in isolated cloud sandboxes that execute in parallel and ping you when finished.

  • Developers can literally teleport active coding sessions between their local machines and cloud environments without losing state or approvals.

  • The model's open weights are available under a modified MIT license, allowing for both self-hosted and cloud-based deployment.

Nvidia has launched Nemotron 3 Nano Omni, a ~30 billion parameter reasoning model designed to act as the ultra-fast "brains" for multimodal agentic applications.

  • The model uses a hybrid mixture-of-experts (MoE) architecture that integrates vision and audio encoders directly into the system, completely eliminating the need for separate perception modules.

  • This unified approach allows the model to deliver up to 9x faster throughput than other open omni models currently on the market.

  • It's optimized for real-time agentic tasks, capable of rapidly interpreting live HD screen recordings, documents, and voice activity with extremely low latency.

  • Due to its smaller size, it can be compressed to run locally on high-end consumer hardware while also scaling efficiently for enterprise cloud deployments.

  • The model is available right now on Hugging Face, OpenRouter, and build.nvidia.com as an Nvidia NIM microservice.

Thanks for making it to the end! I put my heart into every email I send. I hope you are enjoying it. Let me know your thoughts so I can make the next one even better.

See you tomorrow :)

Dr. Alvaro Cintas

Keep Reading