Simplifying AI
Posts
🤖 AI Weekly Recap (Week 50)

🤖 AI Weekly Recap (Week 50)

This week’s top AI news, breakthroughs, and game-changing updates

Alvaro Cintas
December 14, 2025

We just had another crazy week in AI! OpenAI doubled down on frontier-level agents with GPT-5.2, while Google quietly turned the browser itself into an AI app factory with Disco.

And that’s not all, here are the most important AI moves you need to know this week.

1. Z.ai Open-Sources AutoGLM

Chinese AI startup Z.ai (Zhipu AI) has open-sourced Phone Agent, the core framework behind AutoGLM an, AI agent that can directly operate smartphones using voice commands. The move comes amid growing privacy concerns around ByteDance’s closed AI phone and signals a push toward transparent, community-owned agentic AI.

→ Open-sourced the full Phone Agent framework on GitHub + Hugging Face
→ Includes AutoGLM-Phone-9B, a foundational model trained to understand and operate phone UIs
→ Enables AI to perform real actions like ordering coffee, shopping, messaging, and booking rides
→ Currently supports Android and works across 50+ major Chinese apps (WeChat, Taobao, Didi, Meituan)
→ Designed as an open alternative to closed AI phone ecosystems

🧰 Who is This Useful For:

Developers building agentic AI that can act across real consumer apps
Researchers exploring embodied AI and UI-level automation
Companies wanting AI assistants without handing control to closed platforms
Privacy-conscious users and teams preferring open, auditable AI agents

Try it now → https://github.com/zai-org/Open-AutoGLM/blob/main/README_en.md

2. Mistral Launches Devstral 2 + Vibe CLI

Mistral has launched Devstral 2, a new generation of large coding-focused AI models, alongside Mistral Vibe, a natural-language command-line interface designed for vibe-coding and code automation. The release signals Mistral’s push to close the gap with larger AI labs like Anthropic and OpenAI by focusing on production-grade, context-aware developer workflows.

→ Devstral 2 is a 123B-parameter coding model built for real-world, enterprise-scale use
→ Requires 4× H100 GPUs (or equivalent) for deployment
→ Devstral Small (24B) enables local deployment on consumer hardware
→ Ships with open licenses: Devstral 2 (modified MIT), Devstral Small (Apache 2.0)
→ Free via Mistral’s API initially; paid pricing starts at $0.40 / $2.00 per million tokens (input/output)

🧰 Who is This Useful For:

Developers automating large, complex codebases
Teams adopting vibe-coding workflows with natural language
Enterprises needing context-aware AI for production environments
Engineers looking for open-weight alternatives to closed coding models

Try it Now → https://mistral.ai/news/devstral-2-vibe-cli

3. Odyssey Unveils Odyssey-2 Pro

Odyssey has unveiled Odyssey-2 Pro, its most advanced world model yet, capable of generating real-time, interactive video that simulates the world forward frame-by-frame. Unlike traditional video models that render fixed clips, Odyssey-2 produces a live, causal video stream that responds instantly to text (and soon audio), turning video from passive media into something you can actively shape.

→ Generates interactive video in real time, streaming at 20 FPS (1 frame every 50ms)
→ Fully causal + autoregressive, predicting each frame only from past frames and user input
→ Produces multi-minute continuous video that adapts live to prompts
→ Trained with a novel multi-stage pipeline to behave like a real-time world simulator
→ Learns physical dynamics (motion, lighting, contact, fluids) directly from video data

🧰 Who is This Useful For:

Researchers building world models and real-time simulators
Game developers exploring interactive, AI-generated environments
Film, media, and storytelling teams experimenting with emergent video
Education and training teams creating interactive learning experiences
Founders building next-gen products in gaming, simulation, or AI media

Try it Now → https://experience.odyssey.ml/

4. OpenAI Rolls Out GPT-5.2

OpenAI has released GPT-5.2, its newest frontier model built for professional workflows, long-running agents, and high-stakes reasoning. The model delivers major upgrades over GPT-5.1 in intelligence, reliability, and long-context accuracy, and was pushed out early after strong benchmark gains from Google’s Gemini 3.

→ Sets new SOTA on SWE-Bench Pro (55.6%) and SWE-Bench Verified (80.0%), with major gains in multi-step agentic coding
→ Achieves near-perfect long-context accuracy on MRCRv2 “needle-in-haystack” tasks up to 256k tokens
→ Reaches new highs on ARC-AGI-1 / ARC-AGI-2, FrontierMath, and GPQA Diamond
→ Cuts error rates in half on chart reasoning and GUI benchmarks (CharXiv, ScreenSpot-Pro)
→ Delivers 98.7% tool-calling accuracy on Tau2-Bench Telecom, critical for autonomous agents

🧰 Who is This Useful For:

Developers building long-running autonomous agents
Teams shipping complex, multi-step coding workflows
Enterprises relying on high-stakes reasoning and reliability
Researchers evaluating frontier-level reasoning and long-context models

Try it Now → https://chatgpt.com/

5. Cursor Launches Visual Editor for Designers

Cursor has launched Visual Editor, a new AI-powered design tool that brings vibe-coding to designers. The feature lets users design and modify the look and feel of web apps using both traditional design controls and natural-language prompts, all directly inside Cursor’s coding environment. The move expands Cursor beyond developers, aiming to unify design and code in a single AI-driven workflow.

→ Combines manual design controls with natural-language edits via Cursor’s AI agent
→ Designers can tweak fonts, colors, spacing, layouts, and components with real CSS precision
→ Changes are applied directly to the codebase, not abstract design layers
→ Built on Cursor’s integrated browser, enabling real-time visual feedback
→ Can inspect and modify any live website, surfacing its design system instantly

🧰 Who is This Useful For:

Designers working closely with production code
Product teams tired of designer–developer handoff friction
Startups wanting a single platform for design + engineering
Companies building brand-specific UIs without generic AI outputs
Designers adopting vibe-coding without sacrificing control

Try it Now → https://cursor.com/download

6. Runway Launches GWM-1 World Model

Runway has released GWM-1, its first world model, alongside a major update to Gen-4.5 that adds native audio and long-form, multi-shot video generation. With this launch, Runway joins the race to build general-purpose world simulators capable of reasoning about physics, time, and interaction, moving beyond static video generation.

→ GWM-1 predicts the world frame-by-frame, learning physics, geometry, lighting, and temporal dynamics
→ Designed as a general world model, pitched as more flexible than Google’s Genie-3
→ Built to generate simulations for agent training across robotics, life sciences, and more
→ Introduces three specialized variants: GWM-Worlds, GWM-Robotics, GWM-Avatars
→ Signals Runway’s shift from media tools toward simulation and agent foundations

🧰 Who is This Useful For:

AI researchers building world models and simulators
Robotics teams training agents in synthetic environments
Studios and creators producing long-form, audio-native AI video
Enterprises exploring avatars, simulation, and interactive media
Developers building next-gen agent training pipelines

Try it Now → https://runwayml.com/

7. Google Debuts Disco

Google has introduced Disco, a new Gemini-powered browser experiment that turns your open tabs into custom web apps. Using Gemini 3, Disco analyzes what you’re browsing and proactively suggests interactive mini-apps, called GenTabs, that you can generate and refine using natural language.

→ Creates custom web apps directly from your open browser tabs
→ Powered by Gemini 3, using live browsing context + Gemini chat history
→ GenTabs are suggested automatically based on what you’re researching or doing
→ Apps can be refined continuously using natural-language prompts
→ Generated experiences link back to original sources for transparency

🧰 Who is This Useful For:

Students visualizing complex topics while studying
Researchers working across many tabs at once
Professionals planning trips, meals, or projects from scattered sources
Power users who want AI-native workflows inside the browser

Join Waitlist → https://labs.google/disco

Thanks for making it to the end! I put my heart into every email I send, I hope you are enjoying it. Let me know your thoughts so I can make the next one even better! See you tomorrow.

- Dr. Alvaro Cintas