The Batch @ DeepLearning.AI

Intelligence extracted from The Batch @ DeepLearning.AI newsletters.

Issues Tracked

Insights Extracted

Topics Covered

Topics

AI/MLTechnologyProductEngineeringStartupsRegulationBusiness

Key Insights from The Batch @ DeepLearning.AI

**GPT-5.5** tops the Artificial Analysis Intelligence Index (60 pts) and ARC-AGI-2 (85%) but ranks third on knowledge calibration behind **Gemini 3.1 Pro Preview** and **Claude Opus 4.7** due to an 85.53% hallucination rate.

**GPT-5.5** API is priced at $5/$30 per million input/output tokens — roughly double GPT-5.4 rates — while **GPT-5.5 Pro** runs $30/$180 per million tokens with parallel reasoning inference.

**Andrew Ng** launched **AI Prompting for Everyone**, a no-technical-background course covering deep research mode, multi-document context, and agentic AI use across **ChatGPT**, **Claude**, and **Gemini**.

**Andrew Ng** ranks coding agent acceleration by task type: frontend fastest, then backend, infrastructure, and research least accelerated — and adjusts team expectations accordingly.

**Z.ai** released **GLM-5.1**, a 754B parameter open-weights MoE model that can autonomously loop through planning, execution, and self-evaluation for up to eight hours on coding tasks.

**GLM-5.1** tops the Artificial Analysis Intelligence Index among open-weights models, priced at $1.40/$4.40 per million input/output tokens and available under MIT license on HuggingFace.

**Anthropic** released details about **Claude Mythos Preview**, a new AI model that outperforms Claude Opus 4.6 but poses significant cybersecurity risks due to its ability to exploit code vulnerabilities

**Andrew Ng** argues against AI jobpocalypse predictions, citing **Citadel Research** report showing rising software engineering job postings despite AI acceleration in coding

**DeepLearning.AI** announced new **SGLang** course for efficient LLM inference and **AI Developer Conference** on April 28-29 in San Francisco focused on the future of software engineering

**Claude Code's** source code was accidentally leaked through an npm package containing a source map file, revealing over 512,000 lines of code across 1,900 files.

Latest issue: July 10, 2026

Restoration of Claude Fable 5, Gemini's Video Dev Engine, DeepSeek Speeds Up Speculative Decoding

Jul 10

OpenAI's GPT-5.6 Family, New Ways to Train Robots, Models Invoking Models

Jul 3

A New Generation Studies AI, Apple's Recipe for On-Device Models, GLM5.2 Tackles Open-Ended Problems

Jun 26

Testing Mythos and Fable, Moving Beyond SWE-bench, Nvidia's Open Contender

Jun 19

Mythos Begets Fable, Cursor's Composer 2.5, Agents Building Agents

Jun 12

Qwen3.7-Max Challenges Google for Third Place, AI Saves Whales, Fine-Tuning Breaks Copyright Alignment

Jun 5

Hermes vs. OpenClaw, Cybersecurity Alarms Ring, More-Interactive Conversations, Can Agents Do Human Work?

May 22

China Thwarts Meta’s Agentic Ambition, U.S. Evaluates Upcoming Models, AI Diagnoses Mammograms

May 15

Seedance Makes A Splash, Nvidia's AI-Guided Chip Designs, Helping Robots Not Forget

May 8

GPT-5.5 Outperforms (and Hallucinates), Kimi K2.6 Leads Open LLMs, AI Strains Climate Pledges, Strategic Thinking in LLMs vs. Humans

OpenAI's GPT-5.5 tops key objective benchmarks like ARC-AGI-2 and Artificial Analysis Intelligence Index but struggles with hallucinations, ranking third on knowledge calibration behind Gemini and Claude. The newsletter also promotes Andrew Ng's new course 'AI Prompting for Everyone' covering advanced prompting techniques for ChatGPT, Claude, and Gemini. Additional topics teased include Kimi K2.6 leading open LLMs and AI's strain on climate pledges.

May 1AI/ML3 insights

GLM 5.1 Thinks Strategically, Data-Center Revolt Intensifies, When Helpful LLMs Turn Unhelpful, Humanoid Robots Get to Work

Andrew Ng shares a framework for how coding agents accelerate different types of software work, ranking frontend development as most accelerated, followed by backend, infrastructure, and research. Z.ai released GLM-5.1, an open-weights 754B parameter mixture-of-experts model designed for long-running agentic coding tasks lasting up to eight hours. The newsletter also touches on data-center issues, unhelpful LLMs, and humanoid robots entering the workforce.

Apr 24AI/ML3 insights

Anthropic’s Claude Mythos Problem, Dark DNA Unveiled, Pitfalls for Assistive Models, Simulating Fluid Dynamics

Andrew Ng discusses the future of software engineering as AI agents accelerate coding, arguing against predictions of massive AI-driven job losses while highlighting expanding software engineering job postings. Anthropic releases details about Claude Mythos Preview, a new AI model with extraordinary cybersecurity capabilities that outperforms Claude Opus 4.6 but poses security risks due to its ability to identify and exploit code vulnerabilities.

Apr 10AI/ML3 insights

Claude Code’s Source Leaks, OpenAI Exits Video Generation, Gemini Adds Music Generation, LLMs Learn at Inference

This newsletter focuses on the rapid advancement of voice-based AI interfaces and their potential to become pervasive in applications. The main story covers a security breach where Claude Code's source files were accidentally exposed through an npm package, revealing over 512,000 lines of code.

Apr 3AI/ML3 insights

Nvidia's Open Salvo, OpenAI's Amazon Deal, Grok Cuts Video Prices, Recursive Language Models

Andrew Ng discusses how anti-AI coalitions are using public opinion research to find effective messaging against AI development, particularly focusing on warfare, environmental concerns, and job displacement arguments. Nvidia released Nemotron 3 Super 120B-A12B, the first open-source LLM leader from the US since Meta's Llama 4, designed for agentic applications with superior speed performance.

Mar 27AI/ML3 insights

Attacks On Data Centers, Qwen3.5 In All Sizes, DeepSeek's Huawei Play, Apple's Multimodal Tokenizer

Andrew Ng's The Batch newsletter addresses widespread job insecurity across all career levels amid rapid AI advancement and geopolitical instability, recommending community-building and skill investment as stable foundations. The newsletter also reports on Iranian drone strikes targeting at least three Amazon Web Services data centers in Bahrain and the UAE, disrupting critical online services and marking a potential first in wartime targeting of cloud infrastructure.

Mar 20AI/ML3 insights