North Mini Code: Cohere's First Real Agent-Coding Model
Cohere just released a 30B MoE model trained specifically for agentic software engineering. It's Apache 2.0, beats models 4× its size, and actually works across multiple agent harnesses.
18 posts
Cohere just released a 30B MoE model trained specifically for agentic software engineering. It's Apache 2.0, beats models 4× its size, and actually works across multiple agent harnesses.
Hugging Face, Meta PyTorch, Nvidia, and a dozen others just formed a committee to govern OpenEnv—the protocol layer trying to make agentic RL training actually interoperable.
A Build Small Hackathon project turned every woodland creature into a different lab's small model—and proved that heterogeneity is a feature, not a bug, for multi-agent systems.
A Build Small Hackathon entry proves small models shine where frontier models fail: running multi-agent simulations in real-time. Lessons on scarcity,JSON reliability, and reskinning history.
Google just shipped an entire agentic stack in one month: Gemini 3.5 for multi-step workflows, Gemini Omni for multimodal creation, proactive Search agents, Universal Cart, and hardware purpose-built for it all.
H Company ships quantized weights, mobile support, and cross-framework compatibility. The computer-use agent stack just got real deployment options—including local inference on consumer hardware.
IBM Research argues LLMs alone can't scale in enterprise workflows. Their secret weapon? Software primitives that guide models through complex, regulated tasks at 30× lower cost.
Google just shipped two very different models at I/O 2026: Omni for conversational video editing and 3.5 Flash for long-horizon agent tasks. Here's what the demos reveal.
A 10,000-person software shop cut requirements analysis from weeks to hours by encoding senior judgment into Codex. Their playbook: treat it as a desktop agent, not a code assistant.
Frontier models score below 50% on Kubernetes incident response. The new ITBench-AA benchmark from Artificial Analysis and IBM reveals the gap between agent demos and production IT work.
The AI agent field moves fast, and its vocabulary moves faster. HuggingFace's new glossary finally draws clear lines between harness, scaffold, and agent—distinctions that matter.
OpenAI wants to manage your money. Their new ChatGPT finance feature raises hard questions about AI capabilities, privacy theater, and whether we're solving problems that actually exist.
OpenAI's latest customer spotlight shows how Parloa is using GPT models to power voice agents that don't make you want to throw your phone. Real-time, reliable, and surprisingly capable.
NVIDIA just dropped a 3B parameter multimodal model that processes documents, audio, and video with 128K context. Let's dig into what makes this nano model surprisingly capable.
Google just announced TPU v8, but instead of one chip, they're shipping two: v8T for training and v8I for inference. Here's why the bifurcation matters for AI's next phase.
NVIDIA's new Nemotron-based dataset gives developers 4,800 demographically grounded Korean personas to build culturally aware AI agents—a blueprint for non-English AI.
Hugging Face just dropped Ecom-RLVE, a reinforcement learning framework that trains e-commerce agents in realistic but controllable environments. This is how we move from chatbots to actually useful shopping assistants.
TeamOut's new AI agent promises to plan company retreats through chat. But beneath the slick demo lies a fascinating tension: how do you build trust when the stakes are high and the details matter?