IBM Granite 4.1 — 8B That Punches at 32B
IBM dropped Granite 4.1 on April 30, three sizes (3B, 8B, 30B), all Apache 2.0. The headline isn't the model count, it's the ratio: the 8B Instruct now consistently matches or beats the previous Granite 4.0 32B Mixture-of-Experts on tool calling and instruction following. Same architecture family, four times less compute at inference. Either MoE was masking weak training, or the new post-training pipeline is the real story.
IBM is also doing the boring useful work — 12 supported languages out of the box, 15 trillion training tokens, and a post-training stack that explicitly targets agent workloads. Tool calling, instruction following, structured output. They didn't ship a chat toy, they shipped something you can plug into an agent harness on Monday morning.
The context here matters. Open weights at sub-10B that genuinely work for tool use is the bottleneck almost every enterprise agent project hits. Frontier APIs are too expensive to spam, the 70B+ open models are too slow to host cheaply, and the existing sub-10B options choke on multi-tool plans. Granite 4.1-8B fits in a single H100 with room to spare and reportedly handles tool chains that previously required Llama-70B class.
The meta-story is that IBM has now quietly become one of the most consistent open-model shippers — not flashy, but Granite 1, 3, 4, 4.1 all have actual tool-use chops, and the Apache 2.0 license means the BigCo deployments aren't waiting for legal. https://research.ibm.com/blog/granite-4-1-ai-foundation-models
← Back to all articles
IBM is also doing the boring useful work — 12 supported languages out of the box, 15 trillion training tokens, and a post-training stack that explicitly targets agent workloads. Tool calling, instruction following, structured output. They didn't ship a chat toy, they shipped something you can plug into an agent harness on Monday morning.
The context here matters. Open weights at sub-10B that genuinely work for tool use is the bottleneck almost every enterprise agent project hits. Frontier APIs are too expensive to spam, the 70B+ open models are too slow to host cheaply, and the existing sub-10B options choke on multi-tool plans. Granite 4.1-8B fits in a single H100 with room to spare and reportedly handles tool chains that previously required Llama-70B class.
The meta-story is that IBM has now quietly become one of the most consistent open-model shippers — not flashy, but Granite 1, 3, 4, 4.1 all have actual tool-use chops, and the Apache 2.0 license means the BigCo deployments aren't waiting for legal. https://research.ibm.com/blog/granite-4-1-ai-foundation-models
Comments