IBM quietly dropped something significant this morning.
Granite 4.1 is a family of open-source language models released today under the Apache 2.0 license. That means you can use it, modify it, and deploy it commercially -- for free. No API fees. No per-token costs. No vendor lock-in.
Three sizes: 3B, 8B, and 30B parameters. Built for enterprise use. And the benchmark numbers are getting attention.
The Number That Has Everyone Talking
The 8B model -- the mid-tier option -- is matching and beating IBM's previous 32B model on most standard benchmarks. Same tasks, one-quarter the size.
On ArenaHard (a benchmark that evaluates models on 500 real-world prompts), the new 8B scores 69.0. The old 32B model scores lower. On BFCL V3, which tests tool-calling ability -- the thing that makes AI useful for workflows -- the 8B scores 68.3 versus the old model's 64.7.
This matters because smaller models are cheaper to run. If you can get the same performance from a model that needs 8 billion parameters instead of 32 billion, you can run it on less hardware, with lower latency, at lower cost.
For a small business that wants to host its own AI rather than paying monthly for a cloud service, that gap is everything.
What Small Businesses Can Actually Do With This
Free and open-source AI isn't new. What's different here is the use case IBM built Granite 4.1 for: structured enterprise work.
The model was trained with particular focus on:
- Tool calling and function use. This is what lets AI connect to your existing software -- scheduling tools, CRMs, databases. The 8B model outperforms the previous generation here.
- Instruction following. It's filtered on six dimensions of response quality, including completeness, correctness, and calibration. Responses that hallucinate facts get cut before training. This is the part most open-source models still struggle with.
- Long context. The 8B and 30B versions support a 512K token context window, meaning you can feed it entire contracts, customer histories, or large datasets and have it reason across all of it.
The dense architecture -- no complex "mixture of experts" routing -- means cost and latency are predictable. You know what you're paying. You know how fast it'll respond.
Why Apache 2.0 Is the Important Part
Most "free" AI models come with restrictions. You can use them for research but not for products. Or you can deploy them, but you can't modify the underlying weights. Or commercial use requires a separate license with conditions.
Apache 2.0 has none of that. It's the same license that powers Linux, Kubernetes, and most of the open-source software running modern business infrastructure. Build it into a product, sell that product, modify the model to fit your specific data -- all of it is allowed.
For a small business, this is meaningful. You are not dependent on OpenAI's pricing changes, Anthropic's API availability, or Google's enterprise tier requirements. You own your deployment.
How to Actually Run It
Granite 4.1 is available on Hugging Face now. If you have a machine with a decent GPU -- or even a cloud instance -- you can run the 3B version on consumer hardware. The 8B requires more RAM but is within reach of most workstations built in the last two years.
For teams without IT infrastructure, the models can also be accessed via IBM's WatsonX platform, which wraps Granite in an enterprise interface with access controls and usage logging. That's the managed version for businesses that don't want to run their own servers.
IBM's blog post with full benchmark details and setup instructions is at ibm.com.
The Practical Takeaway
AI model pricing has been trending down for two years. Open-source quality has been catching up. Granite 4.1 is the latest evidence that the gap between what you can build for free and what you'd pay a vendor for is now small enough that the decision is mostly about implementation time, not capability.
If your business is already paying monthly for AI tools that handle structured tasks -- data extraction, document summarization, customer query routing, tool automation -- this is worth evaluating. You may be paying for something you can now run yourself.
The model is out today. The benchmarks are real. The license is clean.
IBM Granite 4.1 was released April 30, 2026. Details and benchmark comparisons are available at Firethering and via IBM's official model documentation on Hugging Face.