IRCNF

Microsoft launched its first in-house reasoning model — and it outscores Claude Sonnet 4.6 in blind tests

Microsoft AI
Share:
Microsoft launched its first in-house reasoning model — and it outscores Claude Sonnet 4.6 in blind tests

Microsoft announced MAI-Thinking-1 at Build 2026 on June 2 — its first flagship in-house reasoning model, built without OpenAI training data, and the opening move in what the company is calling its MAI model family. The announcement marks a meaningful shift: Microsoft has been the world's largest OpenAI customer and distributor, and now it has a competitive reasoning model of its own.

The timing is notable. MAI-Thinking-1 arrives as the AI model market is compressing, with strong open-source alternatives narrowing the gap to proprietary frontier models. Microsoft's decision to build and release its own model — rather than simply resell or fine-tune OpenAI's — reflects both the economics (owning the model reduces per-inference costs) and the strategic reality that depending entirely on a supplier that's also a competitor is a position most companies want to diversify out of.

What the benchmarks actually say

MAI-Thinking-1 is a 35-billion-active-parameter model with a 128K context window (256K in some configurations). Microsoft published the following benchmark results:

  • SWE-Bench Pro: 52.8% — Microsoft claims this matches Claude Opus 4.6 on coding tasks
  • AIME 2025: 97.0%
  • AIME 2026: 94.5%
  • LiveCodeBench v6: 87.7%

The headline claim — that independent human evaluators in blind tests preferred MAI-Thinking-1 over Anthropic's Claude Sonnet 4.6 for overall quality in single and multi-turn tasks — is the kind of benchmark that requires scrutiny. "Human preference evaluations" can vary substantially based on prompt selection, evaluator pool, and task framing. Microsoft hasn't published the full methodology. That said, the SWE-Bench Pro score is a concrete, reproducible benchmark, and 52.8% is competitive with the top tier of publicly available models.

The "trained without OpenAI data" specification is significant both legally and technically. It establishes that MAI-Thinking-1 is not a derivative of GPT-family models and that Microsoft's AI capabilities are not contingent on its OpenAI partnership. Whether that independence extends to the full model family or just this release is not yet clear.

The MRC protocol: networking for AI at scale

The second major announcement from Build that deserves attention is the Multipath Reliable Connection (MRC) protocol — an RDMA-based networking standard developed by OpenAI in collaboration with Microsoft, AMD, Broadcom, Intel, and NVIDIA.

MRC is designed to solve a specific and consequential problem: running synchronous AI training jobs across thousands of GPUs requires networking that can handle the simultaneous all-reduce communication patterns of large-scale training with high reliability and low latency. Current AI clusters use InfiniBand or RoCE (RDMA over Converged Ethernet); both have limitations in how they handle congestion and hardware failures during training runs.

MRC extends RoCE with multipath packet spraying — distributing traffic across many simultaneous paths rather than a single path — and SRv6 source routing, which allows the sender to explicitly specify packet routing across the network fabric. Combined, these allow MRC to route around congestion and hardware failures dynamically, without the training job halting or needing to restart from a checkpoint.

Crucially, MRC is already in production. OpenAI and Microsoft have deployed it across their largest training clusters, including systems built on NVIDIA GB200 hardware. The specification has been released to the Open Compute Project — the industry consortium that standardizes open hardware and networking designs — making it available for other operators to implement without licensing fees.

If MRC achieves broad adoption, it represents Ethernet's most significant expansion into AI training infrastructure, a domain that InfiniBand has historically dominated at the highest performance tier. The consortium backing — AMD, Broadcom, Intel, NVIDIA, OpenAI, Microsoft — gives it enough industry weight to be taken seriously by data center operators evaluating fabric architectures for new AI clusters.

What Microsoft's model independence means for the market

The partnership between Microsoft and OpenAI has been structured so that Microsoft resells OpenAI's models through Azure and integrates them into its products. MAI-Thinking-1 creates an alternative internal option. Microsoft hasn't said MAI replaces its OpenAI agreements — the two companies remain closely linked — but having a proprietary model gives Microsoft negotiating leverage, reduces its exposure to OpenAI's pricing decisions, and allows it to offer model serving at margins that depend on its own compute costs rather than OpenAI's API rates.

For enterprise customers currently using Azure OpenAI endpoints, the practical implication is a new option: a Microsoft-native model available via Microsoft Foundry (currently in private preview) that doesn't require routing through OpenAI's infrastructure. Whether enterprises prefer MAI-Thinking-1 to Claude or GPT-5 for their specific workloads will depend on independent evaluations beyond what Microsoft has published.

The model is not yet publicly available. Microsoft Foundry private preview access is the current entry point. The full availability timeline and pricing have not been announced.

Sources: Microsoft AI; Microsoft Blog; Neowin

Originally reported by Microsoft AI. Read the original article for additional details.

View original source
Share: