Y Combinator

Backed by Y Combinator

All issues
Inference Radar·2026-W21·May 21 — May 27, 2026·18 min read

Qwen3.7-Max Forces Runtimes Into Session Mode

The open-source inference stack spent the week hardening for agents: longer sessions, more tool calls, more multimodal inputs, and more low-bit memory pressure. The center of gravity was not a single model drop, but a broad shift toward serving systems that survive persistent workloads across cloud GPUs, Apple Silicon, mobile NPUs, and local desktops.

Cover for Qwen3.7-Max Forces Runtimes Into Session Mode
3,164 commits
2,779 PRs
1,045 issues
94 releases
71 active repos
Weekly activity by organization

Weekly briefing

Get the next issue in your inbox.

One email, every week. Every link cited. No fluff, no crypto analogies.

Subscribe on Inference Radar
RunAnywhere

RunAnywhere Labs

We build the engines, SDKs, and agents that put inference where latency, cost, and privacy want it — on-prem, cloud, edge, or in between.

© 2026 RunAnywhere, Inc.