Paper Summaries2026-05-06

Paper Summaries: Week 20

Three arXiv papers worth reading this week — what changed, what's useful, what's oversold.

By AI Signal Editorial

Three papers from this week's arXiv firehose actually moved the needle. The rest, as usual, are incremental refinements of last year's ideas dressed in new acronyms. Here is the short version of what to read and what to skip.

Sparse mixture-of-experts with token-conditional routing

The headline is a 30% reduction in active parameters at iso-quality on standard reasoning benchmarks. The mechanism is straightforward: a tiny gating network conditions on local context to route each token to a small set of experts, with a load-balancing loss to prevent collapse. What is interesting is not the result — sparse MoE has been the dominant scaling story for two years — but the routing stability under fine-tuning. Previous variants saw expert utilisation drift dramatically during downstream training; this paper proposes a frozen-router phase that holds gating fixed for the first epochs.

A surprisingly clean negative result on synthetic data scaling

A team scaled a small model on increasing amounts of teacher-generated synthetic data and found a clear ceiling: beyond a certain ratio of synthetic to real tokens, the model gets measurably worse, even when filtered for quality. This is one of the rare papers that calls its shot honestly — they expected a positive result, they got a negative one, they published anyway. The takeaway is not "synthetic data is bad" but "the synthetic-to-real ratio is a hyperparameter and you need to tune it."

A benchmark to read carefully

A new "agentic reasoning" benchmark made the rounds this week. Skim it, but do not over-index on the numbers. The task distribution is heavily biased toward problem types that current frontier models already do well on, and the difficulty calibration looks weak. Useful as a smoke test, not as a leaderboard.

// Example: a minimal RAG retrieval step
const hits = await vectorStore.search(queryEmbedding, { topK: 50 });
const reranked = await crossEncoder.rerank(query, hits);
const context = reranked.slice(0, 6).map(h => h.text).join("\n---\n");