Community Reports — April 2026

AI Model Degradation Reports

Tracking quality regressions and performance changes across major AI models. Reports are based on community observations, benchmark comparisons, and direct testing by the TokenCalculator team.

Last updated: April 6, 2026 2 active reports
Degraded Under Investigation Resolved Monitoring
Degraded Report #001

Claude (All Plans) — Peak-Hour Performance Degradation

Anthropic / claude.ai & API  ·  Reported: March 27, 2026  ·  Status: Active

Summary

Starting March 27, 2026, users across Claude Free, Pro, Max, and Team plans began reporting significantly faster depletion of their session and weekly usage allowances during weekday afternoon hours. Anthropic confirmed that a peak-hour usage adjustment is in effect: during the window of 1:00 PM – 7:00 PM UTC, Monday through Friday, each session consumes the weekly allowance at an accelerated rate. Enterprise plans are not affected. Weekends are fully off-peak and unaffected.

Affected Models

Claude Opus 4.6 Claude Sonnet 4.6 Claude Haiku 4.5 All claude.ai plans

What Users Are Experiencing

  • Weekly message allowance depleting 2–3× faster than expected during weekday afternoons
  • "Usage limit reached" warnings appearing earlier in the day than before March 27
  • Normal drain rate observed on weekends and before/after the 1 PM–7 PM UTC window
  • Enterprise users report no change — consistent with the exemption

Peak Hours by Time Zone

City / Region Peak Window (Local) Zone
New York 9:00 AM – 3:00 PM EDT (UTC−4)
San Francisco 6:00 AM – 12:00 PM PDT (UTC−7)
São Paulo 10:00 AM – 4:00 PM BRT (UTC−3)
London 2:00 PM – 8:00 PM BST (UTC+1)
Paris / Berlin 3:00 PM – 9:00 PM CEST (UTC+2)
Istanbul 4:00 PM – 10:00 PM TRT (UTC+3)
New Delhi 6:30 PM – 12:30 AM IST (UTC+5:30)
Beijing 9:00 PM – 3:00 AM CST (UTC+8)
Tokyo / Seoul 10:00 PM – 4:00 AM JST/KST (UTC+9)

Weekends (Sat–Sun) are fully off-peak at all times worldwide.

How to Work Around It

  • Schedule heavy sessions for before 1 PM UTC on weekdays, or after 7 PM UTC
  • Use weekends for compute-intensive or multi-turn sessions
  • API users: use the Batch API for non-urgent workloads — it runs off-peak and costs 50% less
  • Route light tasks to Haiku during peak hours; reserve Opus for after-hours

More details: See our full blog post on this change — Claude Peak Hours 2026: Why Your Weekly Limit Drains Faster on Weekday Afternoons.

Under Investigation Report #002

Gemini 3.1 Pro — Quality Regression vs. Gemini 3.0

Google DeepMind / Vertex AI & Google AI Studio  ·  Reported: April 2026  ·  Status: Under Investigation

Summary

A growing number of developers and researchers have reported that Gemini 3.1 Pro underperforms Gemini 3.0 on several key task categories despite the higher version number. The pattern — where a newer model version is subjectively worse than its predecessor on specific workloads — is sometimes called a "quality regression" and is a known phenomenon in LLM development cycles. Google has not officially acknowledged the regression.

Affected Models

Gemini 3.1 Pro Gemini 3.1 Flash

What the Reports Show

  • Instruction following: Gemini 3.1 Pro shows higher rates of ignoring explicit formatting and constraint instructions compared to 3.0 in community evaluations
  • Code generation: Several developers report that 3.0 produced more reliable, runnable code on identical prompts — 3.1 generates more plausible-looking but subtly broken outputs in some cases
  • Long-context coherence: At 500K+ tokens, 3.1 Pro shows more "attention drift" — losing track of earlier context — than 3.0 did at equivalent context lengths
  • Creative writing: Subjective quality seen as lower by users who preferred 3.0's more consistent tone and structure
  • Reasoning tasks: Mixed results — 3.1 benchmarks higher on formal reasoning suites but community evaluations suggest worse real-world performance on ambiguous multi-step tasks

Why Does This Happen?

Quality regressions in LLM updates typically stem from one of several causes: changes to RLHF (reinforcement learning from human feedback) data that inadvertently overfit for different preferences; architecture changes that improve some capabilities while degrading others; or inference-time optimizations (quantization, speculative decoding) that trade subtle quality for speed. Higher version numbers do not guarantee better performance on all tasks — they reflect a different set of trade-offs.

What You Can Do

  • If your workflow was optimized for Gemini 3.0, consider pinning to the 3.0 model version via the API if available
  • Run A/B evaluations on your specific task type before migrating production workloads to 3.1
  • For coding tasks showing regressions, Claude Sonnet 4.6 and GPT-5.4 are strong alternatives at comparable pricing
  • Submit feedback to Google AI Studio — documented quality regressions with specific examples are the most effective way to prompt a patch

Note: This report is based on community observations and third-party evaluations. Google has not officially confirmed a regression. We will update this report when official information is available. Compare models directly on our models page.

About this page: Degradation reports are maintained by the TokenCalculator team based on developer community reports, benchmark comparisons, and internal testing. Reports do not constitute official statements from any AI provider. Follow @tokencalculator on X for updates.