Weekly LLM API Reliability Report — May 26, 2026
This week's real-time monitoring reveals interesting patterns across the top LLM APIs. Here's what happened.
Executive Summary
| Provider | Uptime (7d) | Avg Latency | Incidents | Confidence |
|---|---|---|---|---|
| Claude | 99.94% | 1,024ms | 1 | High |
| ChatGPT | 99.82% | 573ms | 2 | High |
| xAI (Grok) | 98.80% | 1,358ms | 3 | High |
| Gemini | 70.27% | 915ms | 2* | Medium** |
*Gemini 2.5 Pro hit daily quota limits (1,000 requests/day). Gemini 2.5 Flash recovered quickly.
**Sampling: Gemini checked every 2–3 min vs 30s for others due to rate-limiting (explains variance).
Uptime Breakdown
🟢 Claude (Anthropic) — 99.94%
Status: Operational
Best performer this week. One minor degradation event (< 1 min) on May 26 morning, auto-recovered.
What happened: Haiku 3.5 had a transient issue at 02:00 UTC. No user reports, high confidence it was a momentary glitch.
🟢 ChatGPT (OpenAI) — 99.82%
Status: Operational
Solid week. Two brief incidents: GPT-4 Turbo and GPT-4o Mini both had < 1 min hiccups.
What happened:
- May 26, 02:23 UTC: GPT-4 Turbo degraded for 12 seconds
- May 26, 00:40 UTC: GPT-4o Mini degraded for 19 seconds
Both resolved instantly. Pattern suggests brief load spikes, not underlying issues.
🟡 xAI (Grok) — 98.80%
Status: Operational
Reliable, but slower than Claude/ChatGPT. Three incidents, all brief.
What happened:
- May 25, 22:54 UTC: Grok 3 timeout (30s) — 25-second event
- May 25, 22:38 UTC: Grok 3 Mini degraded (< 1 min)
- May 25, 22:20 UTC: Grok 3 Mini timeout — 19 seconds
All transient. Grok's latency is higher (1,358ms avg) but predictable.
🔴 Gemini (Google) — 70.27%
Status: Degraded (Rate-Limited)
⚠️ Important context: Gemini 2.5 Pro hit Google's daily quota (1,000 requests/day).
What happened:
- May 26, 01:40 UTC: Pro model exceeded daily limit. Status: 429 (rate-limited)
- Duration: 2+ hours (ongoing as of this report)
- Root cause: Google's free tier API has a hard 1,000-request-per-day limit
- Fix: Will reset at UTC midnight, or upgrade to paid tier
Gemini 2.5 Flash: Operating normally (100% uptime). It has a separate quota.
Why the uptime looks low:
- We check Gemini every 2–3 minutes (vs 30s for others) due to rate-limiting
- Fewer data points = higher variance
- The 70% reflects rate-limit periods, not actual API failures
- Real availability: Likely 99%+ when quota is available
Latency Comparison
xAI (Grok): 1,358ms ████████████████
Gemini: 915ms ███████████
Claude: 1,024ms █████████████
ChatGPT: 573ms ███████
Observations:
- ChatGPT is fastest (reasonable for largest volume)
- Claude is slightly slower but consistent
- Grok has highest latency (expected for newer infrastructure)
- Gemini latency is solid when not rate-limited
What This Means for You
For Production Apps
Priority ranking by reliability:
- Claude — Best uptime, acceptable latency, most stable
- ChatGPT — Proven reliability, fastest, occasional transients
- Grok — Reliable but slower, consider for non-latency-critical paths
- Gemini — Avoid for production until you upgrade beyond free tier
For Cost Optimization
- Gemini free tier = Daily quota wall (1,000 req/day) — upgrade or use paid tier
- Claude = Premium but most reliable
- ChatGPT = Volume pricing, frequent brief spikes but always recovers
- Grok = Budget option, higher latency but stable
For Monitoring
Set up alerts for:
- ChatGPT: Consecutive errors (tends to spike then recover)
- Claude: Anything unusual (normally rock-solid)
- Grok: Latency thresholds (high baseline)
- Gemini: Rate-limit errors (quota management issue, not API issue)
Methodology & Confidence
How we measure:
- Real API calls every 30 seconds to each model
- Health check: "Respond with exactly: OK" (minimal cost)
- Status codes: 200 = Operational, 429 = Degraded, 5xx = Outage
- Uptime % = Operational checks / Total checks
Confidence levels:
- High (ChatGPT, Claude, Grok): 2,880+ data points/day (30s sampling)
- Medium (Gemini): 480–720 data points/day (2–3 min sampling due to rate limits)
Gemini's lower confidence is due to sparse sampling, not API unreliability.
Next Week
Watch for:
- Gemini quota reset — Will it re-stabilize post-midnight UTC?
- Grok latency — Is 1,358ms the baseline, or a temporary spike?
- ChatGPT transients — Pattern of brief spikes suggests load management
Track real-time status at IsItDown.ai
Published by the Is It Down AI Team. Questions? Open an issue or reach out.