Commit Graph

45 Commits

Author SHA1 Message Date
Charles Dusek
048e25c2b2 fix(agents): avoid duplicate same-provider cooldown probes in fallback runs (#41711)
Merged via squash.

Prepared head SHA: 8be8967bcb4be81f6abc5ff078644ec4efcfe7a0
Co-authored-by: cgdusek <38732970+cgdusek@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-03-10 15:26:47 +03:00
Altay
531e8362b1 Agents: add fallback error observations (#41337)
Merged via squash.

Prepared head SHA: 852469c82ff28fb0e1be7f1019f5283e712c4283
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-03-10 01:12:10 +03:00
Peter Steinberger
7ab49a7fb7 test(regression): cover recent landed fix paths 2026-03-07 23:07:16 +00:00
Peter Steinberger
e83094e63f fix(agents): warn clearly on unresolved model ids (#39215, thanks @ademczuk)
Co-authored-by: ademczuk <andrew.demczuk@gmail.com>
2026-03-07 22:50:27 +00:00
Altay
6e962d8b9e fix(agents): handle overloaded failover separately (#38301)
* fix(agents): skip auth-profile failure on overload

* fix(agents): note overload auth-profile fallback fix

* fix(agents): classify overloaded failures separately

* fix(agents): back off before overload failover

* fix(agents): tighten overload probe and backoff state

* fix(agents): persist overloaded cooldown across runs

* fix(agents): tighten overloaded status handling

* test(agents): add overload regression coverage

* fix(agents): restore runner imports after rebase

* test(agents): add overload fallback integration coverage

* fix(agents): harden overloaded failover abort handling

* test(agents): tighten overload classifier coverage

* test(agents): cover all-overloaded fallback exhaustion

* fix(cron): retry overloaded fallback summaries

* fix(cron): treat HTTP 529 as overloaded retry
2026-03-07 01:42:11 +03:00
Vignesh Natarajan
d45353f95b fix(agents): honor explicit rate-limit cooldown probes in fallback runs 2026-03-05 20:03:06 -08:00
Altay
49acb07f9f fix(agents): classify insufficient_quota 400s as billing (#36783) 2026-03-06 01:17:48 +03:00
Altay
6859619e98 test(agents): add provider-backed failover regressions (#36735)
* test(agents): add provider-backed failover fixtures

* test(agents): cover more provider error docs

* test(agents): tighten provider doc fixtures
2026-03-06 00:42:59 +03:00
Peter Steinberger
1bd20dbdb6 fix(failover): treat stop reason error as timeout 2026-03-03 01:05:24 +00:00
Peter Steinberger
a2fdc3415f fix(failover): handle unhandled stop reason error 2026-03-03 01:05:24 +00:00
Charles Dusek
92199ac129 fix(agents): unblock gpt-5.3-codex API-key routing and replay (#31083)
* fix(agents): unblock gpt-5.3-codex API-key replay path

* fix(agents): scope OpenAI replay ID rewrites per turn

* test: fix nodes-tool mock typing and reformat telegram accounts
2026-03-02 03:45:12 +00:00
Ayane
5b562e96cb test: add missing ENETRESET test case 2026-03-02 02:08:27 +00:00
Ayane
76ed274aad fix(agents): trigger model failover on connection-refused and network-unreachable errors
Previously, only ETIMEDOUT / ESOCKETTIMEDOUT / ECONNRESET / ECONNABORTED
were recognised as failover-worthy network errors. Connection-level
failures such as ECONNREFUSED (server down), ENETUNREACH / EHOSTUNREACH
(network disconnected), ENETRESET, and EAI_AGAIN (DNS failure) were
treated as unknown errors and did not advance the fallback chain.

This is particularly impactful when a local fallback model (e.g. Ollama)
is configured: if the remote provider is unreachable due to a network
outage, the gateway should fall back to the local model instead of
returning an error to the user.

Add the missing error codes to resolveFailoverReasonFromError() and
corresponding e2e tests.

Closes #18868
2026-03-02 02:08:27 +00:00
Ramez
acbb93be48 fix(agents): comprehensive quota fallback fixes - session overrides + surgical cooldown logic (#23816)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: e6f2b4742b82b9fe44a7e103170c2f96565b09c5
Co-authored-by: ramezgaberiel <844893+ramezgaberiel@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-25 20:35:40 -05:00
Sid
156f13aa64 fix(agents): continue fallback loop for unrecognized provider errors (#26106)
* fix(agents): continue fallback loop for unrecognized provider errors

When a provider returns an error that coerceToFailoverError cannot
classify (e.g., custom error messages without standard HTTP status
codes), the fallback loop threw immediately instead of trying the
next candidate. This caused fallback to stop after 2 models even
when 17 were configured.

Only rethrow unrecognized errors when they occur on the last
candidate. For intermediate candidates, record the error as an
attempt and continue to the next model.

Closes #25926

Co-authored-by: Cursor <cursoragent@cursor.com>

* test: cover unknown-error fallback telemetry and land #26106 (thanks @Sid-Qin)

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-02-25 04:53:26 +00:00
Peter Steinberger
d2597d5ecf fix(agents): harden model fallback failover paths 2026-02-25 03:46:34 +00:00
Peter Steinberger
bf5a96ad63 fix(agents): keep fallback chain reachable on configured fallback models (#25922) 2026-02-25 01:46:20 +00:00
Vincent Koc
06f0b4a193 Tests: keep OpenRouter runnable with legacy cooldown markers 2026-02-24 19:12:08 -05:00
Peter Steinberger
1c753ea786 test: dedupe fixtures and test harness setup 2026-02-23 05:45:54 +00:00
Peter Steinberger
48f327c206 test: consolidate redundant suites and speed attachment tests 2026-02-23 04:55:43 +00:00
Vignesh Natarajan
5c7c37a02a Agents: infer auth-profile unavailable failover reason 2026-02-22 16:10:32 -08:00
Peter Steinberger
3c75bc0e41 refactor(test): dedupe agent and discord test fixtures 2026-02-22 20:04:51 +00:00
Peter Steinberger
e441390fd1 test: reclassify agent local suites out of e2e 2026-02-22 11:16:37 +00:00
Peter Steinberger
9131b22a28 test: migrate suites to e2e coverage layout 2026-02-13 14:28:22 +00:00
Rodrigo Uroz
b912d3992d (fix): handle Cloudflare 521 and transient 5xx errors gracefully (#13500)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: a8347e95c55c6244bbf2e9066c8bf77bf62de6c9
Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Takhoffman <781889+Takhoffman@users.noreply.github.com>
Reviewed-by: @Takhoffman
2026-02-11 21:42:33 -06:00
cpojer
f06dd8df06 chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
cpojer
5ceff756e1 chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
Peter Steinberger
9a7160786a refactor: rename to openclaw 2026-01-30 03:16:21 +01:00
Peter Steinberger
6d16a658e5 refactor: rename clawdbot to moltbot with legacy compat 2026-01-27 12:21:02 +00:00
Gustavo Madeira Santana
959ddae612 Agents: finish cooldowned provider skip (#2534)
* Agents: skip cooldowned providers in fallback

* fix: skip cooldowned providers during model failover (#2143) (thanks @YiWang24)
2026-01-26 22:05:31 -05:00
Peter Steinberger
aa2a1a17e3 test(auth): update auth profile coverage 2026-01-26 19:05:00 +00:00
Luke
be1cdc9370 fix(agents): treat provider request-aborted as timeout for fallback (#1576)
* fix(agents): treat request-aborted as timeout for fallback

* test(e2e): add provider timeout fallback
2026-01-24 11:27:24 +00:00
Peter Steinberger
21370fc09b fix: allow fallback on timeout aborts
Co-authored-by: Larus Ivarsson <larusivar@gmail.com>
2026-01-20 19:23:13 +00:00
Peter Steinberger
ec27c813cc fix(fallback): handle timeout aborts
Co-authored-by: Mykyta Bozhenko <21245729+cheeeee@users.noreply.github.com>
2026-01-18 07:52:44 +00:00
Roshan Singh
0f27cff247 Fix model fallback crash on undefined provider/model (#946) 2026-01-15 16:50:46 +00:00
Peter Steinberger
c379191f80 chore: migrate to oxlint and oxfmt
Co-authored-by: Christoph Nakazawa <christoph.pojer@gmail.com>
2026-01-14 15:02:19 +00:00
Peter Steinberger
3368284b2a fix: per-agent model fallbacks (#583) (thanks @mitschabaude-bot) 2026-01-13 06:50:41 +00:00
Gregor's Bot
9c0c4f50ec Agents: test per-agent model fallbacks override 2026-01-13 06:50:20 +00:00
Gregor's Bot
6729637f61 Config: support per-agent model fallbacks 2026-01-13 06:50:20 +00:00
Peter Steinberger
32115a8b98 test: expand auth fallback coverage 2026-01-13 04:12:16 +00:00
Peter Steinberger
2c2ca7f03b fix: treat credential validation errors as auth errors (#822) (thanks @sebslight) 2026-01-13 04:02:47 +00:00
Sebastian
c4014c0092 fix: treat credential validation failures as auth errors for fallback (#761) 2026-01-12 22:53:21 -05:00
Peter Steinberger
374aa856f2 refactor(agents): centralize failover handling 2026-01-09 21:31:18 +01:00
Peter Steinberger
65cb9dc3f7 fix(agents): fail over on billing/credits errors 2026-01-09 21:17:07 +01:00
Peter Steinberger
5755d85ad1 fix: harden Gmail hook model defaults (#472) (thanks @koala73) 2026-01-09 19:59:45 +01:00