Latest Results
fix(schedules): persist explicit DTSTART on RRule schedules at write time
closes #21362
`RRuleSchedule.to_rrule()` historically fell back to a hardcoded
`DEFAULT_ANCHOR_DATE = 2020-01-01` whenever the persisted rrule string
lacked an explicit `DTSTART`. Because dateutil's `xafter()` is O(n) in
the number of occurrences between `dtstart` and the query time, a
`FREQ=MINUTELY;INTERVAL=5` schedule walks ~660k occurrences from 2020
forward on every scheduler loop. With ~20-30 such deployments the
scheduler saturates a CPU core (the original report).
The previous attempt on this branch — process-level caching of the
parsed rrule object — fixed the CPU symptom but traded it for ~37 MB
of retained memory per high-frequency rule, which is the wrong tradeoff
on small containers. This commit replaces that approach.
The right fix is to make `DTSTART` explicit on every persisted rrule:
1. **`normalize_rrule_string`** in `_internal/schemas/validators.py`
inspects an incoming rrule and either picks a phase-equivalent
recent anchor (for `FREQ=SECONDLY` / `FREQ=MINUTELY` without
`COUNT`) or injects the legacy `DTSTART:20200101T000000` (every
other shape, including `INTERVAL>1`, `COUNT=N`, `BY*` calendar
rules, and rrulesets). The recent-anchor case is provably
occurrence-set-equivalent forward of the new dtstart, and shrinks
dateutil's working set from millions of cached datetimes to ~tens.
The legacy case is byte-for-byte semantically equivalent to the
pre-fix implicit-anchor parsing.
2. **`DeploymentScheduleCreate` / `DeploymentScheduleUpdate`** action
schemas (both server and client) gain a field validator that runs
the normalization on incoming `RRuleSchedule` payloads.
`DeploymentCreate` and `DeploymentUpdate` inherit it transitively
through their inline `schedules` lists. Crucially, the validator
lives on the **action** schemas (write path), not on `RRuleSchedule`
itself — if it lived there, every DB row deserialized into an
`RRuleSchedule` would get a fresh anchor injected on every load,
which would change daily and re-phase `INTERVAL>1` schedules. This
is the same drift bug the earlier draft PR #21361 hit.
3. **Alembic data migration** (paired SQLite + PostgreSQL) walks every
row in `deployment_schedule` and injects `DTSTART` for any RRule
that lacks one, using the same `normalize_rrule_string` helper. New
deployments arrive pre-normalized via the action schemas; existing
rows are backfilled once at upgrade.
4. **`DEFAULT_ANCHOR_DATE`** stays in both client and server schedule
modules as a defensive fallback for any rule that somehow still
reaches `to_rrule()` without a `DTSTART` (old clients on the wire,
YAML, test fixtures, mid-upgrade rows). It is no longer the
load-bearing path for the scheduler.
Tests cover: the helper directly (22 cases including phase-preservation
across all relevant rrule shapes and a memory smoke check), action-schema
integration on both server and client, and the
"deserialization-doesn't-mutate" invariant on both sides. Existing tests
that asserted exact rrule string equality on values flowing through
`DeploymentScheduleCreate` were updated to use `.endswith()` on the
user-supplied portion (they were silently asserting an implementation
detail).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Latest Branches
0%
0%
devin/1775494005-task-runs-work-pool-filter 0%
harsh21234i:feat/deploy-from-yaml-sdk © 2026 CodSpeed Technology