Skip to content

Fix Cache.clean_expired purging still-valid entries (mtime != expiry)#69591

Open
cstrahan wants to merge 1 commit into
saltstack:masterfrom
cstrahan:cstrahan/fix-cache-clean-expired-mtime
Open

Fix Cache.clean_expired purging still-valid entries (mtime != expiry)#69591
cstrahan wants to merge 1 commit into
saltstack:masterfrom
cstrahan:cstrahan/fix-cache-clean-expired-mtime

Conversation

@cstrahan

Copy link
Copy Markdown

What does this PR do?

Fixes eauth tokens being evicted within seconds of issuance on 3008, regardless of token_expire.

The generic Cache.clean_expired() fallback (used by the localfs driver, which has no clean_expired() of its own) treated the driver's updated() value — the file mtime — as the entry's expiry. A file's mtime is always <= now, so every entry was flushed on every sweep. The master's periodic token maintenance (clean_expired_tokenscache.clean_expired("tokens")) therefore deleted all eauth tokens on each pass.

This PR makes clean_expired() consult each entry's stored _expires envelope instead of the mtime (matching how Cache.fetch() already detects that envelope), removing only genuinely-expired entries. It also fixes LoadAuth.mk_token passing the absolute expiry timestamp where Cache.store expects a duration in seconds.

Both defects were introduced together in 3008.0 by 78685e5c44 ("Provide token storage using the salt.cache interface"), which moved token storage onto the generic salt.cache interface. 3006.x/3007.x are unaffected (they used the localfs token driver, which never hit this path).

What issues does this PR fix or reference?

Fixes #69590

Previous Behavior

A freshly-minted token worked for the first request or two, then failed with Authentication error occurred. / HTTP 401 (TokenExpiredError in the master log). Tokens lived only until the next maintenance sweep (0–~60s), not token_expire (default 12h). This broke long-running salt-api operations — notably batch jobs, whose mid-run saltutil.find_job re-auth happened after the token was already swept.

New Behavior

Tokens remain valid until their real expiry (token_expire); only genuinely-expired cache entries are purged, and entries with no expiry envelope are left untouched.

Merge requirements satisfied?

[NOTICE] Bug fixes or features added to Salt require tests.

  • Docs
  • Changelog
  • Tests written/updated

Commits signed with GPG?

No

The generic Cache.clean_expired() fallback (used by the localfs driver,
which has no clean_expired() of its own) treated the driver's updated()
value -- the file mtime -- as the entry's expiry:

    ts = updated(bank, key)         # localfs: file mtime (always in the past)
    if ts is not None and ts <= time.time():
        flush(bank, key)

Because a file's mtime is always <= now, every entry was flushed on every
sweep. The master's periodic token maintenance (clean_expired_tokens ->
cache.clean_expired("tokens")) therefore deleted ALL eauth tokens on each
pass, regardless of their real expiry. A token thus lived only until the
next maintenance sweep (0-~60s), not token_expire (default 12h).

This broke long-running salt-api operations: a normal request uses its
token within ~1s, but a batch job runs gather + waits its per-chunk
timeout (default 60s) before issuing saltutil.find_job, by which point the
token had been swept -> "Authentication error occurred." -> HTTP 401
("Authentication denied" in pepper).

clean_expired() now consults each entry's stored _expires instead of the
mtime, removing only genuinely-expired entries and leaving entries without
an expiry envelope untouched.

Also fix LoadAuth.mk_token passing the absolute expiry timestamp where
Cache.store expects a duration (seconds); it placed _expires ~2x now in the
future so token entries were never cleaned even once the above is fixed.

Both defects were introduced together in 3008.0 by commit 78685e5
("Provide token storage using the salt.cache interface"), which moved
eauth token storage off the localfs token driver and onto the generic
salt.cache interface. That commit added the clean_expired() fallback (with
the mtime bug) and the mk_token cache.store(..., expires=...) call (with the
duration bug) in one go. Token eviction is therefore a regression: 3006.x
and 3007.x are unaffected (they stored tokens via the localfs token driver,
which never hit this path).

Introduced-by: 78685e5 ("Provide token storage using the salt.cache interface")
Fixes saltstack#69590
@cstrahan cstrahan requested a review from a team as a code owner June 28, 2026 00:07
@dwoz dwoz added the test:full Run the full test suite label Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:full Run the full test suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: eauth tokens evicted within seconds on 3008 (token_expire ignored)

2 participants