Fix intermittent installer Abort (exit 2) on salt-minion service re-registration#69579
Merged
Conversation
…egistration Silent installs intermittently failed with NSIS error level 2 -- a script-driven Abort, not the exit-path hang fixed previously. The only Section Abort reachable on the happy path is the salt-minion service registration: "ssm install salt-minion" (CreateService) returning non-zero. The cause is an SCM race against the prior uninstall. The uninstaller's SimpleSC::RemoveService only *marks* the service for deletion; the SCM does not remove the HKLM\...\Services\salt-minion key until every open handle closes, which happens asynchronously after the uninstaller has exited. When the next install's CreateService races that pending delete, Windows returns ERROR_SERVICE_MARKED_FOR_DELETE (1072) or ERROR_SERVICE_EXISTS (1073), ssm exits non-zero, and the Section aborts. This is why it only fails occasionally and recovers on the next run -- it is purely a timing window between back-to-back uninstall/install. Fix it at the point of failure: retry "ssm install" up to 5 times with a 2s pause instead of aborting on first failure. The condition is self-clearing once handles close, so the retry rides it out. This also covers real-world upgrades, where uninstall and reinstall are separate processes and nothing in the uninstaller can constrain a future installer invocation. As defense-in-depth, also wait (bounded, 10s) in un.Uninstall for the SCM to actually remove the service key -- placed after the salt-minion.exe and ssm.exe taskkills, since the key cannot clear until those handles are gone. This narrows the race window for the common case but is not sufficient alone; the install-side retry remains the authoritative guard.
dwoz
approved these changes
Jun 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Silent installs intermittently failed with NSIS error level 2 -- a script-driven Abort, not the exit-path hang fixed previously. The only Section Abort reachable on the happy path is the salt-minion service registration: "ssm install salt-minion" (CreateService) returning non-zero.
The cause is an SCM race against the prior uninstall. The uninstaller's SimpleSC::RemoveService only marks the service for deletion; the SCM does not remove the HKLM...\Services\salt-minion key until every open handle closes, which happens asynchronously after the uninstaller has exited. When the next install's CreateService races that pending delete, Windows returns ERROR_SERVICE_MARKED_FOR_DELETE (1072) or ERROR_SERVICE_EXISTS (1073), ssm exits non-zero, and the Section aborts. This is why it only fails occasionally and recovers on the next run -- it is purely a timing window between back-to-back uninstall/install.
Fix it at the point of failure: retry "ssm install" up to 5 times with a 2s pause instead of aborting on first failure. The condition is self-clearing once handles close, so the retry rides it out. This also covers real-world upgrades, where uninstall and reinstall are separate processes and nothing in the uninstaller can constrain a future installer invocation.
As defense-in-depth, also wait (bounded, 10s) in un.Uninstall for the SCM to actually remove the service key -- placed after the salt-minion.exe and ssm.exe taskkills, since the key cannot clear until those handles are gone. This narrows the race window for the common case but is not sufficient alone; the install-side retry remains the authoritative guard.
Merge requirements satisfied?
[NOTICE] Bug fixes or features added to Salt require tests.
Commits signed with GPG?
Yes