Skip to content

ci(test): enable pytest-xdist for cuopt Python tests#1486

Open
ramakrishnap-nv wants to merge 6 commits into
mainfrom
feat/pytest-xdist-python-tests
Open

ci(test): enable pytest-xdist for cuopt Python tests#1486
ramakrishnap-nv wants to merge 6 commits into
mainfrom
feat/pytest-xdist-python-tests

Conversation

@ramakrishnap-nv

Copy link
Copy Markdown
Collaborator

Summary

  • Add pytest-xdist to test_python_common dependencies (propagated to conda envs and pyproject.toml files via dependency file generator)
  • Pass -n auto to the non-nightly cuopt pytest invocation in ci/run_cuopt_pytests.sh
  • Server tests (run_cuopt_server_pytests.sh) are unchanged — they bind a fixed port (localhost:18900) that would conflict across parallel workers

The script already cds into python/cuopt/cuopt where xdist + coverage is known to work (noted in the existing comment on line 7).

Test plan

  • CI conda-python-tests job shows parallel test workers in the log
  • Test results and coverage report are complete (no missing tests)
  • conda-python-tests wall time is reduced compared to baseline
  • Server tests are unaffected

🤖 Generated with Claude Code

Add pytest-xdist to test_python_common dependencies and pass -n auto
to the non-nightly cuopt pytest invocation to parallelize tests across
available CPUs. Server tests are excluded due to fixed-port binding
(localhost:18900). The run_cuopt_pytests.sh script already cd-s into
python/cuopt/cuopt where xdist+coverage is known to work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Ramakrishna Prabhu <ramakrishnap@nvidia.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@ramakrishnap-nv ramakrishnap-nv self-assigned this Jun 29, 2026
@ramakrishnap-nv ramakrishnap-nv added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Jun 29, 2026
@ramakrishnap-nv

Copy link
Copy Markdown
Collaborator Author

/ok to test eb9a30f

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown

CI Test Summary

✅ All 31 test job(s) passed.

-n auto caused two failures:
- cudaErrorInvalidDevice: too many workers competing for the same GPU
- warmstart timeout: two workers starting gRPC servers on the same port

Switch to -n 2 to limit GPU contention. Fix the gRPC port conflict in
_start_grpc_server_fixture by adding the xdist worker ID (from
PYTEST_XDIST_WORKER) to the port, giving each worker a unique port
within the 100-unit gap between fixture classes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Ramakrishna Prabhu <ramakrishnap@nvidia.com>
@ramakrishnap-nv

Copy link
Copy Markdown
Collaborator Author

/ok to test 10d42be

ramakrishnap-nv and others added 2 commits June 30, 2026 15:33
-n 2 passed cleanly; try -n 4 for more parallelism. The CI GPUs
(L4/24GB, H100/80GB, RTX Pro 6000/48GB) have enough VRAM for 4
concurrent CUDA contexts without contention.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Ramakrishna Prabhu <ramakrishnap@nvidia.com>
-n 4 is untested across all GPU types; stay conservative at -n 2
which is known clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Ramakrishna Prabhu <ramakrishnap@nvidia.com>
@ramakrishnap-nv ramakrishnap-nv marked this pull request as ready for review June 30, 2026 20:36
@ramakrishnap-nv ramakrishnap-nv requested review from a team as code owners June 30, 2026 20:36
@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds pytest-xdist to conda and pyproject test dependencies, runs non-nightly CI pytest jobs with -n 2, and updates cuOpt test server port selection to derive worker-specific ports from PYTEST_XDIST_WORKER.

Changes

pytest-xdist Parallel Test Execution

Layer / File(s) Summary
Add pytest-xdist dependency declarations
conda/environments/all_cuda-129_arch-aarch64.yaml, conda/environments/all_cuda-129_arch-x86_64.yaml, conda/environments/all_cuda-133_arch-aarch64.yaml, conda/environments/all_cuda-133_arch-x86_64.yaml, dependencies.yaml, python/cuopt/pyproject.toml, python/cuopt_self_hosted/pyproject.toml, python/cuopt_server/pyproject.toml
Adds pytest-xdist to conda environment dependency lists, the shared test_python_common package group, and test optional-dependencies in the cuopt, cuopt_self_hosted, and cuopt_server pyproject.toml files.
Enable parallel test execution and worker-aware ports
ci/run_cuopt_pytests.sh, ci/run_cuopt_server_pytests.sh, python/cuopt/cuopt/tests/linear_programming/test_cpu_only_execution.py, python/cuopt_server/cuopt_server/tests/utils/utils.py
Changes the non-nightly pytest commands in both CI scripts to use -n 2, adds worker-aware port calculation from PYTEST_XDIST_WORKER, and uses that port for the cuOpt server subprocess and default client port in tests.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • NVIDIA/cuopt#722: Both PRs modify the cuopt test-run CI scripts and change how pytest is invoked in CI, including pytest-xdist usage.

Suggested reviewers

  • jameslamb
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: enabling pytest-xdist for cuopt Python tests.
Description check ✅ Passed The description is on-topic and describes the xdist dependency and pytest parallelization changes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/pytest-xdist-python-tests

Comment @coderabbitai help to get the list of available commands.

Add -n 2 to run_cuopt_server_pytests.sh. Fix the port conflict by
introducing _worker_port() in utils.py, which offsets the base port
(5555) by the xdist worker ID (from PYTEST_XDIST_WORKER). Both the
cuoptproc fixture and RequestClient default port now use _worker_port()
so each worker gets its own server instance on a unique port. Falls
back to 5555 when not running under xdist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Ramakrishna Prabhu <ramakrishnap@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
python/cuopt_server/cuopt_server/tests/utils/utils.py (2)

20-23: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Duplicate worker-id parsing logic across test files.

The PYTEST_XDIST_WORKER → worker-id parsing here is essentially duplicated from python/cuopt/cuopt/tests/linear_programming/test_cpu_only_execution.py (_start_grpc_server_fixture). Consider extracting a shared helper (e.g., in a common test-utils module or xdist's own worker_id fixture/get_xdist_worker_id()) instead of reimplementing the gwN parsing in two places.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cuopt_server/cuopt_server/tests/utils/utils.py` around lines 20 - 23,
The worker-id parsing logic in _worker_port duplicates the same
PYTEST_XDIST_WORKER handling already present in _start_grpc_server_fixture, so
refactor both call sites to use a shared helper instead of re-parsing gwN in
multiple tests. Move the worker id extraction into a common test utility module
or switch both places to xdist’s built-in worker-id support (for example a
helper like get_xdist_worker_id() or the worker_id fixture), and update
_worker_port to consume that shared helper.

368-368: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Default port computed at import time, not call time.

port=_worker_port() is evaluated once when the class is defined (module import), unlike cuoptproc's use of _worker_port() at fixture-execution time. This happens to work today because PYTEST_XDIST_WORKER is set before worker-side test collection, but it's a fragile pattern relative to the runtime evaluation used elsewhere in this same file.

♻️ Suggested fix for lazy evaluation
-    def __init__(self, port=_worker_port()):
+    def __init__(self, port=None):
+        if port is None:
+            port = _worker_port()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cuopt_server/cuopt_server/tests/utils/utils.py` at line 368, The
default port for the class initializer is being computed too early because
__init__ uses port=_worker_port() at definition/import time instead of when an
instance is created. Update the initializer so _worker_port() is called lazily
inside the constructor path, matching the runtime evaluation pattern used by
cuoptproc in this file, and keep the default behavior unchanged for callers that
do not pass an explicit port.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@python/cuopt_server/cuopt_server/tests/utils/utils.py`:
- Around line 20-23: The worker-id parsing logic in _worker_port duplicates the
same PYTEST_XDIST_WORKER handling already present in _start_grpc_server_fixture,
so refactor both call sites to use a shared helper instead of re-parsing gwN in
multiple tests. Move the worker id extraction into a common test utility module
or switch both places to xdist’s built-in worker-id support (for example a
helper like get_xdist_worker_id() or the worker_id fixture), and update
_worker_port to consume that shared helper.
- Line 368: The default port for the class initializer is being computed too
early because __init__ uses port=_worker_port() at definition/import time
instead of when an instance is created. Update the initializer so _worker_port()
is called lazily inside the constructor path, matching the runtime evaluation
pattern used by cuoptproc in this file, and keep the default behavior unchanged
for callers that do not pass an explicit port.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4d52f55b-597f-4924-be2f-21a9c38815d7

📥 Commits

Reviewing files that changed from the base of the PR and between 792b2f4 and ade01f2.

📒 Files selected for processing (2)
  • ci/run_cuopt_server_pytests.sh
  • python/cuopt_server/cuopt_server/tests/utils/utils.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants