Skip to content

Improve diskann-disk test coverage#1193

Open
arrayka wants to merge 6 commits into
mainfrom
u/arrayka/diskann-disk-tests3
Open

Improve diskann-disk test coverage#1193
arrayka wants to merge 6 commits into
mainfrom
u/arrayka/diskann-disk-tests3

Conversation

@arrayka

@arrayka arrayka commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Adds 18 targeted unit tests to the diskann-disk crate covering previously
untested error paths, control flow branches, and pure functions.

Files with missing lines
Coverage Δ
.../checkpoint/checkpoint_record_manager_with_file.rs
96.20% <100.00%> (+1.81%)
diskann-disk/src/search/pq/pq_scratch.rs
100.00% <100.00%> (+10.00%)
...kann-disk/src/search/provider/disk_sector_graph.rs
97.70% <100.00%> (+0.75%)
diskann-disk/src/storage/quant/generator.rs
94.31% <100.00%> (+1.64%)
diskann-disk/src/utils/kmeans.rs
96.61% <100.00%> (+5.62%)
diskann-disk/src/utils/partition.rs
92.85% <100.00%> (+0.34%)
...kann-disk/src/build/chunking/continuation/utils.rs
93.90% <97.87%> (+3.61%)

Add 18 targeted unit tests covering previously untested error paths,
control flow branches, and pure functions across 7 files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov-commenter

codecov-commenter commented Jun 20, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 99.31034% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.03%. Comparing base (999fa5d) to head (31bb404).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...kann-disk/src/build/chunking/continuation/utils.rs 97.87% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1193      +/-   ##
==========================================
+ Coverage   89.95%   90.03%   +0.07%     
==========================================
  Files         489      489              
  Lines       93127    93417     +290     
==========================================
+ Hits        83772    84105     +333     
+ Misses       9355     9312      -43     
Flag Coverage Δ
miri 90.03% <99.31%> (+0.07%) ⬆️
unittests 89.68% <99.31%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
.../checkpoint/checkpoint_record_manager_with_file.rs 96.20% <100.00%> (+1.81%) ⬆️
diskann-disk/src/search/pq/pq_scratch.rs 100.00% <100.00%> (+10.00%) ⬆️
...kann-disk/src/search/provider/disk_sector_graph.rs 97.70% <100.00%> (+0.75%) ⬆️
diskann-disk/src/storage/quant/generator.rs 94.33% <100.00%> (+1.65%) ⬆️
diskann-disk/src/utils/kmeans.rs 96.61% <100.00%> (+5.62%) ⬆️
diskann-disk/src/utils/partition.rs 92.85% <100.00%> (+0.34%) ⬆️
...kann-disk/src/build/chunking/continuation/utils.rs 93.90% <97.87%> (+3.61%) ⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@arrayka arrayka marked this pull request as ready for review June 20, 2026 01:09
@arrayka arrayka requested review from a team and Copilot June 20, 2026 01:09
@arrayka arrayka enabled auto-merge (squash) June 20, 2026 01:09
@arrayka arrayka changed the title Improve diskann-disk test coverage: 94.4% → 95.0% Improve diskann-disk test coverage Jun 20, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR increases unit test coverage in the diskann-disk crate by adding targeted tests for previously untested branches and error paths across utilities, PQ scratch handling, sector graph reconfiguration, quantization generator validation, continuation helpers, and checkpoint record management.

Changes:

  • Add unit tests for partition-count estimation edge cases (clamping, odd rounding, k_base multiplier).
  • Add tests covering error-path and basic-success behavior in k-means clustering, PQScratch query validation, quant generator parameter validation, sector graph reconfiguration/defaulting, continuation processing behavior, and checkpoint invalidation.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
diskann-disk/src/utils/partition.rs Adds tests for estimate_initial_partition_count edge cases.
diskann-disk/src/utils/kmeans.rs Adds tests for successful clustering output shape and cancellation error path.
diskann-disk/src/storage/quant/generator.rs Adds test for missing compressed-file validation when resuming with offset.
diskann-disk/src/search/provider/disk_sector_graph.rs Adds tests for reconfigure growth/no-op and block-size defaulting.
diskann-disk/src/search/pq/pq_scratch.rs Adds tests for rejecting undersized queries and accepting oversized queries.
diskann-disk/src/build/chunking/continuation/utils.rs Adds tests for early stop, yield-then-continue, and error propagation (sync/async).
diskann-disk/src/build/chunking/checkpoint/checkpoint_record_manager_with_file.rs Adds tests for fresh-state behavior, missing-file completion state, and invalidation semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diskann-disk/src/search/pq/pq_scratch.rs
Comment thread diskann-disk/src/storage/quant/generator.rs Outdated
Comment thread diskann-disk/src/build/chunking/continuation/utils.rs
Comment on lines +144 to +156
#[test]
fn test_pq_scratch_set_accepts_oversized_query() {
let dim = 8;
let mut pq_scratch = PQScratch::new(64, dim, 4, 256).unwrap();

// Query longer than dim should succeed (only first `dim` elements used)
let long_query: Vec<f32> = (1..=dim + 10).map(|i| i as f32).collect();
pq_scratch.set(&long_query).unwrap();

for (i, &val) in long_query.iter().enumerate().take(dim) {
assert_eq!(pq_scratch.query_scratch[i], val);
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this expected behavior? Shouldn't the scratch fail for an incorrectly sized query?

@arrayka arrayka Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is the expected behavior, according to the documentation of pq_scratch.set():

    /// Copy the first `dim` elements of `query` into `query_scratch`.
    ///
    /// `query` must already be in full-precision `f32` representation; quantized
    /// inputs (e.g. `MinMaxElement`) should be decoded via `VectorRepr::as_f32`
    /// at the caller boundary before invoking this method.
    ///
    /// Accepts oversized `query` (only the first `dim` elements are used) for
    /// backwards compatibility with callers that hold alignment-padded buffers.
    /// Returns `DimensionMismatchError` if `query.len() < query_scratch.len()`.
    pub fn set(&mut self, query: &[f32]) -> ANNResult<()>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the chain of calls to set() and new() and I don't think supporting larger input query dimension is actually needed here (the scratch length is derived from the dimension of the fp vectors in the PQ table, which is the same as the query dimension).

Can we fix this since we've found it? The reason I'm asking is cause this is actually incorrect behavior; I'm not actually sure how this ended up getting supported (It's probably my fault!). The f32 dimension is not larger than the dimension of minmax vectors, it's actually smaller :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants