feat: refactor InfiniCore CPU runtime to InfiniRT by spike-zhu · Pull Request #8 · InfiniTensor/InfiniRT

spike-zhu · 2026-06-24T02:10:52Z

Summary

Extends scripts/generate_public_headers.py so generated public runtime dispatch only emits functions supported by the enabled backend runtime headers.
Adds CPU runtime support for host allocation, memory info, stream, event, and async API entries in src/native/cpu/runtime_.h.
Keeps the CPU runtime API aligned with the generated C++ InfiniRT runtime surface used by downstream InfiniCore work.

Motivation

This prepares InfiniRT CPU runtime coverage for replacing InfiniCore runtime calls with InfiniRT runtime APIs.

Related: InfiniTensor/InfiniCore#1342

Type of Change

feat - new feature / new operator / new platform
fix - bug fix
perf - performance improvement (no behavioral change)
refactor - code restructuring without behavior change
test - adding or fixing tests only
docs - documentation only
build / ci - build system or CI configuration
chore - tooling, formatting, or other non-code changes
Breaking change (requires a ! in the Conventional Commits prefix or a BREAKING CHANGE: footer)

Platforms Affected

Smoke Test Result

Not rerun while updating this PR description.

Current PR checks:
- ruff: passed
- clang-format: passed

Previous manual InfiniCore single-op validation evidence from the original PR description:
- https://github.com/user-attachments/assets/c5642073-3ce5-43c0-8002-11e0c44ac0bb
- https://github.com/user-attachments/assets/076a3af7-738a-43b3-a90d-80fc29221aee

Test Results on Supported Platforms

Platform	Affected	Build / Smoke Result	Full Result / Notes
NVIDIA	No	N/A - not affected	N/A
Iluvatar	No	N/A - not affected	N/A
MetaX	No	N/A - not affected	N/A
Cambricon	No	N/A - not affected	N/A
Moore	No	N/A - not affected	N/A
Ascend	No	N/A - not affected	N/A
CPU	Yes	CI format checks passed; smoke not rerun in this update	Related downstream validation is in InfiniCore#1342

Full `pytest` output (optional)

N/A

Benchmark / Performance Impact

N/A. This PR changes runtime API coverage and dispatch generation, not performance-sensitive kernels.

Notes for Reviewers

The CPU async memory/copy entries return an error where CPU has no asynchronous implementation.
MemGetInfo reports /proc/meminfo values on non-Windows platforms and returns invalid value if unavailable.
This PR is intended to be reviewed together with InfiniCore#1342.

spike-zhu · 2026-06-25T13:27:14Z

@voltjia 麻烦嘉成帮我看下修改后 InfiniCore 接入 InfiniRT cpu 运行时的整体思路是否正确，后续我会完善细节，感谢！

voltjia · 2026-06-26T06:13:00Z

@voltjia 麻烦嘉成帮我看下修改后 InfiniCore 接入 InfiniRT cpu 运行时的整体思路是否正确，后续我会完善细节，感谢！

我看了一下，基本上没啥问题，就是咱们这次重构有个原则：尽量复用 CUDA Runtime API 的接口，换句话说，有些接口需要查一下 CUDA Toolkit 里面有没有，比如我好像没查到 GetDeviceResourceSnapshot 相关的接口（也可能是我遗漏了），这部分 CUDA Toolkit 里面没有的，我们可以列个表出来，看看后面是不是真的需要迁移到新 InfiniRT 里面。除了接口名称，参数列表也得检查一下。别的目前看来没啥问题。

spike-zhu · 2026-06-29T01:32:10Z

@voltjia 麻烦嘉成帮我看下修改后 InfiniCore 接入 InfiniRT cpu 运行时的整体思路是否正确，后续我会完善细节，感谢！

我看了一下，基本上没啥问题，就是咱们这次重构有个原则：尽量复用 CUDA Runtime API 的接口，换句话说，有些接口需要查一下 CUDA Toolkit 里面有没有，比如我好像没查到 GetDeviceResourceSnapshot 相关的接口（也可能是我遗漏了），这部分 CUDA Toolkit 里面没有的，我们可以列个表出来，看看后面是不是真的需要迁移到新 InfiniRT 里面。除了接口名称，参数列表也得检查一下。别的目前看来没啥问题。

ok，关于 CUDA Runtime API 接口我也调研罗列一下

spike-zhu · 2026-06-29T03:23:07Z

InfiniCore CPU Runtime API 迁移判定表

CUDA API 来源：cuda_runtime_api.h，CUDA Toolkit 12.9。

InfiniCore 中的 API 名称	InfiniRT 中的名称	对应的 CUDA API 名称	CUDA API 函数接口	InfiniRT 是否迁移
`setDevice(int device_id)`	`infini::rt::SetDevice(Device device)`	`cudaSetDevice`	`cudaError_t cudaSetDevice(int device);`	已迁移
`getDevice(...)` / `infinirtGetDevice(...)`	`infini::rt::GetDevice(Device* device)`	`cudaGetDevice`	`cudaError_t cudaGetDevice(int *device);`	已迁移
`getDeviceCount(int *count)`	`infini::rt::GetDeviceCount(int* count, Device::Type type)`	`cudaGetDeviceCount`	`cudaError_t cudaGetDeviceCount(int *count);`	已迁移，InfiniRT 增加 `Device::Type`
`deviceSynchronize()`	`infini::rt::DeviceSynchronize()`	`cudaDeviceSynchronize`	`cudaError_t cudaDeviceSynchronize(void);`	已迁移
`mallocDevice(void **p_ptr, size_t size)`	`infini::rt::Malloc(void** ptr, std::size_t size)`	`cudaMalloc`	`cudaError_t cudaMalloc(void **devPtr, size_t size);`	已迁移
`freeDevice(void *ptr)`	`infini::rt::Free(void* ptr)`	`cudaFree`	`cudaError_t cudaFree(void *devPtr);`	已迁移
`memcpy(void dst, const void src, size_t size, infinirtMemcpyKind_t kind)`	`infini::rt::Memcpy(void* dst, const void* src, std::size_t count, MemcpyKind kind)`	`cudaMemcpy`	`cudaError_t cudaMemcpy(void dst, const void src, size_t count, enum cudaMemcpyKind kind);`	已迁移
`memsetDevice(void *ptr, int value, size_t count)`	`infini::rt::Memset(void* ptr, int value, std::size_t count)`	`cudaMemset`	`cudaError_t cudaMemset(void *devPtr, int value, size_t count);`	已迁移
`mallocHost(void **p_ptr, size_t size)`	`infini::rt::MallocHost(void** ptr, std::size_t size)`	`cudaMallocHost`	`cudaError_t cudaMallocHost(void **ptr, size_t size);`	已迁移
`freeHost(void *ptr)`	`infini::rt::FreeHost(void* ptr)`	`cudaFreeHost`	`cudaError_t cudaFreeHost(void *ptr);`	已迁移
`memcpyAsync(void dst, const void src, size_t size, infinirtMemcpyKind_t kind, infinirtStream_t stream)`	`infini::rt::MemcpyAsync(void* dst, const void* src, std::size_t count, MemcpyKind kind, void* stream)`	`cudaMemcpyAsync`	`cudaError_t cudaMemcpyAsync(void dst, const void src, size_t count, enum cudaMemcpyKind kind, cudaStream_t stream);`	已迁移
`memsetDeviceAsync(void *ptr, int value, size_t count, infinirtStream_t stream)`	`infini::rt::MemsetAsync(void* ptr, int value, std::size_t count, void* stream)`	`cudaMemsetAsync`	`cudaError_t cudaMemsetAsync(void *devPtr, int value, size_t count, cudaStream_t stream);`	已迁移
`mallocAsync(void **p_ptr, size_t size, infinirtStream_t stream)`	`infini::rt::MallocAsync(void** ptr, std::size_t size, void* stream)`	`cudaMallocAsync`	`cudaError_t cudaMallocAsync(void **devPtr, size_t size, cudaStream_t hStream);`	已迁移
`freeAsync(void *ptr, infinirtStream_t stream)`	`infini::rt::FreeAsync(void* ptr, void* stream)`	`cudaFreeAsync`	`cudaError_t cudaFreeAsync(void *devPtr, cudaStream_t hStream);`	已迁移
`streamCreate(infinirtStream_t *stream_ptr)`	`infini::rt::StreamCreate(void** stream)`	`cudaStreamCreate`	`cudaError_t cudaStreamCreate(cudaStream_t *pStream);`	已迁移
`streamDestroy(infinirtStream_t stream)`	`infini::rt::StreamDestroy(void* stream)`	`cudaStreamDestroy`	`cudaError_t cudaStreamDestroy(cudaStream_t stream);`	已迁移
`streamSynchronize(infinirtStream_t stream)`	`infini::rt::StreamSynchronize(void* stream)`	`cudaStreamSynchronize`	`cudaError_t cudaStreamSynchronize(cudaStream_t stream);`	已迁移
`streamWaitEvent(infinirtStream_t stream, infinirtEvent_t event)`	`infini::rt::StreamWaitEvent(void* stream, void* event)`	`cudaStreamWaitEvent`	`cudaError_t cudaStreamWaitEvent(cudaStream_t stream, cudaEvent_t event, unsigned int flags);`	已迁移，但 InfiniRT 当前缺少 `flags` 参数
`eventCreate(infinirtEvent_t *event_ptr)`	`infini::rt::EventCreate(void** event)`	`cudaEventCreate`	`cudaError_t cudaEventCreate(cudaEvent_t *event);`	已迁移
`eventCreateWithFlags(infinirtEvent_t *event_ptr, uint32_t flags)`	`infini::rt::EventCreateWithFlags(void** event, uint32_t flags)`	`cudaEventCreateWithFlags`	`cudaError_t cudaEventCreateWithFlags(cudaEvent_t *event, unsigned int flags);`	已迁移
`eventRecord(infinirtEvent_t event, infinirtStream_t stream)`	`infini::rt::EventRecord(void* event, void* stream)`	`cudaEventRecord`	`cudaError_t cudaEventRecord(cudaEvent_t event, cudaStream_t stream);`	已迁移
`eventQuery(infinirtEvent_t event, infinirtEventStatus_t *status_ptr)`	`infini::rt::EventQuery(void* event, int* status)`	`cudaEventQuery`	`cudaError_t cudaEventQuery(cudaEvent_t event);`	已迁移，InfiniRT 用输出参数表达 complete/not-ready
`eventSynchronize(infinirtEvent_t event)`	`infini::rt::EventSynchronize(void* event)`	`cudaEventSynchronize`	`cudaError_t cudaEventSynchronize(cudaEvent_t event);`	已迁移
`eventDestroy(infinirtEvent_t event)`	`infini::rt::EventDestroy(void* event)`	`cudaEventDestroy`	`cudaError_t cudaEventDestroy(cudaEvent_t event);`	已迁移
`eventElapsedTime(float *ms_ptr, infinirtEvent_t start, infinirtEvent_t end)`	`infini::rt::EventElapsedTime(float* ms, void* start, void* end)`	`cudaEventElapsedTime`	`cudaError_t cudaEventElapsedTime(float *ms, cudaEvent_t start, cudaEvent_t end);`	已迁移
`getMemInfo(int device_id, size_t free_bytes, size_t total_bytes)`	`infini::rt::GetMemInfo(Device device, std::size_t* free_bytes, std::size_t* total_bytes)`	`cudaMemGetInfo`	`cudaError_t cudaMemGetInfo(size_t free, size_t total);`	已迁移，InfiniRT 增加 `Device` 参数
`getDeviceResourceSnapshot(int device_id, infinirtDeviceResourceSnapshot_t *snapshot)`	无	无直接对应 API	CUDA Runtime 无 `GetDeviceResourceSnapshot` / resource snapshot 聚合接口	不迁移，保留在 InfiniCore 中
`streamBeginCapture(infinirtStream_t stream, infinirtStreamCaptureMode_t mode)`	当前 CPU 未迁移	`cudaStreamBeginCapture`	`cudaError_t cudaStreamBeginCapture(cudaStream_t stream, enum cudaStreamCaptureMode mode);`	CPU 当前不迁移，保留 unsupported
`streamEndCapture(infinirtStream_t stream, infinirtGraph_t *graph_ptr)`	当前 CPU 未迁移	`cudaStreamEndCapture`	`cudaError_t cudaStreamEndCapture(cudaStream_t stream, cudaGraph_t *pGraph);`	CPU 当前不迁移，保留 unsupported
`graphDestroy(infinirtGraph_t graph)`	当前 CPU 未迁移	`cudaGraphDestroy`	`cudaError_t cudaGraphDestroy(cudaGraph_t graph);`	CPU 当前不迁移，保留 unsupported
`graphInstantiate(...)`	当前 CPU 未迁移	`cudaGraphInstantiate`	`cudaError_t cudaGraphInstantiate(cudaGraphExec_t *pGraphExec, cudaGraph_t graph, unsigned long long flags);`	CPU 当前不迁移，保留 unsupported；InfiniCore 旧接口参数与新 CUDA 原型不完全一致
`graphExecDestroy(infinirtGraphExec_t graph_exec)`	当前 CPU 未迁移	无完全同名 CUDA Runtime API	常见对应为 `cudaGraphExecDestroy(cudaGraphExec_t graphExec)`，需按 CUDA Toolkit 版本确认	CPU 当前不迁移，保留 unsupported
`graphLuanch(infinirtGraphExec_t graph_exec, infinirtStream_t stream)`	当前 CPU 未迁移	`cudaGraphLaunch`	`cudaError_t cudaGraphLaunch(cudaGraphExec_t graphExec, cudaStream_t stream);`	CPU 当前不迁移，保留 unsupported

* feat!: align runtime API and add runtime dispatch (#11) * Align runtime API with generated wrappers * Add default runtime dispatch specialization * Refactor runtime dispatch namespace * Use Abseil status for runtime device API * Revert "Use Abseil status for runtime device API" This reverts commit a26ddff. * Address runtime dispatch review feedback * Keep runtime API list in generator * Add TensorView constructor guard test * Align runtime memcpy kind constants with CUDA API * Use CUDA-style runtime memcpy constants * Use CUDA-style runtime memcpy constants * Move TensorView tests back into core test * Remove standalone TensorView test target * Remove standalone TensorView test file * Use fully qualified runtime API names in README * style: format runtime dispatch test * feat: refactor InfiniCore CPU runtime to InfiniRT (#8) Co-authored-by: Jiacheng Huang <huangjiacheng0709@outlook.com> * feat: add platform-adaptive runtime tests (#15) * feat: add runtime backend API foundation (#14) --------- Co-authored-by: spike-zhu <74974704+spike-zhu@users.noreply.github.com>

spike-zhu marked this pull request as draft June 24, 2026 02:11

spike-zhu force-pushed the feat/extract-infinicore-runtime branch from 2e80f6b to 866fc8d Compare June 25, 2026 13:15

spike-zhu mentioned this pull request Jun 25, 2026

issue/1311 - feat: refactor InfiniCore cpu runtime to InfiniRT InfiniTensor/InfiniCore#1342

Draft

spike-zhu self-assigned this Jun 25, 2026

spike-zhu requested a review from voltjia June 25, 2026 13:27

spike-zhu force-pushed the feat/extract-infinicore-runtime branch from 866fc8d to 7fd37b5 Compare June 30, 2026 01:49

spike-zhu marked this pull request as ready for review June 30, 2026 01:51

voltjia requested changes Jun 30, 2026

View reviewed changes

This comment was marked as outdated.

Sign in to view

spike-zhu force-pushed the feat/extract-infinicore-runtime branch 2 times, most recently from 1596dd5 to 187c34a Compare July 1, 2026 08:48

spike-zhu requested a review from voltjia July 1, 2026 08:49

voltjia mentioned this pull request Jul 3, 2026

fix: align CPU runtime extraction with runtime namespace #12

Merged

feat: refactor InfiniCore cpu runtime to InfiniRT

a2aea14

voltjia force-pushed the feat/extract-infinicore-runtime branch from 37c4913 to a2aea14 Compare July 3, 2026 02:43

voltjia approved these changes Jul 3, 2026

View reviewed changes

voltjia changed the title ~~feat: refactor InfiniCore cpu runtime to InfiniRT~~ feat: refactor InfiniCore CPU runtime to InfiniRT Jul 3, 2026

voltjia merged commit 568efd5 into master Jul 3, 2026
4 checks passed

voltjia deleted the feat/extract-infinicore-runtime branch July 3, 2026 03:15

This was referenced Jul 3, 2026

feat: add runtime backend API foundation #14

Merged

fix: align graph runtime API with runtime namespace #13

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: refactor InfiniCore CPU runtime to InfiniRT#8

feat: refactor InfiniCore CPU runtime to InfiniRT#8
voltjia merged 1 commit into
masterfrom
feat/extract-infinicore-runtime

spike-zhu commented Jun 24, 2026 •

edited by voltjia

Loading

Uh oh!

spike-zhu commented Jun 25, 2026 •

edited

Loading

Uh oh!

voltjia commented Jun 26, 2026

Uh oh!

spike-zhu commented Jun 29, 2026

Uh oh!

spike-zhu commented Jun 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

spike-zhu commented Jun 24, 2026 • edited by voltjia Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Type of Change

Platforms Affected

Smoke Test Result

Test Results on Supported Platforms

Benchmark / Performance Impact

Notes for Reviewers

Uh oh!

spike-zhu commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

voltjia commented Jun 26, 2026

Uh oh!

spike-zhu commented Jun 29, 2026

Uh oh!

spike-zhu commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

InfiniCore CPU Runtime API 迁移判定表

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

spike-zhu commented Jun 24, 2026 •

edited by voltjia

Loading

spike-zhu commented Jun 25, 2026 •

edited

Loading

spike-zhu commented Jun 29, 2026 •

edited

Loading