Skip to content

HIVE-29359: Support Credential Vending in Hive Iceberg REST Catalog Client#6474

Open
difin wants to merge 1 commit into
apache:masterfrom
difin:vended_credentials_client
Open

HIVE-29359: Support Credential Vending in Hive Iceberg REST Catalog Client#6474
difin wants to merge 1 commit into
apache:masterfrom
difin:vended_credentials_client

Conversation

@difin

@difin difin commented May 11, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

  • Extended Gravitino LLAP qtest (TestIcebergRESTCatalogGravitinoLlapLocalCliDriver) to add a vended credentials header and run against MinIO + S3 warehouse with Gravitino s3-secret-key vending and OAuth2; configure host-side S3A and Iceberg S3FileIO for the published MinIO port so Tez/LLAP on the host work reliably.

  • Updated Hive to pass vended credentials to executors using jobProperties and jobSecrets.

Why are the changes needed?

To enable vended credentials support with REST Catalog servers.

Does this PR introduce any user-facing change?

Yes. Users configuring an Iceberg REST catalog in Hive can set the header iceberg.catalog.<name>.X-Iceberg-Access-Delegation on REST requests to enabled vended credentials.

How was this patch tested?

Updated existing test with vended credentials testing:

  • TestHiveRESTCatalogClient: new unit tests for vended credentials header mapping.

  • TestIcebergRESTCatalogGravitinoLlapLocalCliDriver with Gravitino + MinIO + OAuth2 + credentials vending.

@difin difin changed the title HIVE-29359: Support Credential Vending in Hive Iceberg REST Catalog C… HIVE-29359: Support Credential Vending in Hive Iceberg REST Catalog Client May 11, 2026
@difin difin force-pushed the vended_credentials_client branch from c092060 to db520b9 Compare May 20, 2026 00:15
@difin difin force-pushed the vended_credentials_client branch from db520b9 to d3274be Compare May 27, 2026 01:15
@difin difin force-pushed the vended_credentials_client branch from d3274be to d390532 Compare May 27, 2026 15:18
@difin difin force-pushed the vended_credentials_client branch from d390532 to f40686f Compare May 28, 2026 19:51
@difin difin force-pushed the vended_credentials_client branch from f40686f to c9d71f5 Compare May 28, 2026 22:31
@difin difin force-pushed the vended_credentials_client branch from c9d71f5 to e50bbd2 Compare June 4, 2026 20:46
@difin difin marked this pull request as ready for review June 4, 2026 20:46
@difin difin force-pushed the vended_credentials_client branch from e50bbd2 to 9198f81 Compare June 4, 2026 23:58
@sonarqubecloud

sonarqubecloud Bot commented Jun 5, 2026

Copy link
Copy Markdown

"iceberg.catalog.ice01." + IcebergVendedCredentialUtil.SECRET_ACCESS_KEY)
.satisfies(map ->
assertThat(map.get(InputFormatConfig.VENDED_STORAGE_CREDENTIALS))
.isNotBlank());

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am feeling the prefix information is gone.

table = readTableObjectFromFile(location, config);
}
checkAndSetIoConfig(config, table);
IcebergVendedCredentialUtil.applyFromJobConf(table, config);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most Iceberg clients don't need to ser/de credentials on their own, probably. We need it. That's because we serialize an Iceberg table in Hadoop's configuration as an intermediate expression? I guess we should store the credentials as they are and just restore them, without refining or normalizing the content on the Hive side

@difin difin Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that most Iceberg clients don't need to ser/de credentials themselves. Hive does, because we serialize the Iceberg Table (SerializableTable) into JobConf for Tez/LLAP, and vended credentials on FileIO typically don't survive that round-trip. Executors rebuild the table from job conf and don't re-run REST loadTable, so we propagate credentials separately (VENDED_STORAGE_CREDENTIALS + S3A bucket keys) and restore them in deserializeTable via applyFromJobConf.

The main place we mutate vended credential content is withConfigurationOverrides() method. REST catalogs can vend connectivity settings from their network view (e.g. http://minio:9000 when the catalog runs in Docker), while Hive session config sets a host-reachable endpoint (iceberg.catalog.ice01.s3.endpoint=http://host:9000). That method overrides only non-secret fields (s3.endpoint, s3.path-style-access) so Iceberg FileIO and S3A agree on connectivity; vended keys are preserved. It runs at both store time (propagateToJob, so the blob on executors is self-contained) and restore time (applyFromJobConf, e.g. when commit still has the catalog-internal endpoint on FileIO from loadTable).

@difin difin force-pushed the vended_credentials_client branch from 9198f81 to 0ff20b0 Compare June 25, 2026 20:34
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants