Iceberg REST Catalog Overview #8 - Scan Plan Retrieval and Cancellation
Alex Merced
Co-Author of “Apache Iceberg: The Definitive Guide” | Head of DevRel at Dremio | LinkedIn Learning Instructor | Tech Content Creator
We explored how to submit a table scan request using Apache Iceberg’s REST Catalog API. Once a scan is submitted, clients need a way to retrieve the scan results or cancel the scan if it’s no longer needed.
This blog covers:
Fetching the Result of a Scan Plan (GET /plan/{plan-id})
After submitting a scan request using /plan, the server might not return results immediately. Instead, it provides a plan-id, which the client uses to fetch results once the scan is ready.
Example Request to Fetch Scan Results
GET /v1/warehouse/namespaces/sales/tables/orders/plan/scan-12345 HTTP/1.1
Host: iceberg.catalog.com
Authorization: Bearer <your-access-token>
Understanding the Response Statuses
The response to this request can have several statuses:
1. Completed (Scan Results Available)
If the scan has been fully planned, the response includes the plan-tasks and file-scan-tasks required to execute the scan.
{
"status": "completed",
"plan-tasks": [
{
"file": "s3://data-lake/sales/orders.parquet",
"start": 0,
"length": 5242880
}
]
}
?? Action: Proceed with executing the scan using the provided tasks.
2. Submitted (Scan Still in Progress)
If the scan is still being processed, the response returns a “submitted” status.
{
"status": "submitted",
"plan-id": "scan-12345"
}
?? Action: Wait and retry the request later.
3. Failed (Error Occurred)
If the scan fails, an error response is returned.
{
"status": "failed",
"error": "Table not found"
}
?? Action: Check the error message and troubleshoot accordingly.
4. Cancelled (Scan No Longer Valid)
If the scan has been cancelled, the response includes “cancelled” status.
{
"status": "cancelled",
"message": "The plan-id is no longer valid."
}
?? Action: Discard the plan-id and do not retry.
Cancelling a Scan Plan (DELETE /plan/{plan-id})
If a scan is no longer needed, clients should explicitly cancel it to release server resources.
Example Request to Cancel a Scan Plan
DELETE /v1/warehouse/namespaces/sales/tables/orders/plan/scan-12345 HTTP/1.1
Host: iceberg.catalog.com
Authorization: Bearer <your-access-token>
If the cancellation is successful, the server returns HTTP 204 No Content.
When Should You Cancel a Scan?
? The scan is still in “submitted” status, and the results are no longer needed. ? The scan was initiated, but no plan tasks were fetched. ? The client is shutting down or switching to a different query plan.
?? No Need to Cancel If:
Best Practices for Managing Scan Plans
Conclusion
Fetching and canceling scan plans gives Iceberg users greater control over query execution and resource utilization. By following best practices, teams can ensure efficient scan planning, reduced latency, and better resource management in large-scale data lakehouse environments.