Skip to content

Compatibility with Apache Iceberg

Note

This document describes how to read and write Havasu tables using the open source Apache Iceberg. If you are using Havasu, you can skip this document.

Havasu is compatible with the Iceberg open table format. User can use any Apache Iceberg APIs to read and write Havasu tables. This is achieved by preserving the backward compatibility of the table format when adding geospatial support to it.

Reading GEOMETRY Data Using Apache Iceberg

Apache Iceberg does not support GEOMETRY data type, so it will read the GEOMETRY column as BINARY column. User can use any Apache Iceberg APIs to read the data, but the data will be read as binary values. User can use Sedona to deserialize the binary values into geometry objects using ST_GeomFromWKB or any other geometry functions for parsing EWKB values provided by the query engine.

SELECT ST_GeomFromWKB(geom) FROM wherobots.test_db.test_table

Writing GEOMETRY Data Using Apache Iceberg

Havasu saves geometry data in BINARY columns in EWKB format by default, so users can use Iceberg to write serialized geometry data into Havasu tables. For example, user can use Iceberg to write EWKB values into Havasu tables:

INSERT INTO wherobots.test_db.test_table
VALUES (1, 'a', ST_AsBinary(ST_GeomFromText('POINT (1 2)'))), (2, 'b', ST_AsBinary(ST_Point(2, 3)))

Warning

Data files written by Apache Iceberg does not have spatial statistics metadata, so Havasu will not be able to optimize spatial range queries on these files. It is also possible to write corrupted values into the geometry column, which will make the data files unreadable by Havasu. It is recommended to always use Havasu to read and write Havasu tables.


Last update: October 3, 2023 07:45:18