Skip to content

Working with Geometry Data

Havasu natively supports geometry data type and allows users to use geometry functions in Sedona for manipulating geometry data. This tutorial describes how to use geometry data type and geometry functions in Havasu.

Create Table using CREATE TABLE Command

To create a geospatial table in Sedona, use sedona.sql(...) to run a CREATE TABLE command:

sedona.sql("CREATE DATABASE IF NOT EXISTS wherobots.test_db")
sedona.sql("CREATE TABLE wherobots.test_db.test_table (id bigint, data string, geom geometry)")
sedona.sql("CREATE DATABASE IF NOT EXISTS wherobots.test_db")
sedona.sql("CREATE TABLE wherobots.test_db.test_table (id bigint, data string, geom geometry)")
sedona.sql("CREATE DATABASE IF NOT EXISTS wherobots.test_db");
sedona.sql("CREATE TABLE wherobots.test_db.test_table (id bigint, data string, geom geometry)");

This will create an empty table. Notice that Havasu introduced a new data type geometry to represent geospatial data. Data in columns with geometry type will be loaded as GeometryUDT values in Sedona, user can use any ST_ funtions provided by Sedona to manipulate the geospatial data.

Create Table using DataFrame

User can also create a table using a DataFrame. The geometry column in the DataFrame will goes into the GEOMETRY column in the created geospatial table.

df.writeTo("wherobots.test_db.test_table").create()
df.writeTo("wherobots.test_db.test_table").create()
df.writeTo("wherobots.test_db.test_table").create();

Havasu supports a full range of SQL DDL and DML commands which Apache Iceberg supports, such as ALTER TABLE, DROP TABLE, etc.

Writing Geometry Data

Once your table is created, you can insert data using INSERT INTO:

sedona.sql("INSERT INTO wherobots.test_db.test_table VALUES (1, 'a', ST_GeomFromText('POINT (1 2)')), (2, 'b', ST_Point(2, 3))")
sedona.sql("INSERT INTO wherobots.test_db.test_table VALUES (1, 'a', ST_GeomFromText('POINT (1 2)')), (2, 'b', ST_Point(2, 3))")
sedona.sql("INSERT INTO wherobots.test_db.test_table VALUES (1, 'a', ST_GeomFromText('POINT (1 2)')), (2, 'b', ST_Point(2, 3))");

You may have noticed that we use ST_GeomFromText and ST_Point provided by Sedona-Enterprise to construct geometry values and directly inserted them into the Havasu table.

Users can also write a DataFrame containing geometry data to a Havasu table:

df.writeTo("wherobots.db_name.table_name").create()
df.writeTo("wherobots.db_name.table_name").create()
df.writeTo("wherobots.db_name.table_name").create();

Or insert data into an existing Havasu table:

df.writeTo("wherobots.db_name.table_name").append()
df.writeTo("wherobots.db_name.table_name").append()
df.writeTo("wherobots.db_name.table_name").append();

Havasu also supports a range of table update operations, such as UPDATE, DELETE, MERGE INTO, etc. Please refer to Apache Iceberg - Spark Writes for details.

Reading Geometry Table

Havasu tables can be queried using SQL SELECT statements:

df = sedona.sql("SELECT * FROM wherobots.db_name.table_name")
df = sedona.sql("SELECT * FROM wherobots.db_name.table_name")
Dataset<Row> df = sedona.sql("SELECT * FROM wherobots.db_name.table_name");

Or using DataFrame API:

df = sedona.table("wherobots.db_name.table_name")
df = sedona.table("wherobots.db_name.table_name")
Dataset<Row> df = sedona.table("wherobots.db_name.table_name");

Havasu also supports time travel and inspecting metadata of table. Please refer to Apache Iceberg - Queries for details.

Processing Geometry Data using Sedona SQL

Users can use any ST_ functions provided by Sedona to manipulate geometry data read from Havasu.

# Read and manipulate geospatial data in a Havasu table
df = sedona.table("wherobots.db_name.table_name")
df.withColumn("wkt", expr("ST_AsText(geom)")).show()
df2 = df.withColumn("geom", expr("ST_Buffer(geom, 0.01)"))

# Write geospatial data back to a Havasu table
df2.writeTo("wherobots.db_name.table_name").append()
// Read and manipulate geospatial data in a Havasu table
val df = sedona.table("wherobots.db_name.table_name")
df.withColumn("wkt", expr("ST_AsText(geom)")).show()
val df2 = df.withColumn("geom", expr("ST_Buffer(geom, 0.01)"))

// Write geospatial data back to a Havasu table
df2.writeTo("wherobots.db_name.table_name").append()
// Read and manipulate geospatial data in a Havasu table
Dataset<Row> df = sedona.table("wherobots.db_name.table_name");
df.withColumn("wkt", expr("ST_AsText(geom)")).show();
Dataset<Row> df2 = df.withColumn("geom", expr("ST_Buffer(geom, 0.01)"));

// Write geospatial data back to a Havasu table
df2.writeTo("wherobots.db_name.table_name").append();

Further Reading

For more information, please refer to Geometry Support in Havasu.


Last update: February 9, 2024 03:04:11