5-minute Tutorials
Streaming Query Replication of PostgreSQL To Snowflake
Below is a 5-minute tutorial to test Streaming Query Replication from Postgres to Snowflake
Step 1: CREATE Postgres and Snowflake Peers
Step 2: Create and populate tables on the Postgres PEER
Below script helps creates and populate pgbench_history
with dummy data on your PostgreSQL peer.
curl -O https://peerdb-sample-data.s3.us-east-2.amazonaws.com/import_data_for_tutorial.sh
chmod +x import_data_for_tutorial.sh
# "postgres://user:password@hostname:5432/dbname"
# is the connection string of your PostgreSQL peer.
./import_data_for_tutorial.sh "postgres://user:password@hostname:5432/dbname"
Step 3: Create pgbench_history table on the destination (snowflake) PEER
CREATE TABLE pgbench_history (tid integer, bid integer,
aid integer, delta integer, mtime timestampntz, filler text);
Step 4: Kick off MIRROR with 8 threads and 50000 rows per partition
CREATE MIRROR postgres_to_snowflake_tutorial
FROM postgres_peer TO snowflake_peer FOR
$$SELECT * FROM public.pgbench_history WHERE mtime BETWEEN {{.start}} AND {{.end}}$$
WITH (
watermark_column = 'mtime',
watermark_table_name = 'pgbench_history',
mode = 'append',
parallelism = 8,
refresh_interval = 10,
num_rows_per_partition = 50000,
destination_table_name = 'public.pgbench_history'
);
Step 5: Monitor the MIRROR
You can connect to localhost:8085
to gain full visibility into the different jobs and steps that PeerDB performs under the hood to manage the MIRROR.
Step 6: Validate the MIRROR
In 1-2 minutes the MIRROR should complete syncing data. Now validate the data on both postgres and snowflake peers. Number of rows should match
SELECT count(*) FROM postgres_peer.pgbench_history;
SELECT count(*) FROM snowflake_peer.public.pgbench_history;