Below table shows supported data types across PostgreSQL, ClickHouse, BigQuery and Snowflake.
Source | Destinations | ||||
---|---|---|---|---|---|
PostgreSQL | PostgreSQL | BigQuery | Snowflake | Clickhouse | ElasticSearch |
smallint | smallint | INTEGER | INTEGER | Int16 | long |
integer | integer | INTEGER | INTEGER | Int32 | long |
bigint | bigint | INTEGER | INTEGER | Int64 | long |
float4 | float4 | FLOAT | FLOAT | Float32 | float |
double precision | double precision | FLOAT | FLOAT | Float64 | float |
boolean | bool | BOOLEAN | BOOLEAN | Bool | boolean |
"char" | CHAR | STRING | STRING | FixedString(1) | text |
varchar | varchar | STRING | STRING | String | text |
date | date | DATE | DATE | Date | date |
json | json | JSON | VARIANT | String | unnested subdocument |
jsonb | jsonb | JSON | VARIANT | String | unnested subdocument |
numeric | numeric | BIGNUMERIC | NUMBER | Decimal | text |
text | text | STRING | STRING | String | text |
timestamp | timestamp | TIMESTAMP | TIMESTAMP_NTZ | DateTime64(6) | date |
timestamp with time zone | timestamp with time zone | TIMESTAMP | TIMESTAMP_TZ | DateTime64(6) | date |
time | time | TIME | TIME | String | date |
bit | bit | BYTES | BINARY | String | text |
bit varying | varbit | BYTES | BINARY | String | text |
bytea | bytea | BYTES | BINARY | String | text |
geography | geography | GEOGRAPHY | GEOGRAPHY | String | Coming soon! |
geometry | geometry | GEOGRAPHY | GEOMETRY | String | Coming soon! |
inet | inet | STRING | STRING | String | text |
macaddr | macaddr | STRING | STRING | String | text |
cidr | cidr | STRING | STRING | String | text |
hstore | hstore | JSON | VARIANT | String | Coming soon! |
uuid | uuid | STRING | STRING | uuid | Coming soon! |
Source | Destinations | |||
---|---|---|---|---|
PostgreSQL Type | PostgreSQL | BigQuery | Snowflake | Clickhouse |
ARRAY<INT2> | ARRAY<INT2> | ARRAY<INT> | VARIANT | Array<Int16> |
ARRAY<INT4> | ARRAY<INT4> | ARRAY<INT> | VARIANT | Array<Int32> |
ARRAY<INT8> | ARRAY<INT8> | ARRAY<INT> | VARIANT | Array<Int64> |
ARRAY<FLOAT4> | ARRAY<FLOAT4> | ARRAY<FLOAT> | VARIANT | Array<Float32> |
ARRAY<DOUBLE PRECISION> | ARRAY<DOUBLE PRECISION> | ARRAY<DOUBLE PRECISION> | VARIANT | Array<Float64> |
ARRAY<BOOL> | ARRAY<BOOL> | ARRAY<BOOL> | VARIANT | Array<Bool> |
ARRAY<VARCHAR> | ARRAY<VARCHAR> | ARRAY<STRING> | VARIANT | Array<String> |
ARRAY<TEXT> | ARRAY<TEXT> | ARRAY<STRING> | VARIANT | Array<String> |
ARRAY<DATE> | ARRAY<DATE> | ARRAY<DATE> | VARIANT | String |
ARRAY<TIMESTAMP> | ARRAY<TIMESTAMP> | ARRAY<TIMESTAMP> | VARIANT | String |
ARRAY<TIMESTAMPTZ> | ARRAY<TIMESTAMPTZ> | ARRAY<TIMESTAMP> | VARIANT | String |
We recognise that there are various approaches to handling certain data types. Here are some decisions we’ve taken for PeerDB.
For Snowflake, we map PostgreSQL’s numeric
type as follows:
numeric
with no specified precision and scale is mapped to NUMBER(38,20)
.numeric
with precision OR scale which is beyond 38 and 37 is mapped to NUMBER(38, 20)
.numeric
with precision AND scale within the above limits is mapped to NUMBER(precision, scale)
.For BigQuery, we map PostgreSQL’s numeric
type as follows:
numeric
with no specified precision and scale is mapped to BIGNUMERIC(38,20)
.numeric
with precision OR scale which is beyond 38 and 37 respectively, is mapped to BIGNUMERIC(38, 20)
.numeric
with precision AND scale within the above limits is mapped to BIGNUMERIC(precision, scale)
.For Clickhouse, we map PostgreSQL’s numeric
type as follows:
numeric
with no specified precision and scale is mapped to Decimal(76,38)
.numeric
with precision OR scale which is beyond 76 and 38 respectively, is mapped to Decimal(76,38)
.numeric
with precision AND scale within the above limits is mapped to Decimal(precision, scale)
.PeerDB detects invalid shapes (for example, a linestring
with only one point) among PostGIS values it pulls, and writes them as null on the destination.
We keep a log of this data and it can be retrieved if needed.
Valid geospatial data is written on BigQuery and Snowflake in Well-Known Text (WKT)
format,
while to PostgreSQL destinations it is written as it is received.
PeerDB writes HSTORE
data as JSON
on BigQuery. All intricaces of the HSTORE
data type are preserved, such as:
NULL
values. Example: '"a"=>NULL'
will be written as {"a":null}
'""=>1'
will be written as {"":1}
'"a"=>"1", "a"=>"2"'
will be written as {"a":2}
To Snowflake, it is written as a VARIANT
data type, although it is
formatted as a JSON
and can be queried as such - snowflake_hstore_column:key
.
PeerDB removes null
values from BigQuery arrays. This is because BigQuery does not support null
values in arrays during their insertion.
We currently rely on Elasticsearch dynamic mapping for data type mappings, so this mapping may not be accurate for all cases. We are working on enabling explicit mappings for Elasticsearch.
Below table shows supported data types across PostgreSQL, ClickHouse, BigQuery and Snowflake.
Source | Destinations | ||||
---|---|---|---|---|---|
PostgreSQL | PostgreSQL | BigQuery | Snowflake | Clickhouse | ElasticSearch |
smallint | smallint | INTEGER | INTEGER | Int16 | long |
integer | integer | INTEGER | INTEGER | Int32 | long |
bigint | bigint | INTEGER | INTEGER | Int64 | long |
float4 | float4 | FLOAT | FLOAT | Float32 | float |
double precision | double precision | FLOAT | FLOAT | Float64 | float |
boolean | bool | BOOLEAN | BOOLEAN | Bool | boolean |
"char" | CHAR | STRING | STRING | FixedString(1) | text |
varchar | varchar | STRING | STRING | String | text |
date | date | DATE | DATE | Date | date |
json | json | JSON | VARIANT | String | unnested subdocument |
jsonb | jsonb | JSON | VARIANT | String | unnested subdocument |
numeric | numeric | BIGNUMERIC | NUMBER | Decimal | text |
text | text | STRING | STRING | String | text |
timestamp | timestamp | TIMESTAMP | TIMESTAMP_NTZ | DateTime64(6) | date |
timestamp with time zone | timestamp with time zone | TIMESTAMP | TIMESTAMP_TZ | DateTime64(6) | date |
time | time | TIME | TIME | String | date |
bit | bit | BYTES | BINARY | String | text |
bit varying | varbit | BYTES | BINARY | String | text |
bytea | bytea | BYTES | BINARY | String | text |
geography | geography | GEOGRAPHY | GEOGRAPHY | String | Coming soon! |
geometry | geometry | GEOGRAPHY | GEOMETRY | String | Coming soon! |
inet | inet | STRING | STRING | String | text |
macaddr | macaddr | STRING | STRING | String | text |
cidr | cidr | STRING | STRING | String | text |
hstore | hstore | JSON | VARIANT | String | Coming soon! |
uuid | uuid | STRING | STRING | uuid | Coming soon! |
Source | Destinations | |||
---|---|---|---|---|
PostgreSQL Type | PostgreSQL | BigQuery | Snowflake | Clickhouse |
ARRAY<INT2> | ARRAY<INT2> | ARRAY<INT> | VARIANT | Array<Int16> |
ARRAY<INT4> | ARRAY<INT4> | ARRAY<INT> | VARIANT | Array<Int32> |
ARRAY<INT8> | ARRAY<INT8> | ARRAY<INT> | VARIANT | Array<Int64> |
ARRAY<FLOAT4> | ARRAY<FLOAT4> | ARRAY<FLOAT> | VARIANT | Array<Float32> |
ARRAY<DOUBLE PRECISION> | ARRAY<DOUBLE PRECISION> | ARRAY<DOUBLE PRECISION> | VARIANT | Array<Float64> |
ARRAY<BOOL> | ARRAY<BOOL> | ARRAY<BOOL> | VARIANT | Array<Bool> |
ARRAY<VARCHAR> | ARRAY<VARCHAR> | ARRAY<STRING> | VARIANT | Array<String> |
ARRAY<TEXT> | ARRAY<TEXT> | ARRAY<STRING> | VARIANT | Array<String> |
ARRAY<DATE> | ARRAY<DATE> | ARRAY<DATE> | VARIANT | String |
ARRAY<TIMESTAMP> | ARRAY<TIMESTAMP> | ARRAY<TIMESTAMP> | VARIANT | String |
ARRAY<TIMESTAMPTZ> | ARRAY<TIMESTAMPTZ> | ARRAY<TIMESTAMP> | VARIANT | String |
We recognise that there are various approaches to handling certain data types. Here are some decisions we’ve taken for PeerDB.
For Snowflake, we map PostgreSQL’s numeric
type as follows:
numeric
with no specified precision and scale is mapped to NUMBER(38,20)
.numeric
with precision OR scale which is beyond 38 and 37 is mapped to NUMBER(38, 20)
.numeric
with precision AND scale within the above limits is mapped to NUMBER(precision, scale)
.For BigQuery, we map PostgreSQL’s numeric
type as follows:
numeric
with no specified precision and scale is mapped to BIGNUMERIC(38,20)
.numeric
with precision OR scale which is beyond 38 and 37 respectively, is mapped to BIGNUMERIC(38, 20)
.numeric
with precision AND scale within the above limits is mapped to BIGNUMERIC(precision, scale)
.For Clickhouse, we map PostgreSQL’s numeric
type as follows:
numeric
with no specified precision and scale is mapped to Decimal(76,38)
.numeric
with precision OR scale which is beyond 76 and 38 respectively, is mapped to Decimal(76,38)
.numeric
with precision AND scale within the above limits is mapped to Decimal(precision, scale)
.PeerDB detects invalid shapes (for example, a linestring
with only one point) among PostGIS values it pulls, and writes them as null on the destination.
We keep a log of this data and it can be retrieved if needed.
Valid geospatial data is written on BigQuery and Snowflake in Well-Known Text (WKT)
format,
while to PostgreSQL destinations it is written as it is received.
PeerDB writes HSTORE
data as JSON
on BigQuery. All intricaces of the HSTORE
data type are preserved, such as:
NULL
values. Example: '"a"=>NULL'
will be written as {"a":null}
'""=>1'
will be written as {"":1}
'"a"=>"1", "a"=>"2"'
will be written as {"a":2}
To Snowflake, it is written as a VARIANT
data type, although it is
formatted as a JSON
and can be queried as such - snowflake_hstore_column:key
.
PeerDB removes null
values from BigQuery arrays. This is because BigQuery does not support null
values in arrays during their insertion.
We currently rely on Elasticsearch dynamic mapping for data type mappings, so this mapping may not be accurate for all cases. We are working on enabling explicit mappings for Elasticsearch.