Heartbeat Table For CDC
Mitigate slot growth with a heartbeat
table
Consider a scenario where you kick off a CDC mirror from PostgreSQL to a data warehouse to sync one table. The mirror creates a logical replication slot in PostgreSQL to stream changes. Now, this table rarely gets rows ingested to it on the source. However, the replication slot keeps growing because of the WAL logs generated by the changes in other tables. This can lead to the database disk getting filled up.
In order to mitigate this, you can create a heartbeat
table in the source database. This table will have a single row that gets updated every minute.
By including this table in the CDC mirror, PeerDB will pick up changes to this table, sync them and flush the slot periodically, keeping the slot size in check.
The below is an example of what we’re talking about. Ensure that the heartbeat table has required permissions to be a part of the mirror.
PeerDB Implementation
PeerDB has the facility to perform periodic updates on your heartbeat table. This has the benefit of not needing to install an extension like pg_cron. Currently, updates are done every 12 minutes. This will soon be configurable. PeerDB will require update permission on the heartbeat table:
In PeerDB UI, head over to Settings
in the sidebar. Search for WAL
in the search bar on the right.
Edit the current value fields of the following settings:
PEERDB_ENABLE_WAL_HEARTBEAT
: Set this totrue
(no quotes or anything)PEERDB_WAL_HEARTBEAT_QUERY
: Set this to the update command to be run periodically:
This sets up updates to the heartbeat table every 12 minutes.
Manual Implementation
Periodic update can also be done with the below pg_cron command, for example:
Include the heartbeat table in the mirror
Now, you can include it as part of the tables in the mirror either via the Create Mirror UI (table selector) or through the SQL Layer as shown below.
Mitigate slot growth with emit messages
If setting up heartbeat tables is unideal for your use case,
you can also rely on pg_logical_emit_message()
- a system function for emitting a logical decoding message into WAL.
Note that this is supported in PostgreSQL versions 14 and above, but NOT on AWS RDS as of now. Please use the heart beat table method for AWS RDS.
PeerDB picks up on the message as part of its WAL processing and flushes the slot at the frequency of the CDC sync interval.
Permissions to be granted, if needed, to call this function can be done with:
Edit the current value fields of the following settings in the Settings tab in UI:
PEERDB_ENABLE_WAL_HEARTBEAT
: Set this totrue
(no quotes or anything)PEERDB_WAL_HEARTBEAT_QUERY
: Set this to the emit_message function call command to be run periodically:
Currently, PeerDB emits a message every 12 minutes. This will soon be configurable.