ACE Functions
ACE provides functions that compare the data from one object to the data on other object, and optionally repairs the differences it finds. ACE functions include:
| Command | Description |
|---|---|
| ACE table-diff | Compare two tables to identify differences. |
| ACE repset-diff | Compare two replication sets to identify differences. |
| ACE schema-diff | Compare two schemas to identify differences |
| ACE spock-diff | Compare two sets of spock meta-data to identify differences. |
| ACE table-repair | Repair data inconsistencies identified by the table-diff function. |
| ACE table-rerun | Rerun a diff to confirm that a fix has been correctly applied. |
ACE Diff Functions
ACE diff functions compare two objects and identify the differences; the output is a report that contains a:
- Summary of compared rows
- Mismatched data details
- Node-specific statistics
- Error logs (if any)
If you generate an html report, ACE generates an interactive report with:
- Colour-coded differences
- Expandable row details
- Primary key highlighting
- Missing row indicators
Common use cases for the ACE diff functions include:
- Performing routine content verification.
- Performing a performance-optimized large table scan.
- Performing a focused comparison between nodes, tables, or schemas.
As a best practice, you should experiment with different block sizes and CPU utilisation to find the best performance/resource-usage balance for your workload. Making use --table-filter for large tables to reduce comparison scope and generating HTML reports will make analysis of differences easier.
As you work, ensure that diffs have not overrun the MAX_ALLOWED_DIFFS limit; if your diffs surpass this limit, table-repair will only be able to partially repair the table.
ACE table-diff
Use the table-diff command to compare the tables in a cluster and produce a csv, json, or html report showing any differences.
The syntax is:
$ ./pgedge ace table-diff cluster_name schema.table_name [options]
cluster_nameis the name of the pgEdge cluster in which the table resides.schema.table_nameis the schema-qualified name of the table that you are comparing across cluster nodes.
Optional Arguments
Include the following optional arguments to customize ACE table-diff behavior:
-dor--dbnameis a string value that specifies the database name;dbnamedefaults to the name of the first database in the cluster configuration.--block-rowsis an integer value that specifies the number of rows to process per block.- Min: 1000
- Max: 100000
- Default: 10000
- Higher values improve performance but increase memory usage.
- This is a configurable parameter in
ace_config.py.
-mor--max-cpu-ratiois a float value that specifies the maximum CPU utilisation; the accepted range is 0.0-1.0. The default is 0.6.- This value is configurable in
ace_config.py.
- This value is configurable in
--batch-sizeis an integer value that specifies the number of blocks to process per multiprocessing worker (default:1).- The higher the number, the lower the parallelism.
- This value is configurable in
ace_config.py.
-oor--outputspecifies the output type; choose fromhtml,json, orcsvwhen including the--outputoption to select the output type for a report. By default, the report is written todiffs/<YYYY-MM-DD>/diffs_<HHMMSSmmmm>.json. If the output mode is csv or html, ACE will generate colored diff files to highlight differences.-nor--nodesspecifies a comma-delimited subset of nodes on which the command will be executed. ACE allows up to a three-way node comparison. We do not recommend simultaneously comparing more than three nodes at once.-qor--quietsuppresses messages about sanity checks and the progress bar instdout. If ACE encounters no differences, ACE will exit without messages. Otherwise, it will print the differences to JSON instdout(without writing to a file).-tor--table-filteris aSQL WHEREclause that allows you to filter rows for comparison.
ACE table-diff Command Examples
The following example reports a difference when comparing a table (public.foo) across all nodes and generates an html report:
$ ./pgedge ace table-diff demo public.foo --output=html
✔ Cluster demo exists
✔ Connections successful to nodes in cluster
✔ Table public.foo is comparable across nodes
Getting primary key offsets for table...
Starting jobs to compare tables...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3/3 [ 0:00:00 < 0:00:00 , ? it/s ]
⚠ TABLES DO NOT MATCH
⚠ FOUND 1 DIFFS BETWEEN n1 AND n2
Diffs written out to diffs/2025-04-08/diffs_072159340.json
HTML report generated: diffs/2025-04-08/diffs_072159340.html
TOTAL ROWS CHECKED = 5
RUN TIME = 0.40 secondsThe following example reports a difference when comparing a table (public.foo) across nodes n1 and n2, with a custom block size (50000):
$ ./pgedge ace table-diff demo public.foo --nodes="n1,n2" --block-rows=50000
✔ Cluster demo exists
✔ Connections successful to nodes in cluster
✔ Table public.foo is comparable across nodes
Getting primary key offsets for table...
Starting jobs to compare tables...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3/3 [ 0:00:00 < 0:00:00 , ? it/s ]
⚠ TABLES DO NOT MATCH
⚠ FOUND 1 DIFFS BETWEEN n1 AND n2
Diffs written out to diffs/2025-04-08/diffs_072804313.json
TOTAL ROWS CHECKED = 5
RUN TIME = 0.40 secondsACE repset-diff
Use the repset-diff command to loop through the tables in a replication set and produce a csv, json, or html report showing any differences. The syntax is:
$ ./pgedge ace repset-diff cluster_name repset_name [options]
cluster_nameis the name of the cluster in which the replication set is a member.repset_nameis the name of the replication set in which the tables being compared reside.
Optional Arguments
-dor--dbname=db_nameis the name of the database in which to run therepset-diffcommand; the default isnone.-mor--max_cpu_ratiospecifies the percentage of CPU power you are allotting for use by ACE. A value of1instructs the server to use all available CPUs, while.5means use half of the available CPUs. The default is.6(or 60% of the CPUs).--block_rowsspecifies the number of tuples to be used at a time during table comparisons. Ifblock_rowsis set to1000, then a thousand tuples are compared per job across tables.-oor--outputspecifies the output type; choose fromhtml,json, orcsvwhen including the--outputparameter to select the output type for a report. By default, the report is written todiffs/<YYYY-MM-DD>/diffs_<HHMMSSmmmm>.json. If the output type is csv or html, ACE will generate coloured diff files to highlight differences.-nor--nodesspecifies a comma-delimited list of nodes on which the command will be executed.--batch-sizeis an integer value that specifies the number of blocks to process per multiprocessing worker (default:1).-qor--quietsuppresses output from ACE; this defaults toFalse.--skip_tables=table_nameinstructs ACE to not evaluate the specified table for differences.--skip_file=file_nameallows you to specify the name of a file that contains a list of tables that you would like to skip.
ACE repset-diff Example
The following example reports a difference when comparing the default repset across all nodes:
$ ./pgedge ace repset-diff demo default
✔ Cluster demo exists
✔ Connections successful to nodes in cluster
CHECKING TABLE public.foo...
✔ Cluster demo exists
✔ Connections successful to nodes in cluster
✔ Table public.foo is comparable across nodes
Getting primary key offsets for table...
Starting jobs to compare tables...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3/3 [ 0:00:00 < 0:00:00 , ? it/s ]
⚠ TABLES DO NOT MATCH
⚠ FOUND 1 DIFFS BETWEEN n1 AND n2
Diffs written out to diffs/2025-04-08/diffs_090241529.json
TOTAL ROWS CHECKED = 5
RUN TIME = 0.40 secondsACE schema-diff
Use the schema-diff command to compare the schemas in a cluster and report differences in a .json format report; the syntax is:
$ ./pgedge ace schema-diff cluster_name schema_name [options]
schema_nameis the name of the schema you will be comparing.cluster_nameis the name of the cluster in which the table resides.
Optional Arguments
-n=node_listspecifies a list of nodes on which the schema will be compared;node_listis a comma-delimited list of node names. If omitted, the default is all nodes.--dbname=db_namespecifies the name of the database in which you would like to run the diff; defaults tonone.--ddl_onlyinstructs ACE to check for only DDL differences.--skip_tables=table_nameinstructs ACE to not evaluate the specified table for differences.--skip_file=file_nameallows you to specify the name of a file that contains a list of tables that you would like to skip.-qor--quietsuppresses output from ACE; this defaults toFalse.
ACE schema-diff Example
The following example demonstrates using the schema-diff command to check for differences in the public schema in a cluster named demo on nodes n1 and n2:
$ ./pgedge ace schema-diff demo public -nodes=n1,n2
✔ Cluster demo exists
✔ Connections successful to nodes in cluster
Comparing nodes 127.0.0.1:6432 and 127.0.0.1:6433:
✔ No differences foundACE spock-diff
Use the spock-diff command to compare the meta-data on two cluster nodes, and produce a report showing any differences. The syntax is:
$ ./pgedge ace spock-diff cluster_name [options]
cluster_nameis the name of the cluster in which the table resides.
Optional Arguments
-n=node_listspecifies a list of nodes on which spock will be compared;node_listis a comma-delimited list of node names. If omitted, the default is all nodes.-qor--quietsuppresses output from ACE; this defaults toFalse.
ACE table-repair
The ACE table-repair function fixes data inconsistencies identified by the table-diff functions. ACE table-repair uses a specified node as the source of truth to correct data on other nodes. Common use cases for table-repair include:
- Spock Exception Repair for exceptions arising from insert/update/delete conflicts during replication.
- Network Partition Repair to restore consistency across nodes after a network partition fails.
- Temporary Node Outage Repair to bring a node up to speed after a temporary outage.
The function has a number of safety and audit features that you should consider before invoking the command:
- Dry run mode allows you to test repairs without making changes.
- Report generation produces a detailed repair audit trail of all changes made.
- Include the Upsert-Only option to prevent data deletion.
- Transaction safety ensures that all changes are atomic. If, for some reason your repair fails midway, the entire transaction will be rolled back, and no changes will be made to the database.
When using table-repair, remember that:
- Table-repair is intended to be used to repair differences that arise from incidents such as spock exceptions, network partition irregularities, temporary node outages, etc. If the 'blast radius' of a failure event is too large -- say, millions of records across several tables, even though table-repair can handle this, we recommend that instead you do a dump and restore using native PostgreSQL tooling.
- Table-repair can only repair rows found in the diff file. If your diff exceeds
MAX_ALLOWED_DIFFS, table-repair will only be able to partially repair the table. This may even be desirable if you want to repair the table in batches; you can perform adiff->repair->diff->repaircycle until no more differences are reported. - You should invoke
ACE table-repairwith--dry-runfirst to review proposed changes. - Use
--upsert-onlyor--insert-onlyfor critical tables where data deletion may be risky. - You should verify your table structure and constraints before repair.
The command syntax is:
./pgedge ace table-repair <cluster_name> <schema.table_name> --diff-file=<diff_file> <--source-of-truth>[options]cluster_nameis the name of the cluster in which the table resides.diff_fileis the path and name of the file that contains the table differences.schema.table_nameis the schema-qualified name of the table that you are repairing.-sor--source-of-truthis a string value specifying the node name to use as the source of truth for repairs. Note: If you are performing a repair that specifies the--bidirectionalor--fix-nullsoption, the--source-of-truthis not required.
Optional Arguments
--dry-runis a boolean value that simulates repair operations without making changes. The default isfalse.--upsert_only(or-u) - Set this option totrueto specify that ACE should make only additions to the non-source of truth nodes, skipping anyDELETEstatements that may be needed to make the data match. This option does not guarantee that nodes will match when the command completes, but can be useful if you want to merge the contents of different nodes. The default value isfalse.--generate_report(or-g) - Set this option totrueto generate a .json report of the actions performed; Reports are written to files identified by a timestamp in the format:reports/<YYYY-MM-DD>/report_<HHMMSSmmmm>.json. The default isfalse.--dbnameis a string value that specifies the database name; dbname defaults tonone.--quietis a boolean value that suppresses non-essential output. The default isfalse.--generate-reportis a boolean value that instructs the server to create a detailed report of repair operations. The default isfalse.--upsert-onlyis a boolean value that instructs the server to only perform inserts/updates, and skip deletions. The default isfalse.-ior--insert-onlyis a boolean value that instructs the server to only perform inserts, and skip updates and deletions. Note: This option usesINSERT INTO ... ON CONFLICT DO NOTHING. If there are identical rows with different values, this option alone is not enough to fully repair the table. The default isfalse.-bor--bidirectionalis a boolean value that must be used with--insert-only. Similar to--insert-only, but inserts missing rows in a bidirectional manner. For example, if you specify--bidirectionalis a boolean value that instructs ACE to apply differences found between nodes to create a distinct union of the content. In a distinct union, each row that is missing is recreated on the node from which it is missing, eventually leading to a data set (on all nodes) in which all rows are represented exactly once. For example, if you are performing a repair in a case where node A has rows with IDs 1, 2, 3 and node B has rows with IDs 2, 3, 4, the repair will ensure that both node A and node B have rows with IDs 1, 2, 3, and 4.
--fix-nullsis a boolean value that instructs the server to fix NULL values by comparing values across nodes. For example, if you have an issue where a column is not being replicated, you can use this option to fix the NULL values on the target nodes. This does not need a source of truth node as it consults the diff file to determine which rows have NULL values. However, it should be used for this special case only, and should not be used for other types of data inconsistencies.--fire-triggersis a boolean value that instructs triggers to fire when ACE performs a repair; note thatENABLE ALWAYStriggers will fire regardless of the value of--fire-triggers. The default isfalse.
ACE table-repair Command Examples
The following commands first perform a table-repair dry run of the public.foo table, specifying a diff file (--diff-file=diffs/2025-04-08/diffs_090241529.json) and using node n1 as the source of truth:
[rocky@ip-172-31-15-12 pgedge]$ ./pgedge ace table-repair demo public.foo --diff-file=diffs/2025-04-08/diffs_090241529.json --source-of-truth=n1 --dry_run=True
✔ Cluster demo exists
✔ Connections successful to nodes in cluster
######## DRY RUN ########
Repair would have attempted to upsert 0 rows and delete 1 row on n2
######## END DRY RUN ########After performing the dry run, we change the --dry_run flag to False, confirming that we want to apply the changes we reviewed in the first command iteration:
$ ./pgedge ace table-repair demo public.foo --diff-file=diffs/2025-04-08/diffs_090241529.json -s=n1 --dry_run=False
✔ Cluster demo exists
✔ Connections successful to nodes in cluster
✔ Successfully applied diffs to public.foo in cluster demo
*** SUMMARY ***
n2 UPSERTED = 0 rows
n2 DELETED = 1 rows
RUN TIME = 0.00 secondsThe following example performs a unidirectional insert-only repair on the public.foo table. In a situation where node 2 is missing a row when compared to node 1, including the --insert-only option inserts the missing rows from node 1 to node 2:
$ ./pgedge ace table-repair demo public.foo diffs/2025-04-09/diffs_101804246.json --source-of-truth=n1 --insert-only=True
✔ Cluster demo exists
✔ Connections successful to nodes in cluster
✔ Successfully applied diffs to public.foo in cluster demo
*** SUMMARY ***
n2 INSERTED = 1 rows
RUN TIME = 0.00 secondsThe following example performs a bidirectional insert-only repair. If you have a network partition between node 1 and node 2, and they each separately received new records, including the --bidirectional option will insert the missing records from node 1 to node 2 and vice versa:
$ ./pgedge ace table-repair demo public.foo diffs/2025-04-09/diffs_103544698.json --source-of-truth=n1 --insert-only=True --bidirectional=True
✔ Cluster demo exists
✔ Connections successful to nodes in cluster
Performing bidirectional repair:
Overall progress: 0%| | 0/100 [00:00<?, ?%/s]
Processing node pair n1/n2
Processing node n1
Processing node n2
Overall progress: 100%|██████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 16786.62%/s]
✔
Successfully completed bidirectional repair in public.foo in cluster demoACE table-repair Report Example
{
"time_stamp": "08/07/2024, 13:20:19",
"arguments": {
"cluster_name": "demo",
"diff_file": "diffs/2024-08-07/diffs_131919688.json",
"source_of_truth": "n1",
"table_name": "public.acctg_diff_data",
"dbname": null,
"dry_run": false,
"quiet": false,
"upsert_only": false,
"generate_report": true
},
"database": "lcdb",
"changes": {
"n2": {
"upserted_rows": [],
"deleted_rows": [],
"missing_rows": [
{
"employeeid": 1,
"employeename": "Carol",
"employeemail": "carol@pgedge.com"
},
{
"employeeid": 2,
"employeename": "Bob",
"employeemail": "bob@pgedge.com"
}
]
}
},
"run_time": 0.1
}Within the report:
time_stampdisplays when the function was called.argumentslists the syntax used when performing the repair.databaseidentifies the database the function connected to.runtimetells you how many seconds the function took to complete.
The changes property details the differences found by ACE on each node of your cluster. The changes are identified by node name (for example, n2) and by type:
- The
upserted_rowssection lists the rows upserted. Note that if you have specifiedUPSERT only, the report will include those rows in themissing_rowssection. - The
deleted_rowssection lists the rows deleted on the node. - The
missing_rowssection lists the rows that were missing from the node. You will need to manually add any missing rows to your node.
ACE table-rerun
The table-rerun function allows you to rerun a previous table-diff operation to verify fixes or check if inconsistencies persist after repairs. You can use table-rerun to:
- perform a post-repair verification to confirm that a table-repair run has resolved the diffs identified previously.
- Verify if diffs identified by the table-diff function still exist after the replication lag window has elapsed.
When using ACE table-rerun, you should:
- Include the
hostdbprocessing option for very large tables and diffs to improve performance. - Compare results using the original diff file to confirm that differences were resolved after a replication lag window.
The syntax is:
$ ./pgedge ace table-rerun <cluster_name> --diff_file=/path/to/diff_file.json schema.table_name
cluster_nameis a string value that specifies the name of the cluster as defined in your configuration file.schema.table_nameis a string value that specifies the fully qualified table name (e.g., "public.users")'.diff_fileis a string value that specifies the path to the JSON diff file from a previous table-diff operation.
Optional Arguments
-dor--dbnameis a string value that specifies the database name; this defaults to the first database in the cluster config file.-qor--quietis a boolean value that suppresses non-essential output.-bor--behavioris a string value that specifies the processing behavior [multiprocessingorhostdb].multiprocessing(the default) uses parallel processing for faster execution.hostdbuses the host database to create temporary tables for faster comparisons. This is useful for very large tables and diffs.
ACE table-rerun Command Examples
To perform a table-rerun of a previous diff (specifying a diff file with the --diff-file=diffs/2025-04-08/diffs_090241529.json clause):
$ ./pgedge ace table-rerun demo --diff-file=diffs/2025-04-08/diffs_090241529.json public.foo
✔ Cluster demo exists
✔ Connections successful to nodes in cluster
✔ Table public.foo is comparable across nodes
Starting jobs to compare tables ...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/1 [ 0:00:00 < 0:00:00 , ? it/s ]
✔ TABLES MATCH OK
TOTAL ROWS CHECKED = 2
RUN TIME = 0.40 seconds