Managing Data Consistency with ACE

ACE (the Anti-Chaos Engine)

ACE is a powerful tool designed to ensure and maintain consistency across nodes in a pgEdge cluster. It helps identify and resolve data inconsistencies, schema differences, and replication configuration mismatches across nodes in a cluster.

Key features of ACE include:

  • Table-level data comparison and repair
  • Replication set level verification
  • Automated repair capabilities
  • Schema comparison
  • Spock configuration validation

ACE Deployment Considerations

ACE is very efficient when it comes to comparing tables. It first looks at table sizes, and then based on the specified runtime options, splits the task into multiple processes and executes them in parallel. How fast it executes a table-diff command depends on the:

  • configuration of the machine you're running ACE on – how many cores, how much memory, etc.
  • size of your table.
  • network latency between the ACE node and your database nodes.

ACE uses the cluster definition JSON file (opens in a new tab) to connect to nodes and execute SQL statements. It does not use native connection pooling except when initialising a database connection before forking a process. This is because psycopg connections are not process-safe. Therefore, it might be desirable to set up a connection pooler like pgBouncer or pgCat separately and point to that in the cluster JSON file for faster runtime performance.

ACE cannot be used on a table without a primary key, because primary keys are the basis for range partitioning, hash calculation and several other critical functions in ACE.

If the tables across nodes are vastly different, ACE caps the maximum number of allowed differences to 104 rows. If ACE encounters 10,000 differences, it will terminate early and warn you that it is reaching the maximum differences allowed. This guardrail has been added to prevent ACE from consuming CPU resources when a backup-and-restore might be needed to bring the nodes back in synchrony.