MySQL’s binary log records every change for replication and point-in-time recovery, but the format it uses to record those changes determines whether replicas stay consistent. Three formats are available. One of them has a silent correctness problem that surfaces only when non-deterministic SQL runs on a replica, at which point the divergence is already committed to disk.

Situation

The binary log (binlog) is the backbone of MySQL replication and PITR. Every write that commits on the primary is written to the binlog. Replicas consume the binlog and replay those writes locally. The format controls how each write is recorded: as the original SQL statement, as the actual row values that changed, or as a combination of both selected automatically.

Engineers provisioning a new MySQL server or migrating from an older version frequently encounter the format question without a clear default rationale. MySQL 5.7 defaulted to STATEMENT. MySQL 8.0 changed the default to ROW. The reason for that change is the correctness problem in STATEMENT format, and understanding it clarifies why ROW is the right default for most production workloads.

You can check the current format on any running server:

SELECT @@binlog_format;

The Problem

STATEMENT format logs the SQL text that ran on the primary. When the replica applies the statement, it re-executes that SQL. For most deterministic DML this is fine. The problem appears with non-deterministic functions: UUID(), RAND(), NOW(), SYSDATE(), user-defined functions, and some stored procedure patterns.

Consider this insert:

INSERT INTO orders (id, session_token, created_at)
VALUES (42, UUID(), NOW());

On the primary, UUID() generates a specific UUID and NOW() captures the current timestamp. That statement is written to the binlog verbatim. On the replica, the statement re-executes — but UUID() generates a different UUID and NOW() captures a different time. The primary and replica now hold different data for the same row. The replica has not errored. It has silently diverged.

The same problem appears with RAND(), triggers that call non-deterministic functions, and stored procedures whose output depends on server state. MySQL logs a warning in STATEMENT mode when it detects a non-deterministic statement, but the warning is easy to miss in a busy log.

How the Three Formats Work

FormatWhat is loggedSafe for non-deterministic SQLBinlog size
STATEMENTSQL text of the changeNoSmall
ROWBefore and after values for each rowYesLarge for bulk operations
MIXEDAutomatically ROW when unsafe, STATEMENT otherwiseYesModerate

ROW format logs the actual column values that changed for every row. For a statement that updates 10,000 rows, ROW format writes 10,000 row images to the binlog. This is verbose. A bulk DELETE or UPDATE that touches millions of rows produces a proportionally large binlog event. Binlog disk usage and replication bandwidth both increase relative to STATEMENT.

The tradeoff is correctness: ROW format replicas always apply the exact values the primary committed. There is no re-execution, no non-determinism, no divergence risk.

MIXED format attempts to get the best of both: it uses STATEMENT by default and switches to ROW automatically when MySQL detects that the statement is unsafe for statement-based replication. The detection covers most known unsafe patterns, but coverage is not exhaustive — some stored procedure and trigger combinations can still produce unsafe MIXED-format behavior in edge cases.

MySQL 8.0 default: ROW. The MySQL 8.0 Reference Manual documents this change explicitly, noting that ROW is safer for replication consistency and required for some features including multi-source replication and certain crash-safe replica configurations.

Changing the format at runtime (requires SUPER or BINLOG_ADMIN privilege):

-- Session level
SET SESSION binlog_format = 'ROW';

-- Global level (takes effect for new connections)
SET GLOBAL binlog_format = 'ROW';

For a permanent change, set it in the MySQL configuration file:

[mysqld]
binlog_format = ROW

Note that changing the global binlog format does not affect the current session’s format. Each session that was open before the change continues using the old format until reconnected.

In Practice

The MySQL 8.0 Reference Manual, in the chapter “Binary Logging Formats,” explicitly documents the non-deterministic function risk in STATEMENT mode and lists the categories of unsafe statements. The change from STATEMENT to ROW as the MySQL 8.0 default is documented in the MySQL 8.0 release notes and the replication chapter of the manual.

The binlog size growth with ROW format is documented behavior: the MySQL documentation notes that ROW format generates more log data for statements that modify many rows, particularly for bulk DELETE, UPDATE, and INSERT…SELECT operations. The practical implication is that teams migrating from STATEMENT to ROW should audit their batch operations and ensure binlog retention and disk capacity accounts for the larger volume.

Where It Breaks

ScenarioWhat breaksWhy
STATEMENT with non-deterministic functionsReplica silently diverges from primaryDifferent values for UUID, RAND, NOW on re-execution
ROW format with bulk multi-row operationsBinlog grows very large; replication bandwidth spikesOne row image written per changed row
MIXED with complex stored procedures or triggersUnsafe pattern not detected; falls back to STATEMENTMySQL’s unsafe-detection does not cover all trigger and procedure edge cases

What to Do Next

  • Problem: STATEMENT format silently breaks replica consistency when any non-deterministic function appears in DML, and the divergence is committed before the error is visible.
  • Solution: Set binlog_format = ROW in the MySQL configuration for all production servers; MySQL 8.0 defaults to this already.
  • Proof: Check SELECT @@binlog_format on all replicas and the primary; run SHOW REPLICA STATUS and verify Seconds_Behind_Source stays near zero after the format change.
  • Action: This week, run SELECT @@binlog_format on every MySQL instance in production. For any instance running STATEMENT or MIXED, review whether non-deterministic functions appear in the application’s DML patterns before the next major version upgrade.

ROW format is not a performance optimization — it is a correctness requirement for any workload that uses non-deterministic SQL. The binlog size cost is real but manageable. Replica divergence is not.