Pythian Blog: Technical Track

pt-online -schema corner case: When you don’t want to replicate your alter activities across the whole cluster.

Intro

When we want to perform a heavy ALTER TABLE against  a several gigabytes table the options are quite a few but the one I tend to use the most is pt-online-schema-change (pt-osc). This tool is very mature and solid and helped me in the past with Galera Cluster, standalone and regular asynchronous MySQL topologies for many years. This time I encountered something that it’s worth keeping in mind when we’re performing a given DDL command in a regular primary-secondary cluster with one requisite: do NOT replicate this ALTER TABLE among the cluster. 

The best practice in these situations is to perform any DDL change in the replica, then perform the proper promotion/demotion of servers and continue with the ALTER operations in the new replicas. Now, in this particular case, instance’s promotion was not an option as the replicas had a hard role designed as backup servers so any DDL would have to happen keeping the original roles.

Findings

In the case of an ALTER TABLE of this nature to be performed with pt-osc the natural way to go is this:

./pt-online-schema-change --alter="FORCE" --statistics --print --recursion-method=none --execute --set-vars sql_log_bin=OFF

Executing the above command will perform the command without any issue and it won’t replicate any temporary table or trigger creation so the alter will run isolated on the target master.. Right? Well, no. 

Whether it’s true that no command like above will be replicated, there is a caveat here: Any query created by the trigger itself will be inserted in the binlog as part of the original event. This is, in case of an INSERT-statement happening in the original table during the pt-osc execution will create a single event that will contain the original statement plus the INSERT-statement on the temporary table identified by _OriginalTableName_new since the variable defined only affects to the active session and not the whole instance.

The problem here is obvious, as soon as that transaction comes into the replica it will break complaining about trying to apply an event in a non existing table (the temporary one as it’s not been replicated thanks to our –set-vars sql_log_bin=OFF) and skipping this event would skip the original DML query too.

Not everything is lost, keep reading…

Testing

Let’s perform a full test to prove my words. For this test i have a dummy script just writing data  in the background and the pt-osc version is 3.3.1

  1. Binlog format
    mysql> show global variables like 'binlog_format';
    +---------------+-------+
    | Variable_name | Value |
    +---------------+-------+
    | binlog_format | ROW   |
    +---------------+-------+
    1 row in set (0.00 sec)
    

     

  2. The primary status
    mysql> show master status\G
    *************************** 1. row ***************************
                 File: mysql-bin.000003
             Position: 166814
         Binlog_Do_DB: 
     Binlog_Ignore_DB: 
    Executed_Gtid_Set: 
    1 row in set (0.00 sec)
    

     

  3. The replica status
    mysql> show slave status\G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: mysql57-01
                      Master_User: repl
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: mysql-bin.000003
              Read_Master_Log_Pos: 166814
                   Relay_Log_File: mysql57-02-relay-bin.000008
                    Relay_Log_Pos: 167027
            Relay_Master_Log_File: mysql-bin.000003
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
    …/…
    

     

  4. Pt-osc execution
    root@mysql57-01:~# ./pt-online-schema-change --alter="FORCE" --statistics --print --recursion-method=none --execute --set-vars sql_log_bin=OFF h=localhost,D=dummydb,t=dummyt,u=root,p=letmein
    No slaves found.  See --recursion-method if host mysql57-01 has slaves.
    Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
    .../...
    2022-06-24T13:22:08 Dropped triggers OK.
    # Event              Count
    # ================== =====
    # INSERT                 1
    # mysql_warning_1062     1
    Successfully altered `dummydb`.`dummyt`.
    

     

  5. New replica status
    mysql> show slave status\G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: mysql57-01
                      Master_User: repl
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: mysql-bin.000003
              Read_Master_Log_Pos: 1192694
                   Relay_Log_File: mysql57-02-relay-bin.000008
                    Relay_Log_Pos: 888957
            Relay_Master_Log_File: mysql-bin.000003
                 Slave_IO_Running: Yes
                Slave_SQL_Running: No
    …/…
                       Last_Errno: 1146
                       Last_Error: Error executing row event: 'Table 'dummydb._dummyt_new' doesn't exist'
    …/…
    

     

  6. Conflictive binlog entry (example)
    BEGIN
    /*!*/;
    # at 889320
    #220624 13:22:08 server id 1  end_log_pos 889378 CRC32 0x648d1d8b       Table_map: `dummydb`.`dummyt` mapped to number 233
    # at 889378
    #220624 13:22:08 server id 1  end_log_pos 889441 CRC32 0xa4727bb7       Table_map: `dummydb`.`_dummyt_new` mapped to number 230
    # at 889441
    #220624 13:22:08 server id 1  end_log_pos 889493 CRC32 0x10a18dfd       Write_rows: table id 233
    # at 889493
    #220624 13:22:08 server id 1  end_log_pos 889545 CRC32 0x3ed50ab7       Write_rows: table id 230 flags: STMT_END_F
    ### INSERT INTO `dummydb`.`dummyt`
    ### SET
    ###   @1=51
    ###   @2='89'
    ###   @3=1656076928
    ###   @4=338
    ### INSERT INTO `dummydb`.`_dummyt_new`
    ### SET
    ###   @1=51
    ###   @2='89'
    ###   @3=1656076928
    ###   @4=338
    # at 889545
    #220624 13:22:08 server id 1  end_log_pos 889576 CRC32 0x855d1d3f       Xid = 6257
    COMMIT/*!*/;
    

     

  7. Count of undesirable queries in binlog
    root@mysql57-01:~# mysqlbinlog --base64-output=decode-rows --verbose /var/log/mysql/mysql-bin.000003 > /tmp/bin03
    root@mysql57-01:~# cat /tmp/bin03 | grep "_new" | wc -l
    74
    

     

It’s clear that our triggers are writing data that shouldn’t be there. Let’s change the binlog format to STATEMENT:

  1. Binlog format
    mysql> show global variables like 'binlog_format';
    +---------------+-----------+
    | Variable_name | Value     |
    +---------------+-----------+
    | binlog_format | STATEMENT |
    +---------------+-----------+
    1 row in set (0.00 sec)
    

     

  2. The master status
    mysql> show master status\G
    *************************** 1. row ***************************
                 File: mysql-bin.000006
             Position: 154
         Binlog_Do_DB: 
     Binlog_Ignore_DB: 
    Executed_Gtid_Set: 
    1 row in set (0.00 sec)
    

     

  3. The replica status
    mysql> show slave status\G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: mysql57-01
                      Master_User: repl
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: mysql-bin.000006
              Read_Master_Log_Pos: 154
                   Relay_Log_File: mysql57-02-relay-bin.000014
                    Relay_Log_Pos: 367
            Relay_Master_Log_File: mysql-bin.000006
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
    …/…
    

     

  4. Pt-osc execution
    root@mysql57-01:~# ./pt-online-schema-change --alter="FORCE" --statistics --print --recursion-method=none --execute --set-vars sql_log_bin=OFF h=localhost,D=dummydb,t=dummyt,u=root,p=letmein
    No slaves found.  See --recursion-method if host mysql57-01 has slaves.
    Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
    …/…
    2022-06-24T13:37:31 Dropped triggers OK.
    # Event  Count
    # ====== =====
    # INSERT     1
    Successfully altered `dummydb`.`dummyt`.
    

     

  5. New replica status
    mysql> show slave status\G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: mysql57-01
                      Master_User: repl
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: mysql-bin.000006
              Read_Master_Log_Pos: 332460
                   Relay_Log_File: mysql57-02-relay-bin.000014
                    Relay_Log_Pos: 332673
            Relay_Master_Log_File: mysql-bin.000006
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
    …/…
    

     

  6. Count of undesirable queries in binlog
    root@mysql57-01:~# mysqlbinlog --base64-output=decode-rows --verbose /var/log/mysql/mysql-bin.000006 > /tmp/bin06
    root@mysql57-01:~# cat /tmp/bin06 | grep "_new" | wc -l
    0
    

     

Summary

If you have the particular need of running pt-osc in such a way that the effects of it must be isolated and affecting only that server, ensure the binlog format is STATEMENT. Otherwise it will end up breaking replication and the solution will be to create all those temporary tables manually and populate them to ensure that any DML command gets properly applied and so the original one too, avoiding any data integrity issue. In any case, keep in mind that the recommended value for binlog format is ROW for consistency reasons as STATEMENT could lead to undeterministic binlog entries that could insert different values in replicas.

No Comments Yet

Let us know what you think

Subscribe by email