MySQL Manual | 4.10.9 Troubleshooting Replication

4 Database Administration
- 4.10 Replication in MySQL

Previous / Next / Up / Table of Contents

4.10.9 Troubleshooting Replication

If you have followed the instructions, and your replication setup is not working, first check the following:

Is the master logging to the binary log? Check with SHOW MASTER STATUS. If it is, Position will be non-zero. If not, verify that you have given the master log-bin option and have set server-id.
Is the slave running? Do SHOW SLAVE STATUS and check that the Slave_IO_Running and Slave_SQL_Running are both ``Yes''. If not, verify slave options
Check the error log for messages. Many users have lost time by not doing this early enough.
If the slave is running, did it establish connection with the master? Do SHOW PROCESSLIST, find the I/O and SQL threads (see section 4.10.3 Replication Implementation Details to see how they display), and check their State column. If it says connecting to master, verify the privileges for the replication user on the master, master host name, your DNS setup, whether the master is actually running, whether it is reachable from the slave.
If the slave was running, but then stopped: it usually happens when some query that succeeded on the master fails on the slave. This should never happen if you have taken a proper snapshot of the master, and never modify the data on the slave outside of the slave thread. If it does, it is a bug, read below on how to report it.
If a query on that succeeded on the master refuses to run on the slave, and a full database resync (that is, delete the slave's database and copy a new snapshot from the master) does not seem feasible, try the following:
- First see if the slave's table was different from the master's. Understand how it happened (it may be a bug: read the Changelogs in the online MySQL manual http://www.mysql.com/documentation to check if this is a known bug and if it is fixed yet). Then make the slave's table identical to the master's and run SLAVE START.
- If the above does not work or does not apply, try to understand if it would be safe to make the update manually (if needed) and then ignore the next query from the master.
- If you have decided you can skip the next query, do SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; SLAVE START; to skip a query that does not use AUTO_INCREMENT or LAST_INSERT_ID(), or SET GLOBAL SQL_SLAVE_SKIP_COUNTER=2; SLAVE START; otherwise. The reason queries that use AUTO_INCREMENT or LAST_INSERT_ID() are different is that they take two events in the binary log of the master.
- Make sure you are not running into an old bug by upgrading to the most recent version.
- If you are sure the slave started out perfectly in sync with the master, and no one has updated the tables involved outside of slave thread, report the bug.

When you have determined that there is no user error involved, and replication still either does not work at all or is unstable, it is time to send us a bug report. We need to get as much information as possible from you to be able to track down the bug. Please do spend some time and effort preparing a good bug report.

If you have a repeatable way to demonstrate the bug, use mysqlbug to prepare a bug report and enter it into our bugs database at http://bugs.mysql.com/. If you have a phantom -- a problem that does occur but you cannot duplicate "at will" -- fortunately this rarely happens:

Verify that there is no user error involved. For example, if you update the slave outside of the slave thread, the data will be out of sync, and you can have unique key violations on updates, in which case the slave thread will stop and wait for you to clean up the tables manually to bring them in sync.
Run slave with log-slave-updates and log-bin -- this will keep a log of all updates on the slave.
Save all evidence before resetting the replication. If we have no or only sketchy information, it would take us a while to track down the problem. The evidence you should collect is:
- All binary logs on the master
- All binary log on the slave
- The output of SHOW MASTER STATUS on the master at the time you have discovered the problem
- The output of SHOW SLAVE STATUS on the master at the time you have discovered the problem
- Error logs on the master and on the slave
Use mysqlbinlog to examine the binary logs. The following should be helpful to find the trouble query, for example:
```
mysqlbinlog -j pos_from_slave_status /path/to/log_from_slave_status | head
```

Once you have collected the evidence on the phantom problem, try hard to isolate it into a separate test case first. Then enter the problem into our bugs database at http://bugs.mysql.com/ with as much information as possible.

User Comments

Posted by Matt Warnock on Friday May 17 2002, @6:24am