最近搭了个主从复制,中间出了点小问题,排查搞定,记录下来

1
环境:

OS:
centos6.5
Linux host2 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

服务器IP
192.168.18.66
192.168.18.67

DB:
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.6.20 |
+-----------+

2
主机:192.168.18.66
从机:192.168.18.67

3
修改主服务器配置,添加如下内容:
server-id=10
log-bin=mysql-bin
binlog-ignore-db=mysql
binlog-ignore-db=information_schema
binlog-ignore-db=performance_schema
replicate-do-db=reptest

此时主服务器这个配置文件/etc/my.cnf内容如下:
[client]
#password = system
#port = 3306
default-character-set=utf8

[mysqld]

server-id=10
log-bin=mysql-bin
binlog-ignore-db=mysql
binlog-ignore-db=information_schema
binlog-ignore-db=performance_schema
replicate-do-db=reptest

sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

port=3306
character_set_server=utf8
character_set_client=utf8
collation-server=utf8_general_ci
lower_case_table_names=1
max_connections=500

[mysql]
default-character-set=utf8

4
修改从服务器配置
server-id=20
relay_log=mysql-relay-bin
read_only

此时从服务器配置文件内容如下:
[client]
#password=system
#port=3306
default-character-set=utf8

[mysqld]

server-id=20
relay_log=mysql-relay-bin
#read_only
#log_slave_updates=1

#master-host=192.168.18.66
#master-user=repl
#master-password=123
#master-port=3306
#master-connect-retry=60
#replicate_do_db=reptest
#replicate_ignore_db=mysql,information_schema,performance_schema

sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
port=3306
character_set_server=utf8
character_set_client=utf8
collation-server=utf8_general_ci
lower_case_table_names=1
max_connections=500

[mysql]
default-character-set=utf8

mysql复制的相关参数可参考下面的网页:
http://dev.mysql.com/doc/refman/5.5/en/replication-options-slave.html

5
在主上增加复制用户
mysql> grant replication slave on *.* to 'repl'@'%' identified by '123456';
flush privileges;


192.168.18.67是从服务器,就通过repl用户密码为空来同步复制

mysql> select host,user,Repl_slave_priv from mysql.user where user='repl';
+---------------+------+-----------------+
| host | user | Repl_slave_priv |
+---------------+------+-----------------+
| 192.168.18.67 | repl | Y |
+---------------+------+-----------------+
1 row in set (0.00 sec)

6
重启主从服务器:
停主,停从
mysqladmin -uroot shutdown -psystem
起从,起主
/etc/init.d/mysql start

[[email protected] ~]# /etc/init.d/mysql start
Starting MySQL.. SUCCESS!

7
导出主数据,取快照
1)锁主库
flush tables with read lock;

2)
这一步比较重要,要记住File和Position值,在起从服务器上的slave线程时备用
> show master status \G
*************************** 1. row ***************************
File: mysql-bin.000002
Position: 401
Binlog_Do_DB:
Binlog_Ignore_DB: mysql,information_schema,performance_schema
Executed_Gtid_Set:
1 row in set (0.00 sec)

3)
[[email protected] ~]# mysqldump -uroot -p reptest --triggers --routines --events > /home/zxw/master_reptest.sql

顺便看一下,mysqldump的内容如下:
[[email protected] ~]# ll /home/zxw/
total 4
-rw-r--r--. 1 root root 1910 Aug 25 13:50 master_reptest.sql
[[email protected] ~]# nl /home/zxw/master_reptest.sql
1 -- MySQL dump 10.13 Distrib 5.6.20, for Linux (x86_64)
2 --
3 -- Host: localhost Database: reptest
4 -- ------------------------------------------------------
5 -- Server version 5.6.20-log

6 /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
7 /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
8 /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
9 /*!40101 SET NAMES utf8 */;
10 /*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
11 /*!40103 SET TIME_ZONE='+00:00' */;
12 /*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
13 /*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
14 /*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
15 /*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

16 --
17 -- Table structure for table `tbldata`
18 --

19 DROP TABLE IF EXISTS `tbldata`;
20 /*!40101 SET @saved_cs_client = @@character_set_client */;
21 /*!40101 SET character_set_client = utf8 */;
22 CREATE TABLE `tbldata` (
23 `id` int(11) DEFAULT NULL
24 ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
25 /*!40101 SET character_set_client = @saved_cs_client */;

26 --
27 -- Dumping data for table `tbldata`
28 --

29 LOCK TABLES `tbldata` WRITE;
30 /*!40000 ALTER TABLE `tbldata` DISABLE KEYS */;
31 INSERT INTO `tbldata` VALUES (1),(2),(3);
32 /*!40000 ALTER TABLE `tbldata` ENABLE KEYS */;
33 UNLOCK TABLES;

34 --
35 -- Dumping events for database 'reptest'
36 --

37 --
38 -- Dumping routines for database 'reptest'
39 --
40 /*!40103 SET [email protected]_TIME_ZONE */;

41 /*!40101 SET [email protected]_SQL_MODE */;
42 /*!40014 SET [email protected]_FOREIGN_KEY_CHECKS */;
43 /*!40014 SET [email protected]_UNIQUE_CHECKS */;
44 /*!40101 SET [email protected]_CHARACTER_SET_CLIENT */;
45 /*!40101 SET [email protected]_CHARACTER_SET_RESULTS */;
46 /*!40101 SET [email protected]_COLLATION_CONNECTION */;
47 /*!40111 SET [email protected]_SQL_NOTES */;

48 -- Dump completed on 2014-08-25 13:50:48

4)
解锁数据库
mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)

############################
###拷贝数据文件目录方式#####
############################
#第二种取主数据库快照的方法
#mysqladmin -uroot shutdown
#打包数据库数据目录,例如数据目录是/data/dbdata:
#cd /data
#tar zcvf dbdata.tar.gz dbdata
#备份后就可以启动主服务器了:
#mysqld_safe –user=mysql &

8
在从服务器上恢复主库快照
1)
在从库上创建数据库
mysql> create database reptest;
Query OK, 1 row affected (0.00 sec)
2)
拷贝备份脚本到从服务器
[[email protected] ~]# scp :/home/zxw/master_reptest.sql /home/zxw/
3)
主库快照导入到从库
[[email protected] ~]# mysql -uroot -psystem reptest < /home/zxw/master_reptest.sql
Warning: Using a password on the command line interface can be insecure.
4)
验证:
[[email protected] ~]# mysql -uroot -psystem
mysql> use reptest;
mysql>
mysql> show tables
-> ;
+-------------------+
| Tables_in_reptest |
+-------------------+
| tbldata |
+-------------------+
1 row in set (0.00 sec)

mysql> select * from tbldata;
+------+
| id |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)

mysql>

############################
###拷贝数据文件目录方式#####
############################
#备份文件方式的导入
#由于需要置换成主服务器的数据目录,先关闭服务:
#mysqladmin -uroot shutdown
#备份数据目录
#mv dbdata dbdata.bak
#解包从主服务器拷贝来的数据目录
#tar zxvf dbdata.tar.gz
#要确保文件的权限属主等设置没问题,dbdata目录应该是mysql:mysql用户所有。

9
1)
在从服务器上操作,连接主服务器开始同步数据:
mysql> Change master to Master_host = '192.168.18.66', Master_port = 3306, Master_user = 'repl', Master_password = '123456', Master_log_file = 'mysql-bin.000002', Master_log_pos = 401;
Query OK, 0 rows affected, 2 warnings (0.11 sec)
mysql>
这里包含的信息有主机的地址和端口、主机提供的复制帐号、主机的binlog位置信息。Master_log_file和Master_log_pos是主服务器的快照信息(就是第7不第2小步看到的值),从服务器从该binlog的相应位置开始从主服务器同步数据。

2)
启动从服务器线程就可以开始同步了:
start slave;
一旦从服务器开始同步了,就能在数据文件目录下找到2个文件master.info和relay-log.info。从服务器利用这2个文件来跟踪处理了多少master的binlog。
分别在主从服务器show processlist查看连接,就可以看到repl用户的连接,可证明复制已经生效。


从:
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.18.66
Master_User: usrep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000004
Read_Master_Log_Pos: 1264
Relay_Log_File: mysql-relay-bin.000021
Relay_Log_Pos: 283
Relay_Master_Log_File: mysql-bin.000004
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1264
Relay_Log_Space: 1075
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10
Master_UUID: c03d6252-2a2f-11e4-9b48-000c291888ce
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)


主:可以看到下面内容
mysql> show processlist \G
*************************** 2. row ***************************
Id: 11
User: usrep
Host: 192.168.18.67:48746
db: NULL
Command: Binlog Dump
Time: 179
State: Master has sent all binlog to slave; waiting for binlog to be updated
Info: NULL
2 rows in set (0.00 sec)

从服务器:
数据文件中相关文件如下
[[email protected] ~]# ll /var/lib/mysql/
-rw-rw----. 1 mysql mysql 128 Aug 28 11:32 master.info
-rw-rw----. 1 mysql mysql 59 Aug 28 11:32 relay-log.info

-rw-rw----. 1 mysql mysql 792 Aug 28 11:32 mysql-relay-bin.000020
-rw-rw----. 1 mysql mysql 283 Aug 28 11:32 mysql-relay-bin.000021
-rw-rw----. 1 mysql mysql 50 Aug 28 11:32 mysql-relay-bin.index

主服务器:
数据文件中相关文件如下
-rw-rw----. 1 mysql mysql 1036 Aug 28 09:32 mysql-bin.000003
-rw-rw----. 1 mysql mysql 1264 Aug 28 11:04 mysql-bin.000004
-rw-rw----. 1 mysql mysql 76 Aug 28 09:32 mysql-bin.index

到这儿就ok了

下面说一下在配置过程中遇到的问题:

问题1
1
刚搭完跑起来一看,有问题,Slave_IO_Running: Connecting,IO线程链接主服务进程没有成功
mysql> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Connecting to master
Master_Host: 192.168.18.66
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 401
Relay_Log_File: host2-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 401
Relay_Log_Space: 120
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 2003
Last_IO_Error: error connecting to master ':3306' - retry-time: 60 retries: 1
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 0
Master_UUID:
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp: 140825 14:29:05
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)

问题1
2
查看日志有如下内容
[[email protected] ~]# tail -n 30 /var/lib/mysql/host2.err
2014-08-27 17:04:37 2384 [ERROR] Slave I/O: error connecting to master ':3306' - retry-time: 60 retries: 1, Error_code: 2003
2014-08-27 17:04:37 2384 [Warning] Slave SQL: If a crash happens this configuration does not guarantee that the relay log info will be consistent, Error_code: 0
2014-08-27 17:04:37 2384 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000003' at position 120, relay log './mysql-relay-bin.000001' position: 4
2014-08-27 17:05:12 2384 [Note] Error reading relay log event: slave SQL thread was killed
2014-08-27 17:05:12 2384 [Note] Slave I/O thread killed while connecting to master
2014-08-27 17:05:12 2384 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000003', position 120

问题1
3
在主上新建一个全权用户,在从上用这个用户做复制,结果一致
主:
mysql> grant all on *.* to 'usrep'@'%' identified by '123456';

mysql> Change master to Master_host = '192.168.18.66', Master_port = 3306, Master_user = 'repl', Master_password = '123456', Master_log_file = 'mysql-bin.000002', Master_log_pos = 401;
用usrep在从上起slave复制线程,问题依旧

问题1
4
在主上mysql -uusrep -p直接登录主数据库,成功。
在从上mysql -h 192.168.18.67 -uusrep -p登录主数据库,失败。
[[email protected] ~]# mysql -h 192.168.18.67 -uroot -psystem
Warning: Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.18.66' (113)

问题1
5
查看主的iptable
[[email protected] ~]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED
ACCEPT icmp -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited


Chain FORWARD (policy ACCEPT)
target prot opt source destination
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

问题1
6
打开文件 /etc/sysconfig/iptables(该文件路径因操作而异),文件内容如下:
[[email protected] ~]# nl /etc/sysconfig/iptables
1 # Firewall configuration written by system-config-firewall
2 # Manual customization of this file is not recommended.
3 *filter
4 :INPUT ACCEPT [0:0]
5 :FORWARD ACCEPT [0:0]
6 :OUTPUT ACCEPT [0:0]
7 -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
8 -A INPUT -p icmp -j ACCEPT
9 -A INPUT -i lo -j ACCEPT
10 -A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
11 -A INPUT -j REJECT --reject-with icmp-host-prohibited
12 -A FORWARD -j REJECT --reject-with icmp-host-prohibited
13 COMMIT


编辑该文件增加一行,放开tcp的3306端口
-A INPUT -m state --state NEW -m tcp -p tcp --dport 3306 -j ACCEPT
结果如下
[[email protected] ~]# nl /etc/sysconfig/iptables
1 # Firewall configuration written by system-config-firewall
2 # Manual customization of this file is not recommended.
3 *filter
4 :INPUT ACCEPT [0:0]
5 :FORWARD ACCEPT [0:0]
6 :OUTPUT ACCEPT [0:0]
7 -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
8 -A INPUT -p icmp -j ACCEPT
9 -A INPUT -i lo -j ACCEPT
10 -A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
11 -A INPUT -m state --state NEW -m tcp -p tcp --dport 3306 -j ACCEPT
12 -A INPUT -j REJECT --reject-with icmp-host-prohibited
13 -A FORWARD -j REJECT --reject-with icmp-host-prohibited
14 COMMIT

重启iptable服务
[[email protected] ~]# /etc/init.d/iptables restart

查看现有iptables规则:
[[email protected] ~]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED
ACCEPT icmp -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh
ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:mysql
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)
target prot opt source destination
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

问题1
7
再在从上启动slave复制线程,问题解决:
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.18.66
Master_User: usrep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000004
Read_Master_Log_Pos: 1264
Relay_Log_File: mysql-relay-bin.000021
Relay_Log_Pos: 283
Relay_Master_Log_File: mysql-bin.000004
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1264
Relay_Log_Space: 1075
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10
Master_UUID: c03d6252-2a2f-11e4-9b48-000c291888ce
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)

此时主从复制搭建完成,测试了可以正常运行。

问题2
想把从数据库搞成只读的,在配置文件中加read_only参数及各种赋值和启停从库n次,没有达到预期效果,依然可以直接连从库进行增删改。
此处特诡异,改日再试

灾难恢复

主从不同步
如果主从同步出现了不一致,就需要重新实施主从复制。步骤和上面相同,只是省略了修改配置文件和创建用户的步骤。
重新配置之前,需要在从服务器停止同步线程:stop slave;

从从服务器恢复
如果主机挂了,可以把从服务器提升为主机,把原主服务器作为备机。
先在从服务器停止同步线程:
stop slave;
在从服务器上添加同步用户:
grant replication slave on *.* to repl@'从服务器ip' identified by '123456';
flush privileges;
配置文件中my.cnf的server-id可以不修改,只要保证id不冲突就行了。

然后,按照主从复制的步骤来进行操作。 

-----------------

转载请著明出处:
blog.csdn.net/beiigang