这个月初,通过zabbix监控发现有1台MySQL数据库的从库内存剩余空间不断降低。检查以往的监控历史图表,发现由于内存占用不断增大,每隔一个多月,就会因为内存严重不足,导致这台服务器的1个mysql实例(端口:3316)重启。数据库实例重启之后,内存被大量释放,但经过一个多月,又会因为不断占用的内存再次重启mysql实例。
创新互联是一家集网站建设,坡头企业网站建设,坡头品牌网站建设,网站定制,坡头网站建设报价,网络营销,网络优化,坡头网站推广为一体的创新建站企业,帮助传统企业提升企业形象加强企业竞争力。可充分满足这一群体相比中小企业更为丰富、高端、多元的互联网需求。同时我们时刻保持专业、时尚、前沿,时刻以成就客户成长自我,坚持不断学习、思考、沉淀、净化自己,让我们为更多的企业打造出实用型网站。
上图是这台服务器的zabbix监控图形,近半年来3次内存枯竭的时间分别是去年10月27日、12月25日、今年2月8日。
执行“top”命令,可以快速找到是那个mysql实例的进程PID:
[root@DB3 data1]# top top - 14:06:03 up 829 days, 5 min, 1 user, load average: 0.00, 0.02, 0.05 Tasks: 167 total, 1 running, 166 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.2 us, 0.1 sy, 0.0 ni, 99.5 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 32520424 total, 14101408 used, 18419016 free, 144168 buffers KiB Swap: 0 total, 0 used, 0 free. 1240256 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3348 mysql 20 0 6048008 4.340g 7436 S 1.0 14.0 48:20.58 mysqld 1039 mysql 20 0 32.919g 5.942g 0 S 0.3 19.2 923:58.09 mysqld 18301 root 20 0 123692 1672 1176 R 0.3 0.0 0:00.02 top 1 root 20 0 41108 2384 1232 S 0.0 0.0 8:53.72 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:05.82 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 1:30.15 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root rt 0 0 0 0 S 0.0 0.0 0:21.02 migration/0 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/0 10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/1 11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/2 12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/3 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/4 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/5 15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/6 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/7 17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/8 18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/9 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/10 20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/11 21 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/12 22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/13 23 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/14 24 root 20 0 0 0 0 S 0.0 0.0 373:05.02 rcu_sched
执行“ps -ef|grep mysql”命令,就可以根据上面的PID,找到是3316端口的mysql实例:
[root@DB3 data1]# ps -ef|grep mysql mysql 1039 26090 0 2017 ? 15:23:58 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/mysql/data/ --plugin-dir=/usr/local/mysql/lib/mysql/plugin --user=mysql --log-error=/mysql/data/mysql-error.log --open-files-limit=65535 --pid-file=/mysql/data/mysql.pid --socket=/mysql/data/mysql.sock mysql 3348 5231 0 Mar15 ? 00:48:20 /usr/local/mysql/bin/mysqld --defaults-extra-file=/mysql/data1/my.cnf --basedir=/usr/local/mysql --datadir=/mysql/data1/ --plugin-dir=/usr/local/mysql/lib/mysql/plugin --user=mysql --log-error=/mysql/data1/mysql-error.log --open-files-limit=65535 --pid-file=/mysql/data1/mysql.pid --socket=/mysql/data1/mysql.sock --port=3316 root 5231 1 0 2016 ? 00:00:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --defaults-extra-file=/mysql/data1/my.cnf --datadir=/mysql/data1/ --user=mysql root 18327 17987 0 14:06 pts/2 00:00:00 grep --color=auto mysql root 26090 1 0 2016 ? 00:00:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/mysql/data/ --pid-file=/mysql/data/mysql.pid
3316这个实例是rabbitMQ持久化存储数据用的,最初我怀疑可能与持久化出问题有关。
但是,运维同事根据数据库的错误日志,认为是mysql的配置参数设置不当有关。
[root@DB3 ~]# cd /mysql/data1/ [root@DB3 data1]# ll -tr total 1607352 drwx------ 2 mysql mysql 6 Nov 28 2016 test drwx------ 2 mysql mysql 4096 Nov 28 2016 performance_schema drwx------ 2 mysql mysql 4096 Nov 28 2016 mysql -rwxr-xr-x 1 root root 127 Nov 28 2016 start_mysql_3316.sh -rw-rw---- 1 mysql mysql 56 Nov 28 2016 auto.cnf drwx------ 2 mysql mysql 4096 Nov 28 2016 activemq -rw-rw---- 1 mysql mysql 536870912 Mar 7 13:09 ib_logfile1 -rw-r--r-- 1 root root 2455 Mar 15 09:57 my.cnf -rw-rw---- 1 mysql mysql 266186542 Mar 15 18:05 mysql-bin.000012 -rw-rw---- 1 mysql mysql 60 Mar 15 18:05 mysql-bin.index srwxrwxrwx 1 mysql mysql 0 Mar 15 18:05 mysql.sock -rw-rw---- 1 mysql mysql 5 Mar 15 18:05 mysql.pid -rw-r----- 1 mysql root 24166 Mar 15 18:05 mysql-error.log -rw-rw---- 1 mysql mysql 17395510 Mar 19 13:49 mysql-bin.000013 -rw-rw---- 1 mysql mysql 79691776 Mar 19 13:49 ibdata1 -rw-rw---- 1 mysql mysql 536870912 Mar 19 13:49 ib_logfile0 -rw-rw---- 1 mysql mysql 204840084 Mar 19 13:49 mysql-slow.log [root@DB3 data1]# tail -90 mysql-error.log 2018-02-07 19:59:28 12413 [Note] InnoDB: Compressed tables use zlib 1.2.3 2018-02-07 19:59:28 12413 [Note] InnoDB: Using Linux native AIO 2018-02-07 19:59:28 12413 [Note] InnoDB: Using CPU crc32 instructions 2018-02-07 19:59:28 12413 [Note] InnoDB: Initializing buffer pool, size = 26.0G 2018-02-07 19:59:30 12413 [Note] InnoDB: Completed initialization of buffer pool 2018-02-07 19:59:30 12413 [Note] InnoDB: Highest supported file format is Barracuda. 2018-02-07 19:59:30 12413 [Note] InnoDB: The log sequence numbers 1625987 and 1625987 in ibdata files do not match the log sequence number 12657781043 in the ib_logfiles! 2018-02-07 19:59:30 12413 [Note] InnoDB: Database was not shutdown normally! 2018-02-07 19:59:30 12413 [Note] InnoDB: Starting crash recovery. 2018-02-07 19:59:30 12413 [Note] InnoDB: Reading tablespace information from the .ibd files... 2018-02-07 19:59:30 12413 [Note] InnoDB: Restoring possible half-written data pages 2018-02-07 19:59:30 12413 [Note] InnoDB: from the doublewrite buffer... InnoDB: 1 transaction(s) which must be rolled back or cleaned up InnoDB: in total 127441 row operations to undo InnoDB: Trx id counter is 10797056 InnoDB: Last MySQL binlog file position 0 793055160, file name mysql-bin.000011 2018-02-07 19:59:32 12413 [Note] InnoDB: 128 rollback segment(s) are active. InnoDB: Starting in background the rollback of uncommitted transactions 2018-02-07 19:59:32 7fd34afa8700 InnoDB: Rolling back trx with id 10003975, 127441 rows to undo InnoDB: Progress in percents: 12018-02-07 19:59:32 12413 [Note] InnoDB: Waiting for purge to start 2018-02-07 19:59:32 12413 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.27-76.0 started; log sequence number 12657781043 22018-02-07 19:59:33 12413 [Note] Recovering after a crash using /mysql/data1/mysql-bin 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 1002018-02-07 19:59:33 12413 [Note] InnoDB: Rollback of trx with id 10003975 completed 2018-02-07 19:59:33 7fd34afa8700 InnoDB: Rollback of non-prepared transactions completed 2018-02-07 19:59:51 12413 [Note] Starting crash recovery... 2018-02-07 19:59:51 12413 [Note] Crash recovery finished. 2018-02-07 19:59:51 12413 [Note] Server hostname (bind-address): '*'; port: 3316 2018-02-07 19:59:51 12413 [Note] IPv6 is available. 2018-02-07 19:59:51 12413 [Note] - '::' resolves to '::'; 2018-02-07 19:59:51 12413 [Note] Server socket created on IP: '::'. 2018-02-07 19:59:51 12413 [Warning] 'user' entry 'root@db3' ignored in --skip-name-resolve mode. 2018-02-07 19:59:51 12413 [Warning] 'user' entry '@db3' ignored in --skip-name-resolve mode. 2018-02-07 19:59:51 12413 [Warning] 'proxies_priv' entry '@ root@db3' ignored in --skip-name-resolve mode. 2018-02-07 19:59:51 12413 [Note] Event Scheduler: Loaded 0 events 2018-02-07 19:59:51 12413 [Note] /usr/local/mysql/bin/mysqld: ready for connections. Version: '5.6.27-76.0-log' socket: '/mysql/data1/mysql.sock' port: 3316 Source distribution 180315 18:04:56 mysqld_safe Number of processes running now: 0 180315 18:04:56 mysqld_safe mysqld restarted 2018-03-15 18:04:57 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). 2018-03-15 18:04:57 0 [Warning] 'ERROR_FOR_DIVISION_BY_ZERO' is deprecated and will be removed in a future release. 2018-03-15 18:04:57 0 [Warning] 'NO_ZERO_DATE' is deprecated and will be removed in a future release. 2018-03-15 18:04:57 0 [Warning] 'NO_ZERO_IN_DATE' is deprecated and will be removed in a future release. 2018-03-15 18:04:57 0 [Note] /usr/local/mysql/bin/mysqld (mysqld 5.6.27-76.0-log) starting as process 3348 ... 2018-03-15 18:04:57 3348 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead. 2018-03-15 18:04:57 3348 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead. 2018-03-15 18:04:57 3348 [Note] Plugin 'FEDERATED' is disabled. 2018-03-15 18:04:57 3348 [Warning] The option innodb (skip-innodb) is deprecated and will be removed in a future release 2018-03-15 18:04:57 3348 [Warning] The option innodb (skip-innodb) is deprecated and will be removed in a future release 2018-03-15 18:04:57 3348 [Note] InnoDB: Using atomics to ref count buffer pool pages 2018-03-15 18:04:57 3348 [Note] InnoDB: The InnoDB memory heap is disabled 2018-03-15 18:04:57 3348 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2018-03-15 18:04:57 3348 [Note] InnoDB: Memory barrier is not used 2018-03-15 18:04:57 3348 [Note] InnoDB: Compressed tables use zlib 1.2.3 2018-03-15 18:04:57 3348 [Note] InnoDB: Using Linux native AIO 2018-03-15 18:04:57 3348 [Note] InnoDB: Using CPU crc32 instructions 2018-03-15 18:04:57 3348 [Note] InnoDB: Initializing buffer pool, size = 2.0G 2018-03-15 18:04:57 3348 [Note] InnoDB: Completed initialization of buffer pool 2018-03-15 18:04:57 3348 [Note] InnoDB: Highest supported file format is Barracuda. 2018-03-15 18:04:57 3348 [Note] InnoDB: The log sequence numbers 1625987 and 1625987 in ibdata files do not match the log sequence number 12972128167 in the ib_logfiles! 2018-03-15 18:04:57 3348 [Note] InnoDB: Database was not shutdown normally! 2018-03-15 18:04:57 3348 [Note] InnoDB: Starting crash recovery. 2018-03-15 18:04:57 3348 [Note] InnoDB: Reading tablespace information from the .ibd files... 2018-03-15 18:04:58 3348 [Note] InnoDB: Restoring possible half-written data pages 2018-03-15 18:04:58 3348 [Note] InnoDB: from the doublewrite buffer... InnoDB: 1 transaction(s) which must be rolled back or cleaned up InnoDB: in total 103450 row operations to undo InnoDB: Trx id counter is 11109120 InnoDB: Last MySQL binlog file position 0 266186542, file name mysql-bin.000012 2018-03-15 18:04:59 3348 [Note] InnoDB: 128 rollback segment(s) are active. InnoDB: Starting in background the rollback of uncommitted transactions 2018-03-15 18:04:59 7eff9bd1d700 InnoDB: Rolling back trx with id 10797063, 103450 rows to undo InnoDB: Progress in percents: 12018-03-15 18:04:59 3348 [Note] InnoDB: Waiting for purge to start 2 3 42018-03-15 18:04:59 3348 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.27-76.0 started; log sequence number 12972128167 5 6 7 8 9 10 11 12 13 14 15 16 172018-03-15 18:04:59 3348 [Note] Recovering after a crash using /mysql/data1/mysql-bin 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 1002018-03-15 18:05:00 3348 [Note] InnoDB: Rollback of trx with id 10797063 completed 2018-03-15 18:05:00 7eff9bd1d700 InnoDB: Rollback of non-prepared transactions completed 2018-03-15 18:05:03 3348 [Note] Starting crash recovery... 2018-03-15 18:05:03 3348 [Note] Crash recovery finished. 2018-03-15 18:05:03 3348 [Note] Server hostname (bind-address): '*'; port: 3316 2018-03-15 18:05:03 3348 [Note] IPv6 is available. 2018-03-15 18:05:03 3348 [Note] - '::' resolves to '::'; 2018-03-15 18:05:03 3348 [Note] Server socket created on IP: '::'. 2018-03-15 18:05:03 3348 [Warning] 'user' entry 'root@db3' ignored in --skip-name-resolve mode. 2018-03-15 18:05:03 3348 [Warning] 'user' entry '@db3' ignored in --skip-name-resolve mode. 2018-03-15 18:05:03 3348 [Warning] 'proxies_priv' entry '@ root@db3' ignored in --skip-name-resolve mode. 2018-03-15 18:05:04 3348 [Note] Event Scheduler: Loaded 0 events 2018-03-15 18:05:04 3348 [Note] /usr/local/mysql/bin/mysqld: ready for connections. Version: '5.6.27-76.0-log' socket: '/mysql/data1/mysql.sock' port: 3316 Source distribution
在上面mysql错误日志里,第4行“Initializing buffer pool, size = 26.0G”引起了他的注意,并据此对mysql配置文件做了修改。
[root@DB3 data1]# tail -16 my.cnf # INNODB # innodb-flush-method = O_DIRECT innodb-log-files-in-group = 2 innodb-log-file-size = 512M innodb-flush-log-at-trx-commit = 1 innodb-file-per-table = 1 innodb-buffer-pool-size = 2G # LOGGING # log-error = /mysql/data1/mysql-error.log log-queries-not-using-indexes = 1 slow-query-log = 1 slow-query-log-file = /mysql/data1/mysql-slow.log #slave slave_skip_errors=1062
第7行的参数“innodb-buffer-pool-size”当时配置的有几十G,他把这个参数修改为“2G”。
这个数据库实例用于rabbitMQ持久化,而rabbitMQ是提供给电销使用的。
在电销部门下午6点下班之后,他通过“ps -ef|grep mysql”找到这个实例,再“kill -9”杀掉这个实例,没想到,这个实例就立即自己启动了。
电销相关的开发同事检查电销业务没有问题后,我们就下班了。
第二天检查zabbix监控,不仅内存被大量释放,而且,也没有再不断被占用。