Description: Under heavy workload mysql 5.6.16 (replicated on 5.6.13 and higher)
throws connection errors "Lost connection to MySQL server at
'sending authentication information', system error: 32".
This was tested on OSX 10.8 and 10.9. The issue is not seen
on windows or other unix servers like centos. How to repeat: This was tested and the issue was repeatable with a simple database
import through the mysql client.The database dump had around 1000 tables
and 200 MB in size. While the dump is being sourced other mysql client(s)
trying to connect (even if it were a different user than the one running
the import) you get the authentication error. "mysql -uroot -p Enter password: ERROR 2013 (HY000): Lost connection to MySQL server at
'sending authentication information', system error: 32" Also observed that passing in the host causes mysql to
throw a different exception: "mysql -uroot -p -h127.0.0.1 Enter password: ERROR 2013 (HY000): Lost connection to MySQL server at
'reading authorization packet', system error: 0" Running mysql under debug (mysql-dubug) and generating a trace
shows the below when the issue occurs: [login_connection: info: login_connection called by thread 11 login_connection: info: New connection received on TCP/IP (1089) login_connection: info: Host: unknown host ip: 127.0.0.1 acl_authenticate: info: com_change_user_pkt_len=0 my_message_sql: error: error: 1158 message:
'Got an error reading communication packets' my_message_sql: error: error: 1043 message: 'Bad handshake' one_thread_per_connection_end: info: thd 0x7f824d245200 block_pthread 1
The interesting piece of trace being "login_connection: info:
Host: unknown host ip: 127.0.0.1" where it sees the host as
"unknown host" which appears to be the cause of the authentication error. This could also be seen under the performance schema where the HOST
is seen as NULL while multiple clients try to connect: mysql> select * from hosts; +-----------+---------------------+-------------------+ | HOST | CURRENT_CONNECTIONS | TOTAL_CONNECTIONS | +-----------+---------------------+-------------------+ | NULL | 18 | 35 | | localhost | 1 | 2 | +-----------+---------------------+-------------------+ 2 rows in set (0.00 sec) The counter "Aborted_connects" keeps increasing. From the JDBC side, tested this with the recent connector/j and mariadb
java client and both fail with similar exceptions: [Caused by: java.io.EOFException:
unexpected end of stream, read 0bytes from 4 at org.mariadb.jdbc.internal.common.packet.buffer.ReadUtil.readFully
(ReadUtil.java:84) [mariadb-java-client.jar:] at org.mariadb.jdbc.internal.common.packet.buffer.ReadUtil.readFully
(ReadUtil.java:92) [mariadb-java-client.jar:] at org.mariadb.jdbc.internal.common.packet.RawPacket.nextPacket
(RawPacket.java:77) [mariadb-java-client.jar:] at org.mariadb.jdbc.internal.common.packet.SyncPacketFetcher.getRawPacket
(SyncPacketFetcher.java:67) [mariadb-java-client.jar:] at org.mariadb.jdbc.internal.mysql.MySQLProtocol.connect
(MySQLProtocol.java:468) [mariadb-java-client.jar:] ... 69 more"] Also observed that bouncing the mysql instance clears up the problem.
It was also seen that terminating all the mysql clients like jdbc
(shutting down the app servers) also helps with the issue and you are
able to login once again. my.cnf that was used to test this: [ # For advice on how to change settings please see # http://dev.mysql.com/doc/refman/5.6/en/server-configuration-defaults.html [mysqld] interactive_timeout=180 wait_timeout=180 max_connect_errors=9999 skip-name-resolve query_cache_size=0 bind-address=127.0.0.1 # Remove leading # and set to the amount of RAM for the most important data # cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%. innodb_buffer_pool_size = 2G # Remove leading # to turn on a very important data integrity option: logging # changes to the binary log between backups. # log_bin # These are commonly set, remove the # and set as required. basedir = /usr/local/mysql datadir = /usr/local/mysql/data port = 3306 server_id = kris_dev socket = /tmp/mysql.sock # Remove leading # to set options mainly useful for reporting servers. # The server defaults are faster for transactions and fast SELECTs. # Adjust sizes as needed, experiment to find the optimal values. join_buffer_size = 128M sort_buffer_size = 2M read_rnd_buffer_size = 2M sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES [client] default-character-set=utf8 ]
Workaround
This was tested on machine specs ranging from 4GB, 8GB and 16 GB and 32 GB RAM.
Things were a little better on the higher specs (16GB and above) but looked severe
on anything 8GB and below. Further debugging and tracing has led to the mysql option "innodb_file_per_table"
which is turned on by default on Mysql 5.6 to be causing this issue. Turning off
this option has revealed positive results so far and no occurrence of
system error: 32.
It was also noticed that increasing open file limits (ulimit -n)
on the higher spec machines helped with better results while running with
"innodb_file_per_table" turned on.
Tools used: mysql-debug (trace option), dtruss
Logged this under: http://bugs.mysql.com/bug.php?id=71960
Hope this helps!
No comments:
Post a Comment