Pythian Blog: Technical Track

Debugging MySQL on Docker

In a recent post, my colleague and teammate Peter Sylvester explained how we could customize MySQL's configuration when running it as a Docker container. Today I want to show you how to debug a Dockerized MySQL process. Let me start by showing you how I am starting my test container:

[fipar@coltrane ~]$ sudo docker run --memory-swappiness=1 -p 3308:3306 --name=mysql1 -e MYSQL_ROOT_PASSWORD=password -d mysql/mysql-server:5.7 Unable to find image 'mysql/mysql-server:5.7' locally Trying to pull repository docker.io/mysql/mysql-server ... 5.7: Pulling from docker.io/mysql/mysql-server Digest: sha256:eb3aa08c047efcb3e6bfcc3a28b80a2ec8c67b4315712b26679b0b22320f0b4a Status: Downloaded newer image for docker.io/mysql/mysql-server:5.7 44c275611eaafa1b64864f89421d6513a01091f72358f45bda9f942b11b95f11 

We can now attach to it to run a simple query (which we could also do by connecting via TCP/IP, so I am demonstrating here how we can attach to the container to run commands on it:

[fipar@coltrane ~]$ sudo docker exec -it mysql1 bash bash-4.2# mysql -ppassword -e 'select @@version' mysql: [Warning] Using a password on the command line interface can be insecure. +-----------+ | @@version | +-----------+ | 5.7.22 | +-----------+ 

One common task we must perform when debugging MySQL problems is, unsurprisingly, attaching a debugger to mysqld. While this is not something one usually does in a production MySQL, gathering stack traces can be a useful last action when the database is hung, and your next step would be to restart it. This way, we may end up with valuable information to diagnose why MySQL was hung in the first place, either for our own analysis or a useful bug report.

Before we can do this, we need to obtain the PID on the host system that corresponds to our container:

[fipar@coltrane ~]$ sudo docker inspect --format '' mysql1 15994 [fipar@coltrane ~]$ ps -ef|grep 15994 mysql 15994 15976 0 19:11 ? 00:00:00 mysqld fipar 16390 15880 0 19:15 pts/0 00:00:00 grep --color=auto 15994 [/code] A tool I often use to gather stack traces and automatically aggregate them for easier analysis is pt-pmp, so naturally, that was my first attempt in this case, too: [code][fipar@coltrane ~]$ sudo pt-pmp --pid 15994 perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). Thu May 24 19:16:08 UYT 2018 Cannot access memory at address 0x1946 Cannot access memory at address 0x193e Cannot access memory at address 0x1946 Cannot access memory at address 0x1946 Cannot access memory at address 0x193e Cannot access memory at address 0x1946 Cannot access memory at address 0x193e</code> warning: Target and debugger are in different PID namespaces; thread lists and other data are likely unreliable. Connect to gdbserver inside the container. 1 ::??,??

 

This failed because processes running in Docker containers have a separate namespace and therefore we can't attach gdb to them. Fortunately, we can leverage the command nsenter to 'enter' the corresponding namespace and attach gdb to mysqld. Before we can do that, however, gdb must be installed in the container. The following snippet shows how we can:

  • Install gdb inside the container
  • Attach it to mysqld to collect stack traces
  • Run the collected traces (now on the host OS) by pt-pmp for aggregation
[fipar@coltrane ~]$ sudo docker exec -it mysql1 yum -y install gdb</code> (...snip...) Installed: gdb.x86_64 0:7.6.1-110.el7 Complete! [fipar@coltrane ~]$ sudo nsenter -t 15994 -m -p gdb -ex "set pagination 0" -ex "thread apply all bt" -batch -p 1 &gt; 15994.stack.traces [fipar@coltrane ~]$ pt-pmp 15994.stack.traces perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). 10 __io_getevents_0_4(libaio.so.1),LinuxAIOHandler::collect,LinuxAIOHandler::poll,os_aio_handler,fil_aio_wait,io_handler_thread,start_thread(libpthread.so.0),clone(libc.so.6) 3 pthread_cond_wait,os_event::wait_low,srv_worker_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 sigwaitinfo(libc.so.6),timer_notify_thread_func,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 sigwait(libpthread.so.0),signal_hand,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_event::wait_low,srv_purge_coordinator_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_event::wait_low,buf_resize_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_event::wait_low,buf_dump_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,compress_gtid_table,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,Per_thread_connection_handler::block_until_new_connection,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,srv_monitor_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,srv_error_monitor_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,lock_wait_timeout_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,ib_wqueue_timedwait,fts_optimize_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,dict_stats_thread,start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_event::timed_wait,os_event::wait_time_low,buf_flush_page_cleaner_coordinator,start_thread(libpthread.so.0),clone(libc.so.6) 1 poll(libc.so.6),Mysqld_socket_listener::listen_for_connection_event,mysqld_main,__libc_start_main(libc.so.6),_start 1 nanosleep(libpthread.so.0),os_thread_sleep,srv_master_thread,start_thread(libpthread.so.0),clone(libc.so.6) 

As you can see, even though we have to go through some extra hoops, we were able to attach gdb to mysqld inside a container, and while my example only gathered stack traces for aggregation, this approach opens the door for us to perform any debugging activity we would typically do on a process that's running directly on our host OS.

Conclusion

Containers offer several operational benefits including standardized deployments, better isolation and consistency between different environments. However, like anything in life, those benefits don't come for free. If you're a pager-carrying person who responds to incidents and is responsible for keeping systems running correctly or figuring out problems and resolving them when systems are not running correctly, the takeaway I would like you to get from this post is: prepare yourself for troubleshooting before your stack hits production. As any layer of abstraction, containers make some of our tracing and debugging work more difficult, and it is best to get familiar with how those practices work in a containerized environment without the stress of having to deal with a production problem.

No Comments Yet

Let us know what you think

Subscribe by email