Profiling new version of MariaDB
MariaDB is continuously evolving and in order to make it more scalable lot of age old, data constructs are being upgraded/revamped to the new age scalable constructs. This series of changes has helped it scale better than most of the open-source databases available. If you have been profiling MariaDB for quite some time now then it is important to ensure that you upgrade/widen your profiling scope to cover these new hot spots.
Mutex vs Latch
There are a lot of use-cases where the flow doesn’t need a mutex (exclusive access) but what it really needs is a latch (multiple readers, single writer) kinds of access. MariaDB has started identifying such use-cases and has ported them to use latches (vs original mutexes). So the traditional tracking mechanism for mutex hotspot needs to be widened.
Most of the users may be using following settings to track MariaDB performance bottleneck
Since the said construct only captures mutex and doesn’t capture locks/latches profiling, information obtain using performance_schema.events_waits_summary_global_by_event_name will not present a complete picture.
Let’s understand how the hotspot has changed from 10.5 to 10.6 to 10.8
| wait/synch/mutex/innodb/log_sys_mutex | 10993543.1228 | 73893323 | | wait/synch/mutex/innodb/lock_mutex | 409445.6364 | 72562319 | | wait/synch/mutex/innodb/redo_rseg_mutex | 282889.2468 | 60315800 | | wait/synch/mutex/sql/LOCK_table_cache | 87841.0034 | 30053316 | | wait/synch/mutex/innodb/buf_pool_mutex | 31164.3056 | 24091341 | | wait/synch/mutex/innodb/log_flush_order_mutex | 30715.0940 | 6204400 |
| wait/synch/mutex/innodb/log_sys_mutex | 8196724.5317 | 94451895 | | wait/synch/mutex/sql/LOCK_table_cache | 53218.7635 | 38989014 | | wait/synch/mutex/innodb/log_flush_order_mutex | 22115.2731 | 5461808 | | wait/synch/mutex/innodb/buf_pool_mutex | 19749.9868 | 24328777 | | wait/synch/mutex/threadpool/group_mutex | 9871.5299 | 73233384 | | wait/synch/mutex/sql/THD::LOCK_thd_data | 6153.8019 | 91115395 |
- log_sys_mutex contention continued.
- lock_mutex, redo_rseg_mutex disappeared. Does that mean contention has been completely resolved in 10.6?
| wait/synch/mutex/innodb/buf_pool_mutex | 48952.9588 | 20786569 | | wait/synch/mutex/sql/LOCK_table_cache | 34947.7470 | 43650073 | | wait/synch/mutex/threadpool/group_mutex | 14281.8536 | 84727091 | | wait/synch/mutex/innodb/fil_system_mutex | 7047.8210 | 5240336 | | wait/synch/mutex/sql/THD::LOCK_thd_data | 6963.7040 | 103323350 | | wait/synch/mutex/innodb/flush_list_mutex | 6182.9708 | 4925339 |
- log_sys_mutex contention also disappeared.
If you are used to the old way of mutex profiling then you should expand the scope and now start tracking rw-locks.
Lock_mutex is now ported to latch and so contention is not visible with mutexes tracing.
| wait/synch/rwlock/innodb/lock_latch | 11823.5208 | 92213015 |
As part of this transition profiling of some important mutexes (converted to latches) has been removed. One such latch is redo_rseg_mutex which is now converted to latch but is not performance profiled. I tried local changes to cover it and found that it still represents significant contention. MDEV-27935
| wait/synch/rwlock/innodb/trx_rseg_latch_key | 342616.7699 | 98834024 |
Same way contention for log_mutex is not visible with 10.8.3 as log-mutex is now ported to log_latch.
| wait/synch/rwlock/innodb/log_latch | 2993779.4089 | 103110276 |
Let’s see how all these optimization has helped.
- Improvement has significantly improved the performance of MariaDB.
More such mutexes are lined up to get ported to latches that will helps scale the MariaDB further. Also, it opens up an opportunities to explore NUMA optimized distributed latch.
If you have more questions/queries do let me know. Will try to answer them.