As of 2019-01-30 14:52 UTC, you can still win the 500-point Bounty because none of the answers have helped!
My Laravel 5.7 website has been experiencing a few problems that I think are related to each other (but happen at different times):
PDO::prepare(): MySQL server has gone away
E_WARNING: Error while sending STMT_PREPARE packet. PID=10
PDOException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry
(My database often seems to try to write the same record twice in the same second. I've been unable to figure out why or how to reproduce it; it doesn't seem to be related to user behavior.)- Somehow, those first 2 types of errors only ever appear in my Rollbar logs but not on the text logs on the server or in my Slack notifications, as all errors are supposed to (and all others do).
For months, I've continued to see scary log messages like these, and I've been completely unable to reproduce these errors (and have been unable to diagnose and solve them).
I haven't yet found any actual symptoms or heard of any complaints from users, but the error messages seem non-trivial, so I really want to understand and fix the root causes.
I've tried changing my MySQL config to use max_allowed_packet=300M
(instead of the default of 4M) but still get these exceptions frequently on days when I have more than a couple of visitors to my site.
I've also set (changed from 5M and 10M) the following because of this advice:
innodb_buffer_pool_chunk_size=218M
innodb_buffer_pool_size = 218M
As further background:
- My site has a queue worker that runs jobs (
artisan queue:work --sleep=3 --tries=3 --daemon
). - There are a bunch of queued jobs that can be scheduled to happen at the same moment based on the signup time of visitors. But the most I see that have happened simultaneously is 20.
- There are no entries in the MySQL Slow Query Log.
- I have a few cron jobs, but I doubt they're problematic. One runs every minute but is really simple. Another runs every 5 minutes to send certain scheduled emails if any are pending. And another runs every 30 minutes to run a report.
- I've run various
mysqlslap
queries (I'm completely novice though) and haven't found anything slow even when simulating hundreds of concurrent clients. - I'm using Laradock (Docker).
- My server is DigitalOcean 1GB RAM, 1 vCPU, 25GB SSD. I've also tried 2GB RAM with no difference.
- The results from
SHOW VARIABLES;
andSHOW GLOBAL STATUS;
are here.
My my.cnf
is:
[mysql]
[mysqld]
sql-mode="STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION"
character-set-server=utf8
innodb_buffer_pool_chunk_size=218M
innodb_buffer_pool_size = 218M
max_allowed_packet=300M
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow_query_log.log
long_query_time = 10
log_queries_not_using_indexes = 0
Any ideas about what I should explore to diagnose and fix these problems? Thanks.
If you see this message randomly, possible reasons:
Your MySQL is behind a proxy, and they are using different
timeout
config.You are using PHP's persist connection.
You may try to dig into the problem by these steps:
Make sure your connections to MySQL have long enough timeout (eg: proxy setting, MySQL's
wait_timeout
/interactive_timeout
)Disable the persist connection at PHP side.
Do some
tcpdump
if you can to see what happend when you got the error message.Re Slowlog: Show us your my.cnf. Were the changes in the
[mysqld]
section? Test it viaSELECT SLEEP(12);
, then look both in the file and the table.Alternate way to find the query: Since the query is taking several minutes, do
SHOW FULL PROCESSLIST;
when you think it might be running.How much RAM do you have? Do not have
max_allowed_packet=300M
unless you have at least 30GB of RAM. Else you are risking swapping (or even crashing). Keep that setting under 1% of RAM.For further analysis of tunables, please provide (1) RAM size, (2)
SHOW VARIABLES;
and (3)SHOW GLOBAL STATUS;
.Re
deleted_at
: That link you gave starts with "The column deleted_at is not a good index candidate". You misinterpreted it. It is talking about a single-columnINDEX(deleted_at)
. I am suggesting a composite index such asINDEX(contact_id, job_class_name, execute_at, deleted_at)
.158 seconds for a simple query on a small table? It could be that there is a lot of other stuff going on. Get the
PROCESSLIST
.Re Separate indexes versus composite: Think of two indexes:
INDEX(last_name)
andINDEX(first_name)
. You flip through the last_name index to find "James", then what can you do? Flipping through the other index for "Rick" won't help you find me.Analysis of VARIABLES and GLOBAL STATUS
Observations:
The More Important Issues:
innodb_buffer_pool_size -- I thought you had it at 213M, not 10M. 10M is much too small. On the other hand, you seem to have less than that much data.
Since the RAM is so small, I recommend dropping tmp_table_size and max_heap_table_size and max_allowed_packet to 8M. And lower table_open_cache, table_definition_cache, and innodb_open_files to 500.
What causes so many simultaneous connections?
Details and other observations:
( innodb_buffer_pool_size / _ram ) = 10M / 1024M = 0.98%
-- % of RAM used for InnoDB buffer_pool( innodb_buffer_pool_size ) = 10M
-- InnoDB Data + Index cache( innodb_lru_scan_depth ) = 1,024
-- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixed by lowering lru_scan_depth( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 375 / 638 = 58.8%
-- Pct of buffer_pool currently not in use -- innodb_buffer_pool_size is bigger than necessary?( Innodb_buffer_pool_bytes_data / innodb_buffer_pool_size ) = 4M / 10M = 40.0%
-- Percent of buffer pool taken up by data -- A small percent may indicate that the buffer_pool is unnecessarily big.( innodb_log_buffer_size / _ram ) = 16M / 1024M = 1.6%
-- Percent of RAM used for buffering InnoDB log writes. -- Too large takes away from other uses for RAM.( innodb_log_file_size * innodb_log_files_in_group / innodb_buffer_pool_size ) = 48M * 2 / 10M = 960.0%
-- Ratio of log size to buffer_pool size. 50% is recommended, but see other calculations for whether it matters. -- The log does not need to be bigger than the buffer pool.( innodb_flush_method ) = innodb_flush_method =
-- How InnoDB should ask the OS to write blocks. Suggest O_DIRECT or O_ALL_DIRECT (Percona) to avoid double buffering. (At least for Unix.) See chrischandler for caveat about O_ALL_DIRECT( innodb_flush_neighbors ) = 1
-- A minor optimization when writing blocks to disk. -- Use 0 for SSD drives; 1 for HDD.( innodb_io_capacity ) = 200
-- I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor.( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF
-- Whether to log all Deadlocks. -- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.( min( tmp_table_size, max_heap_table_size ) / _ram ) = min( 16M, 16M ) / 1024M = 1.6%
-- Percent of RAM to allocate when needing MEMORY table (per table), or temp table inside a SELECT (per temp table per some SELECTs). Too high may lead to swapping. -- Decrease tmp_table_size and max_heap_table_size to, say, 1% of ram.( net_buffer_length / max_allowed_packet ) = 16,384 / 16M = 0.10%
( local_infile ) = local_infile = ON
-- local_infile = ON is a potential security issue( Select_scan / Com_select ) = 111,324 / 264144 = 42.1%
-- % of selects doing full table scan. (May be fooled by Stored Routines.) -- Add indexes / optimize queries( long_query_time ) = 10
-- Cutoff (Seconds) for defining a "slow" query. -- Suggest 2( Max_used_connections / max_connections ) = 152 / 151 = 100.7%
-- Peak % of connections -- increase max_connections and/or decrease wait_timeoutYou have the Query Cache half-off. You should set both query_cache_type = OFF and query_cache_size = 0 . There is (according to a rumor) a 'bug' in the QC code that leaves some code on unless you turn off both of those settings.
Abnormally small:
Abnormally large:
Abnormal strings: