I'm looking for help on how to optimize (if possible) the performance of a SQL
query used for reading wind information (see below) by changing the e.g. the database structure, query or something else?
I use a hosted database to store a table with more than 800,000 rows with wind information (speed and direction). New data is added each minute from an anemometer. The database is accessed using a PHP
script which creates a web page for plotting the data using Google's visualization API.
The web page takes approximately 15 seconds to load. I've added some time measurements in both the PHP
and Javascript
part to profile the code and find possible areas for improvements.
One part where I hope to improve is the following query which takes approximately 4 seconds to execute. The purpose of the query is to group 15 minutes of wind speed (min/max/mean) and calculate the mean value and total min/max during this period of measurements.
SELECT AVG(d_mean) AS group_mean,
MAX(d_max) as group_max,
MIN(d_min) AS
group_min,
dir,
FROM_UNIXTIME(MAX(dt),'%Y-%m-%d %H:%i') AS group_dt
FROM (
SELECT @i:=@i+1,
FLOOR(@i/15) AS group_id,
CAST(mean AS DECIMAL(3,1)) AS d_mean,
CAST(min AS DECIMAL(3,1)) AS d_min,
CAST(max AS DECIMAL(3,1)) AS d_max,
dir,
UNIX_TIMESTAMP(STR_TO_DATE(dt, '%Y-%m-%d %H:%i')) AS dt
FROM table, (SELECT @i:=-1) VAR_INIT
ORDER BY id DESC
) AS T
GROUP BY group_id
LIMIT 0, 360
...
$oResult = mysql_query($sSQL);
The table has the following structure:
1 ID int(11) AUTO_INCREMENT
2 mean varchar(5) utf8_general_ci
3 max varchar(5) utf8_general_ci
4 min varchar(5) utf8_general_ci
5 dt varchar(20) utf8_general_ci // Date and time
6 dir varchar(5) utf8_general_ci
The following setup is used:
- Database: MariaDB, 5.5.42-MariaDB-1~wheezy
- Database client version: libmysql - 5.1.66
- PHP version: 5.6
- PHP extension: mysqli
I strongly agree with the comments so far -- Cleanse the data as you put it into the table.
Once you have done the cleansing, let's avoid the subquery by doing...
I don't understand the purpose of the
LIMIT
. I'll guess that you want to a few days at a time. For that, I recommend you add (after cleansing) between theFROM
and theGROUP BY
.That would show 7 days, starting '2015-04-10' morning.
In order to handle a table of 800K, you would decidedly need (again, after cleansing):
To cleanse the 800K rows, there are multiple approaches. I suggest creating a new table, copy the data in, test, and eventually swap over. Something like...
Write the new select and test it.
By now
new
is missing the newer rows. You can either rebuild it and hope to finish everything in your one minute window, or play some other game. Let us know if you want help there.