How to improve wind data SQL query performance

2019-09-06 06:37发布

I'm looking for help on how to optimize (if possible) the performance of a SQL query used for reading wind information (see below) by changing the e.g. the database structure, query or something else?

I use a hosted database to store a table with more than 800,000 rows with wind information (speed and direction). New data is added each minute from an anemometer. The database is accessed using a PHP script which creates a web page for plotting the data using Google's visualization API.

The web page takes approximately 15 seconds to load. I've added some time measurements in both the PHP and Javascript part to profile the code and find possible areas for improvements.

One part where I hope to improve is the following query which takes approximately 4 seconds to execute. The purpose of the query is to group 15 minutes of wind speed (min/max/mean) and calculate the mean value and total min/max during this period of measurements.

SELECT  AVG(d_mean) AS group_mean, 
        MAX(d_max) as group_max, 
        MIN(d_min) AS
        group_min, 
        dir, 
        FROM_UNIXTIME(MAX(dt),'%Y-%m-%d %H:%i') AS group_dt 
FROM    (
    SELECT  @i:=@i+1, 
            FLOOR(@i/15) AS group_id, 
            CAST(mean AS DECIMAL(3,1)) AS d_mean, 
            CAST(min AS DECIMAL(3,1)) AS d_min, 
            CAST(max AS DECIMAL(3,1)) AS d_max, 
            dir, 
            UNIX_TIMESTAMP(STR_TO_DATE(dt, '%Y-%m-%d %H:%i')) AS dt 
            FROM table, (SELECT @i:=-1) VAR_INIT 
            ORDER BY id DESC
) AS T 
GROUP BY group_id
LIMIT 0, 360

...

$oResult = mysql_query($sSQL);

The table has the following structure:

1   ID      int(11)     AUTO_INCREMENT
2   mean    varchar(5)  utf8_general_ci
3   max     varchar(5)  utf8_general_ci
4   min     varchar(5)  utf8_general_ci
5   dt      varchar(20) utf8_general_ci    // Date and time
6   dir     varchar(5)  utf8_general_ci

The following setup is used:

  • Database: MariaDB, 5.5.42-MariaDB-1~wheezy
  • Database client version: libmysql - 5.1.66
  • PHP version: 5.6
  • PHP extension: mysqli

1条回答
We Are One
2楼-- · 2019-09-06 07:43

I strongly agree with the comments so far -- Cleanse the data as you put it into the table.

Once you have done the cleansing, let's avoid the subquery by doing...

SELECT  MIN(dt) as 'Start of 15 mins',
        FORMAT(AVG(mean), 1) as 'Avg wind speed',
        ...
    FROM table
    GROUP BY FLOOR(UNIX_TIMESTAMP(dt) / 900)
    ORDER BY FLOOR(UNIX_TIMESTAMP(dt) / 900);

I don't understand the purpose of the LIMIT. I'll guess that you want to a few days at a time. For that, I recommend you add (after cleansing) between the FROM and the GROUP BY.

    WHERE dt >= '2015-04-10'
      AND dt  < '2015-04-10' + INTERVAL 7 DAY

That would show 7 days, starting '2015-04-10' morning.

In order to handle a table of 800K, you would decidedly need (again, after cleansing):

INDEX(dt)

To cleanse the 800K rows, there are multiple approaches. I suggest creating a new table, copy the data in, test, and eventually swap over. Something like...

CREATE TABLE new (
    dt DATETIME, 
    mean FLOAT,
    ...
    PRIMARY KEY(dt)  -- assuming you have only one row per minute?
) ENGINE=InnoDB;

INSERT INTO new (dt, mean, ...)
    SELECT str_to_date(...),
           mean, -- I suspect that the CAST is not needed
           ...;

Write the new select and test it.

By now new is missing the newer rows. You can either rebuild it and hope to finish everything in your one minute window, or play some other game. Let us know if you want help there.

查看更多
登录 后发表回答