MySQL: what's the most efficient way to select

2019-02-15 05:33发布

问题:

I have a fairly large dataset and a query that requires two joins, so efficiency of the query is very important to me. I need to retrieve 3 random rows from the database that meet a condition based on the result of a join. Most obvious solution is pointed out as inefficient here, because

[these solutions] need a sequential scan of all the table (because the random value associated with each row needs to be calculated - so that the smallest one can be determined), which can be quite slow for even medium sized tables.

However, the method suggested by the author there (SELECT * FROM table WHERE num_value >= RAND() * (SELECT MAX(num_value) FROM table) LIMIT 1 where num_value is ID) doesn't work for me because some IDs might be missing (because some rows may have been been deleted by users).

So, what would be the most efficient way to retrieve 3 random rows in my situation?

EDIT: the solution does not need to be pure SQL. I also use PHP.

回答1:

Adding your RAND() call into the ORDER BY clause should allow you to ignore the ID. Try this:

SELECT * FROM table WHERE ... ORDER BY RAND() LIMIT 3;

After having performance issues pointed out, your best bet may be something along these lines (utilizing PHP):

$result = PDO:query('SELECT MAX(id) FROM table');
$max    = $result->fetchColumn();
$ids    = array();
$rows   = 5;

for ($i = 0; $i < $rows; $i++) {
    $ids[] = rand(1, $max);
}

$ids     = implode(', ', $ids);
$query   = PDO::prepare('SELECT * FROM table WHERE id IN (:ids)');
$results = $query->execute(array('ids' => $ids));

At this point you should be able to select the first 3 results. The only issue with this approach is dealing with deleted rows and you might have to either bump the $rows var or add some logic to do another query in case you didn't receive at least 3 results back.



回答2:

Since you don't want many results, there are a couple of interesting options using LIMIT and OFFSET.

I'm going to assume an id column which is unique and suitable for sorting.

The first step is to execute a COUNT(id), and then select random 3 numbers from 0 to COUNT(id) - 1 in PHP. (How to do that is a separate question, and the best approach depends on the number of rows total and the number you want).

The second step has two options. Suppose the random numbers you selected are 0, 15, 2234. Either have a loop in PHP

// $offsets = array(0, 15, 2234);
foreach ($offsets as $offset) {
    $rows[] = execute_sql('SELECT ... ORDER BY id LIMIT 1 OFFSET ?', $offset);
}

or build a UNION. Note: this requires sub-selects because we're using ORDER BY.

// $offsets = array(0, 15, 2234);
$query = '';
foreach ($offsets as $index => $offset) {
    if ($query) $query .= ' UNION ';
    $query .= 'SELECT * FROM (SELECT ... ORDER BY id LIMIT 1 OFFSET ?) Sub'.$index;
}
$rows = execute_sql($query, $offsets);


标签: mysql sql random