I am doing a subquery in which I have a calculated column involving random number generation. In the base query I select this column twice. MySQL 5.6 works as I expect, the calculated value being called once and fixed. The 5.7+/8.0+ execution seems to re-evaluate the subquery's column value individually for each selection. Is this correct behavior? What can I do to force it work as expected in newer versions of MySQL?
CREATE TABLE t (
`id` BIGINT(20) NOT NULL PRIMARY KEY AUTO_INCREMENT
) ENGINE=InnoDB;
insert into t values();
insert into t values();
insert into t values();
insert into t values();
insert into t values();
SELECT
q.i,
q.r,
q.r
FROM (
SELECT
id AS i,
(FLOOR(RAND(100) * 4)) AS r
FROM t
) q;
MySQL 5.6 yields (values are the same):
+---+-----+-----+
| i | r | r |
+---+-----+-----+
| 1 | 0 | 0 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| 4 | 2 | 2 |
| 5 | 1 | 1 |
+---+-----+-----+
while 5.7 yields (values are different):
+---+-----+-----+
| i | r | r |
+---+-----+-----+
| 1 | 0 | 2 |
| 2 | 3 | 2 |
| 3 | 1 | 1 |
| 4 | 2 | 1 |
| 5 | 2 | 0 |
+---+-----+-----+
As mentioned in The MySQL 8.0.0 Milestone Release is available,
I suppose that this is the cause of the behavior I am observing in newer versions of MySQL. The mentioned hint can be used with MySQL 8.0 to force RAND() be called only once:
This however is not available in 5.7. To achieve the desired behavior with 5.7, add
LIMIT <a very high number>
to the derived table definition (I'm using signed LONG_MAX below). Thanks to Roy Lyseng for this workaround.As philipxy mentioned in the comment, the result of a query expression must be strictly defined regardless of any optimizations being applied. Which means it is an optimizer bug in MySQL 5.7/8.0.