INNER JOIN Results from Select Statement using Doc

2019-01-12 08:06发布

Can you use Doctrine QueryBuilder to INNER JOIN a temporary table from a full SELECT statement that includes a GROUP BY?

The ultimate goal is to select the best version of a record. I have a viewVersion table that has multiple versions with the same viewId value but different timeMod. I want to find the version with the latest timeMod (and do a lot of other complex joins and filters on the query).

Initially people assume you can do a GROUP BY viewId and then ORDER BY timeMod, but ORDER BY has no effect on GROUP BY, and MySQL will return random results. There are a ton of answers out there (e.g. here) that explain the problem with using GROUP and offer a solution, but I am having trouble interpreting the Doctrine docs to find a way to implement the SQL with Doctrine QueryBuilder (if it's even possible). Why don't I just use DQL? I may have to, but I have a lot of dynamic filters and joins that are much easier to do with QueryBuilder, so I wanted to see if that's possible.

Sample MySQL to Reproduce in Doctrine QueryBuilder

SELECT vv.* 
FROM view_version vv
#inner join only returns where the result sets overlap, i.e. one record
INNER JOIN (
    SELECT MAX(timeMod) maxTimeMod, viewId
    FROM view_version
    GROUP BY viewId
) version ON version.viewId = vv.viewId AND vv.timeMod = version.maxTimeMod
#join other tables for filter, etc
INNER JOIN view v ON v.id = vv.viewId
INNER JOIN content_type c ON c.id = v.contentTypeId
WHERE vv.siteId=1
AND v.contentTypeId IN (2)
ORDER BY vv.title ASC;

Theoretical Solution via Query Builder (not working)

I am thinking that the JOIN needs to inject a DQL statement, e.g.

$em = $this->getDoctrine()->getManager();
$viewVersionRepo = $em->getRepository('GutensiteCmsBundle:View\ViewVersion');

$queryMax = $viewVersionRepo->createQueryBuilder()
    ->addSelect('MAX(timeMod) AS timeModMax')
    ->addSelect('viewId')
    ->groupBy('viewId');

$queryBuilder = $viewVersionRepo->createQueryBuilder('vv')
    // I tried putting the query in a parenthesis, to no avail
    ->join('('.$queryMax->getDQL().')', 'version', 'WITH', 'vv.viewId = version.viewId AND vv.timeMod = version.timeModMax')
    // Join other Entities
    ->join('e.view', 'view')
    ->addSelect('view')
    ->join('view.contentType', 'contentType')
    ->addSelect('contentType')
    // Perform random filters
    ->andWhere('vv.siteId = :siteId')->setParameter('siteId', 1)
    ->andWhere('view.contentTypeId IN(:contentTypeId)')->setParameter('contentTypeId', $contentTypeIds)
    ->addOrderBy('e.title', 'ASC');

$query = $queryBuilder->getQuery();
$results = $query->getResult();

My code (which may not match the above example perfectly) outputs:

SELECT e, view, contentType 
FROM Gutensite\CmsBundle\Entity\View\ViewVersion e 
INNER JOIN (
    SELECT MAX(v.timeMod) AS timeModMax, v.viewId 
    FROM Gutensite\CmsBundle\Entity\View\ViewVersion v 
    GROUP BY v.viewId
) version WITH vv.viewId = version.viewId AND vv.timeMod = version.timeModMax 
INNER JOIN e.view view 
INNER JOIN view.contentType contentType 
WHERE e.siteId = :siteId 
AND view.contentTypeId IN (:contentTypeId) 
ORDER BY e.title ASC

This Answer seems to indicate that it's possible in other contexts like IN statements, but when I try the above method in the JOIN, I get the error:

[Semantical Error] line 0, col 90 near '(SELECT MAX(v.timeMod)': Error: Class '(' is not defined.

1条回答
孤傲高冷的网名
2楼-- · 2019-01-12 08:40

A big thanks to @AdrienCarniero for his alternative query structure for sorting the highest version with a simple JOIN where the entity's timeMod is less than the joined table timeMod.

Alternative Query

SELECT view_version.* 
FROM view_version
#inner join to get the best version
LEFT JOIN view_version AS best_version ON best_version.viewId = view_version.viewId AND best_version.timeMod > view_version.timeMod
#join other tables for filter, etc
INNER JOIN view ON view.id = view_version.viewId
INNER JOIN content_type ON content_type.id = view.contentTypeId
WHERE view_version.siteId=1
# LIMIT Best Version
AND best_version.timeMod IS NULL
AND view.contentTypeId IN (2)
ORDER BY view_version.title ASC;

Using Doctrine QueryBuilder

$em = $this->getDoctrine()->getManager();
$viewVersionRepo = $em->getRepository('GutensiteCmsBundle:View\ViewVersion');

$queryBuilder = $viewVersionRepo->createQueryBuilder('vv')
    // Join Best Version
    ->leftJoin('GutensiteCmsBundle:View\ViewVersion', 'bestVersion', 'WITH', 'bestVersion.viewId = e.viewId AND bestVersion.timeMod > e.timeMod')
    // Join other Entities
    ->join('e.view', 'view')
    ->addSelect('view')
    ->join('view.contentType', 'contentType')
    ->addSelect('contentType')
    // Perform random filters
    ->andWhere('vv.siteId = :siteId')->setParameter('siteId', 1)
    // LIMIT Joined Best Version
    ->andWhere('bestVersion.timeMod IS NULL')
    ->andWhere('view.contentTypeId IN(:contentTypeId)')->setParameter('contentTypeId', $contentTypeIds)
    ->addOrderBy('e.title', 'ASC');

$query = $queryBuilder->getQuery();
$results = $query->getResult();

In terms of performance, it really depends on the dataset. See this discussion for details.

TIP: The table should include indexes on both these values (viewId and timeMod) to speed up results. I don't know if it would also benefit from a single index on both fields.

A native SQL query using the original JOIN method may be better in some cases, but compiling the query over an extended range of code that dynamically creates it, and getting the mappings correct is a pain. So this is at least an alternative solution that I hope helps others.

查看更多
登录 后发表回答