Selecting COUNT from different criteria on a table

2019-03-12 11:19发布

I have a table named 'jobs'. For a particular user a job can be active, archived, overdue, pending, or closed. Right now every page request is generating 5 COUNT queries and in an attempt at optimization I'm trying to reduce this to a single query. This is what I have so far but it is barely faster than the 5 individual queries. Note that I've simplified the conditions for each subquery to make it easier to understand, the full query acts the same however.

Is there a way to get these 5 counts in the same query without using the inefficient subqueries?

SELECT
  (SELECT count(*)
    FROM "jobs"
    WHERE
      jobs.creator_id = 5 AND
      jobs.status_id NOT IN (8,3,11) /* 8,3,11 being 'inactive' related statuses */
  ) AS active_count, 
  (SELECT count(*)
    FROM "jobs"
    WHERE
      jobs.creator_id = 5 AND
      jobs.due_date < '2011-06-14' AND
      jobs.status_id NOT IN(8,11,5,3) /* Grabs the overdue active jobs
                                      ('5' means completed successfully) */
  ) AS overdue_count,
  (SELECT count(*)
    FROM "jobs"
    WHERE
      jobs.creator_id = 5 AND
      jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000'
  ) AS due_today_count

This goes on for 2 more subqueries but I think you get the idea.

Is there an easier way to collect this data since it's basically 5 different COUNT's off of the same subset of data from the jobs table?

The subset of data is 'creator_id = 5', after that each count is basically just 1-2 additional conditions. Note that right now we're using Postgres but may be moving to MySQL in the near future. So if you can provide an ANSI-compatible solution I'd be gratetful :)

3条回答
乱世女痞
2楼-- · 2019-03-12 11:47

Brief

SQL Server 2012 introduced the IIF logical function. Using SQL Server 2012 or greater you can now use this new function instead of a CASE expression. The IIF function also works with Azure SQL Database (but at the moment it does not work with Azure SQL Data Warehouse or Parallel Data Warehouse). It's shorthand for the CASE expression.

I find myself using the IIF function rather than the CASE expression when there is only one case. This alleviates the pain of having to write CASE WHEN condition THEN x ELSE y END and instead writing it as IIF(condition, x, y). If multiple conditions may be met (multiple WHENs), you should instead consider using the regular CASE expression rather than nested IIF functions.

Returns one of two values, depending on whether the Boolean expression evaluates to true or false in SQL Server.

Syntax

IIF ( boolean_expression, true_value, false_value )

Arguments

boolean_expression
A valid Boolean expression.

If this argument is not a Boolean expression, then a syntax error is raised.

true_value
Value to return if boolean_expression evaluates to true.

false_value
Value to return if boolean_expression evaluates to false.

Remarks

IIF is a shorthand way for writing a CASE expression. It evaluates the Boolean expression passed as the first argument, and then returns either of the other two arguments based on the result of the evaluation. That is, the true_value is returned if the Boolean expression is true, and the false_value is returned if the Boolean expression is false or unknown. true_value and false_value can be of any type. The same rules that apply to the CASE expression for Boolean expressions, null handling, and return types also apply to IIF. For more information, see CASE (Transact-SQL).

The fact that IIF is translated into CASE also has an impact on other aspects of the behavior of this function. Since CASE expressions can be nested only up to the level of 10, IIF statements can also be nested only up to the maximum level of 10. Also, IIF is remoted to other servers as a semantically equivalent CASE expression, with all the behaviors of a remoted CASE expression.


Code

Implementation of the IIF function in SQL would resemble the following (using the same logic presented by @rsbarro in his answer):

SELECT 
    COUNT(
        IIF(jobs.status_id NOT IN (8,3,11), 1, 0)
    ) as active_count,
    COUNT(
        IIF(jobs.due_date < '2011-06-14' AND jobs.status_id NOT IN(8,11,5,3), 1, 0)
    ) as overdue_count,
    COUNT(
        IIF(jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000', 1, 0)
    ) as due_today_count
FROM 
    "jobs"
WHERE 
     jobs.creator_id = 5 
查看更多
beautiful°
3楼-- · 2019-03-12 11:51

This is the typical solution. Use a case statement to break out the different conditions. If a record meets it gets a 1 else a 0. Then do a SUM on the values

  SELECT
    SUM(active_count) active_count,
    SUM(overdue_count) overdue_count
    SUM(due_today_count) due_today_count
  FROM 
  (

  SELECT 
    CASE WHEN jobs.status_id NOT IN (8,3,11) THEN 1 ELSE 0 END active_count,
    CASE WHEN jobs.due_date < '2011-06-14' AND jobs.status_id NOT IN(8,11,5,3)  THEN 1 ELSE 0 END  overdue_count,
    CASE WHEN jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000' THEN 1 ELSE 0 END  due_today_count

    FROM "jobs"
    WHERE
      jobs.creator_id = 5 ) t

UPDATE As noted when 0 records are returned as t this result in as single result of Nulls in all the values. You have three options

1) Add A Having clause so that you have No records returned rather than result of all NULLS

   HAVING SUM(active_count) is not null

2) If you want all zeros returned than you could add coalesce to all your sums

For example

 SELECT
      COALESCE(SUM(active_count)) active_count,
       COALESCE(SUM(overdue_count)) overdue_count
      COALESCE(SUM(due_today_count)) due_today_count

3) Take advantage of the fact that COUNT(NULL) = 0 as sbarro's demonstrated. You should note that the not-null value could be anything it doesn't have to be a 1

for example

 SELECT
      COUNT(CASE WHEN 
            jobs.status_id NOT IN (8,3,11) THEN 'Manticores Rock' ELSE NULL
       END) as [active_count]
查看更多
Luminary・发光体
4楼-- · 2019-03-12 11:57

I would use this approach, use COUNT in combination with CASE WHEN.

SELECT 
    COUNT(CASE WHEN 
        jobs.status_id NOT IN (8,3,11) THEN 1 
    END) as [Count1],
    COUNT(CASE WHEN 
        jobs.due_date < '2011-06-14' 
        AND jobs.status_id NOT IN(8,11,5,3) THEN 1
    END) as [COUNT2],
    COUNT(CASE WHEN
            jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000'
    END) as [COUNT3]
FROM 
    "jobs"
WHERE 
     jobs.creator_id = 5 
查看更多
登录 后发表回答