Filter by date range (same month and day) across y

2020-04-05 07:42发布

问题:

I have a PostgreSQL database with a table holding dates. Now I need to find all rows within the date range 15/02 until 21/06 (day/month) across all years.

Example result:

1840-02-28
1990-06-21
1991-02-15
1991-04-25
1992-05-30
1995-03-04
1995-04-10
2001-02-03
2010-04-06

回答1:

Assuming (with a leap of faith) that you want dates between certain days of the year regardless of the year (like if you're sending out a batch of birthday cards or something), you can set up a test with this:

CREATE TABLE d (dt date);
COPY d FROM STDIN;
1840-02-28
1990-06-21
1991-02-15
1991-04-25
1992-05-30
1995-03-04
1995-04-10
2001-02-03
2010-04-06
\.

And you can use "row value constructors" to easily select the desired range:

SELECT * FROM d
  WHERE (EXTRACT(MONTH FROM dt), EXTRACT(DAY FROM dt))
           BETWEEN (2, 15) AND (6, 21);

Which yields:

     dt     
------------
 1840-02-28
 1990-06-21
 1991-02-15
 1991-04-25
 1992-05-30
 1995-03-04
 1995-04-10
 2010-04-06
(8 rows)


回答2:

Use a WHERE clause with the BETWEEN operator. See:

http://www.postgresql.org/docs/current/static/functions-comparison.html#FUNCTIONS-COMPARISON

and:

http://www.postgresql.org/docs/current/static/sql-select.html http://www.postgresql.org/docs/current/static/tutorial.html

If that doesn't help, please expand your question with:

  • The structure of the table(s) you're working with, either from psql's \d tablename command or the original CREATE TABLE statements;
  • Some sample contents
  • The query you're having problems with
  • Expected results


回答3:

You can use following syntax.

SELECT * FROM tableName WHERE dateColumnName BETWEEN '2012.01.01' AND '2012.08.14';

Just replace following;

tableName       - Name of the table you are going to access
dateColumnName  - Name of the column whch contains dates
2012.08.1       - Start date
2012.08.21      - End date 

When entering the two dates, carefully examine the example above. Enter in the same format, and enclose them inside ''s.

If you replace * mark with a column name, you can filter out values of that column only.

Hope that helps..



回答4:

I am pretty sure, @kgrittn's interpretation of the question is accurate and I love his elegant use of a row constructors. Even more so after I have tested a couple of alternatives, none of which could match the performance:

Tested with a real life table of 65426 rows; 32107 qualified. PostgreSQL 9.1.4, best of five with EXPLAIN ANALYZE:

SELECT * FROM tbl
WHERE  to_char(data, 'MMDD') BETWEEN '0215' AND '0621';

Total runtime: 251.188 ms

SELECT * FROM tbl
WHERE  to_char(data, 'MMDD')::int BETWEEN 215 AND 621;

Total runtime: 250.965 ms

SELECT * FROM tbl
WHERE  to_char(data, 'MMDD') COLLATE "C" BETWEEN '0215' AND '0621';

Total runtime: 221.732 ms
String comparison is faster with the "non-locale" C - more in the manual about collation support.

SELECT * FROM tbl
WHERE  EXTRACT(MONTH FROM data)*100 + EXTRACT(DAY FROM data)
       BETWEEN 215 AND 621;

Total runtime: 209.965 ms

SELECT * FROM tbl
WHERE  EXTRACT(MONTH FROM data) BETWEEN 3 AND 5
OR     EXTRACT(MONTH FROM data) = 2 AND EXTRACT(DAY FROM data) >= 15
OR     EXTRACT(MONTH FROM data) = 6 AND EXTRACT(DAY FROM data) <= 21;

Total runtime: 160.169 ms

SELECT * FROM tbl
WHERE  EXTRACT(MONTH FROM data) BETWEEN 2 AND 6
AND    CASE EXTRACT(MONTH FROM data) 
       WHEN 2 THEN EXTRACT(DAY FROM data) >= 15
       WHEN 6 THEN EXTRACT(DAY FROM data) <=21
       ELSE TRUE END;

Total runtime: 147.390 ms

SELECT * FROM tbl
WHERE  CASE EXTRACT(MONTH FROM data) 
       WHEN 3 THEN TRUE
       WHEN 4 THEN TRUE
       WHEN 5 THEN TRUE
       WHEN 2 THEN EXTRACT(DAY FROM data) >= 15
       WHEN 6 THEN EXTRACT(DAY FROM data) <= 21
       ELSE FALSE END;

Total runtime: 131.907 ms

@Kevin's solution with row constructors:

SELECT * FROM tbl
WHERE (EXTRACT(MONTH FROM data), EXTRACT(DAY FROM data))
       BETWEEN (2, 15) AND (6, 21);

Total runtime: 125.460 ms
Chapeau.


Faster with functional index

The only way to beat that is with indexes. None of the queries above can use a plain index on data. However, if read performance is crucial (and for a small cost on write performance) you can resort to a functional index:

CREATE INDEX ON tbl(EXTRACT(MONTH FROM data), EXTRACT(DAY FROM data));

SELECT * FROM tbl
WHERE (EXTRACT(MONTH FROM data), EXTRACT(DAY FROM data))
       BETWEEN (2, 15) AND (6, 21);

Total runtime: 85.895 ms

And that's where I can finally beat Kevin's query by a hair: with a single column index instead of the multi-column index needed in his case.

CREATE INDEX ON tbl(
CAST(EXTRACT(MONTH FROM data) * 100 + EXTRACT(DAY FROM data) AS int));

SELECT * FROM tbl
WHERE  (EXTRACT(MONTH FROM data) * 100 + EXTRACT(DAY FROM data))::int
       BETWEEN 215 AND 621;

Total runtime: 84.215 ms



回答5:

You can use simple condition >= and <= or similar or use between/and but the trick is to know your exact data type. Sometimes date fields contain time and that is where the query can go wrong so it is recommended to use some date related functions to remove the time issue. In SQL Server common function to do that is datediff function.