I'm trying to filter data between September 1st, 2010 and August 31st, 2013 in a Hive table. The column containing the date is in string format (yyyy-mm-dd). I can use month() and year() on this column. But how do I use them to filter data between the above dates? Any examples/sample code would be welcome!
相关问题
- Date with SimpleDateFormat in Java
- Filter Datagridview rows using TextBox
- VBA local timezone adjustment
- How do you use java stream api to convert list of
- JSpinner.DateEditor in Java Not Respecting TimeZon
相关文章
- 在hive sql里怎么把"2020-10-26T08:41:19.000Z"这个字符串转换成年月日
- MYSQL: How can I find 'last monday's date&
- Calculate number of working days in a month [dupli
- Get file created date in node
- SQL query Frequency Distribution matrix for produc
- Temporal Extraction (i.e. Extract date/time entiti
- Postgres String to Date EXAMPLE 10Apr77 to 10/04/1
- Cloudera 5.6: Parquet does not support date. See H
No need to extract the month and year.Just need to use the unix_timestamp(date String,format String) function.
For Example:
Just like SQL, Hive supports BETWEEN operator for more concise statement:
In case you are unable to extract same with unix time stamps , then write a java UDF for hive
You have to convert string formate to required date format as following and then you can get your required result.
The great thing about
yyyy-mm-dd
date format is that there is no need to extractmonth()
andyear()
, you can do comparisons directly on strings:Hive has a lot of good date parsing UDFs: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions
Just doing the string comparison as Nigel Tufnel suggests is probably the easiest solution, although technically it's unsafe. But you probably don't need to worry about that unless your tables have historical data about the medieval ages (dates with only 3 year digits) or dates from scifi novels (dates with more than 4 year digits).
Anyway, if you ever find yourself in a situation where you would want to do fancier date comparisons, or if your date format is not in a "biggest to smallest" order, e.g. the American convention of "mm/dd/yyyy", then you could use
unix_timestamp
with two arguments: