i want to put my logfiles into a hive (amazon Athena)
my regex is ok, says the tester: https://regex101.com/r/hF4fP8/11
my create table is this:
CREATE EXTERNAL TABLE IF NOT EXISTS webservicelogs.Test15 (
`day` int,
`month` string,
`year` int,
`hour` int,
`minute` int,
`second` int,
`offset` string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '\[(\d{2})\/([a-zA-Z]{3})\/(\d{4}):(\d{2}):(\d{2}):(\d{2})\s(\+\d{4})]' )
LOCATION 's3://getag-athena/Test/'
TBLPROPERTIES ('has_encrypted_data'='false')
the create table statement works
if i want to select the table this erros occures
SELECT * FROM "webservicelogs"."test15" limit 10;
Your query has the following error(s):
HIVE_CURSOR_ERROR: Number of matching groups doesn't match the number of columns
the Logfiles i want to parse is like this:
85.239.101.101 - - [07/Jan/2016:01:00:00 +0100] "POST /bpwsortsinfo1-3/services/Ortsinfo?wsdl HTTP/1.1" 200 467 "-" "Axis2" 449/1883 23 BP7 0