I have a table in athena with this structure
CREATE EXTERNAL TABLE `json_test`(
`col0` string ,
`col1` string ,
`col2` string ,
`col3` string ,
`col4` string ,
)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
'quoteChar'='\"',
'separatorChar'='\;')
A Json String like this is stored in "col4":
{'email': 'test_email@test_email.com', 'name': 'Andrew', 'surname': 'Test Test'}
I´m trying to make a json_extract query:
SELECT json_extract(col4 , '$.email') as email FROM "default"."json_test"
But the query returns empty values.
Any help would be appreciated.
The JSON needs to use double quotes ("
) for enclosing values.
Compare:
presto> SELECT json_extract('{"email": "test_email@test_email.com", "name": "Andrew"}' , '$.email');
_col0
-----------------------------
"test_email@test_email.com"
and
presto> SELECT json_extract('{''email'': ''test_email@test_email.com'', ''name'': ''Andrew''}', '$.email');
_col0
-------
NULL
(Note: ''
inside SQL varchar literal mean single '
in the constructed value, so the literal here is the same format that in the question.)
If your string value is a "JSON with single quotes", you can try to fix it with replace(string, search, replace) → varchar
The problem was the single quote char of the json string stored
{'email': 'test_email@test_email.com', 'name': 'Andrew', 'surname': 'Test Test'}
Changing to double quote
{"email": "test_email@test_email.com", "name": "Andrew", "surname": "Test Test"}
Athena Query works properly:
SELECT json_extract(col4 , '$.email') as email FROM "default"."json_test"