I have got a big CSVs that contain big strings. I wanna parse them in U-SQL.
@t1 =
SELECT
Regex.Match("ID=881cf2f5f474579a:T=1489536183:S=ALNI_MZsMMpA4voGE4kQMYxooceW2AOr0Q", "ID=(?<ID>\\w+):T=(?<T>\\w+):S=(?<S>[\\w\\d_]*)") AS p
FROM
(VALUES(1)) AS fe(n);
@t2 =
SELECT
p.Groups["ID"].Value AS gads_id,
p.Groups["T"].Value AS gads_t,
p.Groups["S"].Value AS gads_s
FROM
@t1;
OUTPUT @t
TO "/inhabit/test.csv"
USING Outputters.Csv();
Severity Code Description Project File Line Suppression State Error E_CSC_USER_INVALIDCOLUMNTYPE: 'System.Text.RegularExpressions.Match' cannot be used as column type.
I know how to do it in a SQL way with EXPLODE/CROSS APPLY/GROUP BY. But may be it is possible to do without these dances?
One more update
@t1 =
SELECT
Regex.Match("ID=881cf2f5f474579a:T=1489536183:S=ALNI_MZsMMpA4voGE4kQMYxooceW2AOr0Q", "ID=(?<ID>\\w+):T=(?<T>\\w+):S=(?<S>[\\w\\d_]*)").Groups["ID"].Value AS id,
Regex.Match("ID=881cf2f5f474579a:T=1489536183:S=ALNI_MZsMMpA4voGE4kQMYxooceW2AOr0Q", "ID=(?<ID>\\w+):T=(?<T>\\w+):S=(?<S>[\\w\\d_]*)").Groups["T"].Value AS t,
Regex.Match("ID=881cf2f5f474579a:T=1489536183:S=ALNI_MZsMMpA4voGE4kQMYxooceW2AOr0Q", "ID=(?<ID>\\w+):T=(?<T>\\w+):S=(?<S>[\\w\\d_]*)").Groups["S"].Value AS s
FROM
(VALUES(1)) AS fe(n);
OUTPUT @t1
TO "/inhabit/test.csv"
USING Outputters.Csv();
This wariant works fine. But there is a question. Will the regex evauated 3 times per row? Does exists any chance to hint U-SQL engine - the function Regex.Match is deterministic.