USQL using ARRAY_AGG on user defined type

2019-08-20 08:36发布

问题:

UDT

[SqlUserDefinedType(typeof(StudentHistoryFormatter))]
    public struct StudentHistory
    {
        public StudentHistory(int i, double? score, string status):this()
        {
            InstitutionId = i;
            Score = score;
            Status = status;
        }

        int InstitutionId { get; set; }

        double? Score {get; set; }

        string Status { get; set; }

        public string Value()
        {
            return string.Format("{0},{1},{2}", InstitutionId, Score, Status);
        }
    }

For simplicity I did not even put the class in a namespace. I registered the assembly with the USQL database

USQL

@history =
    EXTRACT InstitutionId int,
            Score double,
            Status string
    FROM @"CoreData\Institution\history.csv"
    USING Extractors.Csv();

@historyMap =
   SELECT InstitutionId,
               ARRAY_AGG<StudentHistory>(new StudentHistory(InstitutionId, Score, Status)) AS History
        FROM @history
        GROUP BY InstitutionId;

Error

Severity Code Description Project File Line Suppression State Error E_CSC_USER_INVALIDCOLUMNTYPE: 'Microsoft.Analytics.Types.Sql.SqlArray' cannot be used as column type. Description: The column type must be a supported scalar, complex or user defined type. Resolution: Ensure the column type is a supported type. For a user defined type, make sure the type is registered, the type name is fully qualified, and the required assembly is referenced by the script.

回答1:

The error message is a bit unclear. Currently, the only supported item types in a SqlArray are the built-in scalar and complex (SqlArray, SqlMap) types. User-defined types are currently not supported.

You can model it either by creating your own ArrayofUDT user-defined type or serialize the UDT into the byte[].

Here is an example of the latter with ARRAY_AGG, assuming you have a ToBinary() method that allows you to convert the UDT to byte[]:

@data = SELECT key, ARRAY_AGG( myUDT.ToBinary() ) AS obj_array FROM @input GROUP BY key;

(see How to convert an object to a byte array in C# for information about object serialization options)

Then when you want to get the object back, you would use CROSS APPLY EXPLODE on the object array and use the byte[] to rehydrate the UDT instance. Pseudo-code:

@objects = 
  SELECT myUDT.FromBinary(data) 
  FROM @data CROSS APPLY EXPLODE (obj_array) AS t(data);

Please file a feature request at http://aka.ms/adlfeedback for UDT in complex types. I will file a bug against the error message.



标签: u-sql