I need complete list of SQL Server datatypes converted to hive in sqoop.
Like SQL Server's timestamp
is converted to hive string
, SQL Server varchar
is converted to hive string
, etc.
Where can I find this complete mapping?
I need complete list of SQL Server datatypes converted to hive in sqoop.
Like SQL Server's timestamp
is converted to hive string
, SQL Server varchar
is converted to hive string
, etc.
Where can I find this complete mapping?
For Java mapping
SELECT * FROM <TABLE-NAME> WHERE 1=0
Query is used. It will fetch no records but metadata.
Useful source code:
protected String getColNamesQuery(String tableName) {
// adding where clause to prevent loading a big table
return "SELECT t.* FROM " + escapeTableName(tableName) + " AS t WHERE 1=0";
}
Then it uses ResulSetMetadata
to find datatypes.
Useful source code for generic JDBC,
int cols = results.getMetaData().getColumnCount();
ArrayList<String> columns = new ArrayList<String>();
ResultSetMetaData metadata = results.getMetaData();
for (int i = 1; i < cols + 1; i++) {
String colName = metadata.getColumnLabel(i);
if (colName == null || colName.equals("")) {
colName = metadata.getColumnName(i);
if (null == colName) {
colName = "_RESULT_" + i;
}
}
columns.add(colName);
LOG.debug("Found column " + colName);
Mapping from --map-column-java
is overridden on this mapping.
For Hive Mapping
This mapping is used:
public static String toHiveType(int sqlType) {
switch (sqlType) {
case Types.INTEGER:
case Types.SMALLINT:
return "INT";
case Types.VARCHAR:
case Types.CHAR:
case Types.LONGVARCHAR:
case Types.NVARCHAR:
case Types.NCHAR:
case Types.LONGNVARCHAR:
case Types.DATE:
case Types.TIME:
case Types.TIMESTAMP:
case Types.CLOB:
return "STRING";
case Types.NUMERIC:
case Types.DECIMAL:
case Types.FLOAT:
case Types.DOUBLE:
case Types.REAL:
return "DOUBLE";
case Types.BIT:
case Types.BOOLEAN:
return "BOOLEAN";
case Types.TINYINT:
return "TINYINT";
case Types.BIGINT:
return "BIGINT";
default:
// TODO(aaron): Support BINARY, VARBINARY, LONGVARBINARY, DISTINCT,
// BLOB, ARRAY, STRUCT, REF, JAVA_OBJECT.
return null;
}
}
Check source code here.
Mapping from --map-column-hive
is overridden on this mapping.
Sqoop is preconfigured to map most SQL types to appropriate Java or Hive representatives. However the default mapping might not be suitable for everyone and might be overridden.
sqoop import ... --map-column-java id=String,value=Integer
See Sqoop user guide: https://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_controlling_type_mapping
Also some other config properties may affect type mapping, like sqoop.bigdecimal.format.string
, etc
Probably you need to study source codes for complete list of preconfigured mappings. But specifying them as --map-column-java or --map-column-hive gives you more control. You can generate mappings from target tables metadata.