Using CTAS we can leverage the parallelism that Polybase provides to load data into a new table in a highly scalable and performant way.
Is there a way to use a similar approach to load data into an existing table? The table might even be empty.
Creating an external table and using INSERT INTO ... SELECT * FROM ...
- I would assume that this goes through the head node and is therefore not in parallel?
I know that I could also drop the table and use CTAS to recreate it but then I have to deal with all the metadata again (column names, data types, distributions, ...).
You could use partition switching to do this, although remember not to use too many partitions with Azure SQL Data Warehouse. See 'Partition Sizing Guidance' here.
Bear in mind check constraints are not supported so the source table has to use the same partition scheme as the target table.
Full example with partitioning and switch syntax: