Table Normalization (Parse comma separated fields

2019-02-17 14:57发布

问题:

I have a table like this:

Device

DeviceId   Parts

1          Part1, Part2, Part3
2          Part2, Part3, Part4
3          Part1

I would like to create a table 'Parts', export data from Parts column to the new table. I will drop the Parts column after that

Expected result

Parts

PartId PartName

  1      Part1
  2      Part2
  3      Part3
  4      Part4

DevicePart

DeviceId PartId

  1      1
  1      2
  1      3
  2      2
  2      3
  2      4
  3      1

Can I do this in SQL Server 2008 without using cursors?

回答1:

-- Setup:

declare @Device table(DeviceId int primary key, Parts varchar(1000))
declare @Part table(PartId int identity(1,1) primary key, PartName varchar(100))
declare @DevicePart table(DeviceId int, PartId int)

insert @Device
values
    (1, 'Part1, Part2, Part3'),
    (2, 'Part2, Part3, Part4'),
    (3, 'Part1')

--Script:

declare @DevicePartTemp table(DeviceId int, PartName varchar(100))

insert @DevicePartTemp
select DeviceId, ltrim(x.value('.', 'varchar(100)'))
from
(
    select DeviceId, cast('<x>' + replace(Parts, ',', '</x><x>') + '</x>' as xml) XmlColumn
    from @Device
)tt
cross apply
    XmlColumn.nodes('x') as Nodes(x)


insert @Part
select distinct PartName
from @DevicePartTemp

insert @DevicePart
select tmp.DeviceId, prt.PartId
from @DevicePartTemp tmp 
    join @Part prt on
        prt.PartName = tmp.PartName

-- Result:

select *
from @Part

PartId      PartName
----------- ---------
1           Part1
2           Part2
3           Part3
4           Part4


select *
from @DevicePart

DeviceId    PartId
----------- -----------
1           1
1           2
1           3
2           2
2           3
2           4
3           1   


回答2:

You will need a Tally table to accomplish this without a cursor.

Follow the instructions to create a tally table here: Tally Tables by Jeff Moden

This script will put the table into your Temp database, so you probably want to change the "Use DB" statement

Then you can run the script below to insert a breakdown of Devices and Parts into a temp table. You should then be able to join on your part table by the part name (to get the ID) and insert into your new DevicePart table.

select *, 
--substring(d.parts, 1, t.n)
substring(d.parts, t.n, charindex(', ', d.parts + ', ',t.n) - t.n) 'Part'
into #devicesparts
from device d
cross join tally t
where t.n < (select max(len(parts))+ 1 from device)
and substring(', ' + d.parts, t.n, 1) = ', '


回答3:

Have a look at using fn_Split to create a table variable from the comma separated values. You can then use this to drive your insert.

EDIT: Actually, I think you may still need a cursor. Leaving this answer incase fn_Split helps.



回答4:

If there is a maximum number of parts per device then, yes, it can be done without a cursor, but this is quite complex.

Essentially, create a table (or view or subquery) that has a DeviceID and one PartID column for each possible index in the PartID string. This can be accomplished by making the PartID columns calculated columns using fn_split or another method of your choice. From there you do a multiple self-UNION of this table, with one table in the self-UNION for each PartID column. Each table in the self-UNION has only one of the PartID columns included in the select list of the query for the table.