SQL Table Creation from Joining Data

2019-09-17 19:31发布

问题:

Before I lay out the question, here’s the information concerning my data:

Table Name: dbo.DecodedCSVMessages_Staging
Columns: MMSI, Message_ID, Time, Vessel_Name, Ship_Type, IMO, Dimension_to_Bow, Dimension_to_stern, Dimension_to_port, Dimension_to_starboard, Draught, Longitude, Latitude

I need to create a New Table. This following is what I need in the table:

I am interested in all this data, but I only need Message_ID’s that are 1 or 3. ** Problem is, Message_ID’s 1 and 3 **lack the following: (Which is only available with Message_ID's 5.)

Vessel_Name,  Ship_Type, IMO, 
Dimension_to_Bow, 
Dimension_to_stern,
Dimension_to_port, 
Dimension_to_starboard, 
Draught

For Message_ID’s 1 and 3, those columns are marked NULL. All they have is

Longitude, 
Latitude, 
Time, 
MMSI

(which are all marked NULL for Message_ID's equaling 5)

MMSI is the primary key in this instance. Message_ID’s 1, 3 and 5 will all have MMSI numbers that represent a given ship. These MMSI’s though are reoccurring as each ship sends out multiple Message’s of type 1, 3 and 5. So say we have an MMSI of 210293000, This number will be alongside several Message_ID’s different types. So what I need to do is grab all the Message_ID’s that are 1 and 3 and append the information from the Message_ID’s that are 5 to the1’s and 3’s. So in that, the columns are no longer NULL.

Last but not least, I have to select only Message_ID 1’s and 3’s that fall within the following:

Where Latitude > 55 and Latitude <85 and Longitude > 50 and Longitude < 141;

Example of how a few columns look:

MMSI/ Message_ID /Time/Ship_type/Vessel_Name/Latitude/Longitude

21029300, 3, 2012-06-01, NULL, NULL, 56.528003, 85.233443

21029300, 5, 2012-07-01, 70, RIO_CUBAL, NULL, NULL

2109300, 1, 2012-08-01, NULL, NULL, 57.432345, 131.123343

2109300, 1, 2012-09-01, NULL, NULL, 62.432345, 121.123343

2109300, 1, 2012-09-02, NULL, NULL, 65.432345, 140.123343

21029300, 5, 2012-08-01, 70, RIO_CUBAL, NULL, NULL

The end result would be as follows from this data:

21029300, 3, 2012-06-01, 70, RIO_CUBAL, 56.528003, 85.233443

2109300, 1, 2012-08-01, 70, RIO_CUBAL, 57.432345, 131.123343

2109300, 1, 2012-09-01, 70, RIO_CUBAL, 62.432345, 121.123343

2109300, 1, 2012-09-02, 70, RIO_CUBAL, 65.432345, 140.123343

Thanks!

回答1:

In a single transaction, create the new table, use INSERT INTO ... SELECT ... to transfer the data from DecodedCSVMessages_Staging into the new table, then DELETE FROM ... to delete the old data from DecodedCSVMessages_Staging.



回答2:

Taking into account all the information from the comments section (especially about the one-time-thing note) you can try the following SQL statement. This will, however, only work if the relevant data of the "type 5" messages really is consistent as you stated!

SELECT Messages.MMSI
    ,Messages.Message_ID
    ,Messages.TIME
    ,Type5Messages.Vessel_Name
    ,Type5Messages.Ship_Type
    ,Type5Messages.IMO
    ,Type5Messages.Dimension_to_Bow
    ,Type5Messages.Dimension_to_stern
    ,Type5Messages.Dimension_to_port
    ,Type5Messages.Dimension_to_starboard
    ,Type5Messages.Draught
    ,Messages.Longitude
    ,Messages.Latitude
INTO [DataBaseName].[dbo].[YourNewTableName]
FROM dbo.DecodedCSVMessages_Staging Messages
LEFT OUTER JOIN (
    SELECT DISTINCT MMSI
        ,Vessel_Name
        ,Ship_Type
        ,IMO
        ,Dimension_to_Bow
        ,Dimension_to_stern
        ,Dimension_to_port
        ,Dimension_to_starboard
    FROM dbo.DecodedCSVMessages_Staging
    WHERE Messages.Message_ID = 5
    ) Type5Messages
    ON Messages.MMSI = Type5Messages.MMSI
WHERE Messages.Message_ID IN (1,3)
    AND Messages.Latitude > 55
    AND Messages.Latitude < 85
    AND Messages.Longitude > 50
    AND Messages.Longitude < 141;

In this statement you select all "type 1" and "type 3" rows and join them with all "type 5" rows that have the same MMSI value. If the "type 5" information is consistent regarding the selected columns, there will be only one entry for each "type 5" row and therefore every "type 1" and "type 3" row will appear only once. You should definitely do some testing, though.
The INTO [DataBaseName].[dbo].[YourNewTableName] line (which you will have to adjust naming-wise) creates a new table (if you choose a name that has not been used yet) with the original table's data types and inserts the selected rows. You might want to alter this table later on if you need indixes, keys, relationships or whatsoever.



回答3:

You can select the Type 1 and Type 3 messages, with added information from a join to the first corresponding Type 5 record in your data. (If there is no corresponding Type 5 record, you will get nulls for those fields.) Try this:

SELECT DISTINCT M13.MMSI, M13.Message_ID, M13.Time, M13.Latitude, M13.Longitude,
M5.Vessel_Name, M5.Ship_Type, M5.IMO, M5.Dimension_to_Bow
M5.Dimension_to_stern, M5.Dimension_to_port, 
M5.Dimension_to_starboard, M5.Draught
FROM dbo.DecodedCSVMessages_Staging M13
JOIN (
SELECT MMSI, Time, Vessel_Name, Ship_Type, IMO, Dimension_to_Bow
Dimension_to_stern, Dimension_to_port, Dimension_to_starboard, 
Draught
FROM dbo.DecodedCSVMessages_Staging
WHERE Message_ID = 5
ORDER BY Time
) M5
ON M5.MMSI = M13.MMSI
WHERE M13.Message_ID IN (1, 3)
AND M13.Latitude > 55
AND M13.Latitude < 85
AND M13.Longitude > 50
AND M13.Longitude < 141
ORDER BY M13.Time

If this returns the data you want, then create a new table and insert the records using INSERT INTO NewTable SELECT.

Please be aware that MMSI is not a primary key, either in the old table or in the new table. Primary keys (PK) have to be unique, and in this case you have multiple entries for each MMSI.

When you create your new table, you should add an integer IDENTITY field so that it will have a PK. This IDENTITY field isn't included in the insert statement, but it is automatically populated with an incrementing integer. That way each record gets a unique PK, which is highly desirable.

EDITED to use a DISTINCT on the outer query.