Before I lay out the question, here’s the information concerning my data:
Table Name: dbo.DecodedCSVMessages_Staging
Columns: MMSI, Message_ID, Time, Vessel_Name, Ship_Type, IMO, Dimension_to_Bow, Dimension_to_stern, Dimension_to_port, Dimension_to_starboard, Draught, Longitude, Latitude
I need to create a New Table. This following is what I need in the table:
I am interested in all this data, but I only need Message_ID’s that are 1 or 3. ** Problem is, Message_ID’s 1 and 3 **lack the following: (Which is only available with Message_ID's 5.)
Vessel_Name, Ship_Type, IMO,
Dimension_to_Bow,
Dimension_to_stern,
Dimension_to_port,
Dimension_to_starboard,
Draught
For Message_ID’s 1 and 3, those columns are marked NULL. All they have is
Longitude,
Latitude,
Time,
MMSI
(which are all marked NULL for Message_ID's equaling 5)
MMSI is the primary key in this instance. Message_ID’s 1, 3 and 5 will all have MMSI numbers that represent a given ship. These MMSI’s though are reoccurring as each ship sends out multiple Message’s of type 1, 3 and 5. So say we have an MMSI of 210293000, This number will be alongside several Message_ID’s different types. So what I need to do is grab all the Message_ID’s that are 1 and 3 and append the information from the Message_ID’s that are 5 to the1’s and 3’s. So in that, the columns are no longer NULL.
Last but not least, I have to select only Message_ID 1’s and 3’s that fall within the following:
Where Latitude > 55 and Latitude <85 and Longitude > 50 and Longitude < 141;
Example of how a few columns look:
MMSI/ Message_ID /Time/Ship_type/Vessel_Name/Latitude/Longitude
21029300, 3, 2012-06-01, NULL, NULL, 56.528003, 85.233443
21029300, 5, 2012-07-01, 70, RIO_CUBAL, NULL, NULL
2109300, 1, 2012-08-01, NULL, NULL, 57.432345, 131.123343
2109300, 1, 2012-09-01, NULL, NULL, 62.432345, 121.123343
2109300, 1, 2012-09-02, NULL, NULL, 65.432345, 140.123343
21029300, 5, 2012-08-01, 70, RIO_CUBAL, NULL, NULL
The end result would be as follows from this data:
21029300, 3, 2012-06-01, 70, RIO_CUBAL, 56.528003, 85.233443
2109300, 1, 2012-08-01, 70, RIO_CUBAL, 57.432345, 131.123343
2109300, 1, 2012-09-01, 70, RIO_CUBAL, 62.432345, 121.123343
2109300, 1, 2012-09-02, 70, RIO_CUBAL, 65.432345, 140.123343
Thanks!
In a single transaction, create the new table, use INSERT INTO ... SELECT ...
to transfer the data from DecodedCSVMessages_Staging
into the new table, then DELETE FROM ...
to delete the old data from DecodedCSVMessages_Staging
.
Taking into account all the information from the comments section (especially about the one-time-thing note) you can try the following SQL statement. This will, however, only work if the relevant data of the "type 5" messages really is consistent as you stated!
SELECT Messages.MMSI
,Messages.Message_ID
,Messages.TIME
,Type5Messages.Vessel_Name
,Type5Messages.Ship_Type
,Type5Messages.IMO
,Type5Messages.Dimension_to_Bow
,Type5Messages.Dimension_to_stern
,Type5Messages.Dimension_to_port
,Type5Messages.Dimension_to_starboard
,Type5Messages.Draught
,Messages.Longitude
,Messages.Latitude
INTO [DataBaseName].[dbo].[YourNewTableName]
FROM dbo.DecodedCSVMessages_Staging Messages
LEFT OUTER JOIN (
SELECT DISTINCT MMSI
,Vessel_Name
,Ship_Type
,IMO
,Dimension_to_Bow
,Dimension_to_stern
,Dimension_to_port
,Dimension_to_starboard
FROM dbo.DecodedCSVMessages_Staging
WHERE Messages.Message_ID = 5
) Type5Messages
ON Messages.MMSI = Type5Messages.MMSI
WHERE Messages.Message_ID IN (1,3)
AND Messages.Latitude > 55
AND Messages.Latitude < 85
AND Messages.Longitude > 50
AND Messages.Longitude < 141;
In this statement you select all "type 1" and "type 3" rows and join them with all "type 5" rows that have the same MMSI
value. If the "type 5" information is consistent regarding the selected columns, there will be only one entry for each "type 5" row and therefore every "type 1" and "type 3" row will appear only once. You should definitely do some testing, though.
The INTO [DataBaseName].[dbo].[YourNewTableName]
line (which you will have to adjust naming-wise) creates a new table (if you choose a name that has not been used yet) with the original table's data types and inserts the selected rows. You might want to alter this table later on if you need indixes, keys, relationships or whatsoever.
You can select the Type 1 and Type 3 messages, with added information from a join to the first corresponding Type 5 record in your data. (If there is no corresponding Type 5 record, you will get nulls for those fields.) Try this:
SELECT DISTINCT M13.MMSI, M13.Message_ID, M13.Time, M13.Latitude, M13.Longitude,
M5.Vessel_Name, M5.Ship_Type, M5.IMO, M5.Dimension_to_Bow
M5.Dimension_to_stern, M5.Dimension_to_port,
M5.Dimension_to_starboard, M5.Draught
FROM dbo.DecodedCSVMessages_Staging M13
JOIN (
SELECT MMSI, Time, Vessel_Name, Ship_Type, IMO, Dimension_to_Bow
Dimension_to_stern, Dimension_to_port, Dimension_to_starboard,
Draught
FROM dbo.DecodedCSVMessages_Staging
WHERE Message_ID = 5
ORDER BY Time
) M5
ON M5.MMSI = M13.MMSI
WHERE M13.Message_ID IN (1, 3)
AND M13.Latitude > 55
AND M13.Latitude < 85
AND M13.Longitude > 50
AND M13.Longitude < 141
ORDER BY M13.Time
If this returns the data you want, then create a new table and insert the records using INSERT INTO NewTable SELECT.
Please be aware that MMSI is not a primary key, either in the old table or in the new table. Primary keys (PK) have to be unique, and in this case you have multiple entries for each MMSI.
When you create your new table, you should add an integer IDENTITY field so that it will have a PK. This IDENTITY field isn't included in the insert statement, but it is automatically populated with an incrementing integer. That way each record gets a unique PK, which is highly desirable.
EDITED to use a DISTINCT
on the outer query.