Before I lay out the question, here’s the information concerning my data:
Table Name: dbo.DecodedCSVMessages_Staging
Columns: MMSI, Message_ID, Time, Vessel_Name, Ship_Type, IMO, Dimension_to_Bow, Dimension_to_stern, Dimension_to_port, Dimension_to_starboard, Draught, Longitude, Latitude
I need to create a New Table. This following is what I need in the table:
I am interested in all this data, but I only need Message_ID’s that are 1 or 3. ** Problem is, Message_ID’s 1 and 3 **lack the following: (Which is only available with Message_ID's 5.)
Vessel_Name, Ship_Type, IMO,
Dimension_to_Bow,
Dimension_to_stern,
Dimension_to_port,
Dimension_to_starboard,
Draught
For Message_ID’s 1 and 3, those columns are marked NULL. All they have is
Longitude,
Latitude,
Time,
MMSI
(which are all marked NULL for Message_ID's equaling 5)
MMSI is the primary key in this instance. Message_ID’s 1, 3 and 5 will all have MMSI numbers that represent a given ship. These MMSI’s though are reoccurring as each ship sends out multiple Message’s of type 1, 3 and 5. So say we have an MMSI of 210293000, This number will be alongside several Message_ID’s different types. So what I need to do is grab all the Message_ID’s that are 1 and 3 and append the information from the Message_ID’s that are 5 to the1’s and 3’s. So in that, the columns are no longer NULL.
Last but not least, I have to select only Message_ID 1’s and 3’s that fall within the following:
Where Latitude > 55 and Latitude <85 and Longitude > 50 and Longitude < 141;
Example of how a few columns look:
MMSI/ Message_ID /Time/Ship_type/Vessel_Name/Latitude/Longitude
21029300, 3, 2012-06-01, NULL, NULL, 56.528003, 85.233443
21029300, 5, 2012-07-01, 70, RIO_CUBAL, NULL, NULL
2109300, 1, 2012-08-01, NULL, NULL, 57.432345, 131.123343
2109300, 1, 2012-09-01, NULL, NULL, 62.432345, 121.123343
2109300, 1, 2012-09-02, NULL, NULL, 65.432345, 140.123343
21029300, 5, 2012-08-01, 70, RIO_CUBAL, NULL, NULL
The end result would be as follows from this data:
21029300, 3, 2012-06-01, 70, RIO_CUBAL, 56.528003, 85.233443
2109300, 1, 2012-08-01, 70, RIO_CUBAL, 57.432345, 131.123343
2109300, 1, 2012-09-01, 70, RIO_CUBAL, 62.432345, 121.123343
2109300, 1, 2012-09-02, 70, RIO_CUBAL, 65.432345, 140.123343
Thanks!
In a single transaction, create the new table, use
INSERT INTO ... SELECT ...
to transfer the data fromDecodedCSVMessages_Staging
into the new table, thenDELETE FROM ...
to delete the old data fromDecodedCSVMessages_Staging
.Taking into account all the information from the comments section (especially about the one-time-thing note) you can try the following SQL statement. This will, however, only work if the relevant data of the "type 5" messages really is consistent as you stated!
In this statement you select all "type 1" and "type 3" rows and join them with all "type 5" rows that have the same
MMSI
value. If the "type 5" information is consistent regarding the selected columns, there will be only one entry for each "type 5" row and therefore every "type 1" and "type 3" row will appear only once. You should definitely do some testing, though.The
INTO [DataBaseName].[dbo].[YourNewTableName]
line (which you will have to adjust naming-wise) creates a new table (if you choose a name that has not been used yet) with the original table's data types and inserts the selected rows. You might want to alter this table later on if you need indixes, keys, relationships or whatsoever.You can select the Type 1 and Type 3 messages, with added information from a join to the first corresponding Type 5 record in your data. (If there is no corresponding Type 5 record, you will get nulls for those fields.) Try this:
If this returns the data you want, then create a new table and insert the records using INSERT INTO NewTable SELECT.
Please be aware that MMSI is not a primary key, either in the old table or in the new table. Primary keys (PK) have to be unique, and in this case you have multiple entries for each MMSI.
When you create your new table, you should add an integer IDENTITY field so that it will have a PK. This IDENTITY field isn't included in the insert statement, but it is automatically populated with an incrementing integer. That way each record gets a unique PK, which is highly desirable.
EDITED to use a
DISTINCT
on the outer query.