I am working on fleet management. I am having large amount of writes on a location table with following columns
- date
- time
- vehicle no.
- long
- latitude
- speed
- userid (which is foreign key...)
Here this table is going to have write operation every 3 sec. Hence there will be millions of record in it.
So to retrieve faster data I AM PLANNING Partition.
Now my question:-
- How to handle foreign key? I heard that partition does not support foreign key
- Which column should be used for partition.
- is it necessary to have unique key as a partition column.
There will be trillions of record
@rc-Thanks man..what abt the performance ...see i am inserting data after every 3 sec so i have to run a check procedure every time i insert the data...so what about the performance???
2>I would like to go partition column as vehicle no.....is there any alternate way...
Read: MySQL Partitioning Limitations
1.) FKs are not supported on partitioned tables.
- One option is to create a stored procedure that inserts/updates the record and to verify inside the procedure that the user id passed is present in your users table before the insert takes place. You should set up the permissions on the table so that only the SP is allowed to update and insert to allow applications and/or users from backdooring the check. You will also need to take precautions when removing users from the users table.
2.) Which column you use for partitioning will depend on how your accessing the table. If your queries are always based on vechicle no., then it probably makes sense to do a hash partition on that column. If you're querying or reporting more on something like "what vehicles have been added this month" or you want to "roll" partitions out as they become a certain age, then partitioning on date may be the way to go. This is something you'll have to decided based on your usage.
3.) See the link above for more information.
Edit based on user question:
Inserting a record every 3 seconds is not a lot of throughput. Make sure you have a primary key on your users table in order for the check inside the procedure to be done efficiently. (This is true even if FKs were supported) The DB would be doing this check for you behind the scenes if you had support for FK's so in that sense, it's not hurting you. If the check ends up being a bottleneck, you may feel the need to drop it and possibly report errant user ids as a nightly batch process, but if you're user table is relatively small and indexed correctly I don't see this being an issue.
Another option would be to do the partitioning manually (i.e. sharding) with partitioned or non-partitioned tables. With the non-partitioned tables of course, you could use native foreign keys. For example you would split your vehicles table into multiple tables like: (assuming you want to use the vehicleNo as the "key")
VehiclesNosLessThan1000
VehiclesNosLessThan2000
VehiclesNosLessThan...
VehiclesNosLessThanMAX
Here you probably want to have an SP again so that the application/user doesn't have to know about the tables. The SP would be responsible for inserting/updating the correct table based on the vehicleNo passed in. You would also want an SP for selecting data so that the app/user doesn't have to know the table to select from either. For easy access to all the data, you can create a view that unions all the tables together.
Note that one benefit of this is that currently MyISAM locks an entire partitioned table during updates, not just the partition it is updating. Sharding a table this way alleviates that contention because the tables themselves are the "partitions".
Based on the limited data I have on what you're doing, I would probably write 2 stored procedures, 1 for selecting the data and 1 for updating/inserting the data and have your application use those for all access. Then I would try the regular partitioning via hash on vehicleNo first while enforcing the user_id key within the procedure. If this becomes an issue, you can easily migrate to sharding the data across multiple tables while not having to change the application because all the logic on how to retrieve and update the data is contained within the SPs.