Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 6 years ago.
I am working on LearnBiz Simulations, a self-funded start-up which makes simulations for the domain of management education.
(a) Our website is currently catering to about 16k people. Currently, size of database is 30 mb with a total of about 90 tables and each table may have anywhere between 5 to 50 columns. All our tables in the database, repeatedly have new rows of insertions, deletions or updations. But no columns are ever added. Does it create any issues with employing indexing techniques? There were a couple of other forums and videos which suggests that even inserting or deleting a row will render indexing useless!
(b) The maximum work on our website is mysql queries with some processing of lines of code ranging between 200-800 lines. Almost 95%+ of our queries have multiple equality where clause, to pin-point the row required. If I understand correctly, then employing indexing will help us speed up the process multiple times faster?
(c) There are some simulations which process data of many participants every minute. For such simulations, currently we are not able to cater even 50 participants at a time. Would employing indexing help such simulations bear many more users?
(d) The server we have currently is a VDS with GoDaddy (costs around $400 annually). To make our website faster, would it make sense for us to shift to a DDS (costing around $3000 annually) or get something like TokuDB? Is there a way to judge user capacity of current system, i.e. server + database + coding?
(a) Those forums and videos are worthless, mindlessly repeating half-truths (yes, there is an overhead when using indexes, it's just that the performance gain usually outweighs it many times)
(b) In majority of cases, although be careful to create indexes that are actually useful. MySQL documentation has a whole chapter on how to do that (in general you will want to have a look at the entire Optimization chapter too
(c) Make sure your benchmark is not simulating too heavy traffic. 50 real users at a time will not generate 50 connections each second for example. Again, you performance should increase after implementing indexes and optimizing your queries
(d) No amount of better resources will help if your database server is not configured properly (do you use query cache? do you allow MySQL to use enough memory to keep tables in memory? etc.)
To sum up: read about basic configuration of your MySQL server, so that it can utilize your resources effectively (defaults are usually not good enough), and also have a look at the Optimization chapter in the manual.
Regarding the indexes questions, an index is not only desirable, but it is needed to "speed up" things. As I understand it (in "layman" terms), the function of an index is to speed up searches and recoveries of data inside a table.
Reasons to use indexes:
- Identify uniquely each row on each table (the primary key is an index, after all)
- The indexes are sorted (even if the data is not)
- Speed up searches and filters: An index makes the recovery of data faster, since it "holds" the placing of the data in the table (it "pinpoints" the data you want to recover). Also it makes easier to the database engine to filter the data (it is always faster and simpler to filter sorted data than scrambled data)
- Optimize the way data is recovered when using related tables: Every foreign key must be indexed in order to speed up the queries that involve primary - foreign key relations
Some "thumb rules" I use to decide which fields need to be indexed:
- Every primary key is indexed (the obvious one: a primary key must be unique and not null)
- Every foreign key must be indexed (in order to make primary - foreign key relations efficient)
- Every numeric or date field on which I need to perform searches must be indexed. That said, I try avoid
double
(or any other floating point numeric type) fields to be indexed, since they are generaly used to store values not meant to be searched.
- Every
char
or varchar
field on which I need to perform searches must be indexed. Try to avoid indexes on text
fields, since they can hold very big values in them.
- Avoid indexing binary (
blob
) fields... it makes no sense
- Don't fall in the temptation of indexing everything. Take your time to decide which fields must be indexed and which fields must not be indexed.