First time here so forgive me for any faux pas. I have a question about the limitation of SQL as I am new to the code, and what I need I believe to be rather complex.
Is it possible to automate finding the optimal data for a specific query. For example, say I have the following columns:
1) Vehicle type (Text) e.g. car,bike,bus
2) Number of passengers (Numeric) e.g. 0-7
3) Was in an accident (Boolean) e.g. t or f
From here, I would like to get percentages. So if I were to select only cars with 3 passengers, what percentage of the total accidents does that account for.
I understand how to get this as a one off or mathematically calculate it, however my question relates how to automate this process to get the optimum number.
So, keeping with this example, say I look at just cars, what number of passengers covers the highest percentage of accidents?
At the moment, I am currently going through and testing number by number, is there a way to 'find' the optimal number? It is easy when it is just 0-7 like in the example, but I would naturally like to deal with a larger range and even multiple ranges. For example, say we add another variable titled:
4) Number of doors (numeric) e-g- 0-3
Would there be a way of finding the best combination of numbers from these two variables that cover the highest percentage of accidents?
So say we took: Car, >2 passengers, <3 doors on the vehicle. Out of the accidents variable 50% were true
But if we change that to:Car, >4 passengers, <3 doors. Out of the accidents variable 80% were true.
I hope I have explained this well. I understand that this is most likely not possible with SQL, however is there another way to find these optimum numbers?
Thanks in advance