I am working on an offline C# application that can find bus routes. I can extract the timetable/bus/route data. I am searching for the most simple solution that will work with basic data.
What algorithm can be used to find a route from bus stop "A" to bus stop "B"? Is there a open-source solution ready for C#/Java? Is the google GTFS format for database good for a simple solution? http://code.google.com/transit/spec/transit_feed_specification.html
Thanks for any help. I am stuck with this. I don't know where to start - how to store the data and how to find routes. I know about Dijkstra/A* but I have used them only on graphs that were not time dependent...
Conceptually, you take the same basic algorithm for evaluating distance between A and B, but instead of distance, you should be evaluating time. Dijkstra can do both, if you give it the proper inputs.
You're used to seeing a map as a measure of distance. However, the same map can be a measure of time as well; all you need is to add data about average speed, and the time it takes to cover a particular distance of a particular road will shake itself out. You can even visualize the map in terms of time; routes that take longer will be longer. Dijkstra doesn't care which it's evaluating, really; it just cares about finding the continuous route with the lowest number, and whether that number represents length or time is immaterial.
To incorporate speed, naive algorithms simply use the daytime speed limit and assume you never have to stop while going from A to B; more advanced algorithms can incorporate information about time of day and traffic patterns (which will impact the average speed you travel on that road at that time), and whether a road is a freeway or surface street (and thus make educated guesses about time spent stopped at an intersection). What you use depends on what you have available, but a basic 4- or 5-layer time of day dimension should be adequate for all but the absolute most time-critical applications. For each direction of each road in your map, you need the average speed during morning rush, daytime, evening rush and night, possibly with lunchtime numbers as well. Once you have that, it's a relatively basic change to a Dijkstra algorithm to pass in a time of day and have it evaluate routes based on time.
The problem you are working on is not a trivial task. So much so, that is has a name: the mixed integer nonlinear programming problem (MINLP). In the words of one author (Deb 1998):
In Deb's paper he proposes a genetic algorithm.
Your other option would be to use simulation. Just to throw something out there you can try right away-- choose thousands of random routes that start at your origin, and fish out the ones that work reasonably well at getting to the destination.
Picture the algorithm like this: You are trying to find the quickest route from stop A to stop B, starting at a certain time. You hire 1,000 people and arm them with a quarter to flip. You tell them to flip the coin every time they have a chance to get on or off a bus. Heads, get off (or get on, if already off). Tails, stay on (or keep waiting, if off). They each have an index card to write down the choices they make as they go. You go to point B and wait for the first guy to show up and take his card.
Try this as your data model.
Bus 1 goes to stops A, B and C. Bus 2 goes to stops B, D and E.
I would store a unique node based on both bus and stop, with distances to nodes being based on time. We would have node A1, B1, C1, B2, D2 and E2. In the special case of transfers apply the wait for the next bus as the distance between nodes. For example, if Bus 1 arrives at stop B 30 minutes before Bus 2, travel time from B1 to B2 is 30 minutes.
You should than be able to apply Dijkstra's Algorithm.
If you are interested in time information, then why not label the distance values on the graph edges using the time information instead of their physical distance from each other. That way you will search for the quickest route, instead of the shortest route. You can then use Dijkstra/A* in order to compute the your result.
I'm a little unclear on what you mean by time dependent. If you mean that that you need to answer queries of the form 'goes from x to y before 10am' then you can calculate what bus routes arrive before 10am, seems like a simple filter on the data. Then apply Dijkstra/A* to the data.
Just wanted to share my final approach on this problem. This was part of a university project, so it might not be fully suitable for real world use. It had to be reasonably fast to run on a Windows Mobile device.
I ended up using 4 tables (SQLite). One table keeps the list of buses, the second one keeps a list of stations. Another table keeps the combinations - what bus does stop at a specific station and where does it go from this station and how long (minutes) does it take. All combinations have to be stored. And the last table is a simple time-table. For every station it lists every bus that stops there and the time (I stored the time as integer value - 14:34 is 1434, to make it faster for comparing).
I used a bi-directional breadth first search algorithm. The neighbor stations (directly accessible) are retrieved for the start station and destination station. There is a path from station A to station X if these two "graphs" overlap on a station. For example, from station A you can get to stations B,C,D,E (by using specific buses). And from the destination station X you can get directly to N,C,Z. These two graphs overlap on station C. So a path A -> C -> X exists. If no paths are found in this first iteration, then the algorithm continues and spreads out the graphs (BFS) again.
Time is not taken into account in the first step - this makes it fast enough. You only get a list of possible paths with a list of buses that you have to use to take these paths. The times are evaluated in the last step, you go through the list of possible paths and check if the bus travels within the specific time (increasing the time every stop).
On a small city, with 250 stations and 100+ buses/railways, these approach works up to 3 changes (where you have to change the buses on the journey). It takes only seconds to calculate. But I had to load the whole database into memory (Dictionary) to speed up the queries, which were taking too long.
I don't think this would work for a large network though. But it works for a public transport of a single small-medium size city.
read this:
Multi-Modal Route Planning. Master's thesis, Universität Karlsruhe (TH), Fakultät für Informatik, 2009. online available at http://i11www.ira.uka.de/extra/publications/p-mmrp-09.pdf
the section on railway routing also applies for bus routing.
The gist of it: the naive approach of expanding space and time into a single graph does not work for large networks. There are smarter solutions.