This is a school project carried out in 2017.
The goal is to understand how does public transportation networks are thought to be efficients in terms of travel time and building cost.
We want to answer the following problem :
Given the geographical localisations of several subway stations, we would like to know how to find a "good" graph to connect them.
Here, "good" means that the average travel time and the building cost of the network should be reasonably lows.
In order to understand each of the following notations, please look at the formal definition of the problem from the report document.
However, here is an overview to help you get a quick mathematical understanding of the problem :
Then the problem sum up to :
We use data from Open Data RATP to get, for each station of the Parisian subway, its localisation and its frequentation over one year.
After preprocessing the raw data, we get a dataframe of 302 stations that looks like this :
Station name | Longitude | Latitude | Entries over one year |
---|---|---|---|
'Abbesses' | 2.3387128116588 | 48.884417645184 | 2417881 |
'Alésia' | 2.3267456737192 | 48.828398514348 | 5124204 |
'Alexandre Dumas' | 2.3949898158233 | 48.856174448968 | 3780611 |
Minimizing both the building cost and the average travel time of a transport network is an impossible problem because these two objectives are contradictory.
Here, we try a first strategy which is to start from the optimal solution for the cost : a minimal spanning tree.