Monday, March 05, 2012

I need to be there by 9: Transit times in Toronto aboard the TTC

I recently realized that with all of the open data initiatives pursued by the government, it's possible for amateurs to grab this data and process it themselves. That's, of course, the whole point of open data initiatives, but I hadn't realized until the last few months that that included me.

With my interest in transit, I decided to grab some of the online transit data and see what I could do with it. Since Google has managed to convince everyone to put their transit route and timing information online in the form of GTFS, I decided that a good first step in playing with transit data would be to grab this GTFS feed for the TTC and chart out transit times in the city. The result is below. It shows how early you have to leave from various parts of the city to reach either Queen or Osgoode subway stations before 9am.

Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

Why downtown during rush hour?
I decided to chart out transit times to downtown during rush hour because this is the most common transit pattern. Most people want to go downtown, and they need to go during rush hour. Even if you do want to go somewhere other than downtown, it's often fastest to still go downtown first because transit systems are usually optimized for taking people to and from downtown the fastest. Of course, the chart isn't representative of off-peak travel times. And sometimes transit systems are overly oriented towards moving people downtown, so it might actually easier to go downtown than to go to the local mall on transit, for example. I was thinking of maybe trying to calculate the transit times to various points in the city, or maybe calculate the transit time needed to reach 70th percentile of the city from any particular location, or maybe the time needed to get 5km away from any location, but doing that would require more complicated coding, and I was feeling lazy, so I opted for the simpler calculation of how long it takes to get downtown.

Simulation Setup
I took the GTFS data for Wednesday, February 15. I took all of the bus stops and transit stations in the data feed and used the transit time information to simulate how long it would take to move between the various stops. In the transit simulation, people are able to walk between stops for a distance of up to 100m at a time at a speed of 4km/h. In reality, some people are willing to much further than 100m if it saves them time, but the 100m restriction prevents the simulation from letting people do things like walk across a highway or across the Don Valley or something. During the simulation, people might walk for a total distance of more than 100m, but they will only walk up to 100m between any two stops. The walk time is simply the time needed to walk the straight line distance between the GPS coordinates of two stops--it doesn't take into consideration buildings in way, terrain, crossing the street, etc.

Weaknesses with the Model
One of the weaknesses of this approach is that it assumes that transit system adheres perfectly to the GTFS data. In reality, buses don't arrive and leave perfectly according to the schedule. People don't try to arrange their bus transfers so that they have exactly 5 seconds to make a transfer. Buses and subways are often late, so people normally would take earlier buses than what the chart shows because they need to ensure there's enough buffer time to make a transfer. There is no sensitivity analysis done on the data--so if I were to run the simulation again but calculate when people would need to leave to get to downtown by 8:55, the calculated transit times might end up being dramatically different. Also, the rule of thumb is that people consider waiting for a transfer to feel like twice as long as actually riding on transit, and that "feel" isn't incorporated into the model either.

Obvious Anomalies in the Map
If you look at the map, you'll notice that there are a lot of purple dots along the Bloor subway line. I suspect that those dots refer to a late-night/early-morning bus that runs when the subway is closed. Due to the 100m walking restriction, those stops can only be "reached" by those overnight buses, but those buses probably don't run during the rush hour since people can just walk a few hundred metres and take the subway instead. I haven't verified this yet.

Previously, I thought that GO Transit doesn't come often enough and that the distances within the city were small enough, that simulating GO Transit wouldn't be useful. But looking at the map now, it looks like that if GO trains come at least twice an hour, then they might provide a benefit for people living at the outer limits of the city, so I might have to incorporate GO data into the simulation later as well.

On the map, you can see the effect of subway lines in the suburbs. Around the subway stations, the transit times are much lower than in the surrounding areas. If you live near a subway station, then you get decent transit times to downtown. Otherwise, you need to take a bus to the station first, which increases your travel time. This effect is particularly prominent on the Sheppard subway and the eastern end of the Bloor subway. Overall, having a subway somewhere in the neighbourhood does seem to improve overall travel times though.

Another annoyance with the way this map is generated is that it's hard to separate the effect of frequency of service from the effect of speed of service. Are poor transit times in some areas caused by infrequent service so people have to wait a long time for transfers or are the times caused by buses moving slowly through congested parts of the city? It's also hard to compare public transit times vs. car times because there's no easy way to calculate how long it takes to drive through the city during rush hour. Oh well, I think it makes a good first step at analyzing some of this public data though.

No comments:

Post a Comment