About RentAtlas

A clear, iterative and reproducible project to help renters compare Montréal neighbourhoods using safety, parks & amenities, transit access and parking. Below I explain how the site was made, the data used, and known limitations so you can understand how the results were obtained.

How this site was built

This site is a lightweight front-end (HTML/CSS/Leaflet/JS) that reads precomputed GeoJSON (FSA geometries with each component's score). The data processing and score calculations are performed in Jupyter notebooks using mainly geopandas, pandas and NumPy.

Index Breakdown

Safety - Crime Index This index was mainly computed using SPVM's open data, calculated per capita, for each FSA, in its geographical area. A calculation for proximity to firehouses was also added at a lower weight, to evaluate emergency response times.
Parks - Amenities Index This index uses Montreal's main public parks database. For each FSA, it measures the number of parks within its bounds, the percent area of the FSA that they collectively take, and a graduated evaluation of the proximity of the FSA's centroid to the nearest park. It also utilizes a federally provided 'Canopy' overaly, measuring greenary coverage over each FSA, again calculating for percentage of land coverage: this was particularly helpful for bouroughs outside of the city proper (mising park data), and to demonstrate the suburbs have overall greenery despite lower dedicated parks. There is also a factor that involves public pools, again using similar techniques as the parks grading scheme.
Transit Access Index This index evaluates the public transit stop density per FSA, proximity to each, as well as special proximity weighting for Metro stations on the network, since they are obviously more sought after (similar to Park again).
Parking Index This index measures the availability of street parking spots, per capita, in order to give a rough estimate of how easily parking can be found on those streets. Suburbs are mostly excluded form this dataset, which unfortunately limits the accuracy of this index in those areas. For this reason Parking Index is not weighed as much on default.

Processing steps (summary)

Collecting raw data: FSA geographic and general data, crime incidents, park database, public transit stops, parking spaces registries and other secondary data sources. Ideally there would also be price, walkshed (or simply walkability) and job opportunities
Normalize spatial data to FSAs, for each dataset (spatial-joins, area-weighted where appropriate).
Compute per-FSA metrics: per-capita parking, transit stops per 1,000 residents, parks area per capita, crime rate per 1,000 residents, etc.
Apply transformations to limit outlier impact (although still present for places like Mont-Royal Park getting a full 1.0 score on greenery) and normalize distributions.
Scale transformed metrics to 0..1 (min-max) or percentile rank depending on the indicator.
Combine component scores with user-adjustable weights to compute the overall score exported to the master FSA GeoJSON, which is then the source of all displayed data on the active map.

Known issues & limitations

Projection / geometry mismatches: mixing sources with different coordinate reference systems requires reprojection — mistakes here can have been made which distorts areas and densities.

Sparse or small-population FSAs: per-1000 metrics can be unstable for FSAs with tiny populations. Small denominators can create inflated rates.

Temporal mismatch: datasets are from different years (ex: 2019 transit vs 2022 crime), so the combined score is not guaranteed to reflect a single point in time.

Parking assumptions: The parking index aggregates different parking types (permit, paid, free) using simplifying rules. Local variances (temporary parking bans, private lots) aren’t fully captured, and are much beyond the scope of this single project.

Aggregation decisions: how point data (stops, incidents) are aggregated to FSAs affects results: simple counts, per capita rates, and spatial weighting all give slightly different outcomes.

These stated limitations above mean the map should be used for exploratory comparison rather than as an authoritative decision maker. When possible verify with local knowledge and cross-reference when the logic isnt adding up.

Reproducibility and code

All processing steps should be in the repo. Here are some key points to reproduce the pipeline:

Install dependencies: geopandas, pandas, numpy, shapely, rasterio (and more if needed).
Run the notebooks that download/clean each source, then an aggregation notebook that creates the per-FSA CSV and GeoJSON
Export the final file to data/processed/fsa_master.geojson, since it is used by the rest of the front end provided

What I learned and main issues encountered

Data cleaning is the majority of the work: missing attributes, inconsistent column names, and duplicated records required careful cleaning and plenty of trial and error.
Edge cases matter: industrial FSAs or very small islands around Montreal distort spatial averages and required manual exclusions or separate handling.
Transformations help: using log1p on outlier-heavy stats made the choropleth more informative than raw counts, particularly for my parks and transit layers.
Normalization choices change ranking: min-max scaling vs percentile ranking produce different spreads. Percentile ranking gives a uniform distribution and is more 'fair' in handling extreme outliers.
Performance: large GeoJSONs slow the browser down a lot. Strategies included simplifying geometries and other AI suggestions.

How you can help and contribute

If you find better data sources, notice a bug or want to improve the scoring methodology, let me know by email. Useful contributions include:

Improved parking datasets or municipal permit rules
Walkshed, employment or price info
Performance improvements for the front-end GeoJSON
Help integrating and marketing to Kijiji, Marketplace or other rental platforms that could use such a decision tool

Contact: XxGeoSlayerxX on GitHub or email at t_coull@live.concordia.ca