Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

import plotly.io as pio

pio.renderers.default = "notebook_connected+plotly_mimetype"

Start by importing necessary packages

import pandas as pd
import plotly.express as px
<frozen importlib._bootstrap>:491: RuntimeWarning:

The global interpreter lock (GIL) has been enabled to load module 'pandas._libs.pandas_parser', which has not declared that it can run safely without the GIL. To override this behavior and keep the GIL disabled (at your own risk), run with PYTHON_GIL=0 or -Xgil=0.

districts = pd.read_csv("https://storage.googleapis.com/python-public-policy2/data/311_community_districts.csv.zip")
districts.head()
Loading...

Map complaint counts by CD

Load the GeoJSON data using the requests package (nothing to do with 311 requests):

import requests

response = requests.get("https://data.cityofnewyork.us/resource/5crt-au7u.geojson")
shapes = response.json()
print("loaded")

# intentionally not outputting the data here since it's large
loaded

This is equivalent to the use of urlopen() and json.load() in the Plotly examples.

Notes:

  • boro_cd is the property we’re looking for. We’ll specify this as the featureidkey.

  • response.json() turns JSON data into nested Python objects: shapes is a dictionary, features is a list beneath it, etc.

def plot_nyc(df):
    """This function makes a chloropleth map of NYC, using a DataFrame with a boro_cd and a requests_per_capita column."""

    fig = px.choropleth_map(
        df,
        locations="boro_cd",  # column name to match on
        color="requests_per_capita",  # column name for values
        geojson=shapes,
        featureidkey="properties.boro_cd",  # GeoJSON property to match on
        center={"lat": 40.71, "lon": -73.98},
        zoom=9,
        height=600,
        title="Requests per capita across Community Districts",
    )

    fig.show()

Wrapping this Plotly code in a function to make the code reusable for plotting different DataFrames.

plot_nyc(districts)
Loading...
Loading...

Midtown, as an outlier, is skewing our results. Let’s exclude it.

no_midtown = districts[districts["boro_cd"] != 105]
plot_nyc(no_midtown)
Loading...