Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

General notebook information

We’ll be doing calculations on 2023 Yellow Taxi Trips data using pandas. Same as Lecture 18, and contrary to the course policy, you’ll be writing the code in this Lab using generative AI only. You can use whatever tool you like.

Step 0

The data needs to available on the machine where Python is running in order to process it, so let’s download from the NYC Open Data site directly:

  1. Open https://data.cityofnewyork.us/resource/4b4i-vvec.csv in your browser, which should download the first thousand rows.

    • We’ll talk about getting more data when we get to APIs.

  2. Move the CSV to the same directory as this notebook.

  3. Rename the CSV something meaningful.

  4. Confirm you can see the file in the VSCode Explorer. You may need to tell it to refresh (🔄 button).

Step 1

Print out the trip distances.

# AI code here

Step 2

Calculate the average ride distance.

# AI code here

Step 3

Your turn! Calculate the percent of trips that were paid for by credit card. The data dictionary will be helpful - see the Attachment on the dataset page.

# AI code here

Step 4

Save a random sample of the trips to a new CSV.

# AI code here

Step 5

Write a paragraph or two of reflection on Lecture 18 and this Lab, specifically around the strict use of generative AI.

  • What worked well?

  • What didn’t work well?

  • Did this change how you’re thinking about generative AI?

YOUR RESPONSE HERE

Step 6

Submit.