We’ll be doing calculations on 2023 Yellow Taxi Trips data using pandas. Same as Lecture 18, and contrary to the course policy, you’ll be writing the code in this Lab using generative AI only. You can use whatever tool you like.
Step 0¶
The data needs to available on the machine where Python is running in order to process it, so let’s download from the NYC Open Data site directly:
Open https://
data .cityofnewyork .us /resource /4b4i -vvec .csv in your browser, which should download the first thousand rows. We’ll talk about getting more data when we get to APIs.
Move the CSV to the same directory as this notebook.
Rename the CSV something meaningful.
Confirm you can see the file in the VSCode Explorer. You may need to tell it to refresh (🔄 button).
Step 1¶
Print out the trip distances.
# AI code hereStep 2¶
Calculate the average ride distance.
# AI code hereStep 3¶
Your turn! Calculate the percent of trips that were paid for by credit card. The data dictionary will be helpful - see the Attachment on the dataset page.
# AI code hereStep 4¶
Save a random sample of the trips to a new CSV.
# AI code hereStep 5¶
Write a paragraph or two of reflection on Lecture 18 and this Lab, specifically around the strict use of generative AI.
What worked well?
What didn’t work well?
Did this change how you’re thinking about generative AI?
YOUR RESPONSE HERE