I’m working on a computer science multi-part question and need an explanation and answer to help me learn.
1 Maryland Traffic Violations
Perform exploration analysis on the Kaggle Maryland Traffic Violations dataset.
Answer the following questions:
1. Which colors of the vehicles are more likely to get involved in a traffic
violation?
2. Which models of the car are more likely to get involved in a traffic violation?
This is an open-ended question. I encourage you to try as many data preprocessing and exploratory analysis tasks as you can possibly do. I am ready to be
impressed.
2 Comments:
1. You can download the data here: Traffic Violations in Maryland County | Kaggle It’s about 500 MB uncompressed. Kaggle
Notebook has a limit of 100 GB per dataset, and Google Colab has a limit
of 70 GB storage.
2. You may use pluto as it is a powerful server with few restrictions. To work
a data science project on pluto, the easiest way is to install an anaconda
under your own directory. Then use ssh tunnel to access your Notebook
from a browser at any place, such as your home. You may Google ’SSH
Tunnel Jupyter Notebook’ for instructions.
3. You can also use your own computer.
4. R is also allowed for this homework.
5. The most relevant skill-set you may need for this assignment is Pandas.
You may find a quick tutorial here: Learn Pandas Tutorials | Kaggle


0 comments