You will find the following 3 files in the datasets.zip file-
*artists.csv*
– *albums.csv*
– *tracks.csv*
You may want to first open the files with a text editor and observe the files carefully, to get an idea about the fields in the data table. The meaning of each column label is self-explanatory.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import math
import thinkstats2
import thinkplot
%matplotlib inline
**Q1** Load the 3 data files (mentioned above) as dataframes: *artists_df*, *albums_df* and *track_df* respectively.
Then, show the first 5 records of the dataframes.
**Q1.1** Merge the *artists_df* and *albums_df* dataframes and name the new dataframe *art_alb_df* and show the first 5 rows.
**Q1.2** Show the name of the artists and number of albums for the top 20 artists with the most number of albums (**not singles**).
**Q1.3** How many singles and how many albums were released by Alicia Keys since 2010 (inclusive).
Note that the release date of albums can be a year or a particular date.


0 comments