For this project, use the IMDB Excel file that you submitted for Project 1. If you want to download another copy of the original data set, feel free to do so. But make sure you re-create the Primary Key Column again. This PrimaryKey column should be used for all pivot tables and associated visualizations that deal with counts.
Make sure that all visualizations include: x and y axis labels, a title, a legend (if appropriate), and data labels (if appropriate). 105pts Instructions: Which countries produce the most movies?
1. Create a new worksheet called “byCountry_byYear”: Create a pivot table and associated visualization to show the total number of films made by each country for each of the last 5 years. Use the “best” visualization that shows totals for each country in each year. 10 pts 1a. Describe the initial patterns that you see. 1b. Modify the pivot to filter out the country with the most videos across the last 5 years and describe the new patterns you see. 5pts How many movies are produced by year?
2. Create a new worksheet called “byYear”: Create a pivot table and a bar chart that shows the total number of films made within each year for all years. Identify the year were the most number of movies were created and modify the color of that bar to help highlight the year. (Don’t filter any years out.) 10pts
2a. Describe the basic patterns that you see. 5pts What are the average budget and gross revenues by year for movies made in the U.S.?
3. Create a new worksheet called “Average_Budget_Revenue”: Create a pivot table, a bar chart, and a line chart that show both the average budget and average gross revenue by year for movies made in the United States. 15pts
3a. Which visualization is better? Explain why. Describe the patterns that you see, what are some of the potential issues you see here (remember the source for this data set). 5pts What are the average budget and average gross revenues by content rating?
4. Create a new worksheet called “Content Rating”. Create a pivot table and the best visualization that shows Average Budget and Average Gross Revenue by Content Rating. Filter your pivot and visualization by Country: USA and Title_Years: last 10 years. 10 pts
4a. Based on this information, if you were a movie producer, what would you want your content rating to be and why? 5pts
4b. Now bring in the Count of movies (make sure you use the primary key) in the pivot table and select best conditional formatting to that column to highlight the differences in counts. Does this change your answer to 4a? Explain why or why not? 5pts
4c. Now bring in the Max Gross in the pivot table and select best conditional formatting to that column to highlight the differences in counts. Does this change your answer to 4a? Explain why or why not? 5pts What are the average budget and average gross revenues by Lead Actor?
5. Create a new worksheet called “Lead Actor”. Create a pivot table and the best visualization that shows Average Budget and Average Gross Revenue by Actor_1_name. Filter your pivot and visualization by Country: USA and Title_Years: last 10 years and filter by the Content_Rating you selected in 4a. 10 pts
5a. Based on this information, if you were a movie producer, who would you want your lead actor to be and why? 5pts
5b. Bring in the Total “Actor_1_facebook likes” into the pivot table and select best conditional formatting to that column to highlight the differences in counts. Does this change your response to 5a? Why or why not? 5pts Make your own advanced visualization.
6. Create your own advanced visualization. Clearly describe the problem/question that you’re trying to address, show your work, and explain the answer that you derived at. And make sure the analysis is complex. It should not be something as simple as “the total number of movies in the data set” or even the “total number of movies by year”. Make sure your analysis is more interesting and complicated than that.


0 comments