This dataset was provided by the World Bank for doing analysis on climate change. For this problem, we are going to clean up the data in anticipation of doing an analysis. Human activity, which often leads to increased GDP (gross domestic product) such as goods production and services, frequently produces carbon dioxide emissions. As such, we are going to look at the relationship between GDP and CO2 emissions.
The three tabs in the exam1_1_xlsx file are as follows:
- Data – this tab has the country code, country name, Series Code, Series Name, SCALE, Decimals, and years 1990 – 2011. You will be using a specific series to answer these questions as described below.
- Country – this tab has data about the country including the Capital City, its region, and the Income Group for the country.
- Series – this tab describes the data in the data tab. It describes what each of the series are, the definition of each of the series, and the sources of the data.
For this problem, you will be turning in a single Excel file with all of the information described below.
First, create a new tab named “GPD” (gross domestic product) which has just the list of countries and the GDP data. This information is GDP $.
Second, create a new tab named “Total CO2” which has just the list of countries and the CO2 data. This information is in the part of the data named “CO2 emissions, total (KtCO2)”.
Third, create a new tab named “GPD and CO2” that has the list of countries, the GDP from 1990 and 2008 and the CO2 from 1990 and 2008. Remove all of the countries that are missing any of the data (for both GDP and CO2) from 1990 or 2008. Remove all entries that are related to incomes (Lower Middle Income, for example). Remove all entries that are regions (Middle East, for example). All remaining entries should be individual countries.
Fourth, using GPD 1990 and CO2 1990, create a scatter chart that shows the relationship between those two variables. Leave this scatter plot in the “GPD and CO2” tab. Choose the appropriate x and y based on our notes.
Fifth, calculate the correlation coefficient by hand for the GPD 1990 and CO2 1990. Put this in the “correlation” tab.
Fifth, calculate by hand the linear regression between GPD 1990 and CO2 1990 in the “linear 1990” tab. Check your answer by using the data analysis tool in Excel.
Sixth, calculate the linear regression of GPD 2008 and CO2 2008 in the “linear 2008” tab.
Based on your analysis, is there a relationship between GPD and CO2? Write a paragraph supporting your answer. Put this in your “analysis” tab.
When you turn in this question, you should have one file: exam1_1.xlsx with the following tabs:
Data, Country, Series, GDP, Total CO2, GPD and CO2, correlation, linear 1990, linear 2008, and analysis.
Here is the data file to download.


0 comments