The purpose of this assignment is to apply multiple regression
concepts, interpret multiple regression analysis models, and justify
business predictions based upon the analysis.
For this assignment, you will use the “Strength” dataset. You will
use SPSS to analyze the dataset and address the questions presented.
Findings should be presented in a Word document along with the SPSS
outputs.
The compressive strength (Y) of concrete is influenced by the mixing
proportions and by the time that it is allowed to cure, although the
exact relationship between the strength and the components is unknown.
The provided data includes the results of n = 1030 concrete strength
experiments that include the following:
- Strength (in MPa): The compressive strength of the concrete.
- Age (in days): The number of days the concrete was allowed to cured.
- Coarse_Aggregate (in kg/m3): The proportion of coarse aggregate in the mix.
- Fine_Aggregate (in kg/m3): The proportion of fine aggregate in the mix.
- Cement (in kg/m3): The proportion of cement in the mix.
- Slag (in kg/m3): The proportion of furnace slag in the mix.
- Superplasticizer (in kg/m3): The proportion of plasticizer in the mix.
- Water (in kg/m3): The proportion of water in the mix.
- Ash (in kg/m3): The proportion of fly ash in the mix.
Part 1:
Derive various transformations of compressive strength to determine
which transformation, if any, results in a variable that most closely
mimics a normal distribution. To do this, plot Q-Q plots after each
transformation listed below, and decide which one should be used to
build a multiple linear model. Explain your answer and provide the SPSS
output as an illustration.
- Strength (no transformation)
- Square root of Strength
- Squared Strength
- (Natural) Log of Strength
- Reciprocal of Strength
Part 2:
Based on the transformation selected in Part 1, build a multiple linear regression model with all eight predictors.
- Use t-tests to determine if any of the predictors significantly
affect the compressive strength of concrete. Explain why each variable
should or should not be included in the model. Assume α = 0.05. Show the
appropriate model results to explain your answer. - If any predictors from question 1 are found to be not significant,
remove them and re-run the model to create a reduced model (RM). Are all
the remaining variables still statistically significant? Show the
appropriate model results to explain your answer. - Based on the RM, should there be concern about multicollinearity
among the predictors selected? Show the appropriate model results to
explain your answer. - After fitting the RM, derive the residual plot (standardized
residuals vs. standardized predicted values) and normal probability
plot. Interpret each plot. - What is the coefficient of determination, R2, of the RM? How would you interpret the R2?
- Based on the RM, what would be the new estimated compressive
strength that is currently 50 MPa, after a 10-day increase in curing
time? Assume all other predictors are held constant. - How would you interpret the intercept (constant) in the RM? Does the
interpretation make sense given the data you used to build the RM?
Part 3:
Given the following components and aging time below, what is the estimated compressive strength based on the RM?
- Age: 50 days
- Coarse_Aggregate: 900 kg/m3
- Fine_Aggregate: 600 kg/m3
- Cement: 300 kg/m3
- Slag: 200 kg/m3
- Superplasticizer: 7 kg/m3
- Water: 190 kg/m3
- Ash: 70 kg/m3
Part 4:
What is a 95% confidence interval of the estimate in Part 3? How would you interpret the 95% confidence interval? (Hint: Use the SPSS scoring wizard to address this question.)
APA format is not required, but solid academic writing is expected.


0 comments