• Home
  • Blog
  • application of Biostat to big data

application of Biostat to big data

0 comments

1• Pick the 10-factor data we know from our Linear Regression HW!
Use Lecture_GLM_lasso.R to answer the following questions:
Which Xs did the Lasso find?
If you take the Lasso-found Xs and fit a linear regression to them, what do you
find?
If you use linear regression, from scratch, to find your Xs, how do they compare
with the Lasso-found Xs?
If you use RR alone, from scratch, to find your Xs, how does that compare with
the Lasso-found Xs?
Which do you think is the “best” model? Why?

2,Let’s pick the dataset: my_data_enet_for_SYSM590.csv, with 100 Xs and 36 observations
The “true” coefficients in red in the Table below were used to “create” Y. After multiplying the Xs
by their corresponding coefficients, a random error from a Normal distribution with mean 0 and
std. dev. 1500 was added to each product. The intercept was set to 7500. That is the “true” model
is created as follows: Y = 7500 + 2X1 – 0.5X2 + 0.25X3 + … + X100 + random errXD8MvjKuBNIAAAAASUVORK5CYII=

XdZPcTiTJJAAAAABJRU5ErkJggg==

Use Lecture_GLM_enet.Rto find the “best” model



About the Author

Follow me


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}