• Home
  • Blog
  • MIS 650 GCU Difference Between Data Analysis & Hypothesis Testing Responses

MIS 650 GCU Difference Between Data Analysis & Hypothesis Testing Responses

0 comments

discussion 1: Arcelia Rael

Malik (2019) writes that data distributions are important in data analysis because they allow us to accurately answer probability questions about the behavior of our data by allowing us to understand the behavior of variables in our data set. Using normal distribution functions in R that calculate mean and standard deviation, alongside visual tools like histograms and plots, allows us to understand our data better.

One way to find whether data is normally distributed is to test our data using Shapiro-Wilk’s method, which tells us if the normal distribution occurs based on a given p-value. For example, STHDA (n.d.) notes that when using the function shapiro.test(), if the p-value output is > .05, then we can assume a normal distribution. Additionally, we may attest to a normal distribution by plotting data using density plots, Q-Q plots, and bar graphs (overlayed with a curve) to assess distribution. For example, when assessing distribution using a bar graph, we expect the curve to be shaped like a bell as this implies that the continuous probability distribution is symmetrically distributed, peaking at the mean.

Other functions in R that are used on normally distributed data include dnorm, prnorm, and qnorm. Malik (2019) writes that with the dnorm() function we are able to find the probability distribution for each point in our dataset given the mean and standard deviation. We can further test the probability of whether variables in our data set are greater or less than the variable x by using functions like pnorm(), or to find specific values based on a given probability.

References

Malik, F. (2019, June 20). Ever wondered why normal distribution is so important? Fine Tech Explained. com/fintechexplained/ever-wondered-why-normal-distribution-is-so-important-110a482abee3″>https://medium.com/fintechexplained/ever-wondered-…

STHDA. (n.d.-a). Normality test in R. Retrieved September 05, 2021, from http://www.sthda.com/english/wiki/normality-test-i…

discussion 2: David

Good afternoon class,

Data distribution is important to analytics in order to identify values of a dateset and frequency, which determines how often the values occur. Analysis requires distributions to be performed to understand the relationship between data points and to test probability. An analyst can identify appropriate sample distributions by repeating the survey multiple times until it is determined that the sample distribution is adequate and matches the data generating process. “R has four in built functions to generate normal distribution” (“Normal Distribution”, n.d.). In order to test whether data is normally distributed in R, an analyst can use the following functions: dnorm(), pnorm(), qnorm(), and rnorm(). These four functions within R generate normal distribution which will ultimately reflect a bell shape curve in the graph.

Reference:

Normal Distribution. (n.d.). Retrieved from https://www.tutorialspoint.com/r/r_normal_distribu…

discussion 3:

Hey Class,

I hope you are doing well! A primary difference between hypothesis testing and exploratory data analysis (EDA) is that hypothesis testing requires that the analyst have some appropriate insight into or assumptions about the relationships between predictors and target whereas EDA requires no preconceived notions about the data.

An analyst would prefer to use EDA over hypothesis testing when (1) the data set has a relatively large number of predictors, and / or (2) the analyst is generally unfamiliar with the data. The purpose of EDA is to explore the data distributions and compare proportions of the various elements in order to uncover interesting relationships among predictors and between the predictors and the target.

I look forward to viewing your responses for the discussion this week!

God bless,

Kyle

About the Author

Follow me


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}