-
Problem I – Write your first name, middle name, and last name in capital letters. The letters involved in your full name would comprise your data set. In case you do not have a middle name, or you do not want to include your real middle name, make one up. Then, do the following
- Write your data in order from A to Z and double check. For example, the student whose complete name is First Middle Last would have
|
A |
A |
E |
E |
E |
J |
K |
M |
N |
N |
R |
R |
S |
Y |
|
Your full name: ………………..
Letters in order with existing repetitions :
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- What is the type of your data? Circle, or list, all that apply:
Numerical, continuous, discrete, categorical, non-numerical, quantitative, qualitative
- What is the size of your data set?
- What scale of measurement is applicable to your data (nominal, ordinal, interval, ratio)? Support your answer briefly.
- Is the word “range,” with its actual definition in statistics, applicable to your data set? How can you say something about your data involving “range” in your statement, anyway?
- Is your data set a sample or a population? Support your answer briefly.
- Depending on your answer to Question 6 above, and recalling what we said in class, what is the correct notation to show the size of your data set in statistics?
- What is (are) the mode(s) of within data set, if any? Is your data set unimodal, bimodal, trimodal, …?
- What is the frequency of the mode? In case you have more than one mode, provide the frequency of each.
- Recalling the example discussed in class or provided in your eTextbooks, construct a “Frequency Distribution Table,” a complete (seven-column) frequency distribution table. You should use the following headings for your table:
Letter, Frequency (F), Relative Frequency (RF), Percent Relative Frequency (PRF), Cumulative Frequency (CF), Cumulative Relative Frequency (CRF), and Cumulative Percent Relative Frequency (CPRF).
- By examining appropriate rows and columns of the frequency table that you have constructed for Step 10 above, write down (in a small table) the fractions (in percentage) of your data set that the vowels A, E, I, O, and U comprise individually and collectively.
- Using the frequency table created in Step 10 above and, preferably, hand drawing on graph paper (show at least some work, in case you use technology),
-
-
Construct a bar chart for the frequency (F) distribution. (See NOTE below)
-
Construct a bar chart for the percent relative frequency (PRF) distribution. (See NOTE below)
-
Compare your F distribution with your PRF distribution. Briefly explain your finding(s).
NOTE: You may do Parts (a) and (b) displaying the categories from highest F (or PRF) to lowest F (or PRF) from left to right; the resulting bar chart is called a “Pareto Bar Chart.”
Please note that each bar chart must have a descriptive title, and the x and y axes must have descriptive labels.
-
- Plot the points corresponding to the cumulative percent relative frequency (CPRF) distribution and connect them; the graph constructed by the line segments is called “ogive.” Please note that your ogive would be an “increasing curve.” The plot should have a title and the axes should be appropriately tick marked and labeled.
- Construct a pie chart to graphically display the relative frequencies in percentages. (In case you use technology to produce the pie chart, show some calculations to demonstrate that you know what is involved in finding the share of each category from the whole circle.)
Problem II – Choose and write down 10 distinct (different) whole numbers (no modes) less than 100 in a way that your data set would have a range of 76, a mean of 55, and a median of 50.
-
Show your data set and your work to demonstrate that your data set does have the statistical characteristics mentioned.
-
Calculate the midrange.
-
Estimate the standard deviation using the “range rule of thumb,” which is based on the fact that four standard deviations practically cover the span on the data (about 95% for normal distributions). Do not calculate the actual standard deviation value.
-
Determine the Interquartile Range (IQR).
-
Are the verified/computed values “statistics” or “parameters”? Explain your answer briefly.
-
Construct a boxplot for your data set. Any outliers?
-
Based on the appearance of your boxplot, is the distribution of your data set normal, close-to-normal, left skewed, or right skewed?
-
Using your estimate of the standard deviation, determine what percentage of the data points fall within one standard deviation from the mean? Briefly explain why your computed percentage is close to, or far from, what the “Empirical Rule” says.
Problem III – In problem 84 of Chapter 1 of Illowsky’s eTextbook (Table 1.37), which was one of your homework problems, the class intervals or bins have been listed under “Age.” We are interested in knowing the mean age of the chief executive officers (CEOs) involved in the study.
-
Can we calculate the exact mean age of the CEOs studied, based on the information provided in the table? Briefly explain your answer.
-
To estimate the mean age of the CEOs studied, we can resort to the class intervals under Age (the left-most column of the table); See section 2.5 of Illowsky’s eTextbook. Consider the midpoint (midrange) of each class interval to represent the age of each CEO in that class interval. For example, the three CEOs in the class interval “40-44” are considered to be (44-40)/2 = 42 years of age each. Then, three values of our data set would be 42, 42, 42, noting that the class frequency is 3. Find the midpoints of the remaining class intervals and note their corresponding class frequencies to complete your data set; you may devote one column to the midpoint values. Then, find the estimated mean CEO age and report the value to two decimal places.
-
Find an estimate for the median age. Support your answer by relating your answer to the actual definition of median.
-
Having estimated mean and median, respectively in parts (b) and (c) above, estimate the “midrange value” for the data represented in the table, as the third measure of the center of the data.
-
Among the estimates found in Parts (b), (c), and (d) above, which one is relatively more accurate than others? Support your answer by a brief explanation.
-
Draw a histogram based on the information provided in the table.
-
Based on the appearance of your histogram (drawn for Part (e) above) state whether the frequency distribution is almost normal, skewed to the left or skewed to the right; explain your answer briefly.
-
Estimate the standard deviation to accompany the estimated mean, as the mean and standard deviation go hand in hand. Hand calculations would be easy for the case in hand, and is highly recommended for this quiz; please see the formulas provided in Section 2.7 of Illowsky’s eTextbook. Standard deviation is simply the square root of the variance.
In case you use technology to do the calculation, you should show some hand calculations to demonstrate that you know how it is down manually; otherwise, you will not earn full credit.
-
As a “sanity check,” show that your result for Part (h) is somewhere between one fourth of the range and one sixth of the range; how does it compare with the mean of the two bounds?


0 comments