Assignment #1: Quantitative Analysis
For
this assignment, students should choose data from the quantitative
analysis below and are asked to analyze it using Excel, RStuido (BONUS
points)
Data set:
Minnesota Healthcare Database.xlsx
Medicare National Data by County
MN Hospital Report Data by Care Unit FY2013
MN HCCIS Imaging Procedures 2013
Students
will develop an analysis report, in five main sections, including
introduction, research method (research questions/objective, data set,
research method, and analysis), results, conclusion and health policy
recommendations. This is a 5-6 page individual project report.
Here are the main steps for this assignment.
TOPIC: – Comparing average annual percent of diabetic Medicare enrollees age 65-75 having hemoglobin A1c between B and W (#1)
Step 2: Develop the research question and
Step 3: Run the analysis using EXCEL (RStudio for BONUS points) and report the findings using the assignment instruction.
The Report Structure:
Start with the
1.Cover page (1 page, including running head).
Please look at the example http://www.apastyle.org/manual/related/sample-experiment-paper-1.pdf (you can download the file from the class) and http://www.umgc.edu/library/libhow/apa_tutorial.cfm to learn more about the APA style.
In the title page include:
- Title, this is the approved topic by your instructor.
- Student name
- Class name
- Instructor name
- Date
2.Introduction
Introduce the problem or topic being investigated. Include relevant background information, for example;
- Indicates why this is an issue or topic worth researching;
- Highlight how others have researched this topic or issue (whether quantitatively or qualitatively), and
- Specify how others have operationalized this concept and measured these phenomena
Note: Introduction should not be more than one or two paragraphs.
Literature Review
There is no need for a literature review in this assignment
3.Research Question or Research Hypothesis
What is the Research Question or Research Hypothesis?
***Just in time information: Here are a few points for Research Question or Research Hypothesis
There
are basically two kinds of research questions: testable and
non-testable. Neither is better than the other, and both have a place in
applied research.
Examples of non-testable questions are:
How do managers feel about the reorganization?
What do residents feel are the most important problems facing the community?
Respondents’
answers to these questions could be summarized in descriptive tables
and the results might be extremely valuable to administrators and
planners. Business and social science researchers often ask non-testable
research questions. The shortcoming with these types of questions is
that they do not provide objective cut-off points for decision-makers.
In
order to overcome this problem, researchers often seek to answer one or
more testable research questions. Nearly all testable research
questions begin with one of the following two phrases:
Is there a significant difference between …?
Is there a significant relationship between …?
For example:
Is there a significant relationship between the age of managers? and their attitudes towards the reorganization?
A
research hypothesis is a testable statement of opinion. It is created
from the research question by replacing the words “Is there” with the
words “There is,” and also replacing the question mark with a period.
The hypotheses for the two sample research questions would be:
There is a significant relationship between the age of managers and their attitudes towards the reorganization.
It is not possible to test a hypothesis directly. Instead, you must
turn the hypothesis into a null hypothesis. The null hypothesis is
created from the hypothesis by adding the words “no” or “not” to the
statement. For example, the null hypotheses for the two examples would
be:
There is no significant relationship between the age of managers
and their attitudes towards the reorganization.
There is no significant difference between white and minority residents
with respect to what they feel are the most important problems facing the community.
All
statistical testing is done on the null hypothesis…never the
hypothesis. The result of a statistical test will enable you to either:
1) reject the null hypothesis, or
2) fail to reject the null hypothesis. Never use the words “accept the null hypothesis.”
*Source:
StatPac for Windows Tutorial. (2017). User’s Guide; Formulating
Hypotheses from Research Questions. Retrieved May 17, 2019 from com/manual/index.htm?turl=formulatinghypothesesfromresearchquestions.htm”>https://statpac.com/manual/index.htm?turl=formulatinghypothesesfromresearchquestions.htm
What does significance really mean?
“Significance
is a statistical term that tells how sure you are that a difference or
relationship exists. To say that a significant difference or
relationship exists only tells half the story. We might be very sure
that a relationship exists, but is it a strong, moderate, or weak
relationship? After finding a significant relationship, it is important
to evaluate its strength. Significant relationships can be strong or
weak. Significant differences can be large or small. It just depends on
your sample size.
To determine whether the observed difference is statistically significant, we look at two outputs of our statistical test:
P-value:
The primary output of statistical tests is the p-value (probability
value). It indicates the probability of observing the difference if no
difference exists.

The
p-value from above example, 0.9926, indicates that we DO NOT expect to
see a meaningless (random) difference of 5% or more in ‘hospital beds’
only about 993 times in 1000 there is no difference (0.9926*1000=992.6 ~
993).
Note: This is an example from the week1 exercise.

The
p-value from above example, 0.0001, indicates that we’d expect to see a
meaningless (random) ‘number of the employees on payer’ difference of
5% or more only about 0.1 times in 1000 (0.0001 * 1000=0.1).
CI
around Difference: A confidence interval around a difference that does
not cross zero also indicates statistical significance. The graph below
shows the 95% confidence interval around the difference between hospital
beds in 2011 and 2012 (CI: [-40.82 ; 40.44]):

CI
around Difference: A confidence interval around a difference that does
not cross zero also indicates statistical significance. The graph below
shows the 95% confidence interval around the difference between hospital
beds in 2011 and 2012 (CI: [-382.16 ; 125.53]):

The
boundaries of this confidence interval around the difference also
provide a way to see what the upper [40.44] and lower bounds [-40.82].
As a summary:
“Statistically significant means a result is unlikely due to chance.
The
p-value is the probability of obtaining the difference we saw from a
sample (or a larger one) if there really isn’t a difference for all
users.
Statistical significance doesn’t mean practical
significance. Only by considering context can we determine whether a
difference is practically significant; that is, whether it requires
action.
The confidence interval around the difference also
indicates statistical significance if the interval does not cross zero.
It also provides likely boundaries for any improvement to aide in
determining if a difference really is noteworthy.
With large
sample sizes, you’re virtually certain to see statistically significant
results, in such situations, it’s important to interpret the size of the
difference”(“Measuring U”, 2019).
*Resource
Measuring U. (2019). Statistically significant. Retrieved May 17, 2019 from: https://measuringu.com/statistically-significant/
Small
sample sizes often do not yield statistical significance; when they do,
the differences themselves tend also to be practically significant;
that is, meaningful enough to warrant action.
4.Research Method
Discuss
the Research Methodology (in general). Describe the variable or
variables that are being analyzed. Identify the statistical test you
will select to analyze these data and explain why you chose this test.
Summarize your statistical alternative hypothesis. This section includes
the following sub-sections:
a)Describe the Dataset
Example:
The primary source of data will be HOSPITAL COMPARE MEDICARE DATA
(citation). This dataset provides information on hospital
characteristics, such as: Number of staffed beds, ownership, system
membership, staffing by nurses and non-clinical staff, teaching status,
percentage of discharge for Medicare and Medicaid patients, and
information regarding the availability of specialty and high-tech
services, as well as Electronic Medical Record (EMR) use (Describe
dataset in 2-3 lines, Google the dataset and find the related website to
find more information about the data).
Also, describe the sample
size; for example, “The writer is using Medicare data-2013, this data
includes 3000 obs. for all of the hospitals in the US.”
b)Describe Variables
Next,
review the database you selected and select a variable or variables
that are a “best-fit.” That is, choose a variable that quantitatively
measures the concept or concepts articulated in your research question
or hypothesis.
Return to your previously stated Research Question
or Hypothesis and evaluate it considering the variables you have
selected. (See the sample Table 1).
Table 1. List of variables used for the analysis
|
Variable |
Definition |
Description of code |
Source |
Year |
|
Total Hospital Beds |
Total facility beds set up and staffed at the end of the reporting period |
Numeric |
MN Data |
2013 |
|
…. |
||||
|
….. |
Source: UMGC, 2019
***Just in time information:
To cite a dataset, you can go with two approaches:
First, look at the note in the dataset for example;
Medicare National Data by County. (2012). Dartmouth Atlas of Health Care, A
Second, use the online citation, for example:
Zare,
H., (2019, May). MN Hospital Report Data. Data posted in University of
Maryland University College HMGT 400 online classroom, archived at: http://campus.umgc.edu
See two examples describing the variables from Minnesota Data:
Table 2. Definition of variables used in the analysis
|
Variable |
Definition |
Description of code |
Source |
Year |
|
hospital_beds |
Total facility beds set up and staffed at the end of the reporting period |
Numeric |
MN data |
2013 |
|
year |
FY |
Categorical |
MN data |
2013 |
Source: UMGC, 2019
c)Describe the Research Method for Analysis
First,
describe the research method as a general (e.g., this is a quantitative
method and then explain about this method in about one paragraph. If
you have this part in the introduction, you do not need to add here).
Then,
explain the statistical method you plan to use for your analysis (Refer
to content in week 3 on Biostatistics for information on various
statistical methods you can choose from).
Example:
Hypothesis: AZ hospitals are more likely to have lower readmission rates for PN compared to CA.
Research
Method: To determine whether Arizona hospitals are more likely to have
lower readmission rate than California, we will use a t-test, to
determine whether differences across hospital types are statistically
significant (You can change the test depends on your analysis).
d)Describe statistical package
Add one paragraph for the statistical package, e.g., Excel or RStudio.
5. Results
Discuss your findings considering the following tips:
▪
Why you needed to see the distribution of data before any analysis
(e.g., check for outliers, finding the best fit test; for example, if
the data had not a normal distribution, you can’t use the parametric
test, etc., so just add 1 or 2 sentences).
▪ Did you eliminate outliers? (Please write 1 or 2 sentences, if applicable).
▪ How many observations do you have in your database and how many for selected variables, report % of missing.
▪ When you are finished with this, go for the next steps:
Present
the results of your statistical analysis; include any relevant
statistical information (summary tables, including N, mean, std. dev.).
Make sure to completely and correctly name all your columns and rows,
tables and variables. For this part you could have at least 1-2 tables
and 1-2 figures (depending on your variables bar-chart, pi-chart, or
scatter-plot), you can use a table like this:
Table 3. Descriptive analysis to compare % of BL in Medicare beneficiary, MD vs. VA- 2013
|
Variable |
Obs. |
Mean |
SD |
P-value |
|
Per of Lipid in MD |
24 |
83.20 |
2.32 |
0.4064 |
|
Per of Lipid in VA |
124 |
82.69 |
4.41 |
Source: UMGC, 2019
When
you have tables and plots ready, think about your finding and state the
statistical conclusion. That is, do the results present evidence in
favor or the null hypothesis or evidence that contradicts the null
hypothesis?
6.Conclusion and Discussion
Review your research questions or hypothesis.
How
has your analysis informed this question or hypothesis? Present your
conclusion(s) from the results (presented above) and discuss the meaning
of this conclusion(s) considering the research question or hypothesis
presented in your introduction.
At the end of this section, add
one or two sentences and discuss the limitations (including biases)
associated with this analysis and any other statements you think are
important in understanding the results of this analysis.
References
Include
a reference page listing the bibliographic information for all sources
cited in this report. This information should be consistent with the
requirements specified in the American Psychological Association (APA)
format and style guide.


0 comments