PUB-550: Application and Interpretation of Public Health Data
PUB-550: Application and Interpretation of Public Health Data
Topic 1: Data Management and Descriptive Statistics
- Evaluate methods of data organization.
- Compare characteristics of correlational, experimental, and quasi-experimental (observational) statistics variables.
- Identify the four levels of measurement.
- Differentiate between a population and a sample, and a parameter and a statistic (descriptive and inferential).
- Explain the role of quantitative and qualitative methods and sciences in describing and assessing a population’s health. PUB-550: Application and Interpretation of Public Health Data
- Evaluate public health data sources.
- Apply methods to calculate and communicate descriptive statistics.
ORDER A CUSTOM-WRITTEN, PLAGIARISM-FREE PAPER HERE
Data Management |
A local community organization was interested in learning about general health behaviors in the area and the relationships between health behaviors and environmental and social determinants. They decided to conduct a brief survey based on a convenient sample of people visiting the local shopping mall. They offered a $5 incentive for completing the survey. The Topic 1 Example dataset includes 30 observations from this survey. Use this data to complete the relevant assignments in this course. | ||||
Education Level | ||||
1 | Less than High School | |||
2 | Graduated High School | |||
3 | Graduated College | |||
Annual Income = US Dollars | ||||
ID | Sex | Smoker | Education_Level*** | Minutes_Exercise | Age | Employed | Annual_Income* | Neighborhood |
101 | Female | No | 2 | 90 | 45 | Yes | 51000 | B |
102 | Male | No | 2 | 50 | 58 | No | 23000 | C |
103 | Female | Yes | 3 | 65 | 31 | Yes | 35000 | B |
104 | Male | No | 1 | 20 | 54 | No | 10000 | C |
105 | Female | Yes | 1 | 50 | 30 | Yes | 28000 | B |
106 | Female | Yes | 2 | 25 | 18 | No | 5000 | C |
107 | Female | No | 3 | 110 | 39 | Yes | 46000 | A |
108 | Male | Yes | 1 | 50 | 37 | Yes | 36000 | B |
109 | Female | Yes | 2 | 40 | 44 | Yes | 51000 | C |
110 | Male | No | 2 | 80 | 24 | No | 12000 | A |
111 | Female | No | 3 | 120 | 42 | Yes | 78000 | A |
112 | Male | No | 1 | 80 | 50 | Yes | 34000 | D |
113 | Female | Yes | 1 | 60 | 20 | No | 15000 | B |
114 | Male | No | 3 | 150 | 35 | Yes | 28000 | B |
115 | Male | No | 2 | 75 | 61 | Yes | 28000 | A |
116 | Male | No | 1 | 80 | 59 | No | 24000 | B |
117 | Female | No | 2 | 110 | 36 | Yes | 55000 | D |
118 | Male | Yes | 3 | 80 | 35 | Yes | 62000 | B |
119 | Male | Yes | 2 | 100 | 29 | No | 32000 | D |
120 | Female | No | 1 | 0 | 32 | No | 7000 | C |
121 | Female | Yes | 2 | 50 | 26 | No | 17000 | B |
122 | Female | No | 3 | 200 | 42 | Yes | 64000 | D |
123 | Male | No | 2 | 60 | 52 | No | 5000 | A |
124 | Male | No | 1 | 65 | 49 | No | 14000 | D |
125 | Female | No | 1 | 40 | 21 | No | 20000 | C |
126 | Male | Yes | 3 | 65 | 48 | Yes | 72000 | A |
127 | Female | Yes | 3 | 70 | 40 | Yes | 85000 | A |
128 | Female | No | 1 | 45 | 53 | No | 15000 | B |
129 | Male | No | 3 | 75 | 46 | Yes | 64000 | C |
130 | Male | Yes | 3 | 50 | 42 | Yes | 27000 | B |
Topic 1 DQ 1 |
Mixed methods research is becoming an important approach in generating public health evidence. Based on the resources supplied, discuss the benefits of a mixed methods approach. Include an explanation of the differences between qualitative and quantitative research and the purpose of each.
Mixed methods research has become increasing popular, however the definition of mixed methods research has yet to be agreed upon (Ozawa & Pongpirul, 2014). Essentially, mixed methods research studies incorporate quantitative and qualitative data to utilize the strengths of both types of research methods (Ozawa & Pongpirul, 2014). In health systems, mixed methods research is critical because it allows researchers to see issues from various perspectives, contextualize information, have a better understanding of the issue, form results, quantify difficult measures, create illustrations for trends, and examine processes (Ozawa & Pongpirul, 2014).To make sense of the assembly of mixed method research designs, there are four categories; the triangulation design, the embedded design, the explanatory design, and the exploratory design (Almalki, 2016). The triangulation design is practical because this type of research gathers data from different sources and utilizes different methods, which all work together as well-organized design (Almalki, 2016). With the embedded design, less resources are needed, and it produces less data, making it easier for researchers to grasp (Almalki, 2016). The explanatory design is easy to implement, and it enables the focus of the research to be maintained (Almalki, 2016). With the exploratory design, separate stages are easy to apply, also qualitative information is acceptable to quantitative researchers (Almalki, 2016).Quantitative research regards the world as being outside of themselves. The purpose is to gain an understanding about the social world (Almalki, 2016). The qualitative approach gains a perspective of issues by investigating them in their own specific setting. The purpose is to observe occurrences and bring meaning to them (Almalki, 2016). The differences between quantitative and qualitative research is as follows:
Quantitative Approach | Qualitative Approach |
Deductive | Inductive, with underlying assumptions reality is a social construct |
Subdivides reality into smaller, manageable pieces | Places emphasis on exploring and understanding |
Observations are made and hypotheses can be tested among variables | Variables are difficult to measure |
Primacy of subject matter | |
Conclusions are made with regard to the hypothesis, following a series of observations and analysis of data | Data collected will consist of an insider’s viewpoint |
(Almalki, 2016).
References
Almalki, S. (2016). Integrating Quantitative and Qualitative Data in Mixed Methods Research – Challenges and Benefits. Journal of Education and Learning. doi:10.5539/jel.v5n3p288. Retrieved from https://files.eric.ed.gov/fulltext/EJ1110464.pdf
Ozawa, S. & Pongpirul, K. (2014). 10 best resources on…mixed methods research in health systems. Health Policy and Planning. Retrieved from https://academic.oup.com/heapol/article/29/3/323/581455
ORDER A CUSTOM-WRITTEN, PLAGIARISM-FREE PAPER HERE
The delivery of healthcare is becoming more complex as evidence by the rising number of individuals with comorbidities and the shift towards the quality of care versus quantity. Addressing challenges that are generated by this complex system requires research that not only produces statistical data, but also understands a population’s natural setting and provides insight how he research can be applied to that setting. Mixed methods research is becoming an important approach in generating public health evidence because it combines both qualitative and quantitative research. Qualitative research answers clinical question regarding meaning and quality improvement and provides descriptive data while quantitative research answers clinical question regarding therapy, etiology, diagnosis, prevention, and prognosis and produces numerical data (Winona State University, 2014). Favorable characteristics of mixed method research include consistency between the research question, purpose and methodological choices; verifiable and transparent techniques that demonstrate trustworthiness; potential for replicability; opportunity for self-correction; and ability to explain the phenomena under investigation (Newman and Hitchcock, 2012). Furthermore, benefits to mixed methods include answering questions that qualitative or quantitative research cannot answer alone; provides better understanding of connections or contradictions between qualitative and quantitative data; it gives participants an opportunity to have a voice and share the experience across the research process [which is important within public health]; it facilitates different avenues of exploration that enhance the quality of evidence and enables questions to be answered more deeply (Shorten & Smith, 2017). A mixed method approach uses the combine strengths of qualitative and quantitative data. Its unique design is appropirate to addressing complex public health issues.Hitchcock, J. H., & Newman, I. (2012). Applying an Interactive Quantitative-Qualitative Framework. Human Resource Development Review, 12(1), 36–52. https://doi.org/10.1177/1534484312462127Shorten, A., & Smith, J. (2017). Mixed methods research: expanding the evidence base. Evidence Based Nursing, 20(3), 74–75. https://doi.org/10.1136/eb-2017-102699Winona State University. (2014). Research Hub: Evidence Based Practice Toolkit: Levels of Evidence. Retrieved from Winona.edu website: https://libguides.winona.edu/c.php?g=11614&p=61584
Topic One, Discussion Question 2:Statistics are ways to summarize data in a way that will answer a specific question (Corty, 2016). There are several key words that help with defining statistics, such as population, sample, parameter and statistic.During investigation studies researchers look for subjects to study. These subjects from large groups called a population (Corty, 2016). If the research only wanted to look at a small group of this population, they would call that a sample (Corty, 2016).For example – If I were to do a research study on obesity, I could use the state of Kentucky as my population. However, if I wanted to only look at Shelbyville, Kentucky that would be a sample of Kentucky.Data from either the sample or the population which can be reduced to a simple number like an average to summarize the group (Corty, 2016). If it is characterizing the sample, it is called a statistic; if it is characterizing the population it is called a parameter. Sample statistics use Latin letters as their symbol and population parameters use Greek letters (Corty, 2016).Then there is descriptive and inferential statistics. Descriptive is the summary statement about the set of cases (Corty, 2016). It reduces a set of data to a meaningful value to describe the characteristics of the group being observed – for example: 63% of the class were females. Inferential statistics uses a sample of cases to draw a conclusion about the larger population and reduces the data down to a single value that inferences about the population (generalization from the sample to a population – for example: Students who are female at GCU have a 15% higher GPA on average than males (Corty, 2016).Public health researchers often limit or rather stop their analyses to descriptive statistics—reporting frequencies, means and standard deviation (Guetterman, 2019). This allows for missed opportunities for more advanced analyses. “For example, knowing that patients have favorable attitudes about a treatment may be important and can be addressed with descriptive statistics. On the other hand, finding that attitudes are different (or not) between men and women and that difference is statistically significant may give even more actionable information to healthcare professionals” (Guetterman, 2019). This missing piece about differences can be addressed through inferential statistical tests (Guetterman, 2019). Therefore, both are extremely important to public health research.
References:
BUY A PLAGIARISM-FREE PAPER NOW
Corty, E. (2016). Using and interpreting statistics. A practical text for the behavioral, social, and health sciences 3rd Edition. Retrieved from https://viewer.gcu.edu/GGdEcj
Guetterman, T., (2019). Basics of statistics for primary care research. Family Medicine Community Health. 7(2). Retrieved from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6583801/
Practicing Application of Descriptive Statistics in Excel and SPSS |
Calculating Confidence Intervals |
Topic 2 DQ 1 |
P-values and confidence intervals are both used in hypothesis testing. Explain three reasons why it may be preferable to report a confidence interval over a P-value. Provide a specific example to justify your reasons.
Topic 2 DQ 1
De Prel et al. (2009) study found the following: P-values in scientific studies are used to determine whether a null hypothesis formulated before the performance of the study is to be accepted or rejected. In exploratory studies, p-values enable the recognition of any statistically noteworthy findings. Confidence intervals provide information about a range in which the true value lies with a certain degree of probability, as well as about the direction and strength of the demonstrated effect. This enables conclusions to be drawn about the statistical plausibility and clinical relevance of the study findings. It is often useful for both statistical measures to be reported in scientific articles, because they provide complementary types of information (p.335).
According to de Prel et al. (2009) “For example, there might be no difference between two antihypertensives with respect to their ability to reduce blood pressure. The alternative hypothesis (H1) then states that there is a difference between the two treatments. This can either be formulated as a two-tailed hypothesis (any difference) or as a one-tailed hypothesis (positive or negative effect). In this case, the expression “one-tailed” means that the direction of the expected effect is laid down when the alternative hypothesis is formulated (p.335).
Reference
du Prel, J. B., Hommel, G., Röhrig, B., & Blettner, M. (2009). Confidence interval or p-value?: part 4 of a series on evaluation of scientific publications. Deutsches Arzteblatt international, 106(19), 335–339. doi:10.3238/arztebl.2009.0335
ORDER A PLAGIARISM-FREE PAPER NOW
Topic 2 DQ 2
The Central Limit Theorem is the fundamental theorem of statistics. In a nutshell, it says that for independent and identically distributed data whose variance is finite, the sampling distribution of any mean becomes more nearly normal (i.e., Gaussian) as the sample size grows (Chang, Wu, Ho and Chen, 2008). The sample mean ¯xn will then approach the population mean µ, in distribution. More formally, where N (0, 1) is the normal distribution and the symbol “d” in the equality means in distribution. σn is the standard deviation of a sampling distribution, σ is the standard deviation of the entire population the study (and which is often not known), and n the sample size. So, sample means vary less than individual measurements. (The square of the standard deviation is the variance.). The sampling distribution is a notional (imaginary) distribution from a very large number of samples, each one of size n, which approaches a normal distribution in the limit of large n. In practice, the Central Limit Theorem holds for n as low as 30, unless there are exceptional circumstances—e.g., when the population distribution is highly skewed—in which case higher values are needed. So, σn measures how widely the sample means of size n vary around the population mean µ (which is approached in the limit of large n). As expected, the results suggest that the distribution of the sample mean better approximates the normal distribution as the sample size increases. The results indicate that the true distribution of the sample mean when the sample is taken from a highly skewed distribution better approximates the normal distribution as the thickness of the tail of the population distribution increases.
Chang, H. J., Wu, C. H., Ho, J. F., & Chen, P. (2008). On sample size in using central limit theorem for gamma distribution
Topic 3: Hypothesis Testing
- Evaluate the importance of hypothesis testing in statistics and public health research.
Hypothesis Testing |
Topic 3 DQ 1 |
Discuss the four potential outcomes of hypothesis testing and describe what is meant by type 1 and type 2 errors. Provide an example of when these errors might occur.
Topic 3 DQ 1
Banerjee et al., (2009) study found the following: Hypothesis testing is an important activity of empirical research and evidence-based medicine. A well worked up hypothesis is half the answer to the research question. For this, both knowledge of the subject derived from extensive review of the literature and working knowledge of basic statistical concepts are desirable. The present paper discusses the methods of working up a good hypothesis and statistical concepts of hypothesis testing (p.127)
Banerjee et al., (2009) study found the following: Just like a judge’s conclusion, an investigator’s conclusion may be wrong. Sometimes, by chance alone, a sample is not representative of the population. Thus, the results in the sample do not reflect reality in the population, and the random error leads to an erroneous inference. A type I error (false-positive) occurs if an investigator rejects a null hypothesis that is actually true in the population; a type II error (false-negative) occurs if the investigator fails to reject a null hypothesis that is actually false in the population. Although type I and type II errors can never be avoided entirely, the investigator can reduce their likelihood by increasing the sample size (the larger the sample, the lesser is the likelihood that it will differ substantially from the population) (p.127).
Banerjee et al., (2009) study found the following: False-positive and false-negative results can also occur because of bias (observer, instrument, recall, etc.). (Errors due to bias, however, are not referred to as type I and type II errors.) Such errors are troublesome, since they may be difficult to detect and cannot usually be quantified (p.127).
Reference
Banerjee, A., Chitnis, U. B., Jadhav, S. L., Bhawalkar, J. S., & Chaudhury, S. (2009). Hypothesis testing, type I and type II errors. Industrial psychiatry journal, 18(2), 127–131. doi:10.4103/0972-6748.62274
Topic 3 DQ 1
The four potential outcome of hypothesis testing are
- Correct inference: Conclude that there is an association when one does exist in the population.
- Correct inference:Conclude that there is no association when one does not exist in the population.
- Incorrect inference (type 1): Conclude that there is an association when there actually is none (false positive).
- Incorrect inference (type 2): Conclude that there is no association when there is one (false negative) (Banerjee, Chitnis, Jadhav, Bhawalkar, & Chaudhruy, 2009)
When the sample is not representative of the population this leads to an erroneous inference and type 1 or type 2 errors. A type 1 error is a false positive, or an investigator rejecting a null hypothesis that is actually true (Banerjee, Chitnis, Jadhav, Bhawalkar, & Chaudhruy, 2009). A type 2 error is the opposite a false negative, an investigator rejecting a null hypothesis that is actually false in the population. These errors are impossible to completely avoid but the likelihood can be decreased by increasing the sample size and (Banerjee, Chitnis, Jadhav, Bhawalkar, & Chaudhruy, 2009).
Bibliography
Banerjee, A., Chitnis, U., Jadhav, S., Bhawalkar, J., & Chaudhruy, S. (2009). Hypothesis Testing, type 1 and type II errors. Indian Psychiatry , 127-131.
Topic 3 DQ 2 |
Review the Healthy People 2020 website. Identify one of the health issues and propose a scenario that would use a z-test as the first step in the six steps of hypothesis testing. Discuss the remaining five steps based on your scenario, including clearly articulating the null and alternative hypotheses for your scenario.
Topic 3 DQ 2
Sphweb (n.d) study found the following: The Centers for Disease Control (CDC) reported on trends in weight, height and body mass index from the 1960’s through 2002.1 The general trend was that Americans were much heavier and slightly taller in 2002 as compared to 1960; both men and women gained approximately 24 pounds, on average, between 1960 and 2002. In 2002, the mean weight for men was reported at 191 pounds. Suppose that an investigator hypothesizes that weights are even higher in 2006 (i.e., that the trend continued over the subsequent 4 years). The research hypothesis is that the mean weight in men in 2006 is more than 191 pounds. The null hypothesis is that there is no change in weight, and therefore the mean weight is still 191 pounds in 2006(n.d).
Sphweb (n.d) study found the following: In order to test the hypotheses, we select a random sample of American males in 2006 and measure their weights. Suppose we have resources available to recruit n=100 men into our sample. We weigh each participant and compute summary statistics on the sample data. Suppose in the sample we determine the following:
- n=100
- s=25.6
Sphweb (n. d). study found the following: Do the sample data support the null or research hypothesis? The sample mean of 197.1 is numerically higher than 191. However, is this difference more than would be expected by chance? In hypothesis testing, we assume that the null hypothesis holds until proven otherwise. We therefore need to determine the likelihood of observing a sample mean of 197.1 or higher when the true population mean is 191 (i.e., if the null hypothesis is true or under the null hypothesis). We can compute this probability using the Central Limit Theorem. Specifically, (n.d.).
Review of the “Nutrition and Weight Status” on the Healthy People 2020
Obesity in Adults (NWS-9)
- Healthy People 2020 objective NWS-9 tracks the proportion of adults with obesity (BMI ≥ 30).
- HP2020 Baseline: In 2005–2008, the rate of obesity was 33.9% among adults aged 20 years and over (age adjusted).
- HP2020 Target: 30.5%, a 10% improvement over the baseline.
- Most Recent: In 2013–2016, the rate of obesity was 38.6% among adults aged 20 years and over (age adjusted).
- Males aged 20 years and over had a lower rate of obesity than females (36.5% versus 40.5%, age adjusted) in 2013–2016. The rate for females was 11.0% higher than that for males.
- Among racial and ethnic groups, the non-Hispanic Asian population had the lowest (best) rate of obesity, 12.5% of adults aged 20 years and over (age adjusted) in 2013–2016. Rates (age adjusted) for other racial and ethnic groups were:
- 0% among the non-Hispanic black population; more than 3.5 times the best group rate
- 9% among the Hispanic population; more than 3.5 times the best group rate
- 1% among the non-Hispanic white population; 3 times the best group rate
ORDER A PLAGIARISM-FREE PAPER NOW
Reference
Explore the Healthy People 2020 website.
URL:
https://www.healthypeople.gov/
http://www.real-statistics.com/hypothesis-testing/null-hypothesis/
Topic 4: The t-Test
Objectives:
- Differentiate the use of three types of t-tests.
- Explain the assumptions of the t-test.
- Interpret t-test results to determine the difference in means.
Application of the t-Test |
Topic 4 DQ 1 |
Compare the three types of t-tests by discussing when each is most appropriate to use and which types of questions each type of t-test best answers. Include specific examples to illustrate the appropriate use of each test.
Topic 4 DQ 1
In statistics, t-tests are a type of hypothesis test that allows you to compare means. They are called t-tests because each t-test boils your sample data down to one number, the t-value. If you understand how t-tests calculate t-values, you’re well on your way to understanding how these tests work.
In this series of posts, I’m focusing on concepts rather than equations to show how t-tests work. However, this post includes two simple equations that I’ll work through using the analogy of a signal-to-noise ratio (Editor, M.B.,2019, n.d.).
Both the signal and noise values are in the units of your data. If your signal is 6 and the noise is 2, your t-value is 3. This t-value indicates that the difference is 3 times the size of the standard error. However, if there is a difference of the same size but your data have more variability (6), your t-value is only 1. The signal is at the same scale as the noise (Editor, M.B.,2019, n.d.).
In this manner, t-values allow you to see how distinguishable your signal is from the noise. Relatively large signals and low levels of noise produce larger t-values. If the signal does not stand out from the noise, it’s likely that the observed difference between the sample estimate and the null hypothesis value is due to random error in the sample rather than a true difference at the population level (Editor, M.B.,2019, n.d.). PUB-550: Application and Interpretation of Public Health Data
Many people are confused about when to use a paired t-test and how it works. I’ll let you in on a little secret. The paired t-test and the 1-sample t-test are actually the same test in disguise! As we saw above, a 1-sample t-test compares one sample mean to a null hypothesis value. A paired t-test simply calculates the difference between paired observations (e.g., before and after) and then performs a 1-sample t-test on the differences (Editor, M.B.,2019, n.d.).
Understanding that the paired t-test simply performs a 1-sample t-test on the paired differences can really help you understand how the paired t-test works and when to use it. You just need to figure out whether it makes sense to calculate the difference between each pair of observations (Editor, M.B.,2019, n.d.).
Reference
Editor, M. B. (n.d.). Understanding t-Tests: 1-sample, 2-sample, and Paired t-Tests. Retrieved September 27, 2019, from https://blog.minitab.com/blog/adventures-in-statistics-2/understanding-t-tests-1-sample-2-sample-and-paired-t-tests.
Topic 4 DQ 2 |
Step 2 of hypothesis testing involves reviewing the assumptions of the test you selected. Discuss the three assumptions of the t-test. Provide an example of the assumption that is not robust to violations and a situation when the assumption is violated. PUB-550: Application and Interpretation of Public Health Data
Hoekstra (2012) study found the following:Using a statistical test is one of the frequently mentioned methods of checking for violations of assumptions (for an overview of statistical methodology textbooks that directly or indirectly advocate this method, see e.g., Hayes and Cai, 2007). However, it has also been argued that it is not appropriate to check assumptions by means of tests (such as Levene’s test) carried out before deciding on which statistical analysis technique to use because such tests compound the probability of making a Type I error (e.g., Schucany and Ng, 2006). Even if one desires to check whether or not an assumption is met, two problems stand in the way. First, assumptions are usually about the population, and in a sample the population is by definition not known. For example, it is usually not possible to determine the exact variance of the population in a sample-based study, and therefore it is also impossible to determine that two population variances are equal, as is required for the assumption of equal variances (also referred to as the assumption of homogeneity of variances) to be satisfied. Second, because assumptions are usually defined in a very strict way (e.g., all groups have equal variances in the population, or the variable is normally distributed in the population), the assumptions cannot reasonably be expected to be satisfied(p.1)Hoekstra (2012) study found the following:The assumptions of normality and of homogeneity of variances are required to be met for the t-test for independent group means, one of the most widely used statistical tests (Hayes and Cai, 2007), as well as for the frequently used techniques ANOVA and regression (Kashy et al., 2009). The assumption of normality is that the scores in the population in case of a t-test or ANOVA, and the population residuals in case of regression, be normally distributed. The assumption of homogeneity of variance requires equal population variances per group in case of a t-test or ANOVA, and equal population variances for every value of the independent variable for regression. Although researchers might be tempted to think that most statistical procedures are relatively robust against most violations, several studies have shown that this is often not the case, and that in the case of one-way ANOVA, unequal group sizes can have a negative impact on the technique’s robustness (e.g., Havlicek and Peterson, 1977; Wilcox, 1987; Lix et al., 1996)(p.2)ReferenceHoekstra, R., Kiers, H. A., & Johnson, A. (2012). Are assumptions of well-known statistical techniques checked, and why (not)?. Frontiers in psychology, 3, 137. doi:10.3389/fpsyg.2012.00137
Topic 5: ANOVA Testing
- Compare and contrast the types of ANOVA tests and their application.
- Apply the results of an ANOVA to determine statistical difference between means and potential interactions.
- PUB-550: Application and Interpretation of Public Health Data
Application of ANOVA |
Topic 5 DQ 1 |
Compare the various types of ANOVA by discussing when each is most appropriate for use and which types of research questions each best answers. Include specific examples to illustrate the appropriate use of each test and how interaction is assessed using ANOVA.
Topic 5 DQ 1
Analysis of variance (ANOVA) is a unit of statistical tests used to compare the means of two or more groups (Corty, 2016). There are two types of tests: between subjects, one-way ANOVA and between subjects, two-way ANOVA. ‘Between subjects’ means independent samples and ‘way’ means explanatory view. ‘Way’ can be grouping variables or independent variables (Corty, 2016) PUB-550: Application and Interpretation of Public Health Data.
The one-way ANOVA is a statistical test used when comparing the means of two or more independent samples when there is only one explanatory variable (Corty, 2016). A one-way ANOVA is most appropriate when used to assess the differences in one continuous variable between one grouping variable (Statistics Solutions, 2020). One- way ANOVA allows more groups to be compared at once, allowing more complex questions to be addressed (Corty, 2016). For example, a one-way ANOVA would be appropriate if the goal of research is to assess for differences in job satisfaction levels between ethnicities (Statistics Solutions, 2020). This type of example would require a question regarding one dependent variable, job satisfaction, and one independent variable, ethnicity.
The two-way ANOVA allows researchers to examine the impact of two explanatory variables at one time (Corty, 2016). The two-way ANOVA is most appropriate to use when there are two or more influencing factors at one time. A two-way ANOVA answer the most complex questions involving multiple influencing factors (Corty, 2016). For example, a researcher performed a study on factors that influence altruism and has interest in both how the children are reared and what their nervous systems are like, nurture vs. nature (Corty, 2016). The study was preformed using adoptive children. Below is the design stud with multiple levels of altruism as the influencing factors:
Adoptive Parents
High on Altruism |
Adoptive Parents
Medium on Altruism |
Adoptive Parents
Low on Altruism |
|
Birth Parents
High Altruism |
|||
Birth Parents
Low Altruism |
(Corty, 2016)
References
Corty, E.W. (2016). Using and Interpreting Statistics: A Practical Text for the Behavioral, Social, and Health Sciences. (3rd Ed.) New York, NY: Worth Publishers
Statistics Solutions. (2020). The Various Forms of ANOVA. Retrieved from https://www.statisticssolutions.com/the-various-forms-of-anova/ PUB-550: Application and Interpretation of Public Health Data
Topic 5 DQ 2 |
Different types of software can be used for data management. Compare Excel and SPSS and discuss specific SPSS software features that make it preferable to Excel for data management. Provide examples illustrating when electing to use SPSS could be preferable to Excel and vice versa.
5 DQ 2
When it comes to statistical analysis there are a few different types of software that can be used for data management. There are two types used in the PUB 550 course at Grand Canyon University. The first is Excel, which is a spreadsheet software that can also be used for statistical analysis. The other is SPSS, Statistical Package for Social Sciences, is an actual statistical analysis software (Statistics Solutions, 2019). Excel is an easy to use software that allows researchers to format data into a table format or spreadsheets, with rows and columns, and then filter the data using formulas (Mittermeier, 2019). The primary purpose of Excel is to create records of data along with manipulation of the data into visual analysis in preparation for formal presentations and reports. SPSS is specifically made for statistical analysis. The software has been used by researchers for decades to perform quantitative analysis of data, allowing for import of statistical packages from other databases and spreadsheets (Statistics Solutions, 2019). Excel utilizes formulas to perform analyses, that the user is expected to be knowledgeable of, whereas SPSS has specific tools to recode and transform variables without additional knowledge of the user required. SPSS is specific to the social sciences as it allows for comparative studies and statistical techniques at a large scale, although limited as it is unable to perform analyses for large data sets from the medical field for clinical data (Statistics Solutions, 2019). PUB-550: Application and Interpretation of Public Health Data. Although both software are capable of aiding researchers in performing statistical analysis on data they have collected, one is significantly more useful in the type of analysis researchers in public health and other social sciences need. SPSS is designed for data analysis in the social sciences and overall is the most user friendly and forgiving, as errors of accidental overwrites or sorting can be avoided unlike Excel worksheets (Mittermeier, 2019).
References
Mittermeier, E. (2019). Why you should move from Excel to SPSS. Retrieved from https://www.2×4.de/2019/06/11/why-you-should-move-from-excel-to-spss/
Statistics Solutions. (2019). SPSS statistics help. Retrieved from https://www.statisticssolutions.com/spss-statistics-help/
Topic 6: Regression
Objectives:
- Apply the steps of a regression analysis to determine the linear regression equation and its appropriateness based on the data.
- Interpret regression output to predict changes in a dependent variable based on changes in one or more predictor variables.
Application of the Pearson Correlation Coefficient and the Chi-Square Test |
Topic 7 DQ 1
Correlation between two variables proves only that there is an association it doesn’t guarantee that one causes the other (Corty, 2016). If two variables vary together systematically a cause and effect relationship may exist but there does not have to be, it is possible there is a third variable (Corty, 2016). When an association is demonstrated further research is needed to assess the strength of the relationship this is where the Pearson test comes in. The Pearson test calculates a correlation coefficient that summarizes the strength of the linear relationship between two variables, a strong relationship would suggest causation (Corty, 2016).
Data necessary to calculate a Pearson correlation coefficient must be interval and/or ratio, for an ordinal variable the Spearman rank order test can be used and for two nominal variables the chi-square test can be used (Corty, 2016).
A study that would be appropriate for the Pearson correlation coefficient would be a study of the need for prescription glasses and its relationship to age.
References
Corty, E. (2016). Using and interpreting statistics : a practical text for the behavioral, social, and health sciences. New York: Worth Publishers.
Topic 7 DQ 2 |
Describe the conditions in which a nonparametric test would be a better selection than a parametric test. Illustrate your ideas with a specific example of when you would use each type of test using similar variables for each example.
Topic 7 DQ 2
Parametric tests should only be used when assumptions about the parameters are met (Corty, 2016). Nonparametric tests do not have to meet these same assumptions. There are two circumstances in which a nonparametric test should be used: PUB-550: Application and Interpretation of Public Health Data
- The outcome variable is ordinal or nominal (Corty, 2016).
- During an experiment, if a nonrobust assumption is violated, the researcher can revert back to a nonparametric test from a parametric test (Corty, 2016).
Nonparametric tests are less restricted by assumptions and relatively simple to conduct, making them desirable. Although, they are often not as influential on the null hypothesis as parametric test (Corty, 2016). The reason nonparametric tests have less power is that they only contain nominal or ordinal data rather than interval/ratio data. Nominal and ordinal numbers contain less information, thus giving nonparametric tests less power (Corty, 2016). Generally, researchers prefer parametric tests, but when the assumptions are not met, nonparametric tests are used.
When the outcome is an ordinal variable or a rank, it is appropriate to use a nonparametric test. For example, a clinical trial is performed where study participants are asked to rate illness symptoms on severity for six weeks for a specific, assigned treatment. Symptom severity is measured on a 5-point ordinal scale with the following response options:
- Symptoms got much worse
- Symptoms are slightly worse
- No change
- Slightly improved
- Much improved
(Sullivan, 2016).
Outcomes that are ordinal, ranked, subject to outliers or measured imprecisely are difficult to analyze with parametric tests without making major assumptions (Sullivan, 2016). An appropriate and most effective test for the example above is the nonparametric test.
Parametric tests can also be used for ordinal variables as long as the ordinal variables are continuous. For example, an experiment analyzing the weight and height of firefighters could use a parametric test because the ordinal variables are continuous.
References
Corty, E.W. (2016). Using and Interpreting Statistics: A Practical Text for the Behavioral, Social, and Health Sciences. (3rd Ed.) New York, NY: Worth Publishers
Sullivan, L. (2016). Nonparametric Tests. Retrieved from http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Nonparametric/BS704_Nonparametric_print.html PUB-550: Application and Interpretation of Public Health Data
Topic 8: Analyzing and Reporting Results
- Apply hypothesis testing steps to a data set.
- Communicate scientific information for public health practice.
- Select quantitative and qualitative data collection methods appropriate for a given public health context.
- Analyze quantitative and qualitative data using biostatistics, informatics, computer-based programming, and software, as appropriate.
- Interpret results of data analysis for public health research, policy, or practice.
Benchmark – Analyzing and Reporting Data |
Analyzing and Reporting Data – Overview
The purpose of this assignment is to give you experience conducting a basic secondary data analysis using real-world surveillance data. Secondary data analysis is faster and cheaper to conduct compared to primary data collection. However, there are also significant limitations. The data were likely collected for a different purpose, and may not include the specific variables required to answer your question. The sampling strategy might not be random and may not be representative of your target population. These are examples of such limitations you should be aware of as you work with existing data.
A key question is whether the data should determine the research question, or if the research question should determine the type of data you use. In practice, you would want your research question or hypothesis to determine the dataset you select. In this assignment, you are limited to three datasets and may need to adjust your initial research question to accommodate one of the three datasets. Avoid “mining” for significant results and stick to your initial research question as much as possible. For this project, you will select one of the three example datasets to complete a basic analysis and communicate your findings through a scientific poster presentation. PUB-550: Application and Interpretation of Public Health Data
These steps will help you get started:
- Review the websites for each of the three datasets listed below. Be sure to understand the purpose of the survey, the sample used in the survey, and the main focus areas of each survey. Review the documentation provided on the websites to get to know the story behind the data and understand the population before reviewing the data.
- Select the dataset that is most appropriate for your interest area.
- Open the data in SPSS and get to know the data by reviewing the variables in “Variable View” mode. This view will allow you to read the variable labels and response labels for each variable.
- Based on your research interest and question, select variables that will help increase your understanding about that topic.
- Arrange the data as needed to organize and clean data, allowing you to focus on your specific question. Remember to save your analytic data file as a new file in case you need to go back to the original file. It is good practice to continually save new versions of the data file as you work with and manipulate the data.
- Follow the hypothesis testing steps to carry out your secondary data analysis. PUB-550: Application and Interpretation of Public Health Data
For additional information on conducting a secondary data analysis, read the Topic Material, “Conducting High-Value Secondary Dataset Analysis: An Introductory Guide and Resources.”
Dataset Documents
- Demographic and Health Survey
The Demographic and Health Survey is a global monitoring survey administered by USAID. The sample dataset is the model data set put together by USAID to explore DHS data. The sample data is not from a specific country or year, but it gives you an idea of what can be obtained from various countries through these datasets. The datasets are free and publically available once you register with USAID to access the DHS data. For the purpose of this assignment, treat this dataset as coming from a country of your choice. Access to the Model Questionnaire, Recode Manual, and Data Video Tutorials, including a video on the sampling strategy, is found at http://dhsprogram.com/data/model-datasets.cfm.
Note: You do not need to worry about weighting strategies for this assignment.
Use the http://dhsprogram.com/data/Using-DataSets-for-Analysis.cfm link to review the “Step-by-Step Introduction to Analyzing DHS Data” for tips on how to access your own dataset for future use and to see what resources are available to help you navigate the model dataset for this assignment:
- Youth Risk Behavior Surveillance System (YRBSS)
The Youth Risk Behavior Survey is a national survey monitoring health behaviors among youth and young adults. It is administered by the Centers for Disease Control and Prevention. The example dataset for this assignment comes from the National Survey (not combined) dataset for 2015. General information about the survey is found at https://www.cdc.gov/healthyYouth/data/yrbs/index.htm.
Documentation and questionnaires can be found by accessing the “YRBSS Data and Documentation: website at https://www.cdc.gov/healthyyouth/data/yrbs/data.htm. PUB-550: Application and Interpretation of Public Health Data
Please read the 2015 YRBS Data User’s Guide, listed in the “National YRBS Datasets and Documentation” page at https://www.cdc.gov/healthyyouth/data/yrbs/pdf/2015/2015_yrbs-data-users_guide_smy_combined.pdf.
The dataset includes calculated variables not found in the questionnaire that you might find helpful in determining your analysis for this assignment. The crosswalk to match the questions with the dataset can be found by viewing the “YRBS Questionnaire Content – 1991-2017” found at https://www.cdc.gov/healthyyouth/data/yrbs/pdf/2017/yrbs_questionnaire_content_1991-2017.pdf.
- National Health Interview Survey (NHIS)
The NHIS began in 1957, and has been used to monitor the health of the United States ever since. It is a household-level survey administered by the U.S. Census Bureau. Key topics in the survey include doctor’s visits, medical conditions, health insurance, and health behaviors. General information about the survey, including the sample design and data collection procedures, can be found at https://www.cdc.gov/nchs/nhis/about_nhis.htm.
A Survey Description of the 2015 National Health Interview Survey can be found at ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2015/srvydesc.pdf.
The sample dataset is from the 2015 adult survey at https://www.cdc.gov/nchs/nhis/nhis_2015_data_release.htm.
Some of the variables have been deleted to decrease the size of the file, but none of the observations have been dropped. Please review the “2015 National Health Interview Survey (NHIS) Public Use Data Release” document at ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2015/readme.pdf.
Review the “2015 NHIS Public Use Variable Summary” at ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2015/samadult_summary.pdf.
After you identify a few variables you are interested in, review the complete description of the variable in the variable layout document at ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2015/samadult_layout.pdf.
Checking the variable frequencies will help you determine the range of answers for each variable of interest, including the number of missing observations PUB-550: Application and Interpretation of Public Health Data. If the number missing is high, consider using another variable. Variable frequencies can be found at ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2015/samadult_freq.pdf.
Topic 8 DQ 1 |
Reporting public health information requires a clear understanding of the various statistical methods used to draw conclusions. These methods are then communicated within the larger story surrounding the public health issue. Identify a public health report or article and discuss what you would do differently to improve understanding and application if you were the author. Post the permalink to your article or report in the Main Forum.
BUY A PLAGIARISM-FREE PAPER HERE
Topic 8 DQ 1
The world today does not have a shortage of public health issues. Researchers are constantly gathering more data to develop prevention and protection programs. In order to properly accomplish this, it is necessary to have a clear understanding of statistical methods to draw accurate conclusions. It is possible to find statistics on nearly all public health issues, which is why is it critical for the scientists and professionals to have a grasp on all statistical methods available.
One public health issue that is relevant today is obesity. The general population is aware that poor nutrition, lack of physical activity and obesity cause a number of health-related issues, however obesity is getting worse all over the world. An article that addresses the issue, published by the World Health Organization, is ‘Obesity and Overweight’. The link to the article is https://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight.
Key facts from the article:
- Obesity has tripled worldwide since 1975
- In 2016, 1.9 million adults over the age of 18 were considered overweight; 650 million were obese
- 40 million children under the age of 5 were overweight or obese. PUB-550: Application and Interpretation of Public Health Data.
- In 2016, over 340 million children and adolescents aged 5-19 were obese or overweight
(WHO, 2020).
The article references multiple statistics regarding population obesity. The article also defines obesity and overweight in order to understand what determines who falls into the obese category and who falls into the overweight category. WHO also shared recent global estimates:
- In 2016, more than 1.9 billion adults aged 18 years and older were overweight. Of these over 650 million adults were obese.
- In 2016, 39% of adults aged 18 years and over (39% of men and 40% of women) were overweight.
- Overall, about 13% of the world’s adult population (11% of men and 15% of women) were obese in 2016.
- The worldwide prevalence of obesity nearly tripled between 1975 and 2016
(WHO, 2020).
While the information is interesting and useful, it would be helpful for the reader if WHO discussed the application used to develop the statistical data. WHO also shared the cause and prevention tactics for obesity. It would be helpful if WHO developed predictions for the next five years if people followed obesity prevention guidelines verses if these guidelines were not followed; all while providing the method to creating the statistical data. This would help the reader to better understand the importance of the public health issue. The sample size and surveillance methods should also be shared regarding obesity data retrieval.
Reference
World Health Organization (WHO). (2020). Obesity and overweight. Retrieved from https://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight PUB-550: Application and Interpretation of Public Health Data