26 Jul This Discussion assists in solidifying your understanding of statistical testing by engaging in some data analysis. This week you will work with a real, secondary dataset
This Discussion assists in solidifying your understanding of statistical testing by engaging in some data analysis. This week you will work with a real, secondary dataset to construct a research question, estimate a multiple regression model, and interpret the results.
Discussion: Multiple Regression
This Discussion assists in solidifying your understanding of statistical testing by engaging in some data analysis. This week you will work with a real, secondary dataset to construct a research question, estimate a multiple regression model, and interpret the results.
Whether in a scholarly or practitioner setting, good research and data analysis should have the benefit of peer feedback. For this Discussion, you will post your response to the hypothesis test, along with the results. Be sure and remember that the goal is to obtain constructive feedback to improve the research and its interpretation, so please view this as an opportunity to learn from one another.
To prepare for this Discussion:
· Review this week’s Learning Resources and media program related to multiple regression.
· Create a research question using the Afrobarometer Dataset or the HS Long Survey Dataset, that can be answered by multiple regression.
By Day 3
Use SPSS to answer the research question. Post your response to the following:
1. If you are using the Afrobarometer Dataset, report the mean of Q1 (Age). If you are using the HS Long Survey Dataset, report the mean of X1SES.
2. What is your research question?
3. What is the null hypothesis for your question?
4. What research design would align with this question?
5. What dependent variable was used and how is it measured?
6. What independent variables are used and how are they measured? What is the justification for including these predictor variables?
7. If you found significance, what is the strength of the effect?
8. Explain your results for a lay audience, explain what the answer to your research question.
Be sure to support your Main Post and Response Post with reference to the week’s Learning Resources and other scholarly evidence in APA Style.
Week Nine: Multiple Regressions
Posted on: Friday, July 22, 2022 9:31:03 AM EDT
As social scientists, we frequently have questions that require the use of multiple predictor variables. Moreover, we often want to include control variables (i.e., workforce experience, knowledge, education, etc.) in our model. Multiple regression allows the researcher to build on bivariate regression by including all of the important predictor and control variables in the same model. This, in turn, assists in reducing error and provides a better explanation of the complex social world.
Example: a local school system is trying to mitigate poor attendance. The researchers may look at several, possible interventions. In the end, a study may find a combination of interventions will work better than any single one. This finding is a typical product of multiple regression. In addition, because combinations of data may need to combined, a researcher can infer. The word is a power word in social sciences as it empowers a researcher to synthesize and speculate based upon responsible use of data.
In the end, having concluded your analysis of a regression, what has been learned? In two sentences or less what can you share with others?
FrankfortNachmias, C., LeonGuerrero, A., & Davis, G. (2020). Social statistics for a diverse society (9th ed.). Thousand Oaks, CA: Sage Publications.
· Chapter 12, “Regression and Correlation” (pp. 401457) (previously read in Week8)
Wagner, III, W. E. (2020). Using IBM® SPSS® statistics for research methods and social science statistics (7th ed.). Thousand Oaks, CA: Sage Publications.
· Chapter 8, "Correlation and Regression Analysis"
· Chapter 11, “Editing Output” (previously read in Week 2, 3, 4, 5. 6, 7, and 8)
Walden University, LLC. (Producer). (2016g). Multiple regression [Video file]. Baltimore, MD: Author.
Note: The approximate length of this media piece is 7 minutes.
In this media program, Dr. Matt Jones demonstrates multiple regression using the SPSS software.
Multiple Regression Models
Topic 2 of 4
Learning Objective: Interpret regression results when the regression model has more than one predictor.
The Purpose of Control Variables
A control variable in a statistical model is a variable that we are attempting to “hold constant” while we examine the association among other variables in our model. In essence, we want to know if our independent variable of interest (e.g., grit) is associated with our dependent variable after factoring in other variables (e.g., personality factors) that could also be related to the dependent variable.
In addition to our independent variables of interest, in regression models, we also include control variables as predictor variables because we suspect the control variable is related to our outcome variable and could explain the association between our independent variable and the outcome.
For example, suppose we want to understand factors that might predict an individual's income. Education level seems like an obvious predictor variable that we would want to examine as it is probably a predictor of income. Might there be other variables, however, that would predict income besides education? And, if we find that education level is associated with income, could it partially be because those with more education are also likely to be older and more accomplished/established and therefore earn more money? For this reason, we probably want to include age as a control variable in our regression model predicting income with education.
Take a look at the output below from SPSS, which shows the results of a regression model based on data from the 2004 General Social Survey ( http://sda.berkeley.edu/archive.htm ). Use an alpha value of .05 to interpret the results.
Model Summary A model summary showing the results of a regression model based on data from the 2004 General Social Survey. Highest year of school completed, age of respondent.
Model 
R 
R Square 
Adjusted R Square 
Std. Error of the Estimate 
1 
.256a 
.066 
.064 
2.276 
a. Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETED, AGE OF RESPONDENT.
Interpreting the Regression Coefficient
So, how would we interpret the regression coefficient in this model for education level, if we are controlling for age? Researchers would say that holding age constant, education level has a weak, positive association with income, β = .21, p < .05. Recall that a positive association indicates that as education level increases, the income also increases. Another way to say the same thing is to say that education level predicts income above and beyond an individual's age.
Recall, too, that we need to look at the pvalue for each predictor in the model in order to discern whether the predictor shows a statistically significant association with the outcome variable and that we can use the standardized regression coefficients to gauge the effect size for each predictor. In our results below, we can see that each predictor, age, and education level is statistically significant as the pvalue is less than the alpha value of .05.
Coefficientsa Table of coefficients showing both unstandardized coefficients and standardized coefficients for age of respondent, highest year of school completed.
Model 
Unstandardized Coefficients 
Standardized Coefficients 
t 
Sig. 

B 
Std. Error 
Beta 

1 
(Constant) 
6.535 
.534 
blank 
12.245 
.000 
AGE OF RESPONDENT 
.025 
.007 
.120 
3.703 
.000 

HIGHEST YEAR OF SCHOOL COMPLETED 
.202 
.031 
.209 
6.470 
.000 
Legend for Coefficientsa 

Standardized regression coefficient for age; the closer this value is to 1, the stronger the effect size. 

pvalue for age 

Standardized regression coefficient for education level; the closer this value is to 1, the stronger the effect size. 

pvalue for educational level 
If you look at the standardized regression coefficients, you can see that each predictor shows a weak relationship with income, as each predictor has a standardized regression coefficient that is about .1 or .2; stronger effects would be indicated if the regression coefficients had values closer to 1. Of the two predictors, the education level has a greater value for its standardized regression coefficient, indicating that it is a stronger predictor of income than age.
Rsquared
Aside from looking at the individual regression coefficients and the pvalues, another thing to note when you are discussing your multiple regression results is the Rsquared value. Rsquared is an important statistic that tells you the proportion of variability in the dependent variable that is accounted for by your model. In other words, it tells you how good of a job your predictors are doing at predicting your outcome variable. The Rsquared value ranges from 0 to 1 and can be expressed as a percent. In the output shown below, you can see that the Rsquared value is .066.
Model Summary
Model 
R 
R Square 
Adjusted R Square 
Std. Error of the Estimate 
1 
.256a 
.066 
.064 
2.276 
a. Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETED, AGE OF RESPONDENT.
Consider the following scenario when answering the question below.
Using the SPSS output above, and assuming an alpha level of .05, suppose we wanted to control for education level instead of age this time around.
Hint: Look at the p value in the “sig.” column of the output for age. Is that value less than the alpha value of .05?
How would we interpret the results if we were interested in predicting income with age while controlling for education level?
Holding education level constant, an increase in age predicts an increase in income.
Education level is a stronger predictor of income than age, indicating that age is not related to income after controlling for education level
Age does not predict income after controlling for education level.
SUBMIT
TAKE AGAIN
How Predictors Are Related to the Dependent Variable?
The above question emphasizes the fact that regardless of whether the researcher is thinking of age or education level as the control variable, the mathematical interpretation of how the predictors are related to the dependent variable does not change. When we interpret the coefficient for one predictor in the model, it is always in the context of holding the other variables “constant,” regardless of which variable, conceptually, we are thinking of as a control variable.
Sometimes, researchers include multiple predictors in a model and are not thinking of any of them, conceptually, as control variables. They are simply interested in how the predictors, together, are related to the outcome variable, or they may be interested in seeing which predictor variables show the strongest relationships with the outcome.
Numbered divider 2
Consider the following scenario when answering the question below.
Suppose we wanted to predict the number of slices of pepperoni pizza people ate at a party based on how many slices their friends ate. Suppose we also gathered data on three (3) additional variables: individual's mood, how much they like pepperoni, and how hungry they reported being when they arrived at the party. Take a look at the correlation results below from SPSS, which is based on fictitious data.
Correlations
Blank 
number of slices 
positive mood 
friends' number of slices 
like pepperoni 
hunger 

number of slices 
Pearson Correlation 
1 
.036 
.791** 
.806** 
.080 
Sign. (2tailed) 
blank 
.864 
.000 
.000 
.702 

N 
25 
25 
25 
25 
25 

positive move 
Pearson Correlation 
.036 
1 
.106 
.174 
.238 
Sig. (2tailed) 
.864 
blank 
.613 
.406 
.253 

N 
25 
25 
25 
25 
25 

friends' number of slices 
Pearson Correlation 
.791** 
.106 
1 
.638** 
.139 
Sig. (2tailed) 
.000 
.613 
blank 
.001 
.507 

N 
25 
25 
25 
25 
25 

like pepperoni 
Pearson Correlation 
.806** 
.174 
.638** 
1 
.198 
Sig. (2tailed) 
.000 
.613 
blank 
.001 
.507 

N 
25 
25 
25 
25 
25 

hunger 
Pearson Correlation 
.080 
.238 
.139 
.198 
1 
Sig. (2tailed) 
.702 
.253 
.507 
.343 
blank 

N 
25 
25 
25 
25 
25 
**. Correlation is significant at the 0.01 level (2tailed).
Hint: Take a look at whether each variable is associated with the outcome variable of number of slices. Is the association between the variable and number of slices statistically significant?
Which of these three (3) variables would be most logical to control for in the regression model?
Hunger
How much individuals reported liking pepperoni
Positive mood
SUBMIT
TAKE AGAIN
Numbered divider 3
In the output shown below, which is based on predicting income with age and education level, the Rsquared is .066.
Model Summary
Model 
R 
R Square 
Adjusted R Square 
Std. Error of the Estimate 
1 
.256a 
.066 
.064 
2.276 
a. Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETED, AGE OF RESPONDENT.
Hint: Remember that to convert a decimal to a percent, you will need to move the decimal point two places to the right.
Which of the following is the appropriate interpretation of the Rsquared value?
Age and education level account for 6.6% of the variability in income.
Income accounts for 6.6% of the variability in age and education level
Age and education level account for 66% of the variability in income.
Income accounts for 66% of the variability in age and education level completed.
SUBMIT
TAKE AGAIN
CONTINUE
How to Create DummyCoded Variables
by Robin KouvarasRobin Kouvaras
Topic 2 of 5
Learning Objective: Interpret regression models with dummycoded variables.
How to Create DummyCoded Variables
Dummycoded variables are created by only using the values of 0 and 1. The general rule used for dummy coding is that you need one (1) fewer dummycoded variables than you have groups (# total groups – 1). So, for our variable of marital status, we would need two (2) dummycoded variables because we have chosen to focus on three (3) marital status groups (3 – 2 = 1). The group for which we do not create a dummycoded variable is typically called the reference category. Often the reference category will be the one that researchers want to compare to other groups. For our research, we might choose “married” as our reference category if we want to compare nonmarried individuals to married individuals.
Before we conduct our regression analyses in SPSS, then, we will need to create two (2) dummycoded variables for marital status:
1. one variable for the divorced group
2. one variable for the nevermarried group
We will use a 1 to indicate membership to that category (e.g., to indicate that someone is divorced for the “divorced” dummycoded variable) and 0 to indicate nonmembership.
The table below shows how we would dummycode our marital status variables.
Notice the Following
If the original value for an individual’s marital status is a 1 (indicating married), that individual would have a 0 for the “divorced” variable and a 0 for the “never married” variable. This is because they are not a “member” of either of these groups, they are not divorced, and they are not in the nevermarried category. This same logic holds for the remaining two (2) values of marital status. If an individual is divorced, they get a 1 for the divorced group, for example, and a 0 for the nevermarried group.
Also, note that each individual in the data set will have a value (either a 0 or a 1) for each dummycoded variable that the researcher creates.
CONTINUE TO ACTIVITY
Interpreting the Coefficients for DummyCoded Variables
by Robin KouvarasRobin Kouvaras
Topic 3 of 5
Learning Objective: Interpret regression models with dummycoded variables.
How to Interpret Regression Results
Now that you are familiar with how to create dummycoded variables, we will discuss how to interpret your regression results. Below is the SPSS output using the marital status groups to predict the frequency of religious attendance using multiple regression. Below the regression output, there is also the SPSS output that shows the mean for religious attendance for each of the marital status groups.
SPSS output using the marital status groups to predict the frequency of religious attendance using multiple regression.
Coefficientsa
Model 
Unstandardized Coefficients 
Standardized Coefficients 
t 
Sig. 

B 
Std. Error 
Beta 

1 
(Constant) 
4.328 
.095 
blank 
45.627 
.000 
Divorced 
1.239 
.206 
.166 
6.009 
.000 

Never Married 
1.190 
.174 
.189 
6.825 
.000 
Legend for Coefficientsa 

pvalue for the Never Married predictor variable. 

pvalue for the Divorced predictor variable. 
SPSS output that shows the mean for religious attendance for each of the marital status groups.
Descriptives HOW OFTEN R ATTENDS RELIGIOUS SERVICES
Blank 
N 
Mean 
Std. Deviation 
Std. Error 
95% Confidence Interval for the Mean 
Minimum 
Maximum 

Lower Bound 
Upper Bound 

MARRIED 
789 
4.33 
2.731 
.097 
4.14 
4.52 
0 
8 
DIVORCED 
212 
3.09 
2.687 
.185 
2.73 
3.45 
0 
8 
NEVER MARRIED 
332 
3.14 
2.484 
.136 
2.87 
3.41 
0 
8 
Total 
1333 
3.83 
2.728 
.075 
3.69 
3.98 
0 
8 
Let’s focus on the unstandardized regression coefficients in the output. Each coefficient will indicate how that particular group compares to the reference category (e.g., married) on the dependent variable. The coefficient reflects the comparison between the mean value of the dependent variable for the reference category and the mean value for the group represented by that particular coefficient. For example, first, take a look at the unstandardized regression coefficient for “divorced” (1.239). This value reflects how the divorced group compares to the married group on religious attendance and indicates that the mean religious attendance for the divorced group is 1.239 units lower than that for the married group.
A few more things about the output:
· bullet
If you subtract the mean for divorced (3.09) from the mean for married (4.33), you can see that you get the absolute value of the coefficient for the divorced variable: 4.33 – 3.09 = 1.24. (If you round 1.239, you get 1.24.)
· bullet
If the value had been positive (1.239 instead of 1.239), it would indicate that the divorced group had a higher mean than the married group on the dependent variable.
· bullet
Similar to when you are interpreting the coefficients for continuous predictor variables in a regression model, the difference between the reference category and the indicated group is only considered to be statistically significant if the pvalue is less than alpha. In our results above, if we assume an alpha of .05 (or even .01), each predictor would be statistically significant, indicating that each group (divorced, never married) differs from the reference category of married on the dependent variable.
· bullet
Also similar to when you are interpreting the coefficients for continuous predictor variables in a regression model, you can use the absolute value of the standardized regression coefficients to gauge the effect size for each variable; values closer to 0 indicate weaker effects, and values closer to 1 indicate stronger effects.
,
6
Before You Begin
Before reading this Skill Builder, be sure to review the following concepts:
· Steps of hypothesis testing
· Null hypothesis
· Alternative hypotheses
· pvalue
· alpha
· Under what circumstances to reject the null hypothesis
· Effect size
· Practice significance/meaningfulness
Correlation Coefficients
Researchers who study adolescent peer relationships are often interested in whether peers shape one another's attitudes toward many different things, including drug and alcohol use, delinquent behavior, and academics. Suppose you are a researcher interested in whether adolescents and their peers shape each other's attitudes toward academics. Although it can be tricky to obtain data that would allow you to examine causal relationships between adolescents’ and peers’ academic attitudes, you can, as a start toward studying this topic, examine whether there are associations between adolescents’ and peers’ academic values. For example, do adolescents who value academics tend to have friends who also value academics?
Pearson’s Correlation
Pearson’s correlation is one method of examining associations among variables. It allows researchers to examine how one variable changes as another variable change. For example, will one variable increase as the other variable increases? Or, will one variable decrease as the other variable increases?
The table below shows an example of SPSS output from Pearson’s correlation showing the association between adolescents’ value for the subject of English and their peers’ value for English (Loken, 2005). Students were asked to answer questions that would indicate how much value they place on English (e.g., how much they like English) and their peers were also asked the same set of questions (Eccles, et. al., 1983). Scores on the English value variable range from a low of 1 to a high of 7, with higher scores indicating a greater degree of value for English.
References: Loken, E.. Academic achievement in middle schoolers. 2005. Eccles et. al. Achievement and achievement motivation in expectancies, values, and academic behaviors. W.H. Freeman. 1983
Table: SPSS Output from Pearson's Correlation
Empty 
English Value 
Peer Group's English Value 

English value 
Pearson Correlation 
1 
.438** 
Sig. (2tailed) 
.000 

N 
67 
63 

Peer group's English Value 
Pearson Correlation 
.438** 
1 
Sig. (2tailed) 
.000 
blank 

N 
63 
63 
**. Correlation is significant at the 0.01 level (2tailed).
Pearson’s correlation is typically used when research scenarios meet these criteria:
· bullet
The researcher wants to examine the association between two variables
· bullet
Both variables can be considered to be continuous
Although the relationship between the two variables is not always linear, we will only focus on linear associations in this Skill Builder. While we will not focus on scatter plots here, examining a scatter plot is an effective way to see if the association is linear or curvilinear and to gauge the strength and direction of the association between the variables.
The Strength and Direction of Correlation Coefficients
When researchers examine the correlation coefficients in their SPSS output, how do they interpret them to discern the strength and direction of the association between the two variables? First, when we examine our SPSS output, we should think about the null hypothesis that we are testing, and we will need to decide whether or not to reject the null hypothesis. The null and alternative hypotheses for a Pearson’s correlation test can be written as:
Null: p = 0
Alternative: p ≠ 0
The null hypothesis states that there is no association between the variables; that is, the Pearson’s correlation coefficient is equal to zero. The alternative hypothesis, on the other hand, specifies that there is an association between the variables – that the Pearson’s correlation coefficient is not equal to zero. In our example of students’ and peers’ English value, SPSS is indicating a pvalue of .000 (see the “sig. (2tailed)” value in the table below). If we assume that alpha was set at .05, we would reject the null hypothesis and conclude that the results are consistent with there being an association between students’ and peers’ value for English.
Correlations
SPSS Output: Students' and Peers' English Value
Simply looking at the pvalue in order to interpret correlation results will not be sufficient, however. Now that we have concluded that there is evidence of an association between the variables, we need to figure out the direction and the strength of the association between the two variabl
HOW OUR WEBSITE WORKS
Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of
HIGH QUALITY & PLAGIARISM FREE.
Step 1
To make an Order you only need to click ORDER NOW and we will direct you to our Order Page at WriteDen. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.
Deadline range from 6 hours to 30 days.
Step 2
Once done with writing your paper we will upload it to your account on our website and also forward a copy to your email.
Step 3
Upon receiving your paper, review it and if any changes are needed contact us immediately. We offer unlimited revisions at no extra cost.
Is it Safe to use our services?
We never resell papers on this site. Meaning after your purchase you will get an original copy of your assignment and you have all the rights to use the paper.
Discounts
Our price ranges from $8$14 per page. If you are short of Budget, contact our Live Support for a Discount Code. All new clients are eligible for 20% off in their first Order. Our payment method is safe and secure.
Please note we do not have prewritten answers. We need some time to prepare a perfect essay for you.