Chat with us, powered by LiveChat This Discussion assists in solidifying your understanding of statistical testing by engaging in some data analysis. This week you will work with a real, secondary dataset | WriteDen

This Discussion assists in solidifying your understanding of statistical testing by engaging in some data analysis. This week you will work with a real, secondary dataset

  

This Discussion assists in solidifying your understanding of statistical testing by engaging in some data analysis. This week you will work with a real, secondary dataset to construct a research question, estimate a multiple regression model, and interpret the results.

Discussion: Multiple Regression

This Discussion assists in solidifying your understanding of statistical testing by engaging in some data analysis. This week you will work with a real, secondary dataset to construct a research question, estimate a multiple regression model, and interpret the results.

Whether in a scholarly or practitioner setting, good research and data analysis should have the benefit of peer feedback. For this Discussion, you will post your response to the hypothesis test, along with the results. Be sure and remember that the goal is to obtain constructive feedback to improve the research and its interpretation, so please view this as an opportunity to learn from one another.

To prepare for this Discussion:

· Review this week’s Learning Resources and media program related to multiple regression.

· Create a research question using the Afrobarometer Dataset or the HS Long Survey Dataset, that can be answered by multiple regression.

By Day 3

Use SPSS to answer the research question. Post your response to the following:

1. If you are using the Afrobarometer Dataset, report the mean of Q1 (Age). If you are using the HS Long Survey Dataset, report the mean of X1SES.

2. What is your research question?

3. What is the null hypothesis for your question?

4. What research design would align with this question?

5. What dependent variable was used and how is it measured?

6. What independent variables are used and how are they measured? What is the justification for including these predictor variables?

7. If you found significance, what is the strength of the effect?

8. Explain your results for a lay audience, explain what the answer to your research question.

Be sure to support your Main Post and Response Post with reference to the week’s Learning Resources and other scholarly evidence in APA Style.

Week Nine: Multiple Regressions

Posted on: Friday, July 22, 2022 9:31:03 AM EDT

As social scientists, we frequently have questions that require the use of multiple predictor variables. Moreover, we often want to include control variables (i.e., workforce experience, knowledge, education, etc.) in our model. Multiple regression allows the researcher to build on bivariate regression by including all of the important predictor and control variables in the same model. This, in turn, assists in reducing error and provides a better explanation of the complex social world.

Example: a local school system is trying to mitigate poor attendance. The researchers may look at several, possible interventions. In the end, a study may find a combination of interventions will work better than any single one. This finding is a typical product of multiple regression. In addition, because combinations of data may need to combined, a researcher can infer. The word is a power word in social sciences as it empowers a researcher to synthesize and speculate based upon responsible use of data.

In the end, having concluded your analysis of a regression, what has been learned? In two sentences or less what can you share with others?

Frankfort-Nachmias, C., Leon-Guerrero, A., & Davis, G. (2020). Social statistics for a diverse society (9th ed.). Thousand Oaks, CA: Sage Publications.

· Chapter 12, “Regression and Correlation” (pp. 401-457) (previously read in Week8)

Wagner, III, W. E. (2020). Using IBM® SPSS® statistics for research methods and social science statistics (7th ed.). Thousand Oaks, CA: Sage Publications.

· Chapter 8, "Correlation and Regression Analysis"

· Chapter 11, “Editing Output” (previously read in Week 2, 3, 4, 5. 6, 7, and 8)

Walden University, LLC. (Producer). (2016g). Multiple regression [Video file]. Baltimore, MD: Author.

 

Note: The approximate length of this media piece is 7 minutes.

 

In this media program, Dr. Matt Jones demonstrates multiple regression using the SPSS software.

Multiple Regression Models

Topic 2 of 4

Learning Objective: Interpret regression results when the regression model has more than one predictor.

The Purpose of Control Variables

A control variable in a statistical model is a variable that we are attempting to “hold constant” while we examine the association among other variables in our model. In essence, we want to know if our independent variable of interest (e.g., grit) is associated with our dependent variable after factoring in other variables (e.g., personality factors) that could also be related to the dependent variable.

In addition to our independent variables of interest, in regression models, we also include control variables as predictor variables because we suspect the control variable is related to our outcome variable and could explain the association between our independent variable and the outcome.

For example, suppose we want to understand factors that might predict an individual's income. Education level seems like an obvious predictor variable that we would want to examine as it is probably a predictor of income. Might there be other variables, however, that would predict income besides education? And, if we find that education level is associated with income, could it partially be because those with more education are also likely to be older and more accomplished/established and therefore earn more money?  For this reason, we probably want to include age as a control variable in our regression model predicting income with education.

Take a look at the output below from SPSS, which shows the results of a regression model based on data from the 2004 General Social Survey ( http://sda.berkeley.edu/archive.htm  ). Use an alpha value of .05 to interpret the results.

Model Summary A model summary showing the results of a regression model based on data from the 2004 General Social Survey. Highest year of school completed, age of respondent. 

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

1

.256a

.066

.064

2.276

a. Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETED, AGE OF RESPONDENT.

Interpreting the Regression Coefficient

So, how would we interpret the regression coefficient in this model for education level, if we are controlling for age? Researchers would say that holding age constant, education level has a weak, positive association with income, β = .21, p < .05. Recall that a positive association indicates that as education level increases, the income also increases. Another way to say the same thing is to say that education level predicts income above and beyond an individual's age.  

Recall, too, that we need to look at the p-value for each predictor in the model in order to discern whether the predictor shows a statistically significant association with the outcome variable and that we can use the standardized regression coefficients to gauge the effect size for each predictor. In our results below, we can see that each predictor, age, and education level is statistically significant as the p-value is less than the alpha value of .05.

Coefficientsa Table of coefficients showing both unstandardized coefficients and standardized coefficients for age of respondent, highest year of school completed. 

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

B

Std. Error

Beta

1

(Constant)

6.535

.534

blank

12.245

.000

AGE OF RESPONDENT

.025

.007

.120

3.703

.000

HIGHEST YEAR OF SCHOOL COMPLETED

.202

.031

.209

6.470

.000

Legend for Coefficientsa

Standardized regression coefficient for age; the closer this value is to 1, the stronger the effect size.

p-value for age

Standardized regression coefficient for education level;  the closer this value is to 1, the stronger the effect size.

p-value for educational level

If you look at the standardized regression coefficients, you can see that each predictor shows a weak relationship with income, as each predictor has a standardized regression coefficient that is about .1 or .2; stronger effects would be indicated if the regression coefficients had values closer to 1. Of the two predictors, the education level has a greater value for its standardized regression coefficient, indicating that it is a stronger predictor of income than age.

R-squared

Aside from looking at the individual regression coefficients and the p-values, another thing to note when you are discussing your multiple regression results is the R-squared value. R-squared is an important statistic that tells you the proportion of variability in the dependent variable that is accounted for by your model. In other words, it tells you how good of a job your predictors are doing at predicting your outcome variable. The R-squared value ranges from 0 to 1 and can be expressed as a percent.  In the output shown below, you can see that the R-squared value is .066.  

Model Summary

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

1

.256a

.066

.064

2.276

a. Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETED, AGE OF RESPONDENT.

Consider the following scenario when answering the question below.

Using the SPSS output above, and assuming an alpha level of .05, suppose we wanted to control for education level instead of age this time around.

Hint: Look at the p -value in the “sig.” column of the output for age. Is that value less than the alpha value of .05?

How would we interpret the results if we were interested in predicting income with age while controlling for education level?

Holding education level constant, an increase in age predicts an increase in income.

Education level is a stronger predictor of income than age, indicating that age is not related to income after controlling for education level

Age does not predict income after controlling for education level.

SUBMIT

TAKE AGAIN

How Predictors Are Related to the Dependent Variable?

The above question emphasizes the fact that regardless of whether the researcher is thinking of age or education level as the control variable, the mathematical interpretation of how the predictors are related to the dependent variable does not change. When we interpret the coefficient for one predictor in the model, it is always in the context of holding the other variables “constant,” regardless of which variable, conceptually, we are thinking of as a control variable.

Sometimes, researchers include multiple predictors in a model and are not thinking of any of them, conceptually, as control variables. They are simply interested in how the predictors, together, are related to the outcome variable, or they may be interested in seeing which predictor variables show the strongest relationships with the outcome.

Numbered divider 2

Consider the following scenario when answering the question below.

Suppose we wanted to predict the number of slices of pepperoni pizza people ate at a party based on how many slices their friends ate. Suppose we also gathered data on three (3) additional variables: individual's mood, how much they like pepperoni, and how hungry they reported being when they arrived at the party. Take a look at the correlation results below from SPSS, which is based on fictitious data.

Correlations

Blank

number of slices

positive mood

friends' number of slices

like pepperoni

hunger

number of slices

Pearson Correlation

1

.036

.791**

.806**

-.080

Sign. (2-tailed)

blank

.864

.000

.000

.702

N

25

25

25

25

25

positive move

Pearson Correlation

.036

1

-.106

.174

.238

Sig. (2-tailed)

.864

blank

.613

.406

.253

N

25

25

25

25

25

friends' number of slices

Pearson Correlation

.791**

-.106

1

.638**

-.139

Sig. (2-tailed)

.000

.613

blank

.001

.507

N

25

25

25

25

25

like pepperoni

Pearson Correlation

.806**

-.174

.638**

1

-.198

Sig. (2-tailed)

.000

.613

blank

.001

.507

N

25

25

25

25

25

hunger

Pearson Correlation

-.080

.238

-.139

-.198

1

Sig. (2-tailed)

.702

.253

.507

.343

blank

N

25

25

25

25

25

**. Correlation is significant at the 0.01 level (2-tailed).

Hint: Take a look at whether each variable is associated with the outcome variable of number of slices.  Is the association between the variable and number of slices statistically significant?

Which of these three (3) variables would be most logical to control for in the regression model?

Hunger

How much individuals reported liking pepperoni

Positive mood

SUBMIT

TAKE AGAIN

Numbered divider 3

In the output shown below, which is based on predicting income with age and education level, the R-squared is .066. 

Model Summary

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

1

.256a

.066

.064

2.276

a. Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETED, AGE OF RESPONDENT.

Hint: Remember that to convert a decimal to a percent, you will need to move the decimal point two places to the right.

Which of the following is the appropriate interpretation of the R-squared value? 

Age and education level account for 6.6% of the variability in income.

Income accounts for 6.6% of the variability in age and education level

Age and education level account for 66% of the variability in income.

Income accounts for 66% of the variability in age and education level completed.

SUBMIT

TAKE AGAIN

CONTINUE

How to Create Dummy-Coded Variables

by Robin KouvarasRobin Kouvaras

Topic 2 of 5

Learning Objective: Interpret regression models with dummy-coded variables.

How to Create Dummy-Coded Variables

Dummy-coded variables are created by only using the values of 0 and 1. The general rule used for dummy coding is that you need one (1) fewer dummy-coded variables than you have groups (# total groups – 1). So, for our variable of marital status, we would need two (2) dummy-coded variables because we have chosen to focus on three (3) marital status groups (3 – 2 = 1). The group for which we do not create a dummy-coded variable is typically called the reference category. Often the reference category will be the one that researchers want to compare to other groups. For our research, we might choose “married” as our reference category if we want to compare non-married individuals to married individuals.  

Before we conduct our regression analyses in SPSS, then, we will need to create two (2) dummy-coded variables for marital status:

1. one variable for the divorced group

2. one variable for the never-married group

We will use a 1 to indicate membership to that category (e.g., to indicate that someone is divorced for the “divorced” dummy-coded variable) and 0 to indicate non-membership.  

The table below shows how we would dummy-code our marital status variables.

Notice the Following

If the original value for an individual’s marital status is a 1 (indicating married), that individual would have a 0 for the “divorced” variable and a 0 for the “never married” variable. This is because they are not a “member” of either of these groups, they are not divorced, and they are not in the never-married category. This same logic holds for the remaining two (2) values of marital status. If an individual is divorced, they get a 1 for the divorced group, for example, and a 0 for the never-married group.

Also, note that each individual in the data set will have a value (either a 0 or a 1) for each dummy-coded variable that the researcher creates. 

CONTINUE TO ACTIVITY

Interpreting the Coefficients for Dummy-Coded Variables

by Robin KouvarasRobin Kouvaras

Topic 3 of 5

Learning Objective: Interpret regression models with dummy-coded variables.

How to Interpret Regression Results

Now that you are familiar with how to create dummy-coded variables, we will discuss how to interpret your regression results. Below is the SPSS output using the marital status groups to predict the frequency of religious attendance using multiple regression. Below the regression output, there is also the SPSS output that shows the mean for religious attendance for each of the marital status groups.

SPSS output using the marital status groups to predict the frequency of religious attendance using multiple regression.

Coefficientsa

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

B

Std. Error

Beta

1

(Constant)

4.328

.095

blank

45.627

.000

Divorced

-1.239

.206

-.166

-6.009

.000

Never Married

-1.190

.174

-.189

-6.825

.000

Legend for Coefficientsa

p-value for the Never Married predictor variable.

p-value for the Divorced predictor variable.

SPSS output that shows the mean for religious attendance for each of the marital status groups.

Descriptives HOW OFTEN R ATTENDS RELIGIOUS SERVICES

Blank

N

Mean

Std. Deviation

Std. Error

95% Confidence Interval for the Mean

Minimum

Maximum

Lower Bound

Upper Bound

MARRIED

789

4.33

2.731

.097

4.14

4.52

0

8

DIVORCED

212

3.09

2.687

.185

2.73

3.45

0

8

NEVER MARRIED

332

3.14

2.484

.136

2.87

3.41

0

8

Total

1333

3.83

2.728

.075

3.69

3.98

0

8

Let’s focus on the unstandardized regression coefficients in the output. Each coefficient will indicate how that particular group compares to the reference category (e.g., married) on the dependent variable. The coefficient reflects the comparison between the mean value of the dependent variable for the reference category and the mean value for the group represented by that particular coefficient. For example, first, take a look at the unstandardized regression coefficient for “divorced” (-1.239). This value reflects how the divorced group compares to the married group on religious attendance and indicates that the mean religious attendance for the divorced group is 1.239 units lower than that for the married group. 

A few more things about the output:

· bullet

If you subtract the mean for divorced (3.09) from the mean for married (4.33), you can see that you get the absolute value of the coefficient for the divorced variable: 4.33 – 3.09 = 1.24.  (If you round 1.239, you get 1.24.)  

· bullet

If the value had been positive (1.239 instead of -1.239), it would indicate that the divorced group had a higher mean than the married group on the dependent variable.  

· bullet

Similar to when you are interpreting the coefficients for continuous predictor variables in a regression model, the difference between the reference category and the indicated group is only considered to be statistically significant if the p-value is less than alpha.  In our results above, if we assume an alpha of .05 (or even .01), each predictor would be statistically significant, indicating that each group (divorced, never married) differs from the reference category of married on the dependent variable.

· bullet

Also similar to when you are interpreting the coefficients for continuous predictor variables in a regression model, you can use the absolute value of the standardized regression coefficients to gauge the effect size for each variable; values closer to 0 indicate weaker effects, and values closer to 1 indicate stronger effects.

,

6

Before You Begin

Before reading this Skill Builder, be sure to review the following concepts:

· Steps of hypothesis testing

· Null hypothesis 

· Alternative hypotheses

· p-value

· alpha

· Under what circumstances to reject the null hypothesis

· Effect size

· Practice significance/meaningfulness

Correlation Coefficients

Researchers who study adolescent peer relationships are often interested in whether peers shape one another's attitudes toward many different things, including drug and alcohol use, delinquent behavior, and academics. Suppose you are a researcher interested in whether adolescents and their peers shape each other's attitudes toward academics. Although it can be tricky to obtain data that would allow you to examine causal relationships between adolescents’ and peers’ academic attitudes, you can, as a start toward studying this topic, examine whether there are associations between adolescents’ and peers’ academic values. For example, do adolescents who value academics tend to have friends who also value academics?

Pearson’s Correlation

Pearson’s correlation is one method of examining associations among variables. It allows researchers to examine how one variable changes as another variable change. For example, will one variable increase as the other variable increases? Or, will one variable decrease as the other variable increases? 

The table below shows an example of SPSS output from Pearson’s correlation showing the association between adolescents’ value for the subject of English and their peers’ value for English (Loken, 2005). Students were asked to answer questions that would indicate how much value they place on English (e.g., how much they like English) and their peers were also asked the same set of questions (Eccles, et. al., 1983). Scores on the English value variable range from a low of 1 to a high of 7, with higher scores indicating a greater degree of value for English.

References: Loken, E.. Academic achievement in middle schoolers. 2005. Eccles et. al. Achievement and achievement motivation in expectancies, values, and academic behaviors. W.H. Freeman. 1983

Table: SPSS Output from Pearson's Correlation

Empty

English Value

Peer Group's English Value

English value

Pearson Correlation

1

.438**

Sig. (2-tailed)

.000

N

67

63

Peer group's English Value

Pearson Correlation

.438**

1

Sig. (2-tailed)

.000

blank

N

63

63

**. Correlation is significant at the 0.01 level (2-tailed).

Pearson’s correlation is typically used when research scenarios meet these criteria: 

· bullet

The researcher wants to examine the association between two variables

· bullet

Both variables can be considered to be continuous

Although the relationship between the two variables is not always linear, we will only focus on linear associations in this Skill Builder. While we will not focus on scatter plots here, examining a scatter plot is an effective way to see if the association is linear or curvilinear and to gauge the strength and direction of the association between the variables.

The Strength and Direction of Correlation Coefficients

When researchers examine the correlation coefficients in their SPSS output, how do they interpret them to discern the strength and direction of the association between the two variables? First, when we examine our SPSS output, we should think about the null hypothesis that we are testing, and we will need to decide whether or not to reject the null hypothesis. The null and alternative hypotheses for a Pearson’s correlation test can be written as:

Null: p = 0

Alternative: p ≠ 0

The null hypothesis states that there is no association between the variables; that is, the Pearson’s correlation coefficient is equal to zero. The alternative hypothesis, on the other hand, specifies that there is an association between the variables – that the Pearson’s correlation coefficient is not equal to zero. In our example of students’ and peers’ English value, SPSS is indicating a p-value of .000 (see the “sig. (2-tailed)” value in the table below). If we assume that alpha was set at .05, we would reject the null hypothesis and conclude that the results are consistent with there being an association between students’ and peers’ value for English.

Correlations

SPSS Output: Students' and Peers' English Value

Simply looking at the p-value in order to interpret correlation results will not be sufficient, however. Now that we have concluded that there is evidence of an association between the variables, we need to figure out the direction and the strength of the association between the two variabl

HOW OUR WEBSITE WORKS

Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of 
HIGH QUALITY & PLAGIARISM FREE.

Step 1

To make an Order you only need to click ORDER NOW and we will direct you to our Order Page at WriteDen. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.
 Deadline range from 6 hours to 30 days.

Step 2

Once done with writing your paper we will upload it to your account on our website and also forward a copy to your email.

Step 3
Upon receiving your paper, review it and if any changes are needed contact us immediately. We offer unlimited revisions at no extra cost.

Is it Safe to use our services?
We never resell papers on this site. Meaning after your purchase you will get an original copy of your assignment and you have all the rights to use the paper.

Discounts

Our price ranges from $8-$14 per page. If you are short of Budget, contact our Live Support for a Discount Code. All new clients are eligible for 20% off in their first Order. Our payment method is safe and secure.

Please note we do not have prewritten answers. We need some time to prepare a perfect essay for you.