Determine what the appropriate statistical test is for your main two variables of interest
· Analysis of variance (ANOVA) assesses whether the means of two or more groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means (quantitative variables) of groups (categorical variables). The null hypothesis is that there is no difference in the mean of the quantitative variable across groups (categorical variable), while the alternative is that there is a difference.
· A Chi-Square Test of Independence compares frequencies of one categorical variable for different values of a second categorical variable. The null hypothesis is that the relative proportions of one variable are independent of the second variable; in other words, the proportions of one variable are the same for different values of the second variable. The alternate hypothesis is that the relative proportions of one variable are associated with the second variable. Note: although it is possible to run large Chi-Square tables (e.g. 5 x 5, 4 x 6, etc.), the test is really only interpretable when you response variable has 2 levels (see Graphing decisions flow chart in bivariate graphing chapter).
· Correlation coefficient assesses the degree of linear relationship between two variables. It ranges from +1 to -1. A correlation of +1 means that there is a perfect, positive, linear relationship between the two variables. A correlation of -1 means there is a perfect, negative linear relationship between the two variables. In both cases, knowing the value of one variable, you can perfectly predict the value of the second. Note: Two 3+ level categorical variables can be used to generate a correlation coefficient if the the categories are ordered and the average (i.e. mean) can be interpreted. The scatter plot on the other hand will not be useful. In general the scatterplot is not useful for discrete variables (i.e. those that take on a limited number of values). When we square r, it tells us the proportion of the variability in one variable that is described by variation in the second variable (aka RSquare or Coefficient of Determination).
· Please note: If you have a quantitative explanatory variable and a categorical response, you will eventually be using logistic regression. For now, categorize your explanatory variable and use a chi-square test as explained above.
The requirement of this assignment is to: Run the appropriate test, post the syntax used, and interpret your findings. In addition, use post-hoc tests if appropriate. Please see the samples below for guidance in writing statistical findings.
Example Writeup:
· Example of how to write results for ANOVA:
· When examining the association between current number of cigarettes smoked (quantitative response) and past year nicotine dependence (categorical explanatory), an Analysis of Variance (ANOVA) revealed that among daily, young adult smokers (my sample), those with nicotine dependence reported smoking significantly more cigarettes per day (Mean=14.6, s.d. ±9.15) compared to those without nicotine dependence (Mean=11.4, s.d. ±7.43), F(1, 1313)=44.68, p=.0001.
· Post hoc ANOVA results: ANOVA revealed that among daily, young adult smokers (my sample), number of cigarettes smoked per day (collapsed into 5 ordered categories, which is the categorical explanatory variable) and number of nicotine dependence symptoms (quantitative response variable) were significantly associated, F (4, 1308)=11.79, p=.0001. Post hoc comparisons of mean number of nicotine dependence symptoms by pairs of cigarettes per day categories revealed that those individuals smoking more than 10 cigarettes per day (i.e. 11 to 15, 16 to 20 and >20) reported significantly more nicotine dependence symptoms compared to those smoking 10 or fewer cigarettes per day (i.e. 1 to 5 and 6 to 10). All other comparisons were statistically similar.
· Chi-Square Test of Independence
· When examining the association between lifetime major depression (categorical response) and past year nicotine dependence (categorical explanatory), a chi- square test of independence revealed that among daily, young adults smokers (my sample), those with past year nicotine dependence were more likely to have experienced major depression in their lifetime (36.2%) compared to those without past year nicotine dependence (12.7%), X2 =88.60, 1 df, p=0001.
· Post hoc Chi-Square results: A Chi Square test of independence revealed that among daily, young adult smokers (my sample), number of cigarettes smoked per day (collapsed into 5 ordered categories) and past year nicotine dependence (binary categorical variable) were significantly associated, X2 =45.16, 4 df, p=.0001. Post hoc comparisons of rates of nicotine dependence by pairs of cigarettes per day categories revealed that higher rates of nicotine dependence were seen among those smoking more cigarettes, up to 11 to 15 cigarettes per day. In comparison, prevalence of nicotine dependence was statistically similar among those groups smoking 10 to 15, 16 to 20, and > 20 cigarettes per day.
· Correlation
· Among daily, young adult smokers (my sample), the correlation between number of cigarettes smoked per day (quantitative) and number of nicotine dependence symptoms experienced in the past year (quantitative) was 0.17 (p=.0001), suggesting that only 3% (i.e. 0.17 squared) of the variance in number of current nicotine dependence symptoms can be explained by number of cigarettes smoked per day.
Sample Submission:
ANOVA
In looking at the question assessing whether getting pregnant now would be one of the worst things, we saw no difference between men (M = 1.64, SD = .97) and women (M = 1.69. SD = 1.01), F (1, 4425) = 2.10, p = .148, partial eta square = .000.
Univariate Analysis of Variance
Notes |
||
Output Created |
10-APR-2024 14:19:24 |
|
Comments |
|
|
Input |
Data |
C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav |
|
Active Dataset |
DataSet1 |
|
Filter |
H1RP1<5.5 (FILTER) |
|
Weight |
<none> |
|
Split File |
<none> |
|
N of Rows in Working Data File |
4427 |
Missing Value Handling |
Definition of Missing |
User-defined missing values are treated as missing. |
|
Cases Used |
Statistics are based on all cases with valid data for all variables in the model. |
Syntax |
UNIANOVA H1RP1 BY BIO_SEX /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /PRINT ETASQ DESCRIPTIVE /CRITERIA=ALPHA(.05) /DESIGN=BIO_SEX. |
|
Resources |
Processor Time |
00:00:00.06 |
|
Elapsed Time |
00:00:00.07 |
Between-Subjects Factors |
||
|
N |
|
BIOLOGICAL SEX-W1 |
1 |
2197 |
|
2 |
2230 |
Descriptive Statistics |
|||
Dependent Variable: S8Q1 PREGNANT NOW ONE OF THE WORST-W1 |
|||
BIOLOGICAL SEX-W1 |
Mean |
Std. Deviation |
N |
1 |
1.64 |
.965 |
2197 |
2 |
1.69 |
1.007 |
2230 |
Total |
1.67 |
.986 |
4427 |
Tests of Between-Subjects Effects |
||||||
Dependent Variable: S8Q1 PREGNANT NOW ONE OF THE WORST-W1 |
||||||
Source |
Type III Sum of Squares |
df |
Mean Square |
F |
Sig. |
Partial Eta Squared |
Corrected Model |
2.040a |
1 |
2.040 |
2.097 |
.148 |
.000 |
Intercept |
12279.740 |
1 |
12279.740 |
12621.511 |
.000 |
.740 |
BIO_SEX |
2.040 |
1 |
2.040 |
2.097 |
.148 |
.000 |
Error |
4305.178 |
4425 |
.973 |
|
|
|
Total |
16590.000 |
4427 |
|
|
|
|
Corrected Total |
4307.218 |
4426 |
|
|
|
|
a. R Squared = .000 (Adjusted R Squared = .000) |
In analyzing the getting pregnant now not being so bad question, there was a significant difference between men (M = 4.23, SD =.99) and women (M= 4.17, SD= 1.03), where men thought that getting pregnant now would not be as bad as women thought, F (1, 4425) = 4.26, p = .039, partial eta square = .001.
Univariate Analysis of Variance
Notes |
||
Output Created |
10-APR-2024 14:33:07 |
|
Comments |
|
|
Input |
Data |
C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav |
|
Active Dataset |
DataSet1 |
|
Filter |
H1RP2<5.5 (FILTER) |
|
Weight |
<none> |
|
Split File |
<none> |
|
N of Rows in Working Data File |
4427 |
Missing Value Handling |
Definition of Missing |
User-defined missing values are treated as missing. |
|
Cases Used |
Statistics are based on all cases with valid data for all variables in the model. |
Syntax |
UNIANOVA H1RP2 BY BIO_SEX /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /PRINT ETASQ DESCRIPTIVE /CRITERIA=ALPHA(.05) /DESIGN=BIO_SEX. |
|
Resources |
Processor Time |
00:00:00.09 |
|
Elapsed Time |
00:00:00.09 |
Between-Subjects Factors |
||
|
N |
|
BIOLOGICAL SEX-W1 |
1 |
2197 |
|
2 |
2230 |
Descriptive Statistics |
|||
Dependent Variable: S8Q2 PREGNANT NOW NOT SO BAD-W1 |
|||
BIOLOGICAL SEX-W1 |
Mean |
Std. Deviation |
N |
1 |
4.23 |
.989 |
2197 |
2 |
4.17 |
1.027 |
2230 |
Total |
4.20 |
1.009 |
4427 |
Tests of Between-Subjects Effects |
||||||
Dependent Variable: S8Q2 PREGNANT NOW NOT SO BAD-W1 |
||||||
Source |
Type III Sum of Squares |
df |
Mean Square |
F |
Sig. |
Partial Eta Squared |
Corrected Model |
4.334a |
1 |
4.334 |
4.259 |
.039 |
.001 |
Intercept |
78000.884 |
1 |
78000.884 |
76647.787 |
.000 |
.945 |
BIO_SEX |
4.334 |
1 |
4.334 |
4.259 |
.039 |
.001 |
Error |
4503.116 |
4425 |
1.018 |
|
|
|
Total |
82504.000 |
4427 |
|
|
|
|
Corrected Total |
4507.451 |
4426 |
|
|
|
|
a. R Squared = .001 (Adjusted R Squared = .001) |
DEMO FROM SEX WORKER SAMPLE
Univariate Analysis of Variance
Notes |
||
Output Created |
15-APR-2024 14:18:08 |
|
Comments |
|
|
Input |
Data |
C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav |
|
Active Dataset |
DataSet2 |
|
Filter |
<none> |
|
Weight |
<none> |
|
Split File |
<none> |
|
N of Rows in Working Data File |
63 |
Missing Value Handling |
Definition of Missing |
User-defined missing values are treated as missing. |
|
Cases Used |
Statistics are based on all cases with valid data for all variables in the model. |
Syntax |
UNIANOVA firstsexrecoded BY condition /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /PRINT ETASQ DESCRIPTIVE /CRITERIA=ALPHA(.05) /DESIGN=condition. |
|
Resources |
Processor Time |
00:00:00.02 |
|
Elapsed Time |
00:00:00.01 |
[DataSet2] C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav
Between-Subjects Factors |
|||
|
Value Label |
N |
|
Sample type |
1 |
cohort |
30 |
|
2 |
prost |
28 |
Descriptive Statistics |
|||
Dependent Variable: firstsexrecoded |
|||
Sample type |
Mean |
Std. Deviation |
N |
cohort |
1.8667 |
.86037 |
30 |
prost |
2.1429 |
.70523 |
28 |
Total |
2.0000 |
.79472 |
58 |
Tests of Between-Subjects Effects |
||||||
Dependent Variable: firstsexrecoded |
||||||
Source |
Type III Sum of Squares |
df |
Mean Square |
F |
Sig. |
Partial Eta Squared |
Corrected Model |
1.105a |
1 |
1.105 |
1.773 |
.188 |
.031 |
Intercept |
232.829 |
1 |
232.829 |
373.645 |
<.001 |
.870 |
condition |
1.105 |
1 |
1.105 |
1.773 |
.188 |
.031 |
Error |
34.895 |
56 |
.623 |
|
|
|
Total |
268.000 |
58 |
|
|
|
|
Corrected Total |
36.000 |
57 |
|
|
|
|
a. R Squared = .031 (Adjusted R Squared = .013) |
Looking at the first sexual experience variable, there was no significant difference between the cohort sample (M=1.86, SD = .86) and the sex worker sample (M=2.15, SD = .70), F (1, 56) = 1.77, p =1.88, partial eta = .031.
DEMO SEX WORKER CHI SQUARE
Crosstabs
Notes |
||
Output Created |
15-APR-2024 14:33:53 |
|
Comments |
|
|
Input |
Data |
C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav |
|
Active Dataset |
DataSet2 |
|
Filter |
<none> |
|
Weight |
<none> |
|
Split File |
<none> |
|
N of Rows in Working Data File |
63 |
Missing Value Handling |
Definition of Missing |
User-defined missing values are treated as missing. |
|
Cases Used |
Statistics for each table are based on all the cases with valid data in the specified range(s) for all variables in each table. |
Syntax |
CROSSTABS /TABLES=condition BY sex_o /FORMAT=AVALUE TABLES /STATISTICS=CHISQ /CELLS=COUNT /COUNT ROUND CELL. |
|
Resources |
Processor Time |
00:00:00.00 |
|
Elapsed Time |
00:00:00.01 |
|
Dimensions Requested |
2 |
|
Cells Available |
524245 |
[DataSet2] C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav
Case Processing Summary |
||||||
|
Cases |
|||||
|
Valid |
Missing |
Total |
|||
|
N |
Percent |
N |
Percent |
N |
Percent |
Sample type * sexual orientation |
63 |
100.0% |
0 |
0.0% |
63 |
100.0% |
Sample type * sexual orientation Crosstabulation |
|||||
Count |
|||||
|
sexual orientation |
Total |
|||
|
hetero |
bi |
homo |
|
|
Sample type |
cohort |
30 |
2 |
0 |
32 |
|
prost |
20 |
9 |
2 |
31 |
Total |
50 |
11 |
2 |
63 |
Chi-Square Tests |
|||
|
Value |
df |
Asymptotic Significance (2-sided) |
Pearson Chi-Square |
8.441a |
2 |
.015 |
Likelihood Ratio |
9.588 |
2 |
.008 |
Linear-by-Linear Association |
8.058 |
1 |
.005 |
N of Valid Cases |
63 |
|
|
a. 2 cells (33.3%) have expected count less than 5. The minimum expected count is .98. |
In our study we found a greater prevalence of heterosexuality in the cohort sample (94%) than in the sex worker sample (65%), chisquare – 8.44, DF=2, p=.015. Comment by John Edlund: 30/32 Comment by John Edlund: 20/31
Univariate Analysis of Variance
Notes |
||
Output Created |
17-APR-2024 14:12:57 |
|
Comments |
|
|
Input |
Data |
C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav |
|
Active Dataset |
DataSet1 |
|
Filter |
H1RP5<5.5 (FILTER) |
|
Weight |
<none> |
|
Split File |
<none> |
|
N of Rows in Working Data File |
4393 |
Missing Value Handling |
Definition of Missing |
User-defined missing values are treated as missing. |
|
Cases Used |
Statistics are based on all cases with valid data for all variables in the model. |
Syntax |
UNIANOVA H1RP5 BY BIO_SEX /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /PRINT ETASQ DESCRIPTIVE /CRITERIA=ALPHA(.05) /DESIGN=BIO_SEX. |
|
Resources |
Processor Time |
00:00:00.08 |
|
Elapsed Time |
00:00:00.08 |
[DataSet1] C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav
Between-Subjects Factors |
||
|
N |
|
BIOLOGICAL SEX-W1 |
1 |
2178 |
|
2 |
2215 |
Descriptive Statistics |
|||
Dependent Variable: S8Q5 RISK OF PREGNANCY W/O PROTECTION-W1 |
|||
BIOLOGICAL SEX-W1 |
Mean |
Std. Deviation |
N |
1 |
3.12 |
.999 |
2178 |
2 |
3.27 |
.990 |
2215 |
Total |
3.20 |
.997 |
4393 |
Tests of Between-Subjects Effects |
||||||
Dependent Variable: S8Q5 RISK OF PREGNANCY W/O PROTECTION-W1 |
||||||
Source |
Type III Sum of Squares |
df |
Mean Square |
F |
Sig. |
Partial Eta Squared |
Corrected Model |
24.738a |
1 |
24.738 |
25.024 |
<.001 |
.006 |
Intercept |
44927.538 |
1 |
44927.538 |
45447.377 |
.000 |
.912 |
BIO_SEX |
24.738 |
1 |
24.738 |
25.024 |
<.001 |
.006 |
Error |
4340.775 |
4391 |
.989 |
|
|
|
Total |
49314.000 |
4393 |
|
|
|
|
Corrected Total |
4365.513 |
4392 |
|
|
|
|
a. R Squared = .006 (Adjusted R Squared = .005) |
In looking at men’s and women attitudes about the risk of unprotected sex, we found that men (M = 3.12, SD .99) were less concerned about the risks than were women (M 3.27, SD=.99), F (1, 4391) = 25.02, p <.001, partial eta square = .006.
DATA FROM APRIL 17THS CLASS SESSION
Crosstabs
Notes |
||
Output Created |
17-APR-2024 14:20:11 |
|
Comments |
|
|
Input |
Data |
C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav |
|
Active Dataset |
DataSet1 |
|
Filter |
H1RP5<5.5 (FILTER) |
|
Weight |
<none> |
|
Split File |
<none> |
|
N of Rows in Working Data File |
4393 |
Missing Value Handling |
Definition of Missing |
User-defined missing values are treated as missing. |
|
Cases Used |
Statistics for each table are based on all the cases with valid data in the specified range(s) for all variables in each table. |
Syntax |
CROSSTABS /TABLES=BIO_SEX BY H1NM5 /FORMAT=AVALUE TABLES /STATISTICS=CHISQ /CELLS=COUNT /COUNT ROUND CELL. |
|
Resources |
Processor Time |
00:00:00.08 |
|
Elapsed Time |
00:00:00.09 |
|
Dimensions Requested |
2 |
|
Cells Available |
524245 |
Case Processing Summary |
||||||
|
Cases |
|||||
|
Valid |
Missing |
Total |
|||
|
N |
Percent |
N |
Percent |
N |
Percent |
BIOLOGICAL SEX-W1 * S12Q5 BIO MOM DISABLED-W1 |
4393 |
100.0% |
0 |
0.0% |
4393 |
100.0% |
BIOLOGICAL SEX-W1 * S12Q5 BIO MOM DISABLED-W1 Crosstabulation |
||||||
Count |
||||||
|
S12Q5 BIO MOM DISABLED-W1 |
Total |
||||
|
0 |
1 |
7 |
8 |
|
|
BIOLOGICAL SEX-W1 |
1 |
282 |
18 |
1878 |
0 |
2178 |
|
2 |
254 |
24 |
1936 |
1 |
2215 |
Total |
536 |
42 |
3814 |
1 |
4393 |
Chi-Square Tests |
|||
|
Value |
df |
Asymptotic Significance (2-sided) |
Pearson Chi-Square |
3.890a |
3 |
.274 |
Likelihood Ratio |
4.280 |
3 |
.233 |
Linear-by-Linear Association |
1.571 |
1 |
.210 |
N of Valid Cases |
4393 |
|
|
a. 2 cells (25.0%) have expected count less than 5. The minimum expected count is .50. |
When looking at men’s and women’s responses to the whether their mom is disabled, we found no difference between men (86% reported no disabilities) and women (87% reported no disabilities), chisquare 3.89, DF =3, p =.274.
Correlations
Notes |
||
Output Created |
17-APR-2024 14:32:21 |
|
Comments |
|
|
Input |
Data |
C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav |
|
Active Dataset |
DataSet1 |
|
Filter |
H1RP5<5.5 (FILTER) |
|
Weight |
<none> |
|
Split File |
<none> |
|
N of Rows in Working Data File |
4393 |
Missing Value Handling |
Definition of Missing |
User-defined missing values are treated as missing. |
|
Cases Used |
Statistics for each pair of variables are based on all the cases with valid data for that pair. |
Syntax |
CORRELATIONS /VARIABLES=H1RP1 H1RP2 /PRINT=TWOTAIL NOSIG FULL /MISSING=PAIRWISE. |
|
Resources |
Processor Time |
00:00:00.14 |
|
Elapsed Time |
00:00:00.08 |
Correlations |
|||
|
S8Q1 PREGNANT NOW ONE OF THE WORST-W1 |
S8Q2 PREGNANT NOW NOT SO BAD-W1 |
|
S8Q1 PREGNANT NOW ONE OF THE WORST-W1 |
Pearson Correlation |
1 |
-.492** |
|
Sig. (2-tailed) |
|
<.001 |
|
N |
4393 |
4393 |
S8Q2 PREGNANT NOW NOT SO BAD-W1 |
Pearson Correlation |
-.492** |
1 |
|
Sig. (2-tailed) |
<.001 |
|
|
N |
4393 |
4393 |
**. Correlation is significant at the 0.01 level (2-tailed). |
The next analysis that I ran was correlating the pregnant now being worse with the pregnant now being not so bad variable; these variables were highly negatively correlated, r = -.492, p <.001.
Correlations
Notes |
||
Output Created |
17-APR-2024 14:37:06 |
|
Comments |
|
|
Input |
Data |
C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav |
|
Active Dataset |
DataSet1 |
|
Filter |
H1RP5<5.5 (FILTER) |
|
Weight |
<none> |
|
Split File |
<none> |
|
N of Rows in Working Data File |
4393 |
Missing Value Handling |
Definition of Missing |
User-defined missing values are treated as missing. |
|
Cases Used |
Statistics for each pair of variables are based on all the cases with valid data for that pair. |
Syntax |
CORRELATIONS /VARIABLES=H1RP1 H1RP3 /PRINT=TWOTAIL NOSIG FULL /MISSING=PAIRWISE. |
|
Resources |
Processor Time |
00:00:00.08 |
|
Elapsed Time |
00:00:00.13 |
Correlations |
|||
|
S8Q1 PREGNANT NOW ONE OF THE WORST-W1 |
S8Q3 WILL SUFFER IF HIV POSITIVE-W1 |
|
S8Q1 PREGNANT NOW ONE OF THE WORST-W1 |
Pearson Correlation |
1 |
.200** |
|
Sig. (2-tailed) |
|
<.001 |
|
N |
4393 |
4393 |
S8Q3 WILL SUFFER IF HIV POSITIVE-W1 |
Pearson Correlation |
.200** |
1 |
|
Sig. (2-tailed) |
<.001 |
|
|
N |
4393 |
4393 |
**. Correlation is significant at the 0.01 level (2-tailed). |
The next analysis that I ran was correlating the pregnant now being worse with the will suffer with HIV variable; these variables were moderate correlated, r = .200, p <.001.
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.