Chi Square Test of Independence in SPSS
The Chi Square Test I am going to talk about a chi square example in this document that corresponds to the example you saw in the Descriptive Statistics Crash Course (#1), t-Test Crash Course (#2), and ANOVA Crash Course (#3) where participants were asked to recall how much money they spent on textbooks the prior semester. However, for this Chi Square crash course, we are going to focus on nominal variables (categorical) rather than scaled variables (like ratio or interval). The good news is that this mini-lecture will sum up the basics of the chi square for you as we look at this study, but you can find additional information about the chi square in your textbooks. On the final page are several questions based on this crash course. Answer these questions and go into your “Crash Course in Statistics – The Chi Square Quiz #4” in your Canvas assessments menu and copy over your answer. Each Crash Course Quiz counts 5 points. How, when, and why do a Chi Square? Before we get to the example, let me give you some basic information about the chi square. The chi square is used to compare two or more levels of a categorical (nominal or ordinal) variable. I recommend looking at the crash course on descriptive statistics for more info on interval and ratio scales, but I want to highlight nominal and ordinal scales here. Nominal scales are based on assigning items to categories. For example, you can have “yes” versus “no” categories, or “male” versus “female” categories, or “Honda” versus “Toyota” versus “Subaru” versus “Ford”. For nominal scales, are just different categories, but one option isn’t “better” or “higher” than another (After all, which is higher: males or females?). Ordinal scales have more order to them. That is, they are ranked. Thus there might be a better or worse ranking here (Pizza is ranked highest, Salad second highest, Sandwiches third highest, Liver lowest for food preference). We might know the order, but we may not know how spread out those preferences are. That is, maybe pizza, salad, and sandwiches are all ranked very high in preference but liver is ranked really, really low! Or think about a race. The first, second, and third place finishers may come in a few seconds apart while the fourth-place finisher is over a minute behind. For this crash course, I want to focus only on the nominal scale. A chi square essentially looks at the percentage of cases that fall into categories of the variable. Let’s say I look at gender as a variable in my study. I sample 200 people at random. There is a good chance I would get 100 men and 100 women, but I could also be off a little from that 50/50 ratio. That is, I could wind up with 95 men and 105 women just based on natural fluctuations of my data collection and selection procedure. The question becomes, “Are the observed differences between the number of men and women I observe significantly different than the data that I expected to see?” As another example, I might look to see if there are more guilty than notguilty verdicts in a trial. If I poll jurors and find that 60% found guilt while 40% found no guilt, I might want to run a chi square to see if the difference is based on chance factors or something more. Because we are looking at categorical variables (men versus women, or guilt versus no guilt), it is not appropriate to look at means and standard deviations. After all, what is a mean gender? No such thing, right? Thus, we cannot use a t-Test or ANOVA for this analysis, as those tests are based on mean scores. The chi square, however, is designed to look at percentages and frequencies of data. There are two different types of chi squares: 1). Chi Square – Goodness of Fit: In this test, we look at only one variable. Thus, we can determine whether we have more men in our study than we should by chance, or we can see if there are more not guilty verdicts than we should have by chance. Although I will briefly discuss this Goodness of Fit test in the lecture material, we will focus mostly on the second kind of chi square test in this crash course document … 2). Chi Square – Test of Independence: For this test, we want to see if two variables are independent. Consider juror gender and verdict in the same chi square model. We might want to know if juror gender has an impact on verdict or whether verdict is independent of gender. BOTH variables here are nominal in nature (two levels to gender: male versus female – two levels to verdict: guilty or not guilty). The nice thing about chi squares is that there really are no assumptions about the shape of the distribution (we don’t need a bell-shaped curve like we do for t-Tests and ANOVAs). But the variables can’t be on scales – just categories. Let’s run a chi square and interpret it in SPSS. Textbook Study – How Much Did You Spend On Textbooks (High or Low Conditions) Recall the basic set-up for our money spent on textbooks. Researchers ask participants to recall how much they spent on textbooks the prior semester, and has each participant write their answer on a survey sheet. In all conditions, the first ten answer slots are already filled in, presumably by other respondents. However, the researcher actually completed those ten slots, and manipulated the dollar amounts so that in in the High Dollar Condition, the dollar amounts ranged from $350 to $450 (see Figure 1). In the Low Dollar Condition, amounts ranged from $250 to $350 (see Figure 2). For right now, we are going to briefly omit the “Control” condition (we’ll get back to this third level later in this document) and focus on the High Dollar Condition and the Low Dollar Condition only. Imagine we have eight real participants in the High Dollar Condition and eight real participants in the Low Dollar Condition (and no, we are not including the original dollar amounts on the sheet passed out by the researcher, as those are not real participants!). However, instead of asking participants to recall how much they spent on textbooks, we ask them to look at the other responses on the survey and decide if other respondents spent more or less than average. We thus have a nominal dependent variable with two answer options: “higher than average” versus a “lower than average”. See Figure 1 for the High Dollar Condition survey and Figure 2 for the Low Dollar Condition survey. Consider the data below, noting that I have a dichotomized dependent variable. I will label “higher” as 1 and “lower” as 2 (though I could just as easily have labeled “higher” as 2 and “lower” as 1 – for nominal variables, the exact value does not matter, just that different numbers represent different conditions). Condition A (High Dollar Condition) 1 (more than average) 1 (more than average) 1 (more than average) 1 (more than average) 1 (more than average) 1 (more than average) 1 (more than average) 2 (less than average) Condition B (Low Dollar Condition) 1 (more than average) 1 (more than average) 1 (more than average) 2 (less than average) 2 (less than average) 2 (less than average) 2 (less than average) 2 (less than average) Eyeballing this, it looks like participants selected the “Others spent higher than average” option (#1 option) response more frequently in the High Dollar Condition while participants selected the “Others spent lower than average” option (#2 option) more frequently in the Low Dollar Condition. But is this difference based simply on chance or something else? If it is based on chance, then I would say that both columns are pretty equal, and thus the High versus Low Dollar manipulation had no impact. If it is based on something other than chance (i.e. it is significant), then I can conclude that there is something else at work here (probably the High versus Low dollar amounts from prior participants!). That is, we can answer the question “Did our manipulation work, with those in the High Dollar Condition saying that the prior respondents spent a higher amount of money than average on books while those in the Low Dollar Condition spent a lower amount of money than average on books?” Let’s see how to run this in SPSS. Since this example involves two different variables with categorical levels, we will run a chi square test of independence. For the next section, I am going to open SPSS and run a chi square test of independence. I’ll use screenshots from SPSS as I go, but feel free to run these analyses yourself. Just set up your SPSS file like mine (I also included this SPSS file for you in Canvas if you prefer to use that. It is called “Crash Course Quiz #4– Textbook Money (Chi Square Practice)”, but it is a short data set so I recommend setting up your own SPSS file using the values from the table above). I am just going to give you the basics here, but you can refer to other sources to figure out some of the info we get from the chi square not covered in this lecture). SPSS – Our Textbook Money Study 1. Click Analyze > Descriptive Statistics > Crosstabs… on the top menu as shown below. You will be presented with the following (though note that I changed the “Did you spend more” variable into a nominal variable, as denoted by the symbol. I also chose to display the dependent variable name as “HigherLower” rather than the mouthful phrase “Did others spend higher or lower than average”): Transfer one of the variables into the “Row(s):” box and the other variable into the “Column(s):” box. In our example we will transfer the “DollarCondition” variable (our independent variable) into the “Row(s):” box and “HigherLower” into the “Column(s):” box (our dependent variable). There are two ways to do this. You can highlight the variable with your mouse and then use the button to transfer the variables or you can drag-and-drop the variables. How do you know which variable goes in the row box and which goes in the column box? There is no right or wrong way. It will depend on how you want to present your data, so feel free to try it either way. If you want to display clustered bar charts, then make sure that “Display clustered bar charts” checkbox is ticked. I usually don’t need the chart, but if you want a visual aide about what the data looks like, the bar chart might help. You will end up with a screen similar to the one below: 1. Click on the as shown below: Click the button. Select the “Chi-square” and “Phi and Cramer’s V” options button. 2. Click the button. Select “Observed” from the “Counts” area and “Row” from the “Percentages” area as shown below: 1. After clicking “Continue”, click the button to generate your output. Output of the Independent Chi Square in SPSS You will see several tables for the chi square, but ignore the case processing table. Condition (1 = High, 2 = Low) * Money Spent (1 = Higher, 2 = Lower) Crosstabulation The nice thing about this table is it tells you exactly what you ran in the name of the table! As you can see, those in the High Dollar Condition chose the “Higher (than average)” option most frequently while those in the Low Dollar Condition chose the “Lower (than average)” option most frequently. But is this statistically significant? For that, we look to the next table. Chi-Square Tests Focus on the Pearson Chi-Square row (and ignore the others). Based on Pearson, we can see that the chi square is significant, χ2(1) = 4.27, p = .039. Keep in mind that our degree of freedom here is 1. We calculate the df by looking at the formula (k1 – 1) X (k2 – 1), where k refers to each variable. That is, k1 focuses on the two levels for Dollar Condition (which has 2 levels, High vs Low) and k2 focuses on the two levels for money spent (which also has 2 levels, Higher than average and Lower than average. Plug in those number of levels into our (k1 – 1) X (k2 – 1) formula, and we get (2 – 1) X (2 – 1), or 1 X 1 = 1. Symmetric Measures Our final table looks at symmetric measures. As you can see below, both Phi and Cramer’s V are very high (.516 on a 0 to +1 scale, which makes .516 pretty high). Both are significant at p = .039. However, we use the “Phi” row for designs like the one we just described (a 2 X 2 study design). Thus we would focus on that Phi row to assess our 2 X 2 design. Use Cramer’s V for all other chi squares (like a 2 X 3 design, or a 3 X 3 design). Interpreting the Independent Chi Square in SPSS If a significant result occurs, the write up looks like this: A chi square test of independence was calculated comparing how much participants in the High Dollar Condition versus Low Dollar Condition thought others spent on average for books (higher than average versus lower than average). A significant relationship emerged, χ2(1) = 4.27, p = .039. Most participants in the High Dollar Condition (87.5%) thought other respondents spent more money than average on books while most participants in the Low Dollar Condition (62%) thought other respondents spent less money than average on books. Phi showed a large effect. This indicates that the High versus Low Dollar manipulation worked as intended. If non-significant, the write up for the chi square is even easier. You simply write: A chi square test of independence was calculated comparing how much participants in the High Dollar Condition versus participants in the Low Dollar Condition thought others spent on average for books (higher than average versus lower than average). No significant relationship was found, χ2(1) = 1.27, p = .351, and phi did not show a large effect. This indicates that the High versus Low Dollar manipulation did not work, as the frequency of answers were equally distributed among cells. Another quick example – Three levels to your main independent variable. Now, assume that we still had two levels to our independent variable of Dollar Condition: High, and Low. However, we add another level to our dependent variable, with the available options including “Higher than average”, “Lower than average”, and a new “Average” option. Consider our SPSS output (next page). For the data on the next page, we have the following interpretation: A chi square test of independence was calculated comparing how much participants in the High Dollar Condition versus Low Dollar Condition thought others spent on average for books (higher than average vs. lower than average vs. average). A significant relationship emerged, χ2(2) = 8.57, p = .014. Most participants in the High Dollar Condition (75%) thought other respondents spent more money than average on books while most participants in the Low Dollar Condition (62.5%) thought other respondents spent less money than average on books. Cramer’s V showed a large effect. This indicates that the High versus Low Dollar manipulation worked as intended. Note that our df changed a bit. We now have three conditions in k2, so (k1 – 1) X (k2 – 1) gives us (2 – 1) X (3 – 1), or 1 X 2 = 2. Also, note that the write up uses Cramer’s V rather than Phi (Phi is best used for a 2 X 2 table, but here we have a 2 X 3 table) A final example – Three levels to your main independent variable and three levels to your independent variable Finally, assume that we have three levels to our independent variable of Dollar Condition: High, Low, and Average. We also have three levels to our dependent variable, with the options including “Higher than average”, “Lower than average”, and “Just about average”. See Figure 1 for an example of the High Dollar Condition with three dependent variable answer options. Figure 1: High Dollar Condition with Three Answer Options If this 3 X 3 test is significant, we would conclude the following (based on the SPSS output below) A chi square test of independence was calculated comparing how much participants in the High Dollar Condition versus Low Dollar Condition versus Average Dollar Condition thought others spent on average for books (higher than average vs. lower than average vs. average). A significant relationship emerged, χ2(4) = 19.79, p = .001. Most participants in the High Dollar Condition (75%) thought other respondents spent more money than average on books, most participants in the Low Dollar Condition (62.5%) thought other respondents spent less money than average on books, and most participants in the Average Dollar Condition (75%) thought other respondents spent an average amount on books. Cramer’s V showed a large effect. This indicates that the High versus Low versus Average Dollar manipulation worked as intended. Notice that our df is now 4. Using (k1 – 1) X (k2 – 1), we now have (3 – 1) X (3 – 1), or 2 X 2 = 4 Crash Course In Statistics – The Chi Square – Quiz #4 (Coaster, Summer 2023) Instructions: In your prior Crash Course Quizzes (#2 and #3), you focused on a study looking at the excitation-transfer theory, or the idea that when a person becomes aroused physiologically there is a subsequent period of time when the person will continue to experience a high state of residual arousal yet be unaware of it. If additional stimuli are encountered during this time, the individual may mistakenly attribute their residual arousal from the previous stimuli to future stimuli. Using that same study design, complete the questions below and transfer your answers to your Crash Course in Statistics – The Chi Square Quiz #4 in Canvas (1 point per question). IMPORTANT: The answer options on Canvas may not be in the same order you see them below, so make sure to copy over the CONTENT of the answer and not simply the answer letter (A, B, C, D, or E). Chi Square Crash Course Quiz Part A You conduct a similar study using the same two groups we used for the t-Test. Recall that in that study, participants were asked how much they would like to take a woman on a date. However, some participants provided their ratings after riding a rollercoaster while others provided their ratings while waiting in line to ride a rollercoaster. (Note: For the Chi Square Crash Course #3 Part A, ignore the “Waiting to ride” condition you saw in the ANOVA crash course quiz.). But you wonder whether your participants are already in a relationship, as that might impact their assessments of whether they would want to date the woman in the dating profile. Thus you ask them, “Are you currently in a romantic relationship?” with 1 = Currently single and 2 = In a relationship. You hope that there are no differences in relationship status between participants in the just rode condition and the waiting to ride condition, so you run a chi square to assess this possibility. Note: If you want to run these analyses yourself, look for the SPSS file called “#4 Chi Square Crash Course Data Coaster Summer A” in Canvas – Running the analysis is not required as the data are presented below, but it is definitely recommended if you want some SPSS practice!). You get the following data: 1). How many participants in the Just rode and Waiting to ride conditions are in a relationship? A. A total of 11 participants (36.7%) in the just rode condition are in a relationship while 21 participants (30%) in the waiting to ride condition are in a relationship. B. A total of 19 participants (63.3%) in the just rode condition are in a relationship while 9 participants (30%) in the waiting to ride condition are in a relationship. C. A total of 19 participants (63.3%) in the just rode condition are in a relationship while 21 participants (70%) in the waiting to ride condition are in a relationship. D. A total of 20 participants (33.3%) in the just rode condition are in a relationship while 40 participants (66.7%) in the waiting to ride condition are in a relationship. E. A total of 30 participants (100%) in the just rode condition are in a relationship while 30 participants (10%) in the waiting to ride condition are in a relationship. 2). We used a chi square above to see if participants relationship-status (currently single versus in a relationship) differed across our two rollercoaster conditions. Could we use a t-Test to also assess this possibility? Select the appropriate answer. A. Yes, we can run a t-Test. The t-Test relies on continuous variables, and the relationship-status dependent variable is continuous (scaled). B. Yes, we can run a t-Test. Since the new relationship-status dependent variable is a dichotomous or nominal variable, a t-Test is appropriate to use C. No, we cannot run a t-Test. Since the new relationship-status dependent variable here is a continuous (scaled) response, so we cannot run a t-Test. A chi square is more appropriate for this new dependent variable D. No, we cannot run a t-Test. Since the new relationship-status dependent variable is a categorical-based response (or a dichotomous or nominal variable), we cannot run a tTest. A chi square is more appropriate. E. There is not enough information in this study to decide if we can run a t-Test 3). Which of the following represents the correct way to write out the results for this chi square in an APA formatted results section? A. A chi square test of independence was calculated to determine whether the relationshipstatus of the participants differed across the just rode and waiting to ride conditions. A significant difference between conditions failed to emerge, χ2(4) = 0.30, p = .584. In the just rode condition, 19 participants (or 63.3% of those in this condition) stated they were in a relationship. In the waiting to ride condition, 21 participants (or 70% of those in this condition) stated they were in a relationship. Phi showed a weak effect. This indicates that participant relationship status was similar across both rollercoaster conditions. B. A chi square test of independence was calculated to determine whether the relationshipstatus of the participants differed across the just rode and waiting to ride conditions. A significant difference between conditions failed to emerge, χ2(1) = 0.30, p = .584. In the just rode condition, 21 participants (or 70% of those in this condition) stated they were in a relationship. In the waiting to ride condition, 19 participants (or 63.3% of those in this condition) stated they were in a relationship. Phi showed a weak effect. This indicates that participant relationship status was similar across both rollercoaster conditions. C. A chi square test of independence was calculated to determine whether the relationshipstatus of the participants differed across the just rode and waiting to ride conditions. A significant difference between conditions failed to emerge, χ2(1) = 0.30, p = .584. In the just rode condition, 19 participants (or 63.3% of those in this condition) stated they were in a relationship. In the waiting to ride condition, 21 participants (or 70% of those in this condition) stated they were in a relationship. Phi showed a weak effect. This indicates that participant relationship status was similar across both rollercoaster conditions. D. A chi square test of independence was calculated to determine whether the relationshipstatus of the participants differed across the just rode and waiting to ride conditions. A significant difference between conditions emerged, χ2(1) = 0.30, p = .005. In the just rode condition, 19 participants (or 63.3% of those in this condition) stated they were in a relationship. In the waiting to ride condition, 21 participants (or 70% of those in this condition) stated they were in a relationship. Phi showed a strong effect. This indicates that participant relationship status was significantly higher in the waiting to ride condition than in the just rode condition. E. A chi square test of independence was calculated to determine whether the relationshipstatus of the participants differed across the just rode and waiting to ride conditions. A significant difference between conditions emerged, χ2(1) = 0.30, p = .005. In the just rode condition, 21 participants (or 70% of those in this condition) stated they were in a relationship. In the waiting to ride condition, 19 participants (or 63.3% of those in this condition) stated they were in a relationship. Phi showed a strong effect. This indicates that participant relationship status was significantly higher in the just rode condition than in the waiting to ride condition. Chi Square Crash Course Quiz Part B You design a new study in which you look at all three conditions from the One-Way ANOVA crash course quiz (Just rode, Waiting to ride, or Waiting for food). However, you also want to see if participants experience different levels of physiological arousal. Therefore you alter your dependent variable to the following: “Which of the following three options best describes your current heartrate?: 1 = Beating very fast, 2 = Beating moderately fast, 3 = Beating slow”. You get the following data (Note: This does differ from the tables used for questions 1, 2, and 3! If you want to run the data yourself, use the “#4 Chi Square Crash Course Data Coaster Summer B” in Canvas – not required, but recommended!): 4). You assume that participants in the Just rode condition will report having a very fast heartrate, participants in the Waiting to ride condition will report having a moderately fast heartrate, and participants in the Waiting for food condition will report having a slow heartrate. Focusing on those cells, choose the option below that best represents the crosstabulation table. A. A total of 20 participants (66.7%) in the Just rode condition said their heart was beating very fast. A total of 6 participants (63.3%) in the Waiting to ride condition said their heart was beating moderately fast. A total of 1 participants (56.7%) in the Waiting for food condition said their heart was beating slow. B. A total of 20 participants (66.7%) in the Just rode condition said their heart was beating very fast. A total of 19 participants (63.3%) in the Waiting to ride condition said their heart was beating moderately fast. A total of 17 participants (56.7%) in the Waiting for food condition said their heart was beating slow. C. A total of 30 participants (66.7%) in the Just rode condition said their heart was beating very fast. A total of 30 participants (63.3%) in the Waiting to ride condition said their heart was beating moderately fast. A total of 30 participants (56.7%) in the Waiting for food condition said their heart was beating slow. D. A total of 20 participants (30%) in the Just rode condition said their heart was beating very fast. A total of 6 participants (43.3%) in the Waiting to ride condition said their heart was beating moderately fast. A total of 12 participants (26.7%) in the Waiting for food condition said their heart was beating slow. E. A total of 20 participants (100%) in the Just rode condition said their heart was beating very fast. A total of 6 participants (100%) in the Waiting to ride condition said their heart was beating moderately fast. A total of 1 participants (10%) in the Waiting for food condition said their heart was beating slow. 5). Which of the following represents the correct way to write out the results for this chi square in an APA formatted results section? A. A chi square test of independence was calculated to see if participants heartrate differed depending on their condition. A significant relationship emerged, χ2(1) = 42.08, p < .001. A total of 20 participants (66.7%) in the Just rode condition said their heart was beating very fast. A total of 19 participants (63.3%) in the Waiting to ride condition said their heart was beating moderately fast. A total of 17 participants (56.7%) in the Waiting for food condition said their heart was beating slow. Cramer’s V was very strong. This indicates that participant heartrate was fastest when they finished riding the rollercoaster followed by participants who were about to ride the rollercoaster. Heartrates were slowest for those waiting for food (i.e. not waiting to ride a rollercoaster). B. A chi square test of independence was calculated to see if participants heartrate differed depending on their condition. A significant relationship emerged, χ2(4) = 42.08, p < .001. A total of 20 participants (66.7%) in the Just rode condition said their heart was beating very fast. A total of 19 participants (63.3%) in the Waiting to ride condition said their heart was beating moderately fast. A total of 17 participants (56.7%) in the Waiting for food condition said their heart was beating slow. Cramer’s V was very strong. This indicates that participant heartrate was fastest when they finished riding the rollercoaster followed by participants who were about to ride the rollercoaster. Heartrates were slowest for those waiting for food (i.e. not waiting to ride a rollercoaster). C. A chi square test of independence was calculated to see if participants heartrate differed depending on their condition. A significant relationship emerged, χ2(4) = 42.08, p < .001. A total of 20 participants (30%) in the Just rode condition said their heart was beating very fast. A total of 6 participants (43.3%) in the Waiting to ride condition said their heart was beating moderately fast. A total of 12 participants (26.7%) in the Waiting for food condition said their heart was beating slow. Cramer’s V was very strong. This indicates that participant heartrate was fastest when they finished riding the rollercoaster followed by participants who were about to ride the rollercoaster. Heartrates were slowest for those waiting for food (i.e. not waiting to ride a rollercoaster). D. A chi square test of independence was calculated to see if participants heartrate differed depending on their condition. A significant relationship emerged, χ2(4) = 90.00, p .05. A total of 20 participants (66.7%) in the Just rode condition said their heart was beating very fast. A total of 19 participants (63.3%) in the Waiting to ride condition said their heart was beating moderately fast. A total of 17 participants (56.7%) in the Waiting for food condition said their heart was beating slow. Cramer’s V was very weak. This indicates that participant heartrate did not differ significantly between the Just rode, Waiting to ride, and Waiting for food conditions.
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.
