Descriptive Statistics
Descriptive Statistics
The most important part of statistics is accurately and effectively communicating the results of a study. For this lab, you will be conducting a statistical study of your choosing and reporting out your results.
Lab Procedure:
Think of a topic or question that you’re interested in exploring further that can be studied through data.
It is recommended, but not required, that you pick a topic in your field of study.
Find a source for your data (or use one of the suggested data source sites listed below).
The data must be quantitative and ratio-level (See Chapter 1).
You will need between 30 – 60 data values.
You will need to provide a direct link to the source of the data.
Using Excel, perform the following statistical analyses:
Calculate the relevant and necessary descriptive statistics (See Chapter 2)
Use Excel to generate a frequency histogram (See Chapter 2)
It should be accurate, with the proper number of classes
All axes should be properly labeled
See textbook and Canvas resources for examples of properly formatted histograms
Write a report that is suitable to be published in a student newspaper or online news site.
The report should be 2 – 4 paragraphs in length
Excel Example download
EXCEL Lab Report Example download
The audience for your article should be the general public. This is an exercise in creating authentic and professional work,
Not just outlining steps you did for an assignment. Write this as it was assigned to you by a supervisor at a new job, not an instructor in a stat class.
As shown in the sample report, yours should include an introduction/background, statistical analysis (both in words and with graphics), and a conclusion based on the analysis.
Your written report should include both a frequency distribution table and the frequency histogram that you created using your data in Excel. (See sample report)
These figures should be titled, formatted, and appropriately sized (see sample report)
Please review the Excel Lab Scoring rubric for additional expectations
MAT308 Excel Lab Scoring Rubric
To submit, you will upload two files to the Canvas assignment link.
The Excel file needs to be uploaded as an actual Excel file (.xls or .xlsx, NOT a PDF)
The written report should be submitted as a .PDF file. For help converting a word document to a PDF, use Google.
Excel Tutorial Videos:
Excel Tutorial: Excel Basics (Links to an external site.)
Excel Tutorial: Histograms (Links to an external site.)
Excel Tutorial: Pie Charts (Links to an external site.)
Excel Tutorial: Frequency Histogram (Links to an external site.)
Excel Tutorial: Measures of Central Tendency (Links to an external site.)
Excel Tutorial: Standard Deviation (Links to an external site.)
Links to Sample Data Sites (Not required to choose one of these):
Centers for Disease Control and Prevention (Links to an external site.)
World Health Organization (Links to an external site.)
Kaggle (Links to an external site.)
Bureau of Labor Statistics (Links to an external site.)
Bureau of Economic Analysis (Links to an external site.)
U.S. Census Bureau (Links to an external site.)
U.S. Small Business Data (Links to an external site.)
Fed Reserve Economic Data (Links to an external site.)
The New York Times
MAT308 Inferential Statistics
Week 4 Assignment
SEA Project – Topic Selection
Each student will select a unique question that they would like to answer. Each question will be answered by collecting, organizing and analyzing data using methods learned in this class. Specifically, each student is required to analyze data using a hypothesis test. Each student must have a unique topic. Students who are repeating the course must select a new topic. The following list is meant to stimulate ideas: these topics may not be used by students.
Sample ideas:
Are gender and political party related? (Test of Association hypothesis test)
Does the number of customers waiting in line vary each day of the week? (Goodness of Fit hypothesis test)
Does a specific method of learning improve grades? (Two sample hypothesis test for means)
Does summer training reduce the rate of fall sports injuries? (Two sample hypothesis test for proportions)
MAT308 Inferential Statistics
Week 5 Assignment
Reese’s Pieces
Print out the document above and complete questions Part 1 and submit it in Canvas by Wednesday night at 11:59 PM. I will gather all of the class data and post it in the Announcements on Thursday. Once I post the summary, please complete Part 2 and submit it in Canvas by Sunday night at 11:59.
Topic: Proportions Activity: Reese’s Pieces
Background Information: The goal of a confidence interval is to estimate a population parameter based on a sample statistic. All confidence intervals have the form: point estimate ± margin-of-error.
Part 1
Example 1: Colors of Reese’s Pieces Consider the population of the Reese’s Pieces candies manufactured by Hershey. Suppose that you want to learn about the distribution of colors of these candies but that you can only afford to take a sample of 25 candies.
a) Take a random sample of 25 candies and record the number and proportion of each color in your sample. (Use this applet to get your sample http://www.rossmanchance.com/applets/OneProp/OneProp.htm?candy=1)
b) Is the proportion of orange candies among the 25 that you selected a parameter or a statistic?
c) Is the proportion of orange candies manufactured by Hershey’s process a parameter or a statistic? What symbol represents it?
d) Do you know the value of the proportion of orange candies manufactured by Hershey?
e) Do you know the value of the proportion of orange candies among the 25 that you selected?
f) Do you suspect that every student in the class obtained the same proportion of orange candies in his/her sample?
Part 2
Class data from Part 1 of the Reese’s Pieces Lab.
h) Did everyone obtain the same number of orange candies in their samples?
i) If every student was to estimate the population proportion of orange candies by the proportion of orange candies in his/her sample, would everyone arrive at the same estimate?
j) Based on what you have learned about random sampling and having the benefit of seeing the sample results of the entire class, take a guess concerning the population proportion of orange candies.
k) Again assuming that each student had access only to her/his sample, would most estimates be reasonably close to the true parameter value? Would some estimates be way off? Explain.
l) In what way would the dotplot have looked different if each student had taken a sample of 10 candies instead of 25? (If unsure, you can check this using the Reese’s Pieces applet
http://www.rossmanchance.com/applets/OneProp/OneProp.htm?candy=1)
m) In what way would the dotplot have looked different if each student had taken a sample of 75 candies instead of 25?
MAT308 Inferential Statistics
Week 9 Assignment
SEA – Project Proposal
Each student will post a project proposal to Canvas as a word document. In your proposal, answer the following questions:
What is your research question? What is your population and your variables? Why are you interested in this question?
How will you collect your data? State your method of sampling using terms discussed in class.
What bias or confounding variables might limit your study?
How will you display your data? State the type of graph you will use.
What type of analysis will you perform? Examples:
Are you determining if two variables are related? (Test of Association hypothesis test)
Are you comparing counts for different categories? (Goodness of Fit hypothesis test)
Are you comparing the means of two sets of data? (Two sample hypothesis test for means)
Are you comparing proportions for two sets of data? (Two sample hypothesis test for proportions)
Please click on the link above to submit your SEA Project Proposal.
MAT308 Inferential Statistics
Week 11 Assignment
If the Mars Company sorters are working properly then any difference between the color percentage in an actual package of M&Ms and the color percentage posted on the website should be due to random chance. You will investigate this hypothesis in this lab.
Please open this document.
Introduction:
Chi Square Modeling Using M & M’s Candies
Have you ever wondered why the package of M&Ms you just bought never seems to have enough of your favorite color? Or, why is it that you always seem to get the package of mostly brown M&Ms? What’s going on at the Mars Company? Is the number of the different colors of M&Ms in a package really different from one package to the next, or does the Mars Company do something to insure that each package gets the correct number of each color of M&M? You’ve probably stayed up nights pondering this!
One way that we could determine if the Mars Co. is true to its word is to sample a package of M&Ms and do a type of statistical test known as a “goodness of fit” test. These type of statistical
tests allow us to determine if any differences between our observed measurements (counts of colors from our M&M sample) and our expected (what the Mars Co. claims) are simply due to chance
sample error or some other reason (i.e. the Mars Co.’s sorters aren’t really doing a very good job of putting the correct number of M&M’s in each package). The goodness of fit test we will be doing today is called a Chi Square Analysis. This test is generally used when we are dealing with discrete data (i.e.
count data, or non continuous data). We will be calculating a statistic called a Chi square or X2 We will be using a table to determine a probability of getting a particular X2 value. Remember, our probability values tell us what the chances are that the differences in our data are due simply to chance alone (sample error).
The Chi Square test (X2) is often used in science to test if data you observe from an experiment is the same as the data that you would predict from the experiment. This investigation will help you to use the Chi Square test by allowing you to practice it with a population of familiar objects, M&M candies.
Objectives: After this investigation you should be able to:
• write a null hypothesis that pertains to the investigation;
• determine the degrees of freedom (df) for an investigation;
• calculate the X2 value for a given set of data;
• use the critical values table to determine if the calculated value is equal to or less than the critical value;
• determine if the Chi Square value exceeds the critical value and if the null hypothesis is accepted or rejected.
M&M DATA (Individual)
Here are the percentages given by M&M on their website for each color.
• Brown = 12%
• Green = 15%
• Red = 12%
• Yellow = 15%
• Orange = 23%
• Blue =23%
1) Open 2 bags of M&Ms. (If you do not have 2 bags of M&M’s email me and I will send you a set of data.)
2) Separate the M&Ms into color categories and count the number of each color.
3) Record your M&M color totals in the data table.
Table 1
Brown
Red
Yellow
Green
Orange
Blue
Total Number of M&M’s
4) Calculate the expected number of M&Ms in your package by multiplying the total number of M&Ms in the package by the color percent listed on page 1 of the activity.
For example, if your package contains 500 M&Ms and you want to find the expected number of red M&Ms you will need to multiply 500 by 20% (500 x 0.20). Record your calculations in the data table.
5) Calculate the difference between the observed and expected numbers for each M&M color. Record your calculations in the data table.
6) Square the difference between the observed and expected. Record your calculations in the data table.
7) Divide the square of the difference by the expected. Record your calculations in the data table.
8) Total all the answers from step 7 to determine the chi-square (λ2) value. Record the chi-square (λ2) in the data table.
Colors
Observed (o)
4) Expected (e)
5) o-e
6) (o-e)2
7) (o-e)2
e
Brown
Red
Yellow
Green
Orange
Blue
Analysis Questions:
1. What are the null and alternative Hypothesis?
Ho:
Ha:
Now you must determine the probability that the difference between the observed and expected values (as summarized by the calculated value of chi square) occurred simply by chance. To do this you will need to compare the calculated value of chi-square with the appropriate value from the Chi Square Distribution Table on the next page. Examine the table. Note the term “degrees of freedom.” For this statistical test the degrees of freedom is equal to the number of classes (color categories) minus one. Complete the following to determine the degrees of freedom for the M&M analysis:
# of color categories
– 1
degrees of freedom
The reason why it is important to consider degrees of freedom is that the value of the chi-square statistic is calculated as the sum of the squared differences for all classes. The natural increase in the value of chi-square with an increase in classes must be taken into account. Scan across the row corresponding to 5 degrees of freedom. Values of the chi-square are given for several different probabilities ranging from 0.95 on the left to 0.001 on the right. Note that the chi-square increases as the probability increases. Notice that a chi-square value of 1.63 would be expected by chance in 95% (0.95) of the cases, whereas one of 12.59 would be expected in 5% (0.05) of the cases. Use the chi-square value calculated and recorded on the data table to determine the probability for the M&M analysis. If the exact chi square value is not listed in the table estimate the probability. Record your answer below.
2, Draw your Chi Square Curve and put in the critical value (p = 0.05).
3. What is the λ2 value for your data?
4. Is your null hypothesis accepted or rejected? Explain why or why not.
MAT308 Inferential Statistics
Week 12 Assignment
SEA Project – Data Collection
Each student will post an Excel spreadsheet to Canvas that provides original data organized in a chart, and displays data in a graph. Guidelines for data collection:
Data must be publicly available, or, you must have permission in writing to share that data publicly.
Projects are not going to be submitted to the IRB for approval, thus must focus on publically available data or surveys.
Sources must be properly cited.
Do not simply use statistics provided in a published study.
No experiments on humans or animals are permitted.
Please click the link above to submit the Data Collection portion of the SEA Course Project.
MAT308 Inferential Statistics
Week 15 Assignment
Purpose: The Structured External Assignment promotes understanding of the course material. Collecting, organizing, and analyzing data using statistical methods is the cornerstone of this course.
Process: There are four milestones for this project: topic selection, proposal, data collection, and final draft. The rough draft and the final draft are polished, written reports in APA format. Since each milestone is used to complete remaining milestones, each milestone must be completed prior to completing the next milestone.
1. Topic selection
Each student will select a unique question that they would like to answer. Each question will be answered by collecting, organizingand analyzing data using methods learned in this class. Specifically, each student is required to analyze data using a hypothesis test. Each student must have a unique topic. Students who are repeating the course must select a new topic. The following list is meant to stimulate ideas: these topics may not be used by students.
Sample ideas:
• Are gender and political party related? (Test of Association hypothesis test)
• Does the number of customers waiting in line vary each day of the week? (Goodness of Fit hypothesis test)
• Does a specific method of learning improve grades? (Two sample hypothesis test for means)
• Does summer training reduce the rate of fall sports injuries? (Two sample hypothesis test for proportions)
2. Project proposal
Each student will post a project proposal to Canvas as a word document. In your proposal, answer thefollowing questions:
• What is your research question? What is your population and your variables? Why are you interested in this question?
• How will you collect your data? State your method of sampling using terms discussed in class.
• What bias or confounding variables might limit your study?
• How will you display your data? State the type of graph you will use.
• What type of analysis will you perform? Examples:
o Are you determining if two variables are related? (Test of Association hypothesis test)
o Are you comparing counts for different categories? (Goodness of Fit hypothesis test)
o Are you comparing the means of two sets of data? (Two sample hypothesis test for means)
o Are you comparing proportions for two sets of data? (Two sample hypothesis test for proportions)
3. Data collection
Each student will post an Excel spreadsheet to Canvas that provides original data organized in a chart, and displays data in a graph. Guidelines for data collection:
• Data must be publicly available, or, you must have permission in writing to share that data publicly.
• Projects are not going to be submitted to the IRB for approval, thus must focus on publically available data or surveys.
• Sources must be properly cited.
• Do not simply use statistics provided in a published study.
• No experiments on humans or animals are permitted.
4. Final draft
Each student will submit a word document to Canvas, where it will be checked by Turnitin for plagiarism. Final drafts must include the sections indicated in the rubric.
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.