End of Chapter 3 and Chapter 4 Note
Assignment # 6 End of Chapter 3 and Chapter 4 Note: Please copy and paste your R steps that perform the calculations to your homework document. Problem 1 Based on Problems 4.4, 4.5, and 4.6 p.189 File: Fertility.csv. Read the description of the fields in this file on p.189. In the following, I am asking you to do something similar to Problems 4.4 and 4.6. Refer to this video: Backward Elimination 4.2. a. Create a correlation matrix, using cor(), to discover which variables – other than LowAFC – that appear to most strongly correlate to the variable MeanAFC. Create a pairs plot to accompany your correlation matrix. b. Choose 4 to 6 variables, (not LowAFC) and create a multi-variable linear model to predict the response variable MeanAFC. Use the backwards elimination technique to reduce the model to just a set of significant variables (use a 0.05 level of significance) based on the p-values for the coefficients. Write the equation for the model and briefly describe the meaning of the terms in the equation. c. Repeat part b. but use the step() command, which uses the AIC criteria (Akaike Information Criterion), to automatically perform the backwards elimination. Discuss any differences that occur in this model from the one in part b. d. Similar to problem 4.6, select 4 to 6 variables from your correlation matrix or pairs plot to create a multi-variable linear model to predict the response variable Embryos. Use either a by-hand backwards elimination or the step() command to reduce the model to a set of significant variables (you may use a significance level of 0.05). Write the equation for the model and briefly describe the meaning of the terms in the equation. Problem 2 Problem 4.9 File: FirstYearGPA.csv. Complete Problem 4.9 on page 190. In modeling we use this idea of cross validation splitting data into a training set and a testing set (the authors say testing sample and holdback sample) – many times, including when using artificial neural nets. Although you can do what the book suggests, I would suggest a random way to split the data frame into a training set and testing set. The only reason the authors do not do a random method of splitting here is so that everyone gets the same answers when you follow their instructions. Assuming that you read the file into a data frame called gpa, the following code snippet (similar to one I did in a video, Cross Validation 4.3) will randomly split the data into the two sets, 75% of the rows in the training set and 25% of the rows in the testing set, after randomly mixing the rows: > rows frac split gpamix gpatrain gpatest
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.