In this homework, you will be performing linear regression on a data set of your choice.
In this homework, you will be performing linear regression on a data set of your choice. Perform the following for this homework: 1. Find a data set of your choice from a valid source. You may use a previously used data set. If you have categorical data, you may want to create dummy variables. 2. Split up your data set into a training set and a scoring set. Rename the data sets appropriately. The scoring data set should not include the column that you are trying to predict. 3. Import both data sets into RapidMiner and check the ranges on all attributes. If some observations in the scoring data set for an attribute lie below or above the training data set’s lower or upper bound for that respective attribute, then remove these observations that are outside this range. Take a screenshot of the loaded data sets. 4. Set the role of the attribute in the training stream that you are trying to predict as a label. 5. Perform linear regression by adding the “Linear Regression” operator to the training stream and adding the “Apply Model” operator to connect the training stream to the scoring stream. Take a screenshot of the final process stream. 6. Run the model and take screenshots of both the linear regression results (i.e., table with regression coefficients) and the results of predictions made on the scoring data set. Evaluate and interpret your results. Examine your attribute coefficients and the predictions made in the scoring data set. In your interpretation of results, you should include answers to the following questions: a) Which attributes have the greatest weight? b) What would the resulting mathematical formula be for the regression line? c) Were any attributes dropped from the data set as non-predictors? If so, which ones and why do you think they weren’t effective predictors? d) What can you conclude from the predictions made? Submission Instructions: Please type up your homework using the homework template posted on Blackboard under Assignments. You should include at least four screenshots: (1) data set loaded in RapidMiner, (2) final process stream, (3) linear regression results (i.e., table with regression coefficients), and (4) results of predictions made on the scoring data set. Remember to interpret your results and answer all questions above in step 6. Only a softcopy submission is needed through the assignment link posted on Blackboard.
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.