August 10, 2023

modernstatisticswithr.com: Fit a kNN classification model to the wine data, using pH, alcohol, fixed.acidity, and residual.sugar as explanatory variables. Evaluate its performance using 10-fold cross-validation, using AUC to choose the best k.

1. Drills with R on K-NN models

To solve the problem, you’ll need to load the data and libraries with:

# Import data about white and red wines:

white <- read.csv(“https://tinyurl.com/winedata1”,sep = “;”)

red <- read.csv(“https://tinyurl.com/winedata2”,sep = “;”)

# Add a type variable:

white$type <- “white”

red$type <- “red”

# Merge the datasets:

wine <- rbind(white, red)

wine$type <- factor(wine$type)

install.packages(‘caret’, dependencies = TRUE)

library(caret)

# to visualize results you need the following

install.packages(‘MLeval’, dependencies = TRUE)

library(MLeval)

For the submission:

1. Provide the commands in plain text that you used to solve the problem.

Attach the figure that resulted after command: plots$roc

Output after executed command: plots$optres[[1]][13,]

Attach the figure that resulted after command: plots$cc

2. Dissimilarities between data objects

This project demonstrates how to measure similarities between data objects. These topics described are mostly in chapter 6 Statistical Machine Learning from ‘Practical Statistics for Data Scientists’. Cover in the project the following:

Find some data examples and show examples of calculating

Euclidean distance

L1 distance

Prove or disprove that Euclidean and L1 distance satisfy

Positivity d(x,y) >= 0 for all x and y, d(x,y) == 0 only if x == y.

Symmetry d(x,y) == d(y,x) for all x and y.

Triangle Inequality d(x,z) <= d(x,y) + d(y,z) for all points x, y, and z

Explain why it is not possible or why it is possible to

rearrange data so Euclidean distance gives the same meaning as Hamming distance

show that measure d=1-cos(x,y) satisfies positivity, symmetry, and triangle Inequality

Draw conclusions about what is important when choosing the distance measure for the evaluation of dissimilarities between data objects.

Assignment 1 and 2 are to be done in 2 different papers in APA format

Requirements: 5-7 pages

Collepals.com Plagiarism Free Papers

Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.

Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS

Why Hire Collepals.com writers to do your paper?

Quality- We are experienced and have access to ample research materials.

We write plagiarism Free Content

Confidential- We never share or sell your personal information to third parties.

Support-Chat with us today! We are always waiting to answer all your questions.

Related Posts

Unit 2: Common Assessment/Project Part A Start Assignment

Person-Centered and Experiential Therapy

Read the Instructions for the Population Health Assessment & Prevention ?Download Instructions for the Population Health Assessment & Prevention