Computer Science Classification with K-Nearest Neighbors
Classification with K-Nearest Neighbors
The k-nearest neighbor (kNN) is used for pattern classification, regression models, and is ideal for data mining. Some real-world examples of its use include determining credit card ratings, identifying who’s likely to default on a loan, detecting unusual patterns in credit card usage, or predicting the future value of stocks.
Step 1: To perform a classification using the k-nearest neighbors algorithm, complete the following:
Access https://archive.ics.uci.edu/datasets?Task=Classification&NumInstances=1000-inf&NumAttributes=10-100&skip=0&take=10&sort=desc&orderBy=NumHits&search=
Note: There are about 120 datasets that are suitable for use in a classification task. For this part of the exercise, you must choose one of these datasets, provided it includes at least 10 attributes and 10,000 instances.
Discuss the origin of the data and assess whether it was obtained in an ethical manner.
Step 2: For your selected dataset, build a classification model as follows, this can be done in either RStudio or Python:
Explain the dataset and the type of information you wish to gain by applying a classification method.
Explain the k-nearest neighbors algorithm and how you will be using it in your analysis (list the steps, the intuition behind the mathematical representation, and address its assumptions). Assume k = 5 using the Euclidian distance. Explain the value of k.
Import the necessary libraries, then read the dataset into a data frame and perform initial statistical exploration.
Clean the data and address unusual phenomena (e.g., normalization, outliers, missing data, encoding); use illustrative diagrams and plots and explain them.
Formulate two questions that can be answered by applying a classification method using the k-nearest neighbors method.
Split the data into 80% training and 20% testing sets.
Train the k-nearest neighbors classifier on the training set using the following parameters: k = 5, metric = ‘minkowski’, p = 2.
Make classification predictions.
Interpret the results in the context of the questions you asked.
Validate your model using a confusion matrix, accuracy score, ROC-AUC curves, and k-fold cross validation. Then explain the results.
Include all mathematical formulas used and graphs representing the final outcomes.
Step 3: Prepare a comprehensive technical report as an RMarkdown document or Jupyter notebook, including all code, code comments, all outputs, plots, and analysis.
Make sure the project documentation contains:
a) Problem statement
b) Algorithm of the solution
c) Analysis of the findings
d) References
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.