March 14, 2022

Discuss machine learning, the fundamentals of the algorithms mentioned in the attachment below and others you can find o

Discuss machine learning, the fundamentals of the algorithms mentioned in the attachment below and others you can find out about or recall from your own previous experience.

Discuss how to evaluate a ML approach, what is a confusion matrix, what measures are used?

1 page, APA format, in-text citation, references include

MACHINE_LEARNING.docx

What is machine learning (ML)?

Machine learning is the process of providing data to an algorithm in order to obtain a prediction or an optimization applicable not to the data provided, but to a new example of the same kind of data. The way this is different to using a computer to calculate a number, say a mean or a standard deviation, is that in that case the solution provided applies to that data, but in the ML, the “learned” process is applied to a new set of data and not to the original set.

There are two major steps in ML, the first one data gathering (everything from collecting data, transforming it and preparing it for modeling) and the second will be the inference step ( the one where the algorithm actually makes a prediction or estimate)

There are two major types of ML approaches supervised and unsupervised learning. In the supervised approach, the outcome is determined, say it is a number, a label or a set of numbers, there is an expected output. In the unsupervised learning, the output is not determined, basically the model runs, and the information is extracted after the process ends, but there is no specific output expected (Larrañaga et al., 2006).

Supervised models:

As described above these are model that have an expected output type, the value is no known, but the class of output is defined.

1. Bayesian classifier: This is a basic classification scheme, in which the Bayesian theorem is applied to the data on hand (the prior probability) and the posterior probability is calculated for a particular classification, the highest posterior probability is labeled as the best classification. An approach of Bayesian classifier is naïve bayes, it is called “naïve” because it assumes total independence on the predictive variables (the available data), this is probably not a very realistic assumption ergo the name. This is the classifier we will use for our practical example below.

2. Classification tree: This also an intuitive solution in which the training data is taken through a series of decisions and the decision points as stored, as shown below.

Those decision points can then be used for new data.

3. Neural networks: it is based on the idea of representing the network of neurons in a human

brain, so an input is placed to series of artificial neurons which classify the input by using an assigned threshold. Neural networks can be multilayered and more complex classification schemes can be assigned to each neuron.

4. Support Vector Machines: in this algorithm the data is classified by partitioning it according to a plane, if a simple plane is used, say two dimensions, this is very clear how the partition is done, whatever is above the plane gets one label and whatever is below gets a different label. The problem is that a classification problem may require multiple labels, this approach solves that problem by introducing a partition in many dimensions, this is called a hyperplane. Since humans are trapped in three dimensions is hard to visualize these hyperplanes, but SVM have very high-performance ratings and they are a powerful classification tool.

Unsupervised models

In the unsupervised model, the approach to discovers new patterns with no specific output expected. Classification algorithms are examples of unsupervised models. The nearest neighbor approach can be made unsupervised if no specific labels are provided and it can assign labels to the data. If the labels are provided to the system, it is a supervised approached.

Nearest neighbor: in this approach, the label for a particular data point is assigned based on the label already assigned to its closest neighbor. This is also extended to k-nearest neighbors, in which the labels are assigned based on the label of its k closest neighbors.

R HANDS ON

Follow the steps to produce predictions with the naïve bayes method. The datasets will be provided for you in the class portal.

>install.packages("e1071")

>library(e1071)

>credit_training <- read.csv("credit_training.csv", stringsAsfactors=T)

> credit_test<-read.csv("credit_test.csv"", stringsAsfactors=T)

>history_test<- read.csv("history_test.csv"", stringsAsfactors=T)

> history_training<- read.csv("history_training.csv"", stringsAsfactors=T)

# NOTE that myuniquenumber is an interger between 300 & 600

# you must pick the value and set that variable

> training<-credit_training[1:myuniquenumber,]

> history<-history_training[1:myuniquenumber,]

> clsf<-naiveBayes(training,history,0)

> predictions<-predict(clsf,credit_test)

> table(predictions)

#copy and paste the output

> table(history_test)

#copy and paste output

REFERENCES

Larrañaga, P., Calvo, B., Santana, R., Bielza, C., Galdiano, J., Inza, I., Lozano, J. A., Armañanzas, R., Santafé, G., Pérez, A., & Robles, V. (2006). Machine learning in bioinformatics. Briefings in Bioinformatics, 7(1), 86–112. https://doi.org/10.1093/bib/bbk007

Collepals.com Plagiarism Free Papers

Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.

Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS

Why Hire Collepals.com writers to do your paper?

Quality- We are experienced and have access to ample research materials.

We write plagiarism Free Content

Confidential- We never share or sell your personal information to third parties.

Support-Chat with us today! We are always waiting to answer all your questions.

Discuss machine learning, the fundamentals of the algorithms mentioned in the attachment below and others you can find o

REFERENCES

Related Posts

he Occupational Information Network (O*NET) is a source of occupational information maintained by the United States Department o

Calculation of the number of participants to reach saturation is needed. I am looking at analysis using G*power to identify the

Describe what Shannon entropy is. ?Then describe one way it can be used in bioinformatics. Question 2: How does a sequence logo