Consider a corpus that contain five documents in Table 1. Using python is fine for this question. In case you use Python for this question, submi
1) (35 points) Consider a corpus that contain five documents in Table 1. Using python is fine for this question. In case you use Python for this question, submit your python code too.
Doc1
Decide which attribute the decision tree algorithm would choose.
Doc2
A decision tree is a classification algorithm that is widely used in machine learning.
Doc3
Making a decision to put a tree is very difficult due to lack of power for the decision
Doc4
Language decision varies from person to person and time to time.
Doc5
Decision trees are different from binary trees or binary search trees.
a) Build a term-document matrix based on raw count of each term for the corpus in Table 1 after removing stopwords and lemmatizing sentences. Use only noun and verb to build a term-document matrix.
b) Build a term-document matrix based on tf-idf of each term for the corpus in Table 1 after removing stopwords and lemmatizing sentences. Use only noun and verb to build a term-document matrix.
Show the procedure how you calculated tf-idf.
(Use stopwords provided by NLTK given here:
{'of', 'against', 'll', 'they', 'aren', 'our', 'that', 'shouldn', 'only', 'shan', 'o', "isn't", 'been', "weren't", "you've", 'myself', 'as', 'once', 'my', 'both', 'too', 'be', 'should', 'hadn', 'in', 'does', "you'll", 'during', 'herself', 'will', 'any', 'was', 'how', 'which', "didn't", 'but', 'had', 'more', 'needn', 'further', 'whom', 'mustn', 'no', 'did', "aren't", 'or', 'on', 'down', 'them', 'to', 'same', "shouldn't", "should've", "mightn't", "it's", 'between', 'before', 'he', 'here', "hadn't", 'have', 'if', "you're", 'haven', 'under', 'nor', 't', 'can', 're', 'it', 'y', 'where', 'then', 'she', 'own', 'hers', 'is', 'isn', 'each', 'don', 'now', 'by', 'than', "hasn't", 'his', 'who', 'above', 'this', "mustn't", 'their', "couldn't", 'there', 'couldn', 'over', "you'd", 'm', 'doing', 'when', 'into', 'i', 'other', 'a', 'ours', 'because', 'we', 'an', 'weren', 'most', 'for', 'wasn', "won't", 'up', "shan't", 'while', 'your', 'am', 'through', 'after', "don't", 'theirs', 'ain', 'him', 'having', 'until', 'those', 'yourself', 'off', 'just', 'below', 'didn', "wouldn't", "that'll", 'out', 'mightn', 'ma', 'wouldn', 'such', 'won', 'all', 'the', 'has', 'ourselves', 'doesn', 'some', 'few', 'these', 'and', "needn't", "doesn't", 'what', 'with', 'very', 'himself', 'do', 'again', 'd', 'yours', 'are', "wasn't", 'not', 'being', 'were', 'from', 'me', 've', 'why', 'itself', 's', 'so', 'hasn', 'her', "she's", 'you', "haven't", 'themselves', 'its', 'at', 'yourselves', 'about'}
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.