IT446 Data mining
College of Computing and Informatics Question One Pg. 01 Learning Outcome(s):4 Evaluate the performance of data mining algorithms. Question One 2 Marks An online streaming platform wants to analyze whether there is a significant association between users’ subscription plans (Basic, Premium, Family) and their preferred device for streaming (Smartphone, Tablet, Laptop). The platform samples 300 subscribers randomly and records their subscription plans and preferred streaming devices. Based on the data, can the streaming platform conclude that there is a relationship between subscription plans and preferred streaming devices among its users? (Use Chisquare with a significance level (α) set at 0.05). Smartphone Tablet Laptop Basic 50 30 20 Premium 60 40 20 Family 30 20 30 Question Two Pg. 02 Learning Outcome(s):2 Demonstrate a wide range of clustering, Question Two 2 Marks It is important to define/select similarity measures in data analysis. However, there is no commonly accepted subjective similarity measure. Results can vary depending on the similarity measures used. Nonetheless, seemingly different similarity measures may be equivalent after some transformation. estimation, prediction, and Suppose we have the following 2-D data set: classification algorithms to solve a specific program or application. Consider the data as a pair of data points. Given a new data point, x = (1.4,1.6) as a query, rank the database points based on similarity with the query using Euclidean distance, Manhattan distance, supremum distance, and cosine similarity. Question Three Pg. 03 Learning Outcome(s):3 Question Three Suppose a group of 12 sales price records has been sorted as follows: Employ data 5,10,11,13,15,35,50,55,72,92,204,215. mining and data Partition them into three bins by each of the following methods: warehousing techniques to (a) equal-frequency (equal-depth) partitioning solve real-world (b) equal-width partitioning problems. 2 Marks Question Four Pg. 04 Learning Outcome(s):1 Define different data mining tasks, problems, and the algorithms most appropriate for addressing them. Question Four 2 Marks Suppose that a data warehouse consists of four dimensions (date, spectator, location, and game) and two measures (count and charge) where charge is the fare that a spectator pays when watching a game on a given date. Spectators may be (students, adults, or seniors), with each category having its own charge rate. (a) Draw a star schema diagram for the data warehouse. (b) Starting with the base cuboid [date, spectator, location, game], what specific OLAP operations should be performed to list the total charge paid by student spectators at GM Place in 2010?
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.
