Faculty of Engineering and Technology Ramaiah University of A
Faculty of Engineering and Technology Ramaiah University of Applied Sciences Department Computer Science and Engineering Programme B. Tech Semester/Batch 8th /2016 Course Code CSC409A Course Title Data Analytics Course Leader Mohan Kumar K N Assignment 01 Register No Name of Student Sections Marking Scheme Marks Max Marks First Examiner Marks Moderator Marks Part-A A 1.1 Data Analytics and its applications 04 A 1.2 Real world use cases industries, technologies 04 A 1.3 Barriers to adoption technical and non technical 04 A 1.4 future of analytics best guess 04 A1.5 Justification and stance taken 04 Part-A Max Marks 20 Part B 1 B 1.1 Phase 2 tools (any two) 8 B 1.2 Phase 4 tools (any two) 8 B 1.3 kinds of use scenarios 4 Part-B 1 Max Marks 20 Part B 2 B2.1 Introduction to the recommended method(s) 06 B2.2 Suggestion and selection the attributes 08 B2.3 Justification 06 Part-B 2 Max Marks 20 Part B 3 B3.1 Recommend relevant solution 08 B3.2 Issues with information retrieval 04 B3.3 justification 08 Part-B 3 Max Marks 20 Part B 4 B4.1 Introduction to big data platform 04 B4.2 Problem solving approach 04 B4.3 Design and implementation 04 B4.4 Performance analysis 04 Part-B 4 Max Marks 20 Total Assignment Marks 100 Course Marks Tabulation Component- CET B Assignment First Examiner Remarks Second Examiner Remarks A B.1 B.2 B.3 B.4 Marks (Max 50 ) Marks (out of 25 ) Signature of First Examiner Signature of Second Examiner Please note: 1. Documental evidence for all the components/parts of the assessment such as the reports, photographs, laboratory exam / tool tests are required to be attached to the assignment report in a proper order. 1. The First Examiner is required to mark the comments in RED ink and the Second Examiners comments should be in GREEN ink. 1. The marks for all the questions of the assignment have to be written only in theComponent CET B: Assignmenttable. 1. If the variation between the marks awarded by the first examiner and the second examiner lies within +/- 3 marks, then the marks allotted by the first examiner is considered to be final. If the variation is more than +/- 3 marks then both the examiners should resolve the issue in consultation with the Chairman BoE. Assignment 1 Term 1 Instructions to students: 1. The assignment consists of5questions: Part A 1Question, Part B-4Questions. 2. A maximum mark is100. 3. The assignment has to be neatly word processed as per the prescribed format. 4. The maximum number of pages should be restricted to25. 5. Restrict your report for Part-A to 5 pages only. 6. Restrict your report for Part-B to a maximum of 20 pages. 7. The printed assignment must be submitted to the course leader. 8.Submission Date: 31 / 03/2020 9.Submission after the due date is not permitted. 10.IMPORTANT: It is essential that all the sources used in preparation of the assignment must be suitably referenced in the text. 11. Marks will be awarded only to the sections and subsections clearly indicated as per the problem statement/exercise/question Preamble: The course is intended to teach the design, development, analysis and evaluation of Data Analytics applications. Employing appropriate techniques, methods and technology in various domains of computing is discussed. Data mining algorithms, tuning them for a given application and actionable interpretations are emphasized. It helps to solve practical applications with data analysis, turning business intelligence into real-world outcomes. Students are trained to analyses, visualize and interpret the data and associated implicit insights. PART A 20 Marks Data Analytics is the science of analyzing data to convert information into useful knowledge. This helps to understand the world better and in many contexts enable to make better decisions. Technological advances and associated changes in daily life have produced a rapidly expanding new content/data/information sources. Although many opportunities exist, big data and data analytic technologies also present many challenges such as understanding data, quality of data, security and real time integration. Most of the organizations are facing the imbalance on data analysts and the amount of data being produced. Debate on the statement Data deluge in information and starving for knowledge in Data Analytics? Your debate should include: A1.1Introduction toData Analytics and its applications A1.2Illustration with real world examples A1.3Discussion on the barriers for adoption A1.4Discussion of the future of analytics A1.5Stance taken and justification PART B 80 Marks B.1 20 Marks Data analytics lifecycle defines analytics process and best practices spanning from discovery to project completion. Consider data preparation and model building phases of data analytics lifecycle and select relevant tools for each phase and defend with suitable example. Perform the following: B1.1Discuss data preparation phase tools B1.2Discuss model building phase tools B1.3Justify with suitable scenarios B.2 20 Marks A data science team is working on a book recommendation problem. The books are available in different categories. If a customer buys a book he should be recommended other books and categories of books of his preference: B2.1Model different method(s) to address the above issue. B2.2Identify suitable attributes. B2.3Justify your solution by comparison. B.3 20 Marks A certain company ?A wants to market its new product. Manual marketing is time consuming and a costly process. The model should spread the product information like virus (viral marketing). B3.1Recommend a solution B3.2Discuss issues B3.3Justification B.4 20 Marks Inverted index:Inverted Index is mapping of text in the document. Mainly used in search engines, it provides faster lookup on text searches. The output file must contain a list of all words also the number of times it occurs. The Map method can read the input file and output (word, filename) as the key-value pair. Reducer method can use a hash map of (filename, count) to count the occurrences of each filename for a particular word key. Solve the problem using Big data (Hadoop R). Your report should include: B4.1Introduction to Big data platform B4.2Problem solving approach B4.3Design and implementation B4.4Performance analysis ❧❧❧ Detailed Marking Scheme Question No. Tasks Steps involved Marks Allotted for steps Instructors Expected Solution Total Allotted Marks for the question A Data Analytics and its applications Illustrate with Real world use cases Discuss the barriers to adoption Discuss the future of analytics Justification and stance taken 4 4 4 4 4 · Definition and any of its applications · Real life examples · Technical and non technical · With respect to best guess · Based on facts and figures 20 B1 Discussion of Phase 2 tools Discussion of Phase 4 tools Discuss which kinds of use scenarios 8 8 4 · Name any 2-3 examples · Name any 2-3 examples · With example explain it 20 B2 Introduction to the recommended method(s) Why you are suggesting it? How does it select the attributes? Justification 6 8 6 · Decision making · Based on problem statement and its requirements 20 B3 B3.1 Recommend relevant analysis B3.2 Describe the issues B3.3 Discuss it with example B3.4 Justification 4 6 6 4 · Model a Decision Support system · Based on problem statement and its requirements 20 B4 4.1 Introduction to big data platform. 4.2 Problem solving approach. 4.3 Design and implementation. B4.4 Discuss its performance. 4 4 6 6 · HDFS(Hadoop Distributed File System) big data platform · Problem solving approach MapReduce is used · Design and implementation using Hadoop/R Using Java platform · Performance 20
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.
