Descriptive Analytics 1. Solve the following questions on Google Colab or Databricks using Spark SQL
Descriptive Analytics
1. Solve the following questions on Google Colab or Databricks using Spark SQL
a. Search the internet for a big dataset of at least 1 GB.
b. Create a DataFrame from the dataset.
c. Using the DataFrame and implement the following aggregation functions.
i. Aggregation with grouping
ii. Aggregation with pivotin
iii. Aggregation with rollups and cubes
d. Spark SQL supports the following window functions. Apply these functions on the DataFrame
i. Ranking functions
1. rank
2. dense_rank
3. percent_rank
4. row_number
5. ntile
ii. Analytic functions
1. cume_dist
2. first_value
3. last_value
4. lag
5. lead
Deliverables:
• One pdf file which contains the following:
o A cover page which includes Student ID, name, HW number, and date
o A description of the big dataset and its source.
o Each SQL statement and a snapshot of its output
o Problemsyoufacedifany.
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.
