few questions related to R
The Prostate Dataset
The prostate dataset comes from a study on 97 men with prostate cancer who were due to receive radical prostatectomy.
The data contain the following variables:
lcavol: log(cancer volume in cm3)
lweight: log(prostate weight in gm)
age: age in years
- lbph: log(benign prostatic hyperplasia amount)
svi: seminal vesicle invasion
lcp: log(capsular penetration)
Gleason: Gleason score
- pgg45: percentage Gleason scores 4 or 5
lpsa: log(prostate specific antigen in ng/mL)
Question 1
- Validate that the prostate data frame contains 97 observations.
Hint: First install the faraway package (if you haven’t already) as instructed on Lesson 1, Slide 49. The following R statement will load the prostate data frame:
data(“prostate”, package = “faraway”).
Use the nrow() function to see how many overvaluations (rows) the data frame has. For example: the following statement prints the number of observations in the car data frame: nrow(cars).
- Question 2
Calculate descriptive statistics of each of the variables.
Hint: Use the summary() function. For example: summary(cars).
Question 3
- Create a new data frame that includes the following variables: lcavol, lweight, age and lpsa.
Use this new data frame for all questions below.
Hint: In the following example, we select two variables (agegp and alcgp) from the esoph data frame and name the new data frame esophSubDf
esophSubDf <- esoph[c(“agegp”, “alcgp”)]
- Question 4
Calculate descriptive statistics of each of the variables using the new data frame.
Question 5
- Create a scatter plot matrix for all the variables using the new data frame.
Hint: Use the pairs() function (see Lesson 2, Slide 50).
Question 6
- Create a (Pearson) correlation matrix for all the variables.
Hint: Use the cor() function (see Lesson 2, Slide 48).
Question 7
Show the same matrix again, but round the correlations (use two decimal places).
- Hint: Use the round() function. The following example calculates the correlation matrix for the cars data frame and rounds the numbers:
round(cor(cars),2)
Question 8
Create a regression model:
The predictor variable (X) should be lpsa.
The outcome variable (Y) should be lcavol.
Show the summary of the model.
Hint: Use the lm() and summary() functions (see Lesson 2, Slide 51).
Question 9
Visualize the two variables and the model you just created by doing the following:
Create a scatter plot. Put lcavol in the y-axis and lpsa in the x-axis. Include the regression line and label the axis.
Hint: See Lesson 2, Slide 52.
Question 10
Update the regression model by adding a second predictor: age
Show the regression model summary
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.
