#INSTRUCTIONS

#Type the codes for each question

#Include answers to ALL questions in the script as a comment (with a #).

#If you are missing codes or answers in this script, points will be deducted.

### REGRESSIONS ###

install.packages(“readxl”)

library(readxl)

wine = read.csv(“https://docs.google.com/spreadsheets/d/e/2PACX-1vSYO2sPUeEvW1CE7wysL88oeFlfODEgQD2rl5aJqt9YlCwjs58pvibRucgDzavO-rCFs6VCgaEY2NzF/pub?gid=1979851222&single=true&output=csv”, header=TRUE)

attach(wine)

View(wine)

#Our data is named “wine” and we are going to explore what factors contribute to wine prices from the Bordeaux region in France.

#To view more information about this data & variables please visit: https://bookdown.org/egarpor/SSS2-UC3M/multlin-exa…

# 1. Let’s get farmiliar with our data by viewing a summary of it

# 2. Plot the the two variables “AGST” and “Price” using the plot() code

#Use the cor() code to look at the correlation and relationships beetwen all of the variables.

# *3. Which two variables have the strongest positive correlation? (Look for numbers closest to 1)

# *4. Use cor() code to look at the correlation between Price, AGST, and Age. What is it?

# AGST stands for the average growing season temperature in celsius. Price is the price of the wine.

# 5. Create a linear regression to predict the “Price” based on “AGST” name this regression “model1”

# *6. View a summary of our model

#What is the intercept, AGST coefficient, and R squared? What do they mean?

#The intercept is

#The AGST coefficient is

#This means that

#The R squared is

# 7. Add a fitted line to our model1 regression using the abline() code

#You can change color of fitted line with the codes below

abline(model1, col = 5)

abline(model1, col = “purple”)

# 8. Use the resid() code to view the residuals of model1. Our goal is to get them to be as close to 0.

# *9. Find the mean of the residuals of model1. Again, this should be close to 0. What is the mean of residuals for model 1?

# *10. Look at a histogram of the residuals for model1. Are they normally distributed?

# *11. Predict the price with AGST of 17.12 use the predict() code. What did you get?

#You could type out the formula instead

-3.4178+0.6351*17.12

### MULTIPLE REGRESSION ###

# *12. Run a regression to predict “Price” based on “AGST” and “HarvestRain”

# What is the intercept, what does this mean?

# What are the coefficients for AGST and HarvestRain? Interpret both of them. Are they both significant ***?

# What is the Adjusted R squared?

#The intercept is

#The coefficient for AGST is

#Coefficient for HarvestRain is

#This means that

#Significance ***

#The adjusted R squared is

# *13. Multiple Regression (all variables) AGST, HarvestRain, WinterRain, Age, FrancePop

# Run a multiple regression including all variables. Which ones are significant?

#Significance