CISC482 - Lecture11

Regression 2

Dr. Jeremy Castagno

Class Business

Schedule

  • Topic Ideas - Should be turned in! Will grade soon!
  • Reading 5-3: Mar 01 @ 12PM, Wednesday
  • HW4 - Mar 08 @ Midnight
  • Proposal: Mar 22, Wednesday

Today

  • Assumptions of Linear Regression
  • Multiple Linear Regression

Review

Review Simple Linear Regression

  • What is the model for linear regresion: \(\hat{y} = ?\)
  • What is residual?
  • What is the loss function to solve for the parameters
    • \(L(\beta_0, \beta_1) = \sum\limits_{i=1}^{n}[y_i - \hat{y}_i]^2\)
    • \(L(\beta_0, \beta_1) = [y_i - (\hat{\beta}_0 + \hat{\beta}_1 x_i)]^2\)

Formulas?

\(\beta_1 (slope) = \frac{\sum\limits_{i=1}^{n}[(x_i-\bar{x})(y_i- \bar{y})]}{\sum\limits_{i=1}^{n} (x_i - \bar{x})^2}\)

\(\beta_0\) (intercept) = \(\bar{y} - \beta_1 \bar{x}\)


\(\hat{\beta}_1 (slope) = \frac{\text{Cov}(x,y)}{s_x^2} = r\frac{s_y}{s_x}\)

\(\beta_0\) (intercept) = \(\bar{y} - \beta_1 \bar{x}\)

Assumptions

Assumptions of Linear Regression

  • \(x\) and \(y\) have a linear relationship.
  • The residuals of the observations are independent.
  • The mean of the residuals is 0 and the variance of the residuals is constant.
  • The residuals are approximately normally distributed.

Linear Relationship?

Residual Plot

Code
ax = sns.residplot(x=x,y=y)
ax.set_xlabel("X")
ax.set_ylabel("Residuals");

Independence of Error

  • The distances from the regression line to the points (residuals) should generally be random.
  • You do not want to see patterns
  • Was the previous plot of residual showing patterns, or was it more or less random?

Residual are independent?

Examples of Dependence

  • Time dependence can often be assessed by analyzing the scatter plot of residuals over time.
  • Spatial dependence can often be assessed by analyzing a map of where the data was collected along with further inspection of the residuals for spatial patterns.
  • Dependencies between observational units must be assessed in context of the study.

Example - Time

Example - Space

Discussion

  • In these graphs we saw time and spatial dependencies in our data set.
  • We saw this by plotting residual plots and looking for any patterns.
  • Recall the model was: \(MPG = m \cdot Weight + b\)
  • Does the independence of error assumption hold in this model?
  • Does that mean we should never use this model?

Mean and Variance of Error

Keeping error low and consistent

  • The residuals of a fitted linear model has a mean of 0. Always.
  • A mean of 0 means that on average the predicted value is equal to to the observed value.

Example, Residual 0

Example, Residual 1

Tip

Look at the y-axis scale!

Consistent Variance

  • A linear regression should have a constant variance for all levels of the input.
  • That means the spread of the error should be the same at all levels of input
  • Level of input = x axis changing

Example, Residual 3

Example, Residual 4

Caution

Clearly see a huge jump in variance around 15

Example, MPG Prediction

Normality of Errors

  • As long as the previous assumptions hold you are going to get a good simple linear regression model
    • \(x\) and \(y\) have a linear relationship
    • Errors are independent (no spatial, time dependence, or other feature dependence)
    • Residual mean of 0, variance constant
  • However, if the errors are also normally distributed, more awesomeness can be done: Interval Estimates

Interval Estimates

Full Example

  • Use library numpy and scikit-learn to fit our models
  • scikit-learn is library made for machine learning. Its awesome!

Creating data

n = 250
x = np.arange(n)
noise = np.random.normal(loc=0, scale=5, size=n)
y = (1.5 * x + 4) + noise
ax = sns.scatterplot(x=x,y=y);
ax.set_xlabel("X")
ax.set_ylabel("Y");

Creating data

Fitting the data

from sklearn.linear_model import LinearRegression
model = LinearRegression()

X = x[:, np.newaxis] # n X 1 matrix
reg = model.fit(X, y) # X needs to be a matrix

slope, intercept  = reg.coef_[0], reg.intercept_
print(f"Slope: {slope:.1f}; \nIntercept: {intercept:.1f}")

ax = sns.scatterplot(x=x, y=y)
ax.axline((0, intercept), slope=slope, color='r', label='Regressed Line');
ax.set_xlabel("X")
ax.set_ylabel("Y");

Fitting the data

Slope: 1.5; 
Intercept: 3.9

Residual Plot

from yellowbrick.regressor import ResidualsPlot
model = LinearRegression()
visualizer = ResidualsPlot(model)

visualizer.fit(X, y) 
visualizer.show();

Residual Plot

Multiple Linear Regression

Definition

  • Dataset has multiple input features
  • Incorporate more than one input feature into a single regression equation -> multiple linear regression
  • \(\hat{y} = \beta_0 + \beta_1 x_1 + ... \beta_k x_k\)
  • \(x_1, ..., x_k\) = input features
  • \(\hat{y}\) = predicted feature
  • \(\beta_0\) = y-intercept. \(\beta_1, ..., \beta_k\) = slopes

Example

pp = sns.pairplot(data=df,
                  height=7,
                  y_vars=['body_mass_g'],
                  x_vars=['bill_length_mm', 'flipper_length_mm'])

Example

3D plot

X = df[['bill_length_mm', 'flipper_length_mm']].values
y = df['body_mass_g'].values
def plot_plane(x_coef, y_coef, intercept, ax):
  x = np.linspace(X[:, 0].min(), X[:, 0].max(),  n)
  y = np.linspace(X[:, 1].min(), X[:, 1].max(), n)
  x, y = np.meshgrid(x, y)
  eq = x_coef * x + y_coef * y + intercept
  surface = ax.plot_surface(x, y, eq, color='red', alpha=0.5)

fig = plt.figure(figsize = (10, 7))
ax = plt.axes(projection ="3d")
ax.scatter(X[:, 0], X[:, 1], y, label='penguins')
ax.set_xlabel("Bill Length")
ax.set_ylabel("Flipper Length")
ax.set_zlabel("Body Mass")
ax.legend();

3D plot

Performing Regression

model = LinearRegression()
reg = model.fit(X, y) # X needs to be a matrix

slope_bill, slope_flipper, intercept  = reg.coef_[0], reg.coef_[1], reg.intercept_
print(f"Bill Slope: {slope:.1f}; Flipper Slope: {slope_flipper:.1f}; \nIntercept: {intercept:.1f}");
Bill Slope: 1.5; Flipper Slope: 48.9; 
Intercept: -5836.3

Visualize Plane

fig = plt.figure(figsize = (10, 7))
ax = plt.axes(projection ="3d")
ax.scatter(X[:, 0], X[:, 1], y, label='penguins')
ax.set_xlabel("Bill Length")
ax.set_ylabel("Flipper Length")
ax.set_zlabel("Body Mass")
plot_plane(slope_bill, slope_flipper, intercept, ax=ax)
ax.legend();

Visualize Plane

Simple Polynomial Regression

  • Special case of multiple linear regression
  • Include powers of a single features as inputs in the regression equation
  • Simple polynomial linear regression
  • \(\hat{y} = \beta_0 + \beta_1 x + \beta_2 x^2 .. \beta_k x^k\)
  • Quadratic: \(\hat{y} = \beta_0 + \beta_1 x + \beta_2 x^2\)

Example Quadratic

Code
n = 30
x = np.arange(n)
noise = np.random.normal(loc=0, scale=5, size=n)
y = (0.1 * x**2  + 0.3 * x + 4) + noise
sns.scatterplot(x=x,y=y);

Fitting a Quadratic (numpy)

coeffs = np.polyfit(x, y, deg=2)
print(coeffs); # beta_2, beta_1, beta_0
[0.1 0.3 3.0]

Plotting your regressed polynomial function

Code
regressed_fn = np.poly1d(coeffs)
y_hat = regressed_fn(x)
ax = sns.scatterplot(x=x,y=y)
ax.plot(x, y_hat, color='red', linestyle='--');

Fitting a Quadratic (SciKit Learn)

  • Scikit-learn can also fit polynomials
  • You must first create the additional columns manually
  • In other words create a second column that is the first column squared!
  • Helpful class called PolynomialFeatures

Fitting a Quadratic (SciKit Learn)

from sklearn.preprocessing import PolynomialFeatures
pf = PolynomialFeatures(degree=2,include_bias=False)
X = x[:, np.newaxis] # n X 1 matrix
x_features = pf.fit_transform(X)
print("Transformed Features:\n", x_features[:5, :])
model = LinearRegression()
reg = model.fit(x_features, y) # X needs to be a matrix
print("Coefficients")
print(reg.coef_, reg.intercept_); # beta_2, beta_1, beta_0
Transformed Features:
 [[0.0 0.0]
 [1.0 1.0]
 [2.0 4.0]
 [3.0 9.0]
 [4.0 16.0]]
Coefficients
[0.3 0.1] 2.965347007117529

Plotting your regressed polynomial function

Code
regressed_fn = np.poly1d(coeffs)
y_hat = reg.predict(x_features)
ax = sns.scatterplot(x=x,y=y)
ax.plot(x, y_hat, color='red', linestyle='--');

Class Activity

Class Activity

Practice Multiple Linear Regression