How to draw multiple regression line in R
To create multiple regression lines in a single plot using ggplot2, we can use geom_jitter function along with geom_smooth function. The geom_smooth function will help us to different regression line with different colors and geom_jitter will differentiate the points. Show
Check out the below Example to understand how it can be done. ExampleFollowing snippet creates a sample data frame − x1<-rpois(20,1) y1<-rpois(20,5) x2<-rpois(20,2) y2<-rpois(20,8) x3<-rpois(20,2) y3<-rpois(20,4) df<-data.frame(x1,y1,x2,y2,x3,y3) df The following dataframe is created x1 y1 x2 y2 x3 y3 1 2 2 0 6 1 6 2 3 4 0 9 1 7 3 2 4 3 7 2 3 4 0 12 2 11 0 1 5 0 2 0 6 1 1 6 1 7 2 7 1 3 7 0 4 0 4 1 5 8 0 3 2 5 0 1 9 1 4 3 3 0 9 10 0 2 0 8 3 5 11 0 7 4 11 2 4 12 0 4 3 8 2 1 13 0 6 0 6 2 4 14 1 6 1 9 2 2 15 2 3 1 9 6 2 16 1 3 1 10 5 2 17 0 5 1 8 2 6 18 1 2 4 7 2 4 19 0 5 2 11 0 7 20 2 8 4 8 2 4 To load the ggplot2 package and create regression lines for multiple models in single plot on the above created data frame, add the following code to the above snippet − Today let’s re-create two variables and see how to plot them and include a regression line. We take height to be a variable that describes the heights (in cm) of ten people. Copy and paste the following code to the R command line to create this variable. height <- c(176, 154, 138, 196, 132, 176, 181, 169, 150, 175) Now let’s take bodymass to be a variable that describes the masses (in kg) of the same ten people. Copy and paste the following code to the R command line to create the bodymass variable. bodymass <- c(82, 49, 53, 112, 47, 69, 77, 71, 62, 78) Both variables are now stored in the R workspace. To view them, enter: height [1] 176 154 138 196 132 176 181 169 150 175 bodymass [1] 82 49 53 112 47 69 77 71 62 78 We can now create a simple plot of the two variables as follows: plot(bodymass, height) We can enhance this plot using various arguments within the plot() command. Copy and paste the following code into the R workspace: plot(bodymass, height, pch = 16, cex = 1.3, col = "blue", main = "HEIGHT PLOTTED AGAINST BODY MASS", xlab = "BODY MASS (kg)", ylab = "HEIGHT (cm)")
In the above code, the syntax pch = 16 creates solid dots, while cex = 1.3 creates dots that are 1.3 times bigger than the default (where cex = 1). More about these commands later. Now let’s perform a linear regression using lm() on the two variables by adding the following text at the command line: lm(height ~ bodymass) Call: lm(formula = height ~ bodymass) Coefficients: (Intercept) bodymass 98.0054 0.9528 We see that the intercept is 98.0054 and the slope is 0.9528. By the way – lm stands for “linear model”. Finally, we can add a best fit line (regression line) to our plot by adding the following text at the command line: R provides comprehensive support for multiple linear regression. The topics below are provided in order of increasing complexity. Fitting the Model
Diagnostic PlotsDiagnostic plots provide checks for heteroscedasticity, normality, and influential observerations.
click to view For a more comprehensive evaluation of model fit see regression diagnostics or the exercises in this interactive course on regression. Comparing ModelsYou can compare nested models with the anova( ) function. The following code provides a simultaneous test that x3 and x4 add to linear prediction above and beyond x1 and x2.
Cross ValidationYou can do K-Fold cross-validation using the cv.lm( ) function in the DAAG package.
Sum the MSE for each fold, divide by the number of observations, and take the square root to get the cross-validated standard error of estimate. You can assess R2 shrinkage via K-fold cross-validation. Using the crossval() function from the bootstrap package, do the following:
Variable SelectionSelecting a subset of predictor variables from a larger set (e.g., stepwise selection) is a controversial topic. You can perform stepwise selection (forward, backward, both) using the stepAIC( ) function from the MASS package. stepAIC( ) performs stepwise model selection by exact AIC.
Alternatively, you can perform all-subsets regression using the leaps( ) function from the leaps package. In the following code nbest indicates the number of subsets of each size to report. Here, the ten best models will be reported for each subset size (1 predictor, 2 predictors, etc.).
click to view Other options for plot( ) are bic, Cp, and adjr2. Other options for plotting with Relative ImportanceThe relaimpo package provides measures of relative importance for each of the predictors in the model. See help(calc.relimp) for details on the four measures of relative importance provided.
click to view Graphic EnhancementsThe car package offers a wide variety of plots for regression, including added variable plots, and enhanced diagnostic and Scatterplots. Going FurtherNonlinear RegressionThe nls package provides functions for nonlinear regression. See John Fox's Nonlinear Regression and Nonlinear Least Squares for an overview. Huet and colleagues' Statistical Tools for Nonlinear Regression: A Practical Guide with S-PLUS and R Examples is a valuable reference book. Robust RegressionThere are many functions in R to aid with robust regression. For example, you can perform robust regression with the rlm( ) function in the MASS package. John Fox's (who else?) Robust Regression provides a good starting overview. The UCLA Statistical Computing website has Robust Regression Examples. The robust package provides a comprehensive library of robust methods, including regression. The robustbase package also provides basic robust statistics including model selection methods. And David Olive has provided an detailed online review of Applied Robust Statistics with sample R code. To PracticeThis course in machine learning in R includes excercises in multiple regression and cross validation. How to plot two different regression lines in R?19.2 Two Regression Lines in Basic R
The regression line will be drawn using the function abline( ) with the function, lm( ), for linear model. The syntax is: abline(lm(y-coordinate ~ x-coordinate). We will use the same colors as those used in the scatterplot to differentiate the two regression lines.
How to make a multiple regression model in R?Step 1 - Install the necessary libraries. ... . Step 2 - Read a csv file and do EDA : Exploratory Data Analysis. ... . Step 3 - Plot a scatter plot between x and y. ... . Step 4 - Train and Test data. ... . Step 5 - Create a linear regression model. ... . Step 6 - Add regression line to the plot. ... . Step 7 - Make predictions on the test dataset.. How to draw a regression line in R?A regression line will be added on the plot using the function abline(), which takes the output of lm() as an argument. You can also add a smoothing line using the function loess().
Can we plot multiple linear regression?At the center of the multiple linear regression analysis lies the task of fitting a single line through a scatter plot. More specifically, the multiple linear regression fits a line through a multi-dimensional cloud of data points. The simplest form has one dependent and two independent variables.
|