menu
Regression analysis in Excel - javatpoint
Regression analysis in Excel with topics of ribbon and tabs, quick access toolbar, mini toolbar, buttons, worksheet, data manipulation, function, formula, vlookup, isna and more.

Regression analysis in Excel - javatpoint

The regression analysis is a part of statistical modeling that is used to estimate the relationship between the two or more variables. In MS Excel, you can perform several statistical analyses, including regression analysis. This one is a good option because almost every computer user can access Excel.

Excel provides the inbuilt method to calculate the regression. In MS Excel, regression feature is available at the end inside the Data tab. You have to add this Data Analysis ToolPak exclusively to your Excel from Add-ins.

Before going in deep, you must know - what is regression analysis? Types of variables in it and many more basics things. We will explain all these terms in this chapter. So, go through the chapter till the end.

Regression analysis is an analysis that shows a relationship between dependent and independent variables and produces an equation. This equation consists of a coefficient that represents the relationship of dependent and independent variables.

In simple linear regression, the value of one variable is used to describe the value of another variable. The variable that is being described is called dependent variable, while the variable, which is used to describe or predict the value of dependent variable is called independent variable.

Regression analysis have two variables:

Dependent variable is the factor that we try to understand and predict. The value of dependent varies according the independent variable. In contrast, independent variables are those factors that affect the dependent variable and helps to predict the value for dependent variables.

Let's take a scenario to understand the variables of regression analysis -

For example, we have sells data of 12 months stored in an Excel worksheet. This data is for the sell of umbrellas from January to December. Each month sell is different according to rainfall. Umbrellas are most sold in month July and less sold in January.

This Excel worksheet will contain three columns: Months (Jan to Dec), Rainfall percentage, and sold umbrella quantity (total number of umbrellas sold in each month).

Here,

Dependent variable: Umbrella

Independent variable: Rainfall percent

So, the umbrella is a dependent variable whose sell depends on the rainfall percentage of each month, which is an independent variable. Sell of umbrellas increase and decreases when rainfall is high or low. Hope you have understood the dependent and independent variable in regression analysis.

Now, before proceeding forward, verify that the Data Analysis ToolPak is enabled and available inside Data tab. Go to the Data tab and check for the Data Analysis ToolPak inside the ribbon at last. See in screenshot below:

If it is not enabled, add it to your Excel for performing regression analysis.

If the Data Analysis option is not available as shown in above screenshot, add it to your Excel by following below steps explained below in detail.

Follow the steps to enable the data analysis ToolPak inside the Data tab.

Step 1: In your current active Excel worksheet, go to the File in Excel menu bar.

Step 2: Inside the More? in left sidebar, you will see an Options option. Click on it that will open a panel containing various settings.

Step 3: From the Excel options panel, click on the Add-ins in the left sidebar.

Step 4: Here, verify that the "Excel Add-ins" is selected inside the Manage dropdown list. If yes, click Go next to the dropdown button.

Step 5: In the Excel Add-ins dialog box, mark the Analysis ToolPak and click OK.

Step 6: Close all the extra tabs opened and see that the Data Analysis ToolPak has been added inside the Data tab.

Now, your Excel is ready to do regression analysis on data. Thus, we will now perform regression analysis on the scenario defined above.

Now, you will see how regression analysis is performed on the Excel data step by step. We have this data here.

Step 1: Inside the Data tab, click on the Data Analysis option added it Excel in earlier steps.

Step 2: Scroll down and Select the Regression from the list and click OK in this panel.

Step 3: Now, configure the following settings in the regression dialogue box.

Step 4: Enter all these required details carefully and click OK.

It will generate a summary in Sheet2 for the analysis after setting up the following things.

Step 5: See the output created by Excel regression analysis and observe it.

This summary output will contain REGRESSION STATISTICS, ANOVA, and RESIDUAL OUTPUT, most importantly. All these details in the same page.

We have performed regression analysis and you have noticed that the execution of regression is very easy. You have to do nothing difficult with it because all the calculations take place automatically. The complete output is auto-generated along with the statement as well.

Calculation is easy, but the interpretation is not so easy to understand. So, this time is to interpret its result. You have seen that the output is containing four major parts: regression statistics, ANOVA, and residual output. Let's analyze them:

Regression Statistics

Regression statistics tell you how the linear regression equations fit in our data.

Let's understand the terms used in the regression statistics table.

ANOVA

The next part of regression analysis is ANOVA, an analysis of variance. Then coefficient section with ANOVA.

The most essential component after the ANOVA table is coefficients. This allows the users to create the linear regression equation in Excel, i.e.,

For our dataset months, rainfall, and sold umbrella, the formula will be:

Put the values from the table in this formula:

Put the value of x = Rainfall (mm) for any month. Like we have put for January rainfall, i.e., 76. So,

y= 0.327*76-15.417

y = 9.435

It is the predicted number for the number of umbrellas sold in January. Similarly, you can predict how many umbrellas going to be sold for any month by putting their rainfall percentage.

RESIDUAL OUTPUT

The last part is residual, which shows the difference between actual and estimated values. If you compare the results of both values for the total number of umbrellas sold each month, you will see that there will be a slight difference between both numbers.

If you will compare the actual umbrellas sold in January month and predicted value for it. You will get a slight difference in them.

Actual sold umbrella in January: 12

Predict value of sold umbrella in January: 9.486

The difference between the actual value and predicted value can be seen in residuals in their respective column.

12 - 9.486 = 2.514

You can match it under the RESIDUAL OUTPUT table.

You can also make a graph and plot the values on it to see the relationship between two variables. Thus, draw a linear regression chart.

Step 1: Open your Sheet1 in the same Excel workbook and select the columns of independent variable and dependent variable along with the header.

Step 2: Navigate the Insert tab, where you see the chart group. Click on it, then choose Scatter (first one in the list). For easy to go, follow Insert > Chart group > Scatter.

Step 3: A scatter plot chart will be inserted in your currently active workbook, which will look something like this -

Step 4: Now, draw a least square regression line in this plotted chart. To do this, right-click on any of the points in this chart and choose Add Trendline? from the context menu.

Step 5: Select the Linear trendline shape from the right side of the panel Format Trendline.

Step 6: Scroll down the format trendline panel and optionally mark the Display equation on chart to get the formula for regression. However, this one is optional.

You can now see that the regression equation has been created.

Step 7: Now, move to the Fill & Line option to customize the line you like. You can change the color and type of line from here. E.g., use a solid line instead of the dashed line.

See the customize linear regression graph.

You can make some more improvements to the graph, like provide the axis title (horizontal and vertical) to the graph.

SoapUI

RPA

Manual Testing

Cucumber

Appium

PostgreSQL

Solr

MongoDB

Gimp

Verilog

Teradata

PhoneGap

Aptitude

Reasoning

Verbal Ability

Interview Questions

Company Questions

Artificial Intelligence

AWS

Selenium

Cloud Computing

Hadoop

ReactJS

Data Science

Angular 7

Blockchain

Git

Machine Learning

DevOps

DBMS

Data Structures

DAA

Operating System

Computer Network

Compiler Design

Computer Organization

Discrete Mathematics

Ethical Hacking

Computer Graphics

Software Engineering

Web Technology

Cyber Security

Automata

C Programming

C++

Java

.Net

Python

Programs

Control System

Data Mining

Data Warehouse

Hindi100

Lyricsia

Website Development

Android Development

Website Designing

Digital Marketing

Summer Training

Industrial Training

College Campus Training

Address: G-13, 2nd Floor, Sec-3

Noida, UP, 201301, India

Contact No: 0120-4256464, 9990449935

© Copyright 2011-2021 www.javatpoint.com. All rights reserved. Developed by JavaTpoint.