REVERSE ENGINEERING USING: LINEAR REGRESSION ANALYSIS & Microsoft Excel 2000
Fred Finkelstein, PE, M.ScCS, A.S.Psy (LION)
Director of Process Engineering at Iron Designer LLC.
LINEAR REGRESSION
Analysis
by FRED FINKELSTEIN, P.E. MSCS, BSME BSCS EIT
Iron Designer LLC
Engineering Services Co.
Feb 11th 2021
ABSTRACT:
Linear Regression is a mathematical method that is used by engineers frequently to reverse engineer data from which comes from an unknown source. Such data collection from an engineering system, machine, function, prototype signal or vibrational mechanism for example requires that through engineering the data be analyzed for system level predictive analysis to within a tolerance band. An example is shown in Figure A of such a set of data that an engineer would collect and use linear regression to determine a predictive model within a tolerance band to assist the engineer in understanding and designing the system response elements, i.e. a damper, spring, sensing unit, communications module, etc. Though many other forms of regression exist, the introductory example is the linear example. This paper will provide the derivation of the equations and an example of how a spread sheet, such as Microsoft Excel can assist the engineer in developing the model equation (red line) to provide the engineer with the useful equation of the predictive model and compute the tolerance band (upper and lower) parallel band lines to determine the envelope of design for the engineer’s final result.??
Linear Regression Analysis Example Diagram:
Figure A: an example of data incorporating the model of linear regression as a predictor of trend, slope and central limit of “best” fit.
In linear regression, the relationships are modeled using?linear predictor functions ?whose unknown model?parameters ?are?estimated ?from the?data . Such models are called?linear models .[3] ?Most commonly, the?conditional mean ?of the response given the values of the explanatory variables (or predictors) is assumed to be an?affine function ?of those values; less commonly, the conditional?median ?or some other?quantile ?is used. Like all forms of?regression analysis , linear regression focuses on the?conditional probability distribution ?of the response given the values of the predictors, rather than on the?joint probability distribution ?of all of these variables.
Derivation of Linear Regression Equations:
The basic equation is the algebraic equation of a straight line:
The model of the regression line must lay with a slope and an intercept such that the residuals; the vertical (positive & negative) distances between the data points and the model line at any given point x that corresponds to a data point (x,y) is minimized and the sum of all residuals is = zero.
The residual is given by the equation:
Next the sum of the residuals must be computed by the following:
We must find the constants of regression of our model:???ao & a1
We must take the derivative of each summation, such that each derivative has a slope of “zero”, meaning that there is no “gain” or “loss” in the residual and the balance of the residual summation is equal.
Now, we have 2 equations and 2 unknowns;
So, this is solvable first using the “chain rule” in calculus and then using algebra to solve for the variables (ao & a1), we will close by using a matrix and Cramer’s rule to solve the final equations.
Cancel the –2 in both equations and move factors across the “=” to balance equations:
Next step is to correct equation (1). Since the summation is to “n”, then n * a0??is the summation of this portion, since a0?is a constant. And we can factor out?a1??since this too is a constant.
Using Matrix Algebra:
Using?Cramer’s Rule to solve the matrix:
are the average values, as denoted by the division of n in the equations, therefore a0 is the difference of the mean values, which derives the final equation of the trendline as follows:
y = a0 + a1x
the equation of the trendline.
-end derivation of the trendline from the derivatives of the residual sum of squares summations, i.e. the least squares function -
Simple Example:??“ORIGINAL COLLECTED DATA”
Figure 1: simple data table with 3 pairs of points
Figure 2: graph of the data set in Figure 1
领英推荐
Question is, how was the equation (y = 0.25x + 1.5) developed in the Microsoft Excel Trend Line (red line) and how did they develop the residual R2 in the equation. The residual gives the confidence of the Trend Line (red line) to the data, as an approximation of Least Squares to the Trend Line. This paper will explain the procedure to develop the Trend Line Equation for this set of data and any set of data points (pairs).
Create a table as shown, the headers will remain the same for any type of linear regression model.
The value in this example in the upper left corner 3.000, above the n is just the value for the number of pairs in this model, i.e. 3, but it also is called “k” in some text books, and in this example it is called “n”.
NOTE: the columns x3 and x4, i.e. log(x) & log(y) are not need for linear regression, but would be helpful for exponential regression, we will leave them here since we already have them, but note, they do not play a role in linear regression for tis type of modeling.
Also, NOTE: the column “copy-x was just used to more easily graph the data and is also not mathematically required, it is just there to simplify graphing.
Step 2:
NOTE: Several variable names are just randomly chosen and have no effect on the algorithm, the student can choose any variable names they choose.
NOTE: The student should understand that substitutions of names such as SUMX = sum(x) and SUMX2 = sum(x2) = sum(x^2), all mean the same thing, so be careful to NOTICE and make note of these substitutions **, as they will interchange often in this paper.
** i.e. k = n = K = N = 3 = 3.000, all mean the same thing in this example.
FORMULAS AND EQUATIONS FOR THE REMAINDER OF THE
LINEAR REGRESSION ANALYSIS
?
?
Step 3:?“TEST EXAMPLE”
?
?
?
Step 4:?“VERIFICATION of TRENDLINE”
NOTES:??
·??????In the more familiar: Y = mx + b, make note of the following to prevent confusion.
·??????b = A = ST6
·??????m = B = ST5
?
y = 1.5 +?(0.25)(x)
Equation: 3a – “equation of the TREND LINE”
Student’s note: the student should be able to distinguish between X & x and also Y and y, not that it makes any difference or distinction, other than symbols are usually “overloaded” and that x & y can mean different things in different contexts, the student must be careful to know which “x” is which and which “y” is which.
Step 5:??“TEST VALUES in a TABLE”, note: these values test the TREND LINE to verify that it “maps” and corresponds to the computed values in Step 3: and Step 4: ( a good student will ALWAYS “back check” all of their work, and leave nothing to chance or believe anything they are told, until they check it, 100%)
Step 6: Graphical Verification of the TREND LINE
Step 8:??CONCLUSIONS
Well, everything looks ok, right? After all the vertical deltas [(y2-y1) = 0], so if its zero, it must be right. Right? Not exactly. What’s wrong then ?
My Analysis:
-- end of paper --?
@copyright 2021
all rights reserved, Microsoft Excel 2000 was used to complete this analysis.
Iron Designer Engineering Services LLC
this report may not be copied (owner: Fred Finkelstein, P.E.)
*** if your company needs help with mathematical simulations, or business analysis, I have over 35 years of experience in Analytical design. Feel FREE to reach out to me to discuss a contract with IRONDesigner LLC @2021.
If your company is looking for an executive level board member to provide guidance, direction, business tactic, strategies, mergers, acquisitions, or to grow or dissolve, feel free to contact me for inclusion in your executive team
?
-------------
DISCLAIMER
Fair Use Notice: This blog may contain some copyrighted material whose use has not been authorized by the copyright owners. We believe that this not-for-profit, educational, and/or criticism or commentary use on the Web constitutes a fair use of the copyrighted material (as provided for in section 107 of the US Copyright Law. If you wish to use this copyrighted material for purposes that go beyond fair use, you must obtain permission from the copyright owner. Fair Use notwithstanding we will immediately comply with any copyright owner who wants their material removed or modified, wants us to link to their web site, or wants us to add their photo.