登录查看更多内容

Saturday with Math (Jul 6th)

Alberto Boaventura

发布日期: 2024年7月5日

In some of your studies, you may have needed to project Revenue, CapEx, or OpEx for a period longer than the time series presented in the DFCF. When dealing with a single variable dependent on another, such as revenue based on 5G accesses, tools like Excel and linear estimation formulas for a single variable are widely used. However, when variables are grouped, such as CapEx for FTTx being combined with Homes Passed & Homes Connected, it presents a challenge in separating these costs. The solution, as shown in the figure below, is the use of Least Squares. [1], [3], [4], [5]

This week, we present another exceptional set of tools with applications in a wide range of domains: Finance and Economics; Engineering; Physics and Chemistry; Biology and Medical Imaging; Machine Learning; Geodesy and Surveying; Quality Control; Computer Science and Data Science; Environmental Science and Geoscience; Social Sciences; and Telecommunications and Signal Processing. These tools include Least Squares and its variations.

The method of least squares is a parameter estimation technique in regression analysis that minimizes the sum of the squares of the residuals, which are the differences between observed values and the values predicted by a model. This technique is predominantly used in data fitting, where it is crucial for obtaining the best-fit line or curve that describes the data accurately.

Least squares problems are categorized into linear or ordinary least squares and nonlinear least squares, depending on whether the model functions are linear or not. Linear least squares problems have closed-form solutions, whereas nonlinear least squares problems typically require iterative refinement, approximating the system linearly at each iteration. [3], [4]

It is important to mention that although linear regression and least squares have similar purposes, there are essential differences between them. Linear regression is a statistical technique that models the relationship between variables, estimating how a scalar response depends on one or more explanatory variables. Simple linear regression involves one explanatory variable, while multiple linear regression involves more than one. Multivariate linear regression predicts multiple correlated dependent variables, and when explanatory variables are measured with error, errors-in-variables models are used.

Conversely, least squares regression is a method within linear regression used to estimate the parameters of the linear model by minimizing the sum of squared errors. Essentially, while linear regression defines the relationship between variables, the least squares method is used to achieve the best fit for this relationship.

Due to the wide range of applications for least squares, various methods are best suited for specific purposes, such as Ordinary Least Squares (OLS) and Nonlinear Least Squares (NLS) for basic applications; Weighted Least Squares (WLS) and Structured Least Squares (SLS) for weighted and structured approaches; Generalized Least Squares (GLS) and Total Least Squares (TLS) for advanced and generalized methods; Ridge Regression (Regularized Least Squares) and Lasso Regression (Least Absolute Shrinkage and Selection Operator) for regularization techniques; Robust Regression and Iteratively Reweighted Least Squares (IRLS) for robust and iterative methods; and Partial Least Squares (PLS) and Bayesian Least Squares for specialized techniques.

Ordinary Least Squares (OLS): It is a method for estimating the unknown parameters in a linear regression model by minimizing the sum of the squared differences between the observed dependent variable (y) and the values predicted by the linear function of the independent variable(s). OLS assumes that the relationship between the dependent and independent variables is linear and provides a closed-form solution, allowing direct computation of parameter estimates using matrix algebra. The method seeks to minimize the sum of the squared residuals, where a residual is the difference between an observed value and the value predicted by the model. OLS is based on the assumptions that the errors (residuals) are normally distributed, have constant variance (homoscedasticity), and are uncorrelated. This method is widely used in fields such as economics, finance, engineering, and social sciences to model relationships between variables and make predictions.

Nonlinear Least Squares (NLS): It is used for estimating the parameters of a nonlinear model, where the relationship between the dependent and independent variables is nonlinear. Unlike OLS, NLS involves iterative numerical methods, such as the Gauss-Newton algorithm or Levenberg-Marquardt algorithm, to minimize the sum of the squared residuals, as there is generally no closed-form solution. Initial guesses for the parameters are required, and the solution is refined through successive iterations. NLS is applied in various fields, including physics, chemistry, biology, and economics, where relationships between variables are inherently nonlinear, such as in growth models, reaction kinetics, and dose-response curves.

Weighted Least Squares (WLS) : It is a generalization of OLS that accounts for different variances in the observations by assigning weights to each observation. WLS minimizes the weighted sum of the squared residuals, giving more importance to observations with smaller variances. This method is particularly useful when the variance of the errors differs across observations (heteroscedasticity). WLS is commonly used in econometrics, experimental design, and quality control, especially when dealing with data that have non-constant variance or different levels of precision.

Structured Least Squares (SLS): It is a method used for estimating parameters in models that have a specific structure or constraints. By incorporating known constraints or structures in the model, such as sparsity, symmetry, or other relationships among parameters, SLS can provide more accurate and reliable parameter estimates than standard methods. The estimation process may involve specialized algorithms that leverage the model's structure to enhance computational efficiency. SLS is used in fields like signal processing, control systems, and machine learning, where models often have inherent structures that can be exploited for better estimation.

Generalized Least Squares (GLS): It extends the OLS method to handle cases where there are correlations among the residuals or the residual variances are not constant (heteroscedasticity). GLS aims to provide the best linear unbiased estimates (BLUE) of the coefficients by transforming the original model to account for these issues. The method involves weighting the observations and transforming the data so that the transformed residuals satisfy the standard OLS assumptions of uncorrelated and homoscedastic errors. GLS is particularly useful in econometrics and time series analysis, where the assumptions of OLS are often violated.

Total Least Squares (TLS): It is a method used when there are errors in both the dependent and independent variables. Unlike OLS, which assumes that only the dependent variable has measurement errors, TLS accounts for errors in both sets of variables. This method minimizes the orthogonal distances from the data points to the fitted model, rather than the vertical distances minimized by OLS. TLS is often used in fields such as geodesy, computer vision, and system identification, where measurement errors can affect all variables.

Ridge Regression (Regularized Least Squares): Also known as Tikhonov regularization, is a technique used to address multicollinearity in multiple regression models, where the independent variables are highly correlated. It modifies the least squares criterion by adding a penalty equal to the sum of the squares of the coefficients, scaled by a regularization parameter. This penalty shrinks the coefficients towards zero, reducing their variance and improving the model's stability. Ridge regression is widely used in machine learning, particularly in situations where the number of predictors exceeds the number of observations.

Lasso Regression (Least Absolute Shrinkage and Selection Operator): It Lasso Regression is a type of linear regression that includes a penalty equal to the absolute value of the magnitude of the coefficients. This technique not only helps in regularizing the model to avoid overfitting but also performs variable selection by shrinking some coefficients exactly to zero. As a result, Lasso can produce simpler, more interpretable models that include only the most significant predictors. It is commonly used in statistics, machine learning, and bioinformatics for high-dimensional data analysis.

Robust Regression: It methods are designed to be less sensitive to outliers than OLS. These methods aim to provide parameter estimates that are not unduly influenced by a small number of extreme values in the dataset. Techniques such as M-estimators, R-estimators, and Least Absolute Deviations (LAD) fall under this category. Robust regression is useful in practical data analysis where data may not meet the ideal conditions required for OLS.

Partial Least Squares (PLS): It is a method that combines features from principal component analysis and multiple regression. It is particularly useful when the predictors are highly collinear or when the number of predictors exceeds the number of observations. PLS projects the predictors into a new space and performs regression on these projections, capturing the most relevant information. PLS is widely used in chemometrics, genomics, and other fields involving high-dimensional data.

Iteratively Reweighted Least Squares (IRLS): It is an iterative method for solving least squares problems where the weights of the observations are updated in each iteration based on the residuals from the previous iteration. This technique is often used in robust regression to minimize the influence of outliers, by assigning lower weights to observations with larger residuals. IRLS is applicable in various fields, including statistics, econometrics, and machine learning, where robustness against outliers is crucial.

Bayesian Least Squares: It incorporates prior information about the parameters into the estimation process, combining it with the likelihood of the observed data to produce a posterior distribution of the parameters. This approach is particularly useful when prior knowledge or expert opinion is available and can be quantified. Bayesian least squares is used in various fields, such as economics, finance, and engineering, where incorporating prior information can improve the estimation process.

Least Squares Origin

The method of least squares, a cornerstone of modern statistical inference and data analysis, traces its origins back to the late 18th and early 19th centuries. Its development was spurred by the practical need to accurately predict celestial positions and measure the shape of the Earth in astronomy and geodesy. [2]

The method began to take shape with contributions from several key figures. Roger Cotes, in 1722, proposed the idea of combining different observations to minimize error, laying foundational concepts. Tobias Mayer, around 1750, applied averaging methods to lunar observations, which influenced subsequent developments. Pierre-Simon Laplace, in 1788, developed methods for combining observations under various conditions, including models that accounted for errors in variables.

A significant leap came with Adrien-Marie Legendre in 1805, who published the first concise exposition of the method and applied it to problems in geodesy. However, it was Carl Friedrich Gauss who, in 1809, connected least squares with probability theory, particularly the normal distribution. Gauss's work was pivotal; he solved the problem of orbit prediction for celestial bodies like Ceres using least squares, demonstrating its power in handling observational errors and providing optimal parameter estimates.

As the method of least squares gained acceptance in scientific circles, it became a standard tool in fields beyond astronomy and geodesy, including statistics and later, various branches of engineering and social sciences. Its robustness and applicability to a wide range of data analysis problems have ensured its enduring significance in scientific research and practical applications alike.

Applications

The Least Squares Technique is extensively applied across various domains due to its effectiveness in minimizing the sum of squared differences between observed data and predicted values.

Towards Data Science 1 年前

The Era of the Mathematician Has Arrived

Keith McNulty 7 个月前

MAKING STOCHASTIC PROGRAMING MODELS - FILLING THE…

Jesus Velasquez-Bermudez 5 年前

In Finance, Economics, and Econometrics, it is used for predicting stock prices, estimating demand curves, analyzing economic trends, aiding in portfolio optimization to maximize returns and minimize risks, and estimating relationships between GDP and other economic indicators. It is also crucial for forecasting economic indicators such as inflation and unemployment rates, as well as estimating demand and supply functions, policy analysis, and market research. [4], [5]

In Engineering and Quality Control, the Least Squares Technique finds application in curve fitting for stress-strain analysis, optimizing manufacturing processes, designing controllers for aircraft, spacecraft, and vehicles, and enhancing signal detection and resolution in radar systems. It is also used for improving the quality and extracting features in engineering applications through image processing, optimizing design parameters for systems and structures, and monitoring and improving product quality in manufacturing processes.

In Physics and Chemistry, this technique is essential for modeling physical phenomena, analyzing experimental data, fitting reaction rate data to complex kinetic models, determining physical constants through linear regression, and using curve fitting to model the behavior of physical systems under non-linear forces. Additionally, it is used for data analysis to estimate parameters in complex physical systems and optimization to refine models of physical phenomena for better accuracy.

In Biology and Medicine, the Least Squares Technique helps in modeling biological processes such as population growth and drug kinetics, enhancing image quality and reconstructing images in CT and MRI scans, extracting clear signals from noisy medical data through signal processing, and optimizing neural networks for medical data analysis via machine learning. It also ensures consistency in pharmaceutical manufacturing.

In Computer Science and Data Science, it is utilized for fine-tuning models like linear regression, support vector machines, and neural networks, identifying outliers and unusual patterns in data through anomaly detection, enhancing features and reducing noise in computer vision tasks with image processing, and modeling and predicting trends in large datasets using time series analysis. Moreover, it improves algorithms for data analysis and processing through optimization.

In Environmental Science and Geoscience, the technique models relationships between environmental variables and pollution levels, predicts weather patterns and climate change through time series analysis, monitors tectonic movements and improves satellite positioning in geodesy and surveying, and aids in resource management and allocation based on environmental data through optimization.

In the Social Sciences, linear regression is employed to analyze survey data and identify trends and relationships, while nonlinear regression models complex social phenomena. Data analysis is used to estimate parameters in social science models, and time series analysis forecasts social indicators such as population growth. Econometrics is utilized to analyze the impact of policies and social programs.

In Telecommunications and Signal Processing, the Least Squares Technique filters noise from communication signals, enhances speech processing, optimizes algorithms for better signal quality and resolution, and examines and predicts signal behavior over time using time series analysis. [6], [7], [8], [9]

LS (Least Squares) plays a special role in Orthogonal Frequency Division Multiplexing (OFDM) within wireless communication systems due to its simplicity in channel estimation. OFDM, known for its high transmission rate and efficient frequency utilization, divides the channel into multiple narrow subchannels to mitigate inter-symbol interference (ISI). In OFDM systems, LS is widely used for its straightforward implementation, although it suffers from high mean square error (MSE). In contrast, MMSE (Minimum Mean Square Error) provides superior performance in low signal-to-noise ratio (SNR) environments but is computationally intensive. Modified LS methods, incorporating noise reduction procedures, and hybrid techniques like LMMSE (Least Minimum Mean Square Error) aim to balance accuracy and complexity. These methods are crucial in MIMO-OFDM systems, such as Wi-Fi, 4G (LTE) and 5G (NR), enhancing data transmission by using multiple antennas to increase data rates and improve resilience against fading. [10], [11]

These diverse applications highlight the versatility and importance of the Least Squares Technique in both theoretical research and practical implementations across various fields.

Equation in Focus

?The equation in focus is the orthogonal projection of the vector space formed by the columns of X. Orthogonal projection is a type of linear transformation that maps a vector onto a subspace in such a way that the error, defined as the difference between the original vector and its projection, is minimized. This means that the orthogonal projection of a vector v onto a subspace W results in a vector p in W such that the difference v-p is orthogonal to every vector in W.?In least square models, orthogonal projection plays a crucial role in determining the estimation error.

-x-x-x-x-

In the next issue, in honor of the 75th anniversary of the publication of Claude Shannon's paper "A Mathematical Theory of Communication," we will present the Shannon-Hartley equation and its implications for telecommunications.

References

[1] https://www.dhirubhai.net/pulse/nature-telecom-capex-dr-kim-kyllesbech-larsen-/

[2] https://www.amazon.com/History-Statistics-Measurement-Uncertainty-before/dp/067440341X

[3] Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis

[4] https://www.amazon.com/Estimation-Inference-Econometrics-1st-First/dp/B0071MC3DW

[5] https://www.amazon.com/Econometrics-Economics-handbook-Gregory-Chow/dp/0070108471

[6] https://www.amazon.com/Array-Signal-Processing-Concepts-Techniques/dp/0130485136

[7] Evaluation of the Direction of Arrival Estimation Methods for the Wireless Communications Systems (Beamforming & Smart Antenna)

[8] Least Squares with Examples in Signal Processing

?[9] https://www.amazon.com/Square-Estimation-Applications-Digital-Processing/dp/047187857X

[10] A Modified LS Channel Estimation Algorithm for OFDM System in Mountain Wireless Environment

[11] Analysis of LSE and MMSE Pilot Based Channel Estimation Techniques for MIMO-OFDM System