Factor Investing in Brazil: A Deep Dive into Evidence and Performance
Over the past few months, I have devoted significant time studying quantitative analysis and financial modeling for investments. It’s exciting to merge my Insper-rooted approach—making decisions based on evidence and robust analysis—with practical applications that contribute to informed decision-making. A special thanks goes to Brenno, from Varos, who has been instrumental in teaching me technical content and contributing to this modeling process.
I am thrilled to share my study and results on a Factor Investing model applied to the Brazilian market, particularly over the last decade. This is the first of many projects I intend to pursue, aimed at exploring and refining robust methodologies that deliver solid outcomes and, most importantly, are grounded in statistical evidence.
A significant challenge in conducting such studies in Brazil lies in the sample size and data quality. For my research, I utilized a database that includes fundamental company data from 2011 onward, allowing me to maximize the sample size and identify variables and factors capable of explaining returns effectively. However, it’s crucial to note that data maturity in Brazil lags far behind that of the United States, which boasts a much larger and higher-quality dataset.
Addressing Key Biases
For this model and study, I took special care to address several critical biases that must be considered in such analyses:
Analyzing Factors in Brazil
Before building the model and backtesting the strategy, it’s crucial to validate the factors and understand how they behaved in the Brazilian market. To this end, I conducted a backtest and a detailed analysis of each factor.
Methodology
Below is the result of my preliminary analysis to determine which indicator within the value factor consistently delivered the highest returns over time.
Following the same approach described above, I analyzed all factors, using as many relevant indicators as possible. This enabled me to select the "champions among champions"—the indicators that best represented each factor for subsequent regression analysis.
Graphing the Best Indicators and Analyzing Combined Factors
The chart above illustrates the returns of the first quartile for each selected factor. It reveals that the momentum/trend-following factor delivers the highest return within the first quartile. This suggests that investing in companies with stronger absolute performance may explain a substantial portion of the returns.
Risk Premium Analysis
Another critical analysis involves examining the risk premium of each factor. This was done by subtracting the fourth quartile's return from the first quartile's return. In other words, it quantifies the advantage of investing in the "best-selected" companies for each factor compared to the "worst-selected" ones. The results are shown in the chart below.
The chart highlights that there is indeed a premium associated with investing in first-quartile companies over those in the fourth quartile. The only factor where this premium was null or negative was market cap. For all other factors, a positive and consistent risk premium was observed. Once again, momentum stood out as the clear winner, with a risk premium exceeding 1,000%. This strongly indicates a significant advantage for investors focusing on companies with stronger absolute performance.
Factor-Specific Descriptive Analysis
Finally, this report includes a dedicated descriptive analysis for each factor. Below is an example for the momentum factor.
The chart on this page provides a breakdown of how each factor performed over time. The bar chart is especially useful for assessing the consistency of each factor. Ideally, a "staircase" pattern is sought, where the first quartile delivers the highest return, followed by the second, and so on, as observed with the momentum factor. Additionally, it’s essential to analyze the performance of each quartile over time, particularly in rolling 1-year windows, to determine if the factor is consistently robust and valid for regression testing.
Factor Correlation Analysis
An important step in selecting factors for regression analysis is to assess their correlation. In linear regression, it's critical that independent variables (factors) are not perfectly correlated, as this would violate MLR.3, the assumption of no perfect multicollinearity. Beyond statistical validity, selecting uncorrelated factors also ensures a more practical and efficient model, as it is likely to perform well across different market cycles.
The correlation matrix for the factors is included in the descriptive analysis report referenced above and can be seen below.
The correlation matrix reveals an exceptionally high correlation between the value factor and both momentum and quality factors. Consequently, it would be ideal to exclude these from the same linear regression to avoid redundancy and preserve the model's validity.
Regression and Statistical Analysis
To determine whether the factors robustly and consistently explain returns, it is essential to go beyond descriptive analysis and employ linear regression, as proposed by Fama and French. This allows for the examination of key statistical metrics related to both the model and the independent variables (factors).
The Fama-French regression follows the equation below:
Although this model was originally developed for the U.S. market, it can be adapted to the Brazilian market with some modifications.
An important variable to calculate is the market premium (Rmt - Rft). Ideally, a positive market premium is expected, meaning that, in the long run, the market return should exceed the return of the risk-free rate. In the United States, this value is typically positive, as the S&P 500, for example, often outperforms the risk-free rate. However, in Brazil, this market premium tends to be zero or even negative, depending on the time window analyzed. This presents a challenge for linear regression, as the Beta becomes distorted compared to the original models proposed by Fama-French.
As can be seen in the graph below, over a sample of approximately 13 years, the market premium in Brazil showed a return of -53%.
Preparing the Dataset
To perform the regression, I calculated the difference between the average universe return and the CDI rate (Rmt - Rft). But what exactly is the "average universe return"?
Since each factor contains a different number of companies in the dataset due to data maturity in Brazil, I first calculated the average return of each factor and then averaged those returns across all factors. This creates a proxy for the market's overall performance while ensuring consistency across the dataset.
领英推荐
Additionally, I applied a liquidity filter, considering only companies with an average daily trading volume greater than R$1M. This ensures the analysis excludes illiquid stocks, which may not be practical for real-world investment strategies.
The independent variables in the regression are the selected factors from the descriptive analysis phase, and the dependent variable is the adjusted market return (Rmt - Rft).
Statistical Considerations
Linear regression minimizes the error terms using Ordinary Least Squares (OLS) to estimate the coefficients. Key considerations for interpreting the results include:
First Model: Excluding the Momentum Factor
In the initial regression, I included the Leverage, Value, Quality, and Size factors, excluding Momentum. This allowed me to analyze the model's performance without Momentum and observe its impact when added later.
The results show moderate explanatory power, with an acceptable R2 and an F-statistic that indicates the model’s overall significance. However, individual factors displayed only mild statistical significance, highlighting room for improvement.
Second Model: Including the Momentum Factor
In the second regression, I included the Momentum factor while retaining the original factors from the first model.
This adjustment significantly enhanced the model's performance:
Insights from the Regression Analysis
The results demonstrate the critical role of the Momentum factor in explaining returns in the Brazilian equity market. Its inclusion not only enhances the model's statistical robustness but also highlights its dominance over other factors, as seen in the descriptive analysis and backtests.
While the Value factor's insignificance might initially seem concerning, it could indicate cyclical limitations or overlap with other variables, warranting further investigation.
In conclusion, the regression confirms that a multi-factor model, especially one emphasizing Momentum, can provide valuable insights into return drivers in Brazil. This analysis lays the groundwork for refining factor-based strategies tailored to the local market.
Backtest of the Model
After analyzing and running the linear regressions to statistically assess the robustness of the model, it is time to get hands-on and observe in practice how the defined model, with the chosen factors and indicators, performs in real-life conditions, taking into account transaction costs and the practical challenges the market presents.
To implement my proprietary factor model, several assumptions are essential. The first is the liquidity filter, mentioned earlier. The second is the portfolio rebalancing frequency. How often will we reanalyze and adjust the investments? This is crucial both for the backtest and the model. Lastly, how many assets will we hold in our portfolio? A key point to highlight is that the model is 100% long-only, meaning it will always be fully invested in equities at all times and under all market conditions.
An important concept for the practical model is to consider its capacity, i.e., how much capital the model can handle without impacting the prices or liquidity of any specific asset. For this, several variables need to be considered, such as the number of assets in the portfolio, the liquidity filter, how many days we will take to buy our stocks (we can distribute purchases over 1/2/3 days), and the percentage of the traded volume that we want to establish as a limit. Over these months, I have developed two proprietary models whose results I find relevant, and they have different capacities. The first model, which I call "aggressive," has a smaller capacity as it includes small caps, which tend to have lower liquidity. The second model, which I consider "moderate," has a significantly higher capacity. For the aggressive model, the minimum capacity would be around 10 million reais, which could increase if we adjust some variables (which would also impact the results). This specific model was designed with a retail investor in mind, but from an institutional perspective, additional relevant variables would need to be considered to account for the capacity of a fund, where the assets under management are much larger.
Agressive Model
I am very pleased with my ability to model the factors and variables to achieve such a satisfactory result over these months. This first model would be fully applicable in practice, and in fact, I am currently implementing it in my own portfolio, meaning I have skin in the game with the model I created. As a young investor, I believe the aggressive model could provide a better risk-return profile throughout my journey.
Regarding the results, I do not intend to delve into every detail to avoid being overly technical and repetitive. However, we observed very relevant statistics, such as an annual return of approximately 37%, with an annual volatility of 23%. Clearly, it is a highly aggressive model, with high volatility, but as shown in the trade statistics, mathematically, it is a winning model. Furthermore, over longer time frames, as seen on the third page, it was able to generate alpha in the market throughout the entire period.
Moderate Model
The moderate model emerged due to the need to create a strategy for individuals with a more conservative/moderate risk profile. While the aggressive model yielded good results, it’s understood that not everyone has the risk appetite for such a model. Additionally, the moderate model accommodates a much higher capacity than the aggressive one.
This model also pleasantly surprised me. Its mathematical return is quite interesting, with a consistent annual return of 23%, much lower volatility than the aggressive model (14%), and a more controlled drawdown. What impressed me positively was that, even during sideways market periods, like the recent one, it managed to deliver positive results.
The report provides additional details regarding risk (such as specific events, e.g., the truckers’ strike, 2008 crisis, COVID, etc.) and more charts. However, I felt that it would be too much content to post here, so I selected what I considered most relevant.
Conclusion
The main takeaway I offer in this article is the importance of studying new topics and deeply exploring subjects we deem relevant, regardless of the field. When I began diving into quantitative finance to build factor investing models, I started connecting a lot of what I had learned in college, from statistics and econometrics to finance and even behavioral economics. I believe that’s the power of education: being able to combine various tools to create meaningful studies in your field, at the forefront of knowledge. I know I’m infinitely far from mastering this area, but I also know that today I know much more than I did a year ago, and that’s what matters—the desire and insatiable curiosity to learn about topics that fascinate us.
Furthermore, in factor investing, I think one key insight is that it is possible to create a portfolio/investment model systematically and automatically, free from the cognitive biases that, unfortunately, are present in all of us. Of course, like any analysis, it is subject to flaws and can always be improved. However, the current results are promising for the Brazilian market. That is, the factors outlined above can indeed explain market returns. In other words, by exposing ourselves to the right factors, we can achieve consistent and efficient returns throughout our investments.
Beyond factor investing, I’ve also developed trading models that use technical analysis indicators, such as Hi-Lo, Bollinger Bands, Moving Averages, OBV, and so on. I’ve achieved promising results in some of them, while others have been disappointing. Currently, the trading model uses daily data, and I plan to move to shorter timeframes.
Additionally, with the factor model, I’m eager to explore macroeconomic factors and triggers that might shift the model to invest in CDI or remain invested in equities, thus enhancing the risk-return profile by incorporating macroeconomic factors.
Feel free to connect with me on LinkedIn for any questions or suggestions. I’m happy to discuss the topic further.
Thank you!