Efficient Frontier Construction in R Studio
Introduction
The concept of the Efficient Frontier is central to modern portfolio theory, which was promulgated by Harry Markowitz in the mid-20th century. The Efficient Frontier represents the set of optimal portfolios that offer the highest expected return for a given level of risk. This report outlines the application of the efficient frontier in the context of FIN 439 at Oregon State University, utilizing R for practical implementation. I have used this course and code to create a practical guide for any interested investor to create an Efficient Frontier and efficient portfolio consummate with their perceptions of risk.
Course Objectives:
Where course objectives emphasize hands-on experience in managing investment portfolios, this guide illustrates fundamental concepts in portfolio management and portfolio analysis. Where one of the aims of the course is to develop an efficient and comprehensive portfolio, this code aids investors and students alike.
The Efficient Frontier
Concept:
The efficient frontier is a visualization of optimal portfolios that offer the maximum possible return for a given level of risk, where risk is defined as the standard deviation of the returns of an asset or portfolio. It is part of the Modern Portfolio Theory, developed by Harry Markowitz, which emphasizes diversification to reduce risk. According to MPT, risk is reduced by selecting the asset with the greatest expected return when controlling for risk. Such methodologies assist investors to make informed decisions by understanding the trade-off between risk and return. This understanding can guide the construction of portfolios which align with an investor's risk tolerance and return expectations.
In order to construct an Efficient Frontier, an investor needs estimations of expected returns, variances, and covariances of asset returns. Mathematical methods of optimization are subsequently utilized to construct a portfolio with the greatest expected return and least expected risk. After such methods are applied, the tangency portfolio is selected, and a line is drawn on the Frontier at the tangency portfolio, which represents the greatest return per unit of risk among the portfolios. This line is also called the Capital Market Line.
For the purposes of reproduction and ease-of-use, R studio was utilized to create an EF, and the dots in the graph represent 10,000 simulated portfolios, each with different weights for the assets included. In our model, five stocks were used, being 'AAPL', 'MSFT', 'GOOG', 'AMZN', and 'JNJ.’ These five well-known stocks offer a simple and relatable Frontier for understanding the EF in this report.
The following is a step-by-step guide for recreating the portfolio found in this report. The code, in blue, can be copied and pasted into R-studio, and has been fact-checked to work.
Practical Application Using R
R Code Overview: The following R code provides a practical example of constructing the efficient frontier using weekly stock data for five companies: AAPL, MSFT, GOOG, AMZN, and JNJ. Any symbols can replace these stocks in the code.
## (1) Define the packages that will be needed
packages <- c('quantmod', 'ggplot2', 'dplyr')
## (2) Install them if not yet installed
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
install.packages(packages[!installed_packages])
}
## (3) Load the packages into R session
invisible(lapply(packages, library, character.only = TRUE))
?
## Create a character vector that has the stock codes we need
portfolio <- c('AAPL', 'MSFT', 'GOOG', 'AMZN', 'JNJ')
## Load the stocks needed into R
portfolio <- lapply(portfolio, function(x) {getSymbols(
x, periodicity='weekly', auto.assign=FALSE)})
## Get adjusted prices of all stocks
portfolio_adjusted <- lapply(portfolio, Ad)
## Transform into xts
portfolio_adjusted <- do.call(merge, portfolio_adjusted)
## View the first few rows
head(portfolio_adjusted)
?
?## Make a list that contains log weekly returns of each stock
portfolio_adjusted <- lapply(portfolio_adjusted, weeklyReturn, type='log')
## Transform into an xts object
portfolio_adjusted <- do.call(merge, portfolio_adjusted)
## Adjust the column names
colnames(portfolio_adjusted) <- c('AAPL', 'MSFT', 'GOOG', 'AMZN', 'JNJ')
## Remove first row since these do not have returns
portfolio_adjusted <- portfolio_adjusted[-1]
## View the first few rows of the log returns
head(portfolio_adjusted)
领英推荐
## Get variance-covariance matrix
var_covar <- var(portfolio_adjusted)
## Print results
print(var_covar)
?
?## Set seed for reproducibility
set.seed(123)
## Get 50,000 random uniform numbers
random_numbers <- runif(50000)
## Transform random numbers into matrix to distribute across all symbols
all_weights <- matrix(random_numbers, nrow=10000, ncol=5)
## Add sixth column with just NAs
all_weights <- cbind(all_weights, rep(NA, 10000))
## Add names
colnames(all_weights) <- c('AAPL', 'MSFT', 'GOOG', 'AMZN', 'JNJ', 'total')
## Loop to convert into actual weights
for (i in 1:10000) {
## Get sum of random numbers in each row
all_weights[i, 6] <- sum(all_weights[i, 1:5])
## Get the actual weights of the random numbers
all_weights[i, 1:5] <- all_weights[i, 1:5] / all_weights[i, 6]
}
## Delete total column
all_weights <- all_weights[, -6]
## Create column placeholders
portfolio_risk <- rep(NA, 10000)
portfolio_returns <- rep(NA, 10000)
sharpe_ratios <- rep(NA, 10000)
## Define risk-free rate
risk_free_rate <- 0.02 / 52 ## annualized risk-free rate converted to weekly
## loop to calculate risk and return per weights
for (i in 1:10000) {
weights <- all_weights[i, ]
portfolio_risk[i] <- sqrt(sum((weights %*% var_covar) * weights))
portfolio_returns[i] <- sum(weights * colMeans(portfolio_adjusted))
sharpe_ratios[i] <- (portfolio_returns[i] - risk_free_rate) / portfolio_risk[i]
}
?## Identify the portfolio with the highest Sharpe ratio
max_sharpe_index <- which.max(sharpe_ratios)
max_sharpe_portfolio <- all_weights[max_sharpe_index, ]
tangency_portfolio_risk <- portfolio_risk[max_sharpe_index]
tangency_portfolio_return <- portfolio_returns[max_sharpe_index]
## Make a data frame to be used for ggplot2
portfolio_df <- data.frame(portfolio_risk, portfolio_returns)
## Plot the efficient frontier with the tangency line
portfolio_df %>%
ggplot(aes(x=portfolio_risk, y=portfolio_returns)) +
geom_point(alpha=0.2) +
theme_minimal() +
geom_abline(intercept = risk_free_rate,
slope = (tangency_portfolio_return - risk_free_rate) / tangency_portfolio_risk,
color = 'blue',
linetype = 'dashed') +
geom_point(aes(x=tangency_portfolio_risk, y=tangency_portfolio_return),
color='red', size=3) +
labs(
title='Efficient Frontier graph of 5 assets with Tangency Line',
subtitle='AAPL, MSFT, GOOG, AMZN, JNJ',
x = 'Portfolio Risk (Standard Deviation)',
y = 'Portfolio Return'
)
Variance-Covariance Matrix:
The variance-covariance matrix is calculated so that investors can understand the relationships between the returns of the different stocks in the portfolio; higher scores imply that the prices or changes in prices of assets tend to vary together; for these reasons, assets are often selected such that their variance-covariance scores are close to zero, indicating that they do not vary positively with one another and that they similarly not tend to move in opposite directions consistently. This matrix is crucial for portfolio optimization as it helps in calculating the risk (standard deviation) of different portfolio combinations.
Random Portfolio Simulation:
The code generates 10,000 random portfolios by assigning random weights to each stock; it ensures the weights sum to 1 for each portfolio. This simulation is important to explore the range of possible portfolios and identify those on the efficient frontier. For each simulated portfolio, the code calculates the portfolio's risk and expected return, using the aforementioned weights.
Finally, the code uses ggplot2 to create a scatter plot of the portfolios, with risk on the x-axis and return on the y-axis. The resulting plot visually represents the efficient frontier, highlighting the trade-off between risk and return; the exact location that a randomly selected investor prefers on the Frontier is idiosyncratic, though professional investors may tend to prefer similarly risky portfolios. We can then state that Portfolios lying on the efficient frontier are optimal, while those below the curve are sub-optimal as they offer lower returns for the same risk level.
Using the Frontier
Investors can use the efficient frontier to select portfolios based on their risk tolerance; for risk-averse investors, portfolios on the left side of the frontier (lower risk) are preferable. For risk-tolerant investors, portfolios on the right side (higher risk) may be more suitable, as they offer higher potential returns. The efficient frontier highlights the importance of diversification in reducing risk without detracting from expected returns; it also demonstrates that increasing returns generally involves taking on higher levels of risk.
Practical Implications:
By working through the R code, I have gained practical experience in data manipulation, financial analysis, and visualization. I have learned to apply theoretical concepts to real-world data, enhancing my understanding of portfolio management. I have developed essential skills in using R for financial analysis, which are valuable in the finance industry. The practical application of constructing the efficient frontier aligns with the course objectives of developing an efficient portfolio and understanding security valuation models. I have also solidified knowledge gained in Applied Portfolio Management I (FIN 437).
Conclusion
The efficient frontier is a powerful tool in modern portfolio theory, enabling investors to make informed decisions about their portfolios. By understanding and applying the concept, students can optimize their investment strategies to achieve the best possible returns for their desired level of risk. The R code serves as a practical example of constructing the efficient frontier, illustrating the trade-offs between risk and return.
This report provides a comprehensive overview of the efficient frontier, its practical application using R, and the importance of portfolio optimization.
Full Code:
## (1) Define the packages that will be needed
packages <- c('quantmod', 'ggplot2', 'dplyr')
## (2) Install them if not yet installed
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
install.packages(packages[!installed_packages])
}
## (3) Load the packages into R session
invisible(lapply(packages, library, character.only = TRUE))
## Create a character vector that has the stock codes we need
portfolio <- c('AAPL', 'MSFT', 'GOOG', 'AMZN', 'JNJ')
## Load the stocks needed into R
portfolio <- lapply(portfolio, function(x) {getSymbols(
x, periodicity='weekly', auto.assign=FALSE)})
## Get adjusted prices of all stocks
portfolio_adjusted <- lapply(portfolio, Ad)
## Transform into xts
portfolio_adjusted <- do.call(merge, portfolio_adjusted)
## View the first few rows
head(portfolio_adjusted)
## Make a list that contains log weekly returns of each stock
portfolio_adjusted <- lapply(portfolio_adjusted, weeklyReturn, type='log')
## Transform into an xts object
portfolio_adjusted <- do.call(merge, portfolio_adjusted)
## Adjust the column names
colnames(portfolio_adjusted) <- c('AAPL', 'MSFT', 'GOOG', 'AMZN', 'JNJ')
## Remove first row since these do not have returns
portfolio_adjusted <- portfolio_adjusted[-1]
## View the first few rows of the log returns
head(portfolio_adjusted)
## Get variance-covariance matrix
var_covar <- var(portfolio_adjusted)
## Print results
print(var_covar)
## Set seed for reproducibility
set.seed(123)
## Get 50,000 random uniform numbers
random_numbers <- runif(50000)
## Transform random numbers into matrix to distribute across all symbols
all_weights <- matrix(random_numbers, nrow=10000, ncol=5)
## Add sixth column with just NAs
all_weights <- cbind(all_weights, rep(NA, 10000))
## Add names
colnames(all_weights) <- c('AAPL', 'MSFT', 'GOOG', 'AMZN', 'JNJ', 'total')
## Loop to convert into actual weights
for (i in 1:10000) {
## Get sum of random numbers in each row
all_weights[i, 6] <- sum(all_weights[i, 1:5])
## Get the actual weights of the random numbers
all_weights[i, 1:5] <- all_weights[i, 1:5] / all_weights[i, 6]
}
## Delete total column
all_weights <- all_weights[, -6]
## Create column placeholders
portfolio_risk <- rep(NA, 10000)
portfolio_returns <- rep(NA, 10000)
sharpe_ratios <- rep(NA, 10000)
## Define risk-free rate
risk_free_rate <- 0.02 / 52 ## annualized risk-free rate converted to weekly
## loop to calculate risk and return per weights
for (i in 1:10000) {
weights <- all_weights[i, ]
portfolio_risk[i] <- sqrt(sum((weights %*% var_covar) * weights))
portfolio_returns[i] <- sum(weights * colMeans(portfolio_adjusted))
sharpe_ratios[i] <- (portfolio_returns[i] - risk_free_rate) / portfolio_risk[i]
}
## Identify the portfolio with the highest Sharpe ratio
max_sharpe_index <- which.max(sharpe_ratios)
max_sharpe_portfolio <- all_weights[max_sharpe_index, ]
tangency_portfolio_risk <- portfolio_risk[max_sharpe_index]
tangency_portfolio_return <- portfolio_returns[max_sharpe_index]
## Make a data frame to be used for ggplot2
portfolio_df <- data.frame(portfolio_risk, portfolio_returns)
## Plot the efficient frontier with the tangency line
portfolio_df %>%
ggplot(aes(x=portfolio_risk, y=portfolio_returns)) +
geom_point(alpha=0.2) +
theme_minimal() +
geom_abline(intercept = risk_free_rate,
slope = (tangency_portfolio_return - risk_free_rate) / tangency_portfolio_risk,
color = 'blue',
linetype = 'dashed') +
geom_point(aes(x=tangency_portfolio_risk, y=tangency_portfolio_return),
color='red', size=3) +
labs(
title='Efficient Frontier graph of 5 assets with Tangency Line',
subtitle='AAPL, MSFT, GOOG, AMZN, JNJ',
x = 'Portfolio Risk (Standard Deviation)',
y = 'Portfolio Return'
)