LOESS regression method on Stock Market Data
In this post, we will dive into the world of local regression, and specifically the LOESS (locally estimated scatterplot smoothing) method. The data I used for this analysis is five years' worth of daily opening stock prices 3/9/2015- 3/9/2020 for 20 different Fortune 500 companies. Each of these companies were led by the same CEO the entirety of that timeframe. The goal of this analysis is to compare performances and see if companies with Female CEOs perform better than those with male CEOs. McKinsey's 2018 report "Delivering Through Diversity" gave the following insight:
"Companies in the top-quartile for gender diversity on executive teams were 21% more likely to outperform on profitability and 27% more likely to have superior value creation. The highest-performing companies on both profitability and diversity had more women in line (i.e., typically revenue-generating) roles than in staff roles on their executive teams. "
View full report here
If this statistic is true en masse, perhaps a small sample of top performing companies will provide additional evidence to support. For sake of simplicity, we will be using stock price data as placeholder for value creation.
Here are the female-led and male-led companies we are using for the analysis:
We will use mean squared error (MSE) as our goodness of fit metric. In this example mean squared error is calculated as the average distance from each point to the regression line squared. For Duke energy, this MSE is 31585.09.
If we use the LOESS method to plot a smoothing line with span parameter=.25, we get the below visual output with a MSE of 16770, a reduction of nearly 50%.
The smoothing algorithm works by taking a certain "span" of the data at each point when assigning a fitted value. If this span is close to zero, it has the ability to perfectly fit the data, and a span near 1 will closely resemble to a linear regression. This idea an be intuitively explained with the below graph, fitting the same data to different LOESS "span" levels.
The actual data points have been removed to ease pattern recognition. As span level increases, the line becomes more and more flat. This begs the question- What is the appropriate level of "smoothness". In attempt to answer this question, we will fit each of the company's lines to the same R-squared level of .8. R-squared is the coefficient of determination and essentially tells us how well the model fits the data. If we assume that 20% of stock price fluctuation is noise, this output can be used to assign a company's underlying value change.
Some of these companies interpretations improve drastically with the addition of this smoothing algorithm. Others can be predicted just as well with a linear regression. Below depicts the percent increase in goodness-of-fit (MSE) in Female CEO companies by moving from a linear regression to the smoothing algorithm with assigned R-squared of .8
Lockheed saw the smallest change in MSE of nearly 0. These results can be seen below with the LOESS best fit line in green and linear model fit in blue. The stock price's changes closely resemble a straight line.
On the other hand, Synchrony's stocks explanation see vast improvement with a LOESS regression. The data appears to have some sort of cyclical rise and fall that cannot be explained or depicted with a straight line.
Moving forward with these smoothing lines, below is the plot for all 10 Female CEO companies' performances.
And again, below for the 10 companies with male CEOS. These lines do appear quite straight due to the large disparity in prices. With Amazon and Berkshire Hathaway(B stock) being such large outliers, we can not visualize the true changes in such a small plot.
Now, we can take an average of the performances of the 10 female CEO companies and the 10 male companies to compare their performances. Below is average of male (Blue) against female (Pink).
The two averaged-out performances are incredibly similar. However, as I noted above, the Male results were heavily skewed especially due to Amazon's staggering growth. If we remove this outlier and replace it with Kroger, the next largest company on the Fortune 500 list, the relationship between male and female CEO changes quite significantly.
Obviously, there is more to inclusion and diversity than just the gender of the CEO position. However, this may be an indicator for the company's overall I&D efforts. If there is a true relationship between a company's growth and diversity efforts, this small experiment may provide supporting evidence. Thanks for reading!
"Companies in the top-quartile for gender diversity on executive teams were 21% more likely to outperform on profitability and 27% more likely to have superior value creation. The highest-performing companies on both profitability and diversity had more women in line (i.e., typically revenue-generating) roles than in staff roles on their executive teams. "
View full report here
If this statistic is true en masse, perhaps a small sample of top performing companies will provide additional evidence to support. For sake of simplicity, we will be using stock price data as placeholder for value creation.
Here are the female-led and male-led companies we are using for the analysis:
After data ingestion, cleaning, reformatting, and compiling, I created a table for Male CEO and Female CEO with dates and all corresponding company prices. For proof of concept, I wanted to compare the results of a linear regression to the LOESS smoothing algorithm for one randomly selected company. Below you can see the fit of a linear regression on the Duke Energy stock prices.
If we use the LOESS method to plot a smoothing line with span parameter=.25, we get the below visual output with a MSE of 16770, a reduction of nearly 50%.
The smoothing algorithm works by taking a certain "span" of the data at each point when assigning a fitted value. If this span is close to zero, it has the ability to perfectly fit the data, and a span near 1 will closely resemble to a linear regression. This idea an be intuitively explained with the below graph, fitting the same data to different LOESS "span" levels.
The actual data points have been removed to ease pattern recognition. As span level increases, the line becomes more and more flat. This begs the question- What is the appropriate level of "smoothness". In attempt to answer this question, we will fit each of the company's lines to the same R-squared level of .8. R-squared is the coefficient of determination and essentially tells us how well the model fits the data. If we assume that 20% of stock price fluctuation is noise, this output can be used to assign a company's underlying value change.
Some of these companies interpretations improve drastically with the addition of this smoothing algorithm. Others can be predicted just as well with a linear regression. Below depicts the percent increase in goodness-of-fit (MSE) in Female CEO companies by moving from a linear regression to the smoothing algorithm with assigned R-squared of .8
Lockheed saw the smallest change in MSE of nearly 0. These results can be seen below with the LOESS best fit line in green and linear model fit in blue. The stock price's changes closely resemble a straight line.
On the other hand, Synchrony's stocks explanation see vast improvement with a LOESS regression. The data appears to have some sort of cyclical rise and fall that cannot be explained or depicted with a straight line.
Moving forward with these smoothing lines, below is the plot for all 10 Female CEO companies' performances.
And again, below for the 10 companies with male CEOS. These lines do appear quite straight due to the large disparity in prices. With Amazon and Berkshire Hathaway(B stock) being such large outliers, we can not visualize the true changes in such a small plot.
Now, we can take an average of the performances of the 10 female CEO companies and the 10 male companies to compare their performances. Below is average of male (Blue) against female (Pink).
The two averaged-out performances are incredibly similar. However, as I noted above, the Male results were heavily skewed especially due to Amazon's staggering growth. If we remove this outlier and replace it with Kroger, the next largest company on the Fortune 500 list, the relationship between male and female CEO changes quite significantly.
Obviously, there is more to inclusion and diversity than just the gender of the CEO position. However, this may be an indicator for the company's overall I&D efforts. If there is a true relationship between a company's growth and diversity efforts, this small experiment may provide supporting evidence. Thanks for reading!
Excellent analysis, and good information!
ReplyDelete