Speaker 1: Hey everyone, welcome back. So today we're going to be looking at how to make money using the ARMA model. So ARMA being autoregressive moving average. So we'll be looking at if we fit various types of ARMA models to some real stock data, historical stock data, can we realistically make any money and can we do better than some baseline random methods. So we'll be looking at real data today. By the way, all the heavy lifting code is kind of stuffed into this function called run simulation. I've put as many comments in it as possible to tell you how it works, but just know that everything is pretty much in this function, but I'll be explaining what is going on as we look at the charts. So we'll be looking at Apple stock data today and we'll be looking at it from the time period January 1st, 2021 until April 1st, 2021. And that price data is shown in this chart here. So this is the price of Apple stock every day in that time period. Now what we'll be forecasting in this video is the returns, because remember that if we want to use an autoregressive moving average model, we need to make sure that the time series that we're trying to forecast is stationary. I have a whole video on stationarity, but kind of the loose definition is the mean should be constant. The volatility should be constant. And that's clearly not true. The mean part by itself is not true for the price data. So typically we forecast returns where returns are usually centered around zero, the volatility that's kind of up in the air, but it looks kind of constant here. We should do a more robust check for stationarity, but the idea is that returns are usually more stationarity than the series they came from. And we can still use the returns to forecast whether or not to buy the stock. So by the way, return just means what is the percent change in stock price between one day and the next. So sometimes it's less than zero. Those are days where the stock price is going down. And some days it's greater than zero where the stock price is going up. And so we're going to do the typical thing. We're going to build ACF and PACF plots to try and inform what order of ARMA model we should build. So here's the autocorrelation function plot. And again, we're looking for spikes where it's outside of these blue windows. And we see the main one is at five. So that means at a lag of five days, we have a strong signal in the ACF. And again, the ACF plot informs the order of the MA part of our model. So we might want to pick a MA5 process. Let's look at the partial autocorrelation PACF, which informs the order of the AR part of the model. And we see the same exact story. There's a spike at five. There's another spike very later on, but let's just focus on the spike at five. And before even looking at these models and their results, why might there be spikes at five days? Well, we know in the stock market, it's closed on weekends. So it's open five days a week, Monday through Friday. So that may very likely be the reason why we're seeing these patterns at lags of five. So let's start with the baseline model because we're going to build some ARMA models and actually use them to buy and sell stocks to see what kind of money we could have made in this period. But we don't know if they're doing well unless we compare them to some really dumb models. So this baseline model is about as dumb as you can get. What it's really doing is just random buying. So here's a graphic of what it does. Basically every day it randomly chooses whether or not to buy the stock. And then on the subsequent day, it's just going to sell the stock no matter what, no matter if it's high or low, whatever. And so this graphic that I've created here, basically these red windows are places where it bought the stock at the left of the window and the stock price went down where it sold the stock on the following day. The green windows are where it bought the stock and then the stock price actually went up. So you see there's a lot of red in this particular instance. And so if we start with $100 at the end of this entire period of doing this random buying and selling, we end up with $95. So we've lost about 5%. But of course, since this is random, this is just one possible outcome. We should probably run many of these to see on average how it's doing. That's exactly what we do here. So tons of runs. How many runs are there? I think there's a thousand runs. So we get a pretty good picture. And this is a histogram of the total amount of money you have at the end after a random buying scheme. On average, you have $95, standard deviation about $6. So you're losing money. So this dashed line here is how much you started with. And this histogram shows you how much you end with. So unsurprisingly, on average, you're losing money with this random buying technique. Let's do one more baseline method that's a little bit smarter and see what we get. So this method says if the last return was positive, then you're going to buy the stock and then immediately sell it on the day after that. So we can visualize that here. For example, every one of these windows was chosen because right before that window, the stock price was going up. For example, take a look at this little green window here. Right before that, the stock price is going up. So you say that, oh, it's going up. Let me buy it. And then after that, you sell it one day after. So the green windows are again where you made a good decision and the stock price continued to go up after that. And the red windows are where you made a bad decision where although it was going up the previous day, when you bought it, it went down after that the day following. And so we see the total amount you earn here is $94. So we're actually doing worse than random here. I think that just might be because of the particular stock we chose. But either way, this is kind of still an uninformed, ill-informed model. So let's get to the real point of this video, which is starting to build our autoregressive moving average models to see if we can beat any of these. We're losing money still, so we would like to be making money. Let's start with an AR1 model just to kind of establish a baseline and see if we can do better than that by taking into account those ACF and PACF plots. So I have run many different AR1 models, and they're different based on the threshold. So let me explain exactly how this works. What we do is every day, we forecast what the stock return will be on the following day. We forecast that using an AR1 model. If that predicted return is bigger than some threshold, so the threshold is what we're varying. So this particular plot you're looking at here is threshold zero, which means that if you predict the stock return to be anything above zero, anything positive on the next day, then you're going to buy the stock, and then you're going to immediately sell the stock on the following day. If that's our strategy, then this is the outcome. And you see using this model in the threshold of zero, we are having $97 at the end of this period, so we're still losing money. But the good news is that we're not losing as much money as with these baseline strategies above. If we pick a threshold of .001, which says that I'm only going to buy if the predicted return is above .001, then we end up having $95, which is a little bit worse than this actually. If we pick .005, we have $97. So we're doing a little bit better than these baseline methods. We're still losing money, but maybe we're getting somewhere. So now let's use an AR5 model, the five being because that's the lag and the PACF that we found was important. If we build AR5 models with the same thresholds, we start getting positive things. So we see that the total amount using a threshold of zero with an AR5 model is $104. So hey, we've made about a 4% return in this period using an AR5 model. If we pick .001, it's the same exact thing. If we pick .005, we actually make $110 or a 10% return in this time period. And let's kind of notice a pattern as the threshold increases. As the threshold increases, for example, in this chart, threshold was .001. You see there's lots of buying and selling going on. When we increase the threshold, there's going to be less periods of buying and selling because it's less likely to have a predicted return using the AR5 model that's above .005. So the higher you make this threshold, the more sure you have to be about the predicted price to buy. But that also means you're going to be buying less often. So it's kind of this trade-off. But we see that trade-off is worth it in this case because we made a 10% return in this about three or four month period. So not bad. We see we're making lots of good decisions and only a couple of bad decisions along the way. So maybe we think that if we try an ARMA 5.5, remember that was just an AR5. Maybe if we include the MA5 component, we're going to be doing even better. That's also what I thought might be true. But when we actually build these models with a threshold of zero, we get negative 6%. Threshold of .001, we get same thing. And then here we get negative 5%. So it seems weird on some level, but let's kind of brainstorm why this might be happening. What might be happening is that we are overfitting. So we know overfitting generally means that you're building a model that's a little bit too complicated for the data you have, and therefore it doesn't perform well out in the wild. That might be what we're seeing. The AR5 model might have been sufficient, and that's why we're seeing up to 10% returns. Maybe by making the model more complicated, even though it was informed by those PACF, ACF plots, maybe the model is just a little bit too complicated for the data, and therefore we are incorporating too much noise into the model, leading to poor returns in the long run. So either way, I encourage you to play around with this notebook. Change the stock from Apple to Microsoft to Tesla. See what happens. Mess with the code all you want. I'll make it available to you. This was just kind of a video on, can we use the ARMA framework, which is very, very popular in time series, to actually make money off of a stock? And the answer is yes. And furthermore, we can make more money than with these kind of baseline methods of randomness or just based on the previous day. Now I will say that there's a lot more work to be done. For example, note that in all of these models, we made the very simple assumption that when you buy, you're just going to sell on the next day, which obviously is not how all people are trading stocks. So you can modify the code to change that however you want. So hopefully you thought this video was interesting. I thought it was interesting, but I'm the one who made it. So if you like this video, please like and subscribe for more videos just like this, and I'll see you next time.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now