0:00
hey there Leo here and I am smiling and I hope you are too you might notice
0:05
there is a new background behind me that is because I've moved and the house you know the reason why there was no content
0:11
for a while is the house we were moving from ended up getting flooded and it just created a nightmare so anyways long
0:17
story short I'm back to create some new content So speaking of content what are we going to learn today well today
0:23
you're going to learn how to test any indicator and see if it has predictive power we're going to use a simple linear
0:29
regression machine learning to figure out whether or not more specifically
0:35
anchored v-wap and the anchored SMA and smas actually have some predictive power
0:41
I know that anchored v-wap is you know thought of very fondly among a lot of
0:47
Traders and I'm not here to knock it I'm just going to show you my results and you know without taking in too much in
0:54
detail they weren't that great right I'm not going to be using the anchor B web but again um you can check out the rest
1:01
of this video or if you're not interested in the code I have another Channel where my next video where I'm just going to discuss how discuss these
1:10
results and how it relates to trading so without dragging this on any further I hope you enjoy this one let me know how
1:16
it goes I'm just resetting up all the equipment and all of that stuff so I'd love to hear any constructive criticism
1:22
or how I could make these videos better alright thanks so the first thing that we're going to want to do is get our required Imports
1:29
and in this instance I'm going to use my own database that I've populated with my
1:34
own price data I teach you how to do that on this channel if you don't have your own database feel free to get the
1:41
data freely from live Finance which I also show you how to do so I'll link the Y Finance tutorial video in the
1:47
description below that way you can follow along with free data if you don't have your own I will get the wired
1:53
Imports reports it all right Imports
2:00
now I'm going to import sys because I need to add my local path so I can
2:05
access my data import sys path append
2:11
home yo Smiggle development Alpha I
2:18
got so this will allow me to access you
2:23
know some of my data that's essentially what I'm doing it I'll point it out exactly when I'm doing it too so we're going to import the standard we'll
2:29
import pandas numpy that's models for the linear regression and P values and
2:35
Pi plot for plotting so import and as PD important numpy SNP import stats models
2:44
as SM import matplotlib dot I plot as PLT and then
2:51
we'll do a start date get a lot of it so we'll do two thousand three zero one zero one an end date
2:57
equals two thousand twenty two twelve Thirty One
3:04
perfect so that will give us our initial Imports now the next thing we need to do
3:11
is get our s p list now this is going to get us the S P 500 list
3:18
quickly and easily create the S P 500 list now keep in mind the reason why we're
3:25
doing this is that we want a quick answer right we're not looking for Perfection here because there are some
3:31
issues with just scraping the S P 500 this like this with um you know various
3:36
biases with look ahead bias and survivorship bias and all that other stuff but for right now we don't really
3:42
care about that we're just trying to identify does this even make sense so we'll just scrape the S P 500 list and
3:48
it should give us you know the answer that we're looking for and if we want to pursue it further we can have a more
3:54
vigorous back test so anyways we'll import requests so that way we can make a
4:01
request to Wikipedia and then we'll use beautiful soup to
4:06
go ahead and manipulate that request so beautiful
4:11
oop and we'll create a function to scrape the S P 500 scrape
4:17
SP 500 list and now what we want to do is pass in the URL
4:25
I have this already punched in and now what we
4:30
want to do is we want to take the response from request.geturl so we make that
4:37
request using this request module and then we pass in the URL and
4:43
we get a response pretty simple right now let me use two parsip the soup equals
4:49
beautiful dupe fonts text we'll do HTML parser
4:58
and now what we want to do is we want to grab the table so table equals soup find table
5:06
and pass it to class so that way you know it knows exactly what we're trying to Target so we
5:21
and the rows equal table find all ER for table row
5:27
now symbols now now that we have the data
5:33
the rows now we just create a list of the symbols an empty list for row and rows right
5:41
we'll skip the first row because that's a header row symbol equal row find all
5:47
the TD or a table definition first one text strip so symbols append
5:54
symbol then turn tools
6:02
s e 500 list equals sorted support them rape SP 500 list
6:10
and then print SP 500 list and we'll just print the first five
6:16
hopefully I didn't make any typos and of course I did all right so where did I mess up couldn't
6:22
a tree Builder with the features you requested HTML parse or stir
6:28
well makes sense b-a-r-s-e-r
6:34
there we go so now we see that we have you know the S P 500 tickers they're
6:39
fantastic now the next step is to get the prices
6:45
getting the price list is pretty easy again you can use my finance or if you've already built
6:51
your own database at daily prices use the function again so from these are
6:59
uh you know this is my data right from SRC which you notice of course for my
7:06
Civic Alpha AI project where I imported it up here if you're wondering where that is you don't have that unless
7:13
you've named it the same but the point is uh either import using Wi-Fi and answer your own stuff here so funnels
7:19
API import get daily prices
7:24
yeah people get daily prices SP 500
7:29
best start date equals part eight
7:35
and eight eight not the DM that should give me the
7:42
prices and this can take some time because again we've got however many tickers from when we say 2003 now that
7:49
we have our data let's go ahead and create our v-wap and Anchor Point map and anchor
7:57
if you're not sure what view app is I have an article both on avwap and view
8:03
app that teaches you how to calculate and all of the details about that I'm not going to cover that here because we've got 10 steps to go through and I
8:10
don't want to spend the time discussing what V web is if you're here you know the whole point is to learn how to code
8:16
this stuff but if you need the additional assistance just check out the website
8:21
all right so now first we're going to create our two periods and then we need
8:28
a anchor threshold what I mean by that is we're not going to decide hey this
8:33
news should be anchored or this news shouldn't all we're going to do is look for a volume Spike that is three times
8:38
larger than the 20-day moving average and we'll anchor to that point the v-wap
8:44
period equals 20. anchor period equals 20 anchor volume
8:52
multiplier equals three so three x anchor
9:00
period average volume now what we want to do is we want to
9:05
calculate the typical price some v-wap calculations only use close price but
9:10
the proper pendantic or maybe academic way to do it or where the most
9:16
authoritative sites say the typical price which is simply an average of the
9:21
low high and close let's calculate that now the f
9:40
I plus f ose and we just divide by three and now
9:48
we need to calculate the volume for that price period so and weighted so EF
9:56
typical price times volume equals the f
10:02
typical price times DF volume right so this price will
10:08
now be weighted by that volume so assuming that all worked which it did now we have all of the requisite stuff
10:15
to be able to create our e-wap calculation so v-wap equals now we're going to group by
10:24
root by but group I thicker now why are we doing that well
10:30
what we don't want to accidentally do is have Apple's volume on uh some other
10:36
ticker's price right so we want to make sure that v-wap is only calculated through each one of these bounds because
10:43
we're you know rolling uh doing like rolling average and we don't want to actually roll our average to the wrong
10:49
ticker so we'll do DF dot root by
10:54
right makes sense because we're grouping by the ticker then we'll take this epx
11:06
Rolling In The View app period so what are we doing there we're
11:11
grouping by the ticker and taking the volume which will be
11:17
volume TPX volume and then we're adding it now the last 20 periods for rows and
11:25
summing it okay so that's the first part then what we want to do is we want to
11:30
divide that by that price right
11:56
now what we want to do is now that we have that typical price times volume we want to divide by that total volume to
12:04
get that weighted price we'll do DF group I picker
12:12
volume rolling same period
12:17
a period sum that up now what we'll do is we'll
12:23
reset the index level zero and drop it through
12:29
uh so that's just so we can add this and let's go ahead and
12:35
Alpha of course like something up oh fix
12:40
data frame object has no attribute Group by look at this getting out the initial
12:46
YouTube jitters there we go so now we can see that we have this v-wap and this makes sense to
12:52
have these nands here because right we need those 20 values to be able to
12:58
calculate that because it's a rolling 20 period I'll go ahead and delete this and
13:03
now that we have our V web calculated what we'll do is we'll calculate the simple moving average so we can compare
13:09
them right so DF SMA equals the F Group by
13:16
quicker again since we're using that rolling we
13:21
want to make sure that we don't roll across ticker boundaries the rolling d-wap period
13:28
mean right so all we're doing for a simple moving average is we take uh you know the sum of the prices and then we
13:35
just divide by uh you know however many rows there are or take the mean
13:41
and I'm going to reset the index again level zero
13:53
now I have okay and now I do the perfect so that calculated for both the SMA now
14:00
we've got a few more things to do we want to calculate the fields where there's that anchor Spike
14:06
and then we want to check them so do that now so d uh anchor volume Spike
14:13
equals the up Group by thicker
14:24
and we'll do rolling dwap period
14:30
actually I shouldn't do anchor period even in the same anchor period 18 times anchor
14:46
that index level zero Property Group
14:51
that's just so we can set right that get that value all right so let's see if
14:57
that's the case so we do is we Group by the Ticker on volume we roll that over
15:04
20 and take the average so we got the average 20-day volume and times the
15:11
multiplier so what this will do is it'll take um you know this is the anchor volume
15:18
requirement right so three times the average rolling volume now all we need
15:23
to do is create a true or false field for if there is an anchor like so anchor
15:34
BF volume is greater than anchor volume
15:41
like make sense then run that no problems and then let's test it what
15:47
we'll do is we'll go to dfdf anchor
15:55
equals through and then what I'll do is I'll just all
16:01
that sweet of them so then we can check to see on these dates whether or not there was indeed a volume Spike now that
16:10
we have the data we can analyze the v-wap versus SMA returns or simple moving average returns and then we'll do
16:16
the anchored B web right because it's a little bit uh A Step Above from in the complexity level so analyze the whap
16:24
versus SMA returns and sorry about this sound I did notice it was chopping
16:29
hopefully this will be a little bit better but anyways what we're going to do here I'm just going to copy and paste
16:35
some code and explain it because I think that would be easier so what we're going to do first because
16:41
we have this multi-index is we're going to reset the index right so that means this multi-index sticker and date they
16:48
will now be columns this then allows us to sort the data frame which it
16:55
shouldn't really unsort it but just to be safe and I put that in there and then
17:01
once we have it sorted we're going to calculate the the daily return and
17:07
that's pretty easy to do all we do is we Group by again because we want to make sure that the percentage change
17:13
is limited by you know that ticker bound
17:19
so we're not calculating like uh Apple's price change with Teslas or whatever right so we create this daily return by
17:26
doing the percent change grouping by the ticker and then how we're going to analyze
17:32
whether or not these indicators meaning the simple moving average or view app has any predictive power is we're going
17:40
to calculate the distance away from price right so we can see here we have
17:46
the simple moving average let's just say it's 101 and the close is you know 100
17:53
divided by 100 so then we can see you know it's essentially one percent away
17:58
right so the idea is the stronger or further away that the SMA or the V web
18:06
is from price it should potentially change that signal that's why I did it that way and then what we'll do the same
18:13
thing with v-wap we take the V web price minus close and divide by close just gets the percentage away from the close
18:20
price positive or negative and then we calculate the return right
18:27
so we can just cumulative sum now you might take issue with this but it's a lot easier to sum some of these things
18:34
and just look for predictive power than to have exponential returns and analyze
18:39
um so again as I said before we're prototyping here we want simple clean and easy we don't necessarily want
18:46
Precision right because we want to say yes this meets our gate criteria we're
18:52
going to dig further into this and then you start digging further into that we'll drop the nand values because there
18:58
will be some nand values because we're doing the percent change right so the
19:04
first one there is no percent change before it right so it'd been in and then we set the index because again we reset
19:11
that index earlier because there's a multi-index back in place and print it out so I'll run this
19:18
and then we can see here we have our multi index data frame once again ohlcv
19:24
our typical price remember that's just the average of the low high and close our typical price times volume so that's
19:32
our volume weighted price and then we get the view app by calculating the TPX
19:38
versus uh the volume for that rolling period we also have the SMA and the
19:45
anchor um so I can't move over there for whatever reason
19:51
uh whatever I'm recording so anyways um so that gives us all of that information
20:05
and then we'll plot the returns let me go ahead and create a new section I
20:11
don't remember what oh fix uh plot returns
20:17
go and now what we're going to do is we're going to get a list of the tickers and
20:23
simply plot all of the returns but we're going to truncate the legend because we don't want a
20:29
bajillion tickers just scrolling down the side of our page right so we already have numpy so we'll say tickers equal DF
20:36
index get level values we're going to want to get the first index which is the ticker
20:42
right and then that would give us a bunch of tickers right so what we want to do is we want to get the unique values and
20:49
then turn it into a list and the reason why we get a bunch is because for every row we get ZTS ETS so we don't we don't
20:55
want that we just want the unique values so now that we have all the tickers let's create a figure and an access so
21:03
uh create figure and access I'm going to do a
21:08
better job with comments so that way if you're looking at this in the future and trying to work through it you'll have
21:13
some more information but anyways big x equals that's the ACT figure which is what the plot is on and then the axes
21:22
um we'll do plot subplot big size and six
21:27
we're not going to have any subplots we're just going to have a bunch of larger plots
21:35
then we'll Loop through each kicker and plot a
21:42
log daily return and of sum
21:47
maybe we'll do yeah um probably doesn't even really matter but anyways we made your log but
21:54
or ticker and tickers thicker data equals
22:05
daily return limit of sum right so what we're doing here is we're
22:11
just taking these daily returns for each ticker right so
22:17
um and we're turning it into this the data and then what we do is we want to plot
22:25
for each ticker this data so plot picker
22:31
Theta index particular data label quicker and this
22:38
should give us Right add labels Legend
22:44
this should give us one plot with a bunch of tickers plotted on there and hopefully we don't have
22:50
many outliers but we probably will and that's why I did it with sum versus
22:55
exponential returns but we'll probably still run into the problem but anyways ax set X label
23:02
I want to say this is the date ax that y label
23:07
it says cumulative daily return
23:15
right so this should actually be daily return I'm not going to do the log I'm going to get simple instead
23:21
um and then handles labels equal ax get
23:26
Legend handles labels and then ax Legend handles
23:37
okay now let's analyze results I'll paste this in here and go line by line uh one thing that I think I messed up
23:44
let's scroll back up here uh yeah we'll do stats models
23:51
dot API as SM so I need to fix that
23:56
okay so now what we do is I first create a blank data frame right this is going
24:02
to hold the alpha of the V web and SMA signals Alpha's Ascent also The
24:08
Intercept but basically it just means the return above and beyond what would be expected if you're curious about what
24:16
that means you can check out my website analyzing Alpha to find out exactly what Alpha really is we've got a p-value and
24:23
for both B web and SMA to determine whether or not the results are statistically significant
24:29
now let's do some dropping we drop any nand fields from the data frame and then
24:35
also any ticker that has less than 50 data points I just remove it so that's
24:40
pretty easy right so then the next thing that we're going to do is I'm going to go ahead and loop through every ticker
24:47
right so for every ticker in tickers we get the data remember we have a multi
24:55
you know multi-index so instead of getting this whole thing we'll just get the data for a and dropping the a right
25:01
so just give us gives us the data so then what we do is we then Define our
25:07
independent and dependent variables right after the data cleaning so this is
25:13
that model preparation so what we do is we have the dependent variable or the thing that we're predicting will be the
25:19
daily return technically speaking it's the next day daily return right that's why we see one we start at one here we
25:27
wouldn't want to start at zero because that doesn't really make any sense we want to predict take the V whap and the
25:33
SMA use that data and again to predict the next day and that's why I remove the
25:40
last element from here and start you know essentially remove the first element here so they're equal length but
25:47
essentially what we're doing is we're just saying all right let's take v-wap and SMA distances and see if we can
25:52
predict the next return to some degree and then we fit the model right this
25:58
just does all the calculations for us we create the V web model the SMA model we get the results after fitting the them
26:05
and then we simply plot it and plot the linear regression line and all of that
26:11
other fun stuff so if you're not sure how to do this linear regression thing
26:16
all right I've got a I've got a pretty in-depth tutorial that I thought was really good it teaches you all about the
26:22
different ways to do linear regression and take you through but pretty cool project didn't get a lot of views I
26:29
think it's really good if you're interested check that out that'll teach you all about this and then after the titles what we do is I create
26:36
for every row we append the data right of the alphas that you know the stuff up
26:43
here right so we create a new data frame with called Alpha DF with all that information and I'll run it and you'll
26:50
see here that it starts printing out all of these graphs here's the v-wap equation and the SMA equation
26:57
essentially we're just looking for y you know which is The Intercept or the alpha
27:03
and then this will gonna take a little bit but really what we're interested in here after all of this plotting is that
27:09
we want to understand what is the average or mean Alpha for all of this
27:17
right so we have all of these tickers and now what we want to see is okay does this really give us some results so what
27:25
I'll do now is I'll create a new row here and then run this
27:30
and what this is doing is we're taking essentially this mean row right and
27:36
creating this so we take all of the tickers we just call this mean and then the V web Alpha we take the mean of that
27:44
the SMA Alpha and the reason we're not doing sum or anything like that is that we we want to kind of get rid of some of
27:50
the outliers I mean we could have done median here means find I mean basically again remember we're trying to get good
27:58
right good is good enough it doesn't have to be perfect so let's see that'll take a minute to
28:15
all right so now we have the returns and we can see that here's the tickers now
28:21
there's 500 of them and then we have the v-wap alpha SMA
28:27
Alpha P values and SMA values we see here that the v-wap has a little bit
28:32
higher Alpha than the SMA but we also see that the P values are above 0.5
28:39
which means that this is not statistically significant uh so take
28:45
that with what you will all right so now what we'll do actually before I do that I'm going to create another heading
28:52
analyze a v-wap versus Asma results
28:58
and really what we're testing here is whether or not volume weighting makes sense because AV web is essentially just
29:06
a moving average that is anchored and then I'll paste this in here and we'll
29:12
cover it line by line I felt like that was more effective so the first thing we do is we get all of the tickers that have a volume Spike
29:19
okay and then what we do is we make sure that the ticker has multiple anchors because
29:26
of just how I use the logic down here and there's other ways to do this but again remember we're just trying to get
29:33
quick results so that way we can prototype this put this on our list of things to test further
29:38
we just create some containers for the results and then for every ticker and
29:44
the anchors index we then get the data we use the anchor indices which I
29:51
thought is pretty slick here what we do is for start and IN Zip anchoring to see so what we're doing is we're taking this
29:57
data and then we're looking at between these anchor spikes right or volume
30:04
spikes or 50 days if it ends within 50 days right so basically if the data ends
30:12
before 50 periods or there's another volume Spike you know that that ends but anyways
30:18
um so then we just create the AV web and the Asma using this new data we've seen that before because we have to do it
30:24
between the anchor volume spikes and then essentially you know just
30:30
concatenating all the results together and then here we have this you know
30:35
renaming it and finally merging all the data so when I do this this will give us a data frame of uh you know that
30:43
anchored v-wap and SMA data so now what we want to do is let's just take a quick
30:49
peek and see what does this actually look like on our chart
30:55
right did we actually do the the V web correctly or the anchoring correctly so
31:00
we just create a plot right and then we Loop through you know each one of the tickers for the index
31:07
essentially and what we should see is a graph with the actual chart of a stock
31:14
and then highlighted where the anchoring points occur
31:21
and since there's a lot of data this will take some time so I will see you as soon as it's done processing
31:27
and now we can see here that after this plot we have the price for American Airlines in blue and then we see the I
31:36
guess the volume Spike anchors you know V web and SMA charted here in Orange so it does
31:44
appear you know that looks like that might be a in an earnings announcement well they're
31:49
probably mostly if not all earning earnings announcements but it does appear that it works and the maximum is
31:55
50 days unless another one occurs so looks good to me from a visual inspection so now let's go ahead and see
32:02
if the anchored variants have any Alpha so the first thing we need to do is
32:07
create the distances right we've already done this before so I'll just copy and
32:12
paste this over I feel like this is an easier method at least for me let me know if you like
32:18
like it being more interactive and watching on my typos or copying and pasting me just explaining stuff so
32:24
anyways we create this AV web distance percentage which is just how far away the AV rep is from close and a
32:31
percentage perspective positive or negative same thing with Asma and then we go ahead and drop any Nas and then we
32:38
have our data frame with the AV web and anchored SMA distances right so now what
32:46
we need to do is we need to go ahead and get tickers with AV web right so in the
32:54
next section we will we'll need all of these tickers because we need to Loop through them just like we did you know
33:00
above in this this code way up here right so go ahead and run this
33:07
you can see here are our tickers and now I'll create a new row here and we're
33:14
going to go ahead and paste this in we can see we likely don't need this we already
33:21
have numpy we already have matplotlib because we've already plotted this stuff
33:26
so now what I'm going to do is just run this well actually before I run it I'll explain it right we did touch on this
33:33
pretty in depth but this one the second one obviously we have two new columns we have the SMA Alpha and the smap value
33:41
and we're doing the exact same thing you know we're trying to predict this independent I'm sorry the dependent
33:47
variable and then from a bunch of independent variables right in this case
33:53
we have AV rep Asma and I've added just the simple moving average to see you
33:58
know the anchoring matter at all or should we just be using a 20-day SMA so
34:03
then we fit the models plot it you know we've seen all this before draw straight lines
34:08
uh you know the best fit line then we create this mean row where we take all of the AV web Alpha Asma Alpha SMF and
34:16
and the P values and we just take the mean of that and then run this and you
34:22
can see it'll start graphing quite a few a few things and then this will take a
34:27
minute or two and then once that is done we will go ahead and determine does the
34:33
anchoring actually help us generate Alpha which would be nice for you know
34:38
price earnings gaps or EPS things like those types of setups so with that done processing let's go ahead and analyze
34:46
the results intersection here
34:52
and for context purposes let's get the prior results of those Alpha DF mean if
34:57
we recall they were very similar but they did not have significance because
35:02
you know they're above that 0.05 threshold so let's go ahead and see if the anchored variants are any better so
35:09
Alpha DF 2 mean and here's the big reveal so it looks
35:15
like the AV wrap performs a little bit less than the anchored SMA value and
35:23
then the simple moving average performs the best right so it seems like the
35:29
volume weighting or the anchoring actually hurts the indicators potential profitability and and then we have you
35:38
know no significance here so what does this mean right so when you're looking at data you have to figure out okay why
35:43
is it the way it is right and try to hypothesize that well my thought process in regards to anchoring is I don't on
35:50
volume is I don't think it's that great of an idea because if you really think about Price Right Price is a discounting
35:57
mechanism where you're essentially you know try you know the market is very
36:02
good at getting the latest information so a lot of times when you anchor on a
36:07
lower price let's just say in a post earnings announcement drift scenario your actually using and heavily waiting
36:15
volume that occurred earlier on in the period where price hasn't gone through
36:21
that price Discovery yet so my initial thought process you know when thinking
36:26
through this before doing this was that you know what anchoring is probably not that great of an idea because you're
36:31
overweighting historical prices relative to newer prices which are in my opinion
36:39
and the Market's opinion you can't argue with it right I mean it's incorporated as you know that that further
36:44
information so I think this what makes logical sense I'm not going to use anchoring and my strategies how about
36:52
you what did you think did you like this video and if you did let me know between if you like me typing stuff out or if
36:59
you like me you know just explaining the code and hopefully I'll see you in the next video thanks bye