yfinance Python Tutorial: Get Free Market Data
532 views
Dec 11, 2024
Learn how to get free financial data from yahoo finance using #python and #yfinance. This includes - Price data at minute granularity - Fundamentals data - Options data - Institutional holdings - And more... 👍 Subscribe for more: https://bit.ly/3lLybeP 👉 Follow along: https://analyzingalpha.com/yfinance-python #python #yfinance #pandas 00:00 Introduction 01:28 Install yfinance Using Pip 01:48 Import yfinance and pandas 02:00 How to Download Historical Price Data Using finance 02:32 Download One Ticker Using yfinance
View Video Transcript
0:00
if you're interested in grabbing freely
0:01
available yahoo data for your algorithms
0:03
you've come to the right place today
0:05
we're going to cover the y finance
0:07
library and i'm going to show you how to
0:08
get all of the yahoo finance data which
0:10
is freely available but i must copy this
0:13
by saying if you i would not recommend
0:15
using yahoo finance in a live trading
0:18
algorithm it's just more for research
0:20
purposes or if you're not there to get a
0:22
paid service like polygon or quandl but
0:24
anyways i digress let's go ahead and see
0:26
what we're going to cover today we're
0:28
first going to install y finance if you
0:30
don't have it already then we're going
0:31
to import the y finance and panda
0:33
libraries then we'll
0:35
go ahead and download historical price
0:38
data using one ticker and then multiple
0:40
tickers and then after that we'll
0:42
download some fundamental data again
0:44
with one ticker and then multiple
0:46
tickers then we'll learn how to get
0:48
options data and institutional holdings
0:50
and then finally we'll wrap it up with a
0:52
clear example of why you probably
0:54
shouldn't use yahoo finance for live
0:56
training
0:57
the blog article is right here it's
0:59
already live and essentially covers uh
1:01
the same thing but i do detail
1:04
a lot of the methods are you know easier
1:07
to to read here so you may want to
1:10
browse over there if you have questions
1:12
regarding
1:13
uh you know the different signatures and
1:15
things like that so anyways let's go
1:17
ahead and get started before i digress
1:19
too much installing why finance is super
1:21
easy i'm going to use pip
1:23
type bang pip install why finance i'm
1:26
not going to hit enter here
1:28
but if you are in jupiter notebook
1:30
you'll have to put the bang so that way
1:33
a jupyter notebook knows to run this
1:34
into command if you're just at your
1:36
shell with your virtual environment
1:37
activated you just type pip install y
1:39
finance and that will give you
1:41
or install the library now the next
1:43
thing we'll do is import a wi-fi nance
1:46
and pandas so we'll type import
1:49
canvas rfpd import by finance as yf
1:54
simple enough hit enter so they're
1:55
available to us and now before we
1:58
um you know go on with downloading one
2:01
ticker you know that another way you can
2:03
find out what's available to you is just
2:05
type d-i-r-y-f
2:07
or go to
2:08
the documentation but we can see here
2:11
you know we're going to be
2:12
using
2:13
this stuff here but we're probably going
2:14
to be concerned with ticker
2:17
and also
2:19
the
2:20
download there so so even though it
2:22
might feel overwhelming it's it's pretty
2:25
easy for the most part so speaking of
2:27
pretty easy let's download the data for
2:29
one ticker
2:30
we'll just say let's see tdg for trans
2:33
dime it's an airplane parts manufacturer
2:36
ticker
2:37
we'll do tdg
2:39
so what we're doing is we're using the
2:42
wi-fi names ticker
2:44
to download the tdg symbol into tdg
2:48
and now just hit enter here and you can
2:50
see that it is indeed a y finance ticker
2:53
object okay
2:55
so pretty straightforward there and if
2:58
you're curious on all of the
3:00
uh you know functionality available to
3:02
you
3:03
you can just
3:05
check out
3:07
this right here so there's clearly a lot
3:10
you can do with that ticker object
3:13
okay so now
3:15
let's go ahead and get
3:17
the history
3:18
for transdime so we'll type data equals
3:23
tbg history
3:26
and then i'll just do data ahead not to
3:29
take up the screen with all of the data
3:30
but you can see here that it gives us
3:33
daily data open high low close volume
3:35
dividends and stock splits
3:38
pretty straightforward
3:40
but it might be
3:41
more interesting to
3:43
select a certain date so we'll do data
3:48
tdg history
3:50
we'll do and change the interval right
3:52
interval
3:54
equals one minute so now with one minute
3:56
date i believe you can only get one week
3:58
of data
3:59
so start
4:00
2022 0 1 0 3
4:04
and we'll do n 20 22 0 1 10
4:09
data this should give us minute data
4:12
and it indeed does you can see 9 30 it
4:15
even has the time zone
4:17
so it looks like it does skip a minute
4:20
here which
4:21
is common with minute data if no trades
4:24
were transacted during that time frame
4:26
it would just skip a bar and that's
4:28
pretty much how it works so there's no
4:29
reason to send data to us if nothing
4:32
happened okay
4:33
perfect so you know how long did that
4:35
take uh
4:37
less than four minutes from start to
4:39
finish to learn how to download minute
4:40
data okay now let's go ahead and
4:43
download multiple tickers let's
4:46
go ahead we'll say data equals yf
4:48
download so this time we're using
4:51
download and not the ticker class
4:53
and then we'll select two
4:55
tickers we'll do goog
4:57
add meta
4:59
and the period will equal one month
5:03
and then data.head
5:05
and now we can see
5:07
it has that progress bar now we have
5:10
google's and meta's
5:11
data now if you want to change
5:15
you know how the headings are laid out
5:17
so you know obviously the columns are up
5:19
here and the tickers are down here
5:21
you can do that quite easily
5:25
data equals yf download
5:27
pass it the same tickers
5:30
meta
5:31
and we'll also do a start and end again
5:34
it's
5:35
pretty sim similar to
5:37
the single ticker right
5:39
then n 20 21 12 30
5:43
and then group by
5:45
and then what we do here is
5:48
essentially decide what to group by i
5:50
believe columns the default so this time
5:52
we want ticker
5:54
and then we'll say data head
5:57
and now you can see
5:59
now we have it
6:01
grouping by the ticker first and then
6:03
the ol gels ohlcv
6:07
values okay so pretty straightforward so
6:11
now five minutes in let's see we can
6:13
keep this moving all right so how to
6:15
download fundamental data well when
6:17
we're downloading fundamental data we're
6:19
just going to use the ticker object so
6:21
this time we will download danaher which
6:24
is an industrial another industrial
6:26
company
6:26
ticker
6:28
dhr you can type it lowercase or
6:31
uppercase it doesn't matter
6:33
me just verify that for you
6:36
not to make a liar out of me there we go
6:39
okay good so dhr
6:41
we'll do info equals data here info this
6:45
just gives us
6:47
essentially a lot of the
6:49
often requested information
6:51
right so
6:53
you know how many employees what sector
6:55
it's in
6:57
let's see dict object has no keys
7:02
info
7:05
get that let's see here
7:13
okay there it is
7:15
okay so here we have a
7:18
dictionary object
7:20
there's a description
7:22
then we have the keys there so
7:27
try it again
7:29
because that was definitely a dictionary
7:30
object huh see so this is one of the
7:32
reasons why uh
7:34
you know you're gonna be careful because
7:36
sometimes you know you just have issues
7:37
with yahoo finance so
7:40
anyways here's all the dictionary keys
7:41
so you can see
7:42
zip sector full-time employees long
7:44
business summary city etc so that's all
7:47
of this stuff available to you if you
7:49
want to grab that general information
7:53
and then to access this it's pretty easy
7:55
let's just say we want this sector
7:57
that's in healthcare so pretty pretty
7:59
simple
8:00
and if we also want
8:02
the earnings
8:05
you know
8:07
we can grab that so so pretty
8:09
straightforward but let's go ahead and
8:12
get some more
8:14
uh
8:14
interesting information we can do dhr
8:17
get financials
8:18
i'd rather said p l
8:22
financie
8:30
so that'll get us the profit loss i
8:32
consider financials all of the
8:33
financials not just the profit loss but
8:36
that's okay
8:38
you know it's
8:39
comes from
8:41
lots of accounting so anyways let's grab
8:43
all of the financial statements we'll do
8:44
dhr financials
8:48
it's a pnl grab the balance sheet
8:50
balance sheet
8:52
and cash flow
8:55
cash flow
8:56
and we'll say financial statements
8:58
equals pd call cat
9:01
p l
9:03
balance sheet cash flow and then we'll
9:06
output those financial statements so you
9:07
can see now we have aggregated all the
9:10
financial statements together so you can
9:12
come up with whatever metrics or analyze
9:15
you know uh you know what percentage of
9:18
capex to revenue or whatever you're
9:20
interested in okay and now the next
9:24
thing we want to do is
9:26
go ahead and transpose this because it
9:29
actually is a little bit easier to see
9:30
if you do it this way right so now we
9:32
just have our rows by date
9:34
and then here's all of our columns up
9:36
above so what's next well let's see how
9:39
we can do this for multiple
9:41
uh tickers right
9:43
so this assumes that you're going to
9:45
know that you know pandas i mean almost
9:47
all of these
9:48
uh libraries assume that
9:53
so we'll have our tickers right here
9:56
got them all and now what we want to do
9:58
is create ticker objects for all of them
10:01
we'll just replace tickers we'll just do
10:04
some list comprehension
10:05
ticker
10:07
ticker
10:08
for ticker and tickers so what are we
10:10
doing so for every ticker right in these
10:14
tickers
10:15
return it here and then
10:17
pass it into y f ticker so that's list
10:20
comprehension pretty common
10:24
in python
10:25
so now we have a list of
10:28
ticker objects so now we can iterate
10:30
through all of these
10:32
ticker objects to get what we want so
10:35
what we'll do is i'll first create a
10:36
list of data frames
10:40
and then for ticker and tickers
10:44
and now let's say we'll get financial
10:46
statements
10:48
for each
10:49
so do p and l
10:50
again this is what we did above
10:52
ticker financials
10:54
we'll do balance sheet
10:56
ticker balance sheet
10:59
and then cash flow equals ticker
11:01
cash flow so now we're just going to
11:04
aggregate all of these together right so
11:06
you can cat
11:07
into
11:11
a spell into one data frame so do
11:14
financial statements equals pd concat
11:17
pnl balance sheet cash flow
11:21
okay and now we're just going to format
11:23
this to make it a little nicer right so
11:26
just to make data
11:29
frame format i don't know nicer perfect
11:32
so as i showed before uh we transposed
11:36
that to make the
11:37
rows
11:39
you know on the left hand i'm sorry the
11:41
dates on the left-hand side the columns
11:43
for the actual field so let's uh get it
11:45
in that format so do fs transpose
11:48
swap dates and columns
11:51
okay
11:52
data equals theta dot reset index
11:56
because what we want to do is we want to
11:58
reset
12:01
index or date into the column right
12:05
so right here that's the index we want
12:07
that into a column and we can do that by
12:09
resetting that index but now we need to
12:13
rename that so we'll do
12:15
data
12:16
columns equals
12:19
and bear with me here
12:21
data columns
12:23
skip the first one
12:25
so what we're doing is we're skipping
12:26
we're keeping all the columns the same
12:29
but we're renaming the first one
12:31
okay
12:32
and
12:33
we use this this operator here to kind
12:36
of explode the co you know the list
12:39
already so essentially it's date
12:41
skipping the first column which is that
12:43
index and then we pass in a list so
12:45
we're just basically passing in a list
12:47
of columns
12:48
um
12:50
to uh essentially get the columns that
12:53
we want in the order we want now let's
12:54
add the ticker
12:57
equals ticker dot ticker
13:00
say add
13:02
clicker to data frame
13:05
okay notice
13:07
rename
13:08
old index to date
13:11
and then we'll append this to our list
13:13
dfs append
13:15
data so again all we're really doing is
13:18
for each ticker we're taking the profit
13:19
loss balance sheet cash flow smooshing
13:21
it all together using king king cat
13:24
concat then we're just transposing it a
13:27
little bit just to make it into
13:29
this format to make it a little easier
13:31
to read
13:32
we append it to a list and then we're
13:34
going to just
13:35
you know smoosh them all together
13:38
uh once we so we have all you know again
13:41
we have this and then we're going to put
13:43
four
13:44
of them right next to each other because
13:46
we have four tickers so you'll see what
13:48
i mean here in fact
13:50
i'll
13:51
hit press data here just to give you
13:54
an understanding so this might take a
13:56
second to run depending if yahoo finance
13:59
behaves what did i do here
14:02
expected access has 63 elements new
14:05
values have two elements
14:10
14
14:12
okay data columns
14:14
equals date
14:17
data dot columns oops
14:20
that would be why i didn't mean to
14:22
put the one here
14:24
i wanted to start at the first element
14:26
not end of the first element
14:29
right so this is just slicing right so
14:31
this just means skip the first element
14:33
and continue to the end
14:34
okay
14:36
so that's what i did there
14:39
the joys of
14:41
coding live on youtube okay well not
14:44
live but you know what i mean all right
14:46
so now this is what
14:48
our
14:49
uh data looks like right so this would
14:52
be for the last ticker
14:54
google because it looped through so this
14:56
should be google
14:59
right and then we just have a list of
15:02
this data frame
15:04
in here all four of them so now let's
15:07
get them in a format
15:10
where
15:11
you know we can
15:13
see all of them together
15:15
so this might seem a little bit strange
15:17
but bear with me here we're going to
15:18
create a new parser
15:20
parsers dot bass parser
15:24
parser base
15:27
use calls
15:29
none
15:31
it's gonna be like leo what in the world
15:32
is this well we've got uh we need to
15:35
make sure that we don't have uh
15:36
duplicate columns whenever we're
15:38
concatenating so what we do is we've got
15:42
four
15:43
df ndfs so that's data frames we'll do
15:46
df columns equals parser
15:50
maybe the dupe
15:52
names do you have columns so we're just
15:54
going to dedupe the names okay
15:57
uh simple as that and then we'll create
15:58
a new data frame pd concat dfs
16:02
ignore index equal true
16:05
and then we'll do
16:06
[Music]
16:07
df
16:08
equal df set index
16:12
ticker
16:13
date
16:14
and df
16:16
now we can see we have our tickers on
16:19
the left we have a multi-index right our
16:21
level zero is the ticker our level one
16:24
is the date and now we have all of the
16:26
data neatly organized so you can
16:29
manipulate that however you want to
16:32
perfect
16:33
so that's probably the hardest thing
16:35
that we're going to do today but maybe
16:37
the most useful
16:39
but now let's go ahead and look at how
16:41
we can get the institutional holders
16:46
wait did i mix i missed something
16:48
oh yes i did miss something hold on
16:51
i'm not going to do institutional
16:52
holding holders yet we'll do
16:56
let me check here
16:59
options data
17:01
i'm going with something
17:03
how to get options data
17:08
okay so perfect so now what we'll do is
17:11
we can get options data pretty easily
17:12
let's just use apple yf ticker
17:15
and we use the ticker object for almost
17:17
everything
17:19
okay so we've got that apple object now
17:22
the options chain is super easy to grab
17:24
so we can do options equal
17:27
apple
17:28
options chain
17:31
options and that will give us
17:33
apple
17:36
options
17:38
which hits option chain yep
17:41
okay so
17:44
here are is the option chain you can
17:46
pass
17:49
an expiry if you want to so we can do
17:51
apple
17:54
options
17:55
and then you can pass that date into the
17:57
options chain if you only want
17:59
that information but
18:01
that's okay so now you might say
18:04
leo make sure
18:05
how do i get the you know calls and puts
18:08
pretty easy so do
18:10
calls equal options calls
18:15
and then
18:17
you can do the same thing for puts puts
18:19
equal
18:20
puts equal options puts
18:27
okay and that's pretty much it from an
18:30
option perspective right it's super easy
18:33
to get
18:34
and they are their own type obviously
18:37
but let's go ahead now and get the
18:41
institutional holders which is again
18:43
super easy like i said like getting the
18:45
fundamentals multiple fundamentals and
18:47
pushing money that's the hardest thing
18:48
we're going to do today so we'll just do
18:50
apple
18:52
dot institutional
18:54
folders i can spell that right
18:57
i n s t i t u t i o n a l holders
19:03
okay
19:06
and that's pretty simple you can see the
19:08
top 10 institutional holders and now
19:10
let's talk about why you shouldn't use
19:13
yahoo finance you did see above
19:15
there when i was trying to get
19:18
the dictionary information earlier i'm
19:20
not going to scroll up but it just came
19:22
up with an error same thing
19:24
for uh you'll just encounter other stuff
19:27
like this so ticker
19:29
fb
19:31
okay we'll do meta equal yf thicker meta
19:34
so we have face we know facebook change
19:37
its name to meta okay
19:39
uh these should be this we should get
19:41
the same data right so let's see what
19:43
happens from a cash flow perspective
19:45
oops
19:46
get
19:48
cash flow
19:50
right and it should give us some
19:52
information
19:55
and indeed it does
19:57
now what happens
19:59
if we say
20:00
meta
20:02
get
20:03
cash flow
20:09
and there's nothing so we can see here
20:11
even though these are the same same
20:13
stocks
20:15
you know they're giving us different
20:17
information so again yahoo finance is
20:20
probably the best free source online for
20:23
uh free stock data but again it's free
20:26
and i definitely wouldn't use it for
20:28
live trading simply because you get lots
20:30
of connection errors and also
20:33
data errors like this so hopefully you
20:36
enjoyed that video and now you're able
20:37
to download that freely available yahoo
20:40
data so you can follow along with all of
20:42
the other algo trading videos that i
20:44
create on this channel so hope you have
20:46
a wonderful day and i'll see you in the
20:47
next one bye
#Investing