/ TECHNOLOGY

Query Price Data Using SQLAlchemy, Python & Pandas

Learn how to query price data from a PostgreSQL database using SQLAlchemy, Python, and Pandas.

Steps to Query Price Data Using SQLAlchemy, Python & Pandas

Below is the Python code demonstrated in the video and Jupyter notebook.

1. Imports

We’ll start by getting our imports. We’ll need the standard stuff including datetime and pandas, but we’ll also need the models we created in a previous video and also the postgres connection code we created a few videos back.

import datetime as dt
import pandas as pd
from models import Symbol, MinuteBar
from psql import db, session

2. Create Get Symbols Function

Getting the symbols is easy. We use the SQLAlchemy session to query our model, which in this case is Symbol. We then pass the sql query statement to pandas to retrieve the list of symbols.

def get_symbols(market=None):
    query = session.query(Symbol).statement
    symbols = pd.read_sql(query, db, index_col='id')
    return symbols
s = get_symbols()
s
ticker name market active
id
1 adausd adausd Market.crypto True
2 algusd algusd Market.crypto True
3 ampusd ampusd Market.crypto True
4 antusd antusd Market.crypto True
5 astusd astusd Market.crypto False
... ... ... ... ...
128 zbtusd zbtusd Market.crypto True
129 zcnusd zcnusd Market.crypto True
130 zecusd zecusd Market.crypto True
131 zilusd zilusd Market.crypto True
132 zrxusd zrxusd Market.crypto True

132 rows × 4 columns

3. Create Get Bars Function

The query for getting the bars is a little more complex than retrieving the symbols. We need to join Symbol to MinuteBar and add a few filters to get the data we want. SQLAlchemy understands our relationships and automatically returns a one-to-many relationship and the proper statement. We than pass that statement to pandas and get our minute bars.

def get_bars(tickers, start=None, end=None):
    start = start if start else dt.date.today() - dt.timedelta(weeks=52)
    end = end if end else dt.date.today()
    
    query = session.query(Symbol.ticker,
                          MinuteBar.date,
                          MinuteBar.open,
                          MinuteBar.high,
                          MinuteBar.low,
                          MinuteBar.close,
                          MinuteBar.volume) \
                         .join(MinuteBar) \
                         .filter(MinuteBar.date.between(start, end)) \
                         .filter(Symbol.ticker.in_((tickers))) \
                         .order_by(Symbol.ticker, MinuteBar.date.asc()) \
                         .statement
    
    bars = pd.read_sql(query, db, parse_dates=['date'])
    return bars
minute_bars = get_bars(s['ticker'][:5].to_list())
minute_bars
ticker date open high low close volume
0 adausd 2020-10-14 00:00:00 0.109510 0.109610 0.109510 0.109610 9095.490723
1 adausd 2020-10-14 00:01:00 0.109470 0.109470 0.109360 0.109360 58877.984725
2 adausd 2020-10-14 00:02:00 0.109340 0.109340 0.109160 0.109160 187338.589320
3 adausd 2020-10-14 00:03:00 0.109160 0.109160 0.109090 0.109090 88277.334285
4 adausd 2020-10-14 00:04:00 0.109150 0.109150 0.109130 0.109130 11769.100000
... ... ... ... ... ... ... ...
558905 astusd 2020-12-29 04:21:00 0.080150 0.080150 0.080150 0.080150 0.000000
558906 astusd 2020-12-29 04:22:00 0.080150 0.080150 0.080150 0.080150 0.000000
558907 astusd 2020-12-29 04:23:00 0.080150 0.080150 0.080150 0.080150 0.000000
558908 astusd 2020-12-29 04:24:00 0.080150 0.080150 0.080150 0.080150 0.000000
558909 astusd 2020-12-29 04:25:00 0.079949 0.079949 0.079949 0.079949 1762.900000

558910 rows × 7 columns

leo

Leo Smigel

Based in Pittsburgh, Analyzing Alpha is a blog by Leo Smigel exploring what works in the markets.