Query Price Data Using SQLAlchemy, Python & Pandas

Learn how to query price data from a PostgreSQL database using SQLAlchemy, Python, and Pandas.

You can download the code at the Analyzing Alpha GitHub Repo.

Steps to Query Price Data Using SQLAlchemy, Python & Pandas

Below is the Python code demonstrated in the video and Jupyter notebook.

1. Get Imports

We’ll start by getting our imports. We’ll need the standard stuff including datetime and pandas, but we’ll also need the models we created in a previous video and also the Postgres connection code we created a few videos back.

import datetime as dt
import pandas as pd
from models import Symbol, MinuteBar
from psql import db, session

2. Create Get Symbols Function

Getting the symbols is easy. We use the SQLAlchemy session to query our model, which in this case is Symbol. We then pass the SQL query statement to pandas to retrieve the list of symbols.

def get_symbols(market=None):
    query = session.query(Symbol).statement
    symbols = pd.read_sql(query, db, index_col='id')
    return symbols
s = get_symbols()
s

3. Create Get Bars Function

The query for getting the bars is a little more complex than retrieving the symbols. We need to join Symbol to MinuteBar and add a few filters to get the data we want. SQLAlchemy understands our relationships and automatically returns a one-to-many relationship and the proper statement. We then pass that statement to pandas and get our minute bars.

def get_bars(tickers, start=None, end=None):
    start = start if start else dt.date.today() - dt.timedelta(weeks=52)
    end = end if end else dt.date.today()

    query = session.query(Symbol.ticker,
                          MinuteBar.date,
                          MinuteBar.open,
                          MinuteBar.high,
                          MinuteBar.low,
                          MinuteBar.close,
                          MinuteBar.volume) \
                         .join(MinuteBar) \
                         .filter(MinuteBar.date.between(start, end)) \
                         .filter(Symbol.ticker.in_((tickers))) \
                         .order_by(Symbol.ticker, MinuteBar.date.asc()) \
                         .statement

    bars = pd.read_sql(query, db, parse_dates=['date'])
    return bars
minute_bars = get_bars(s['ticker'][:5].to_list())
minute_bars

Leave a Comment