Learning FastAPI (2): "Stock Toolkit" :: Brain Dump

As I’ve been trying to update the tools I lean on for quick web things, I am really enjoying using FastAPI with Svelte. These are notes about the development setup that I currently like. Fair warning: this is not battle hardened yet. I’ve used it to make a couple of toys and like how it’s shaping up, though. I give an overview of the stack here.

I’m still mainly talking about a development setup in this post. While I have some ideas about how I want to host it, those are only half-developed at this point. I’m writing these details now half for future reference and half so I’ve got something I can point to and ask for commentary.

To get my arms around these tools, I’m building a small tool for fetching, displaying and filtering information about stocks. It’s very heavily inspired by this one, though FastAPI is really all it’s got in common with that tutorial anymore.

Initial FastAPI Setup

FastAPI is remarkably low-fuss to set up. While I do plan to make a cookiecutter template for the things I find myself doing repeatedly, the only thing I’d put in it so far is my requirements.txt.

The repository I’m building as I write these posts is available here.

I’m starting with an empty directory containing a .gitignore file that’s populated for preventing jetbrains noise and node noise from finding their way into my repository, in addition to ignoring sqlite files and related items. I’ve also got a requirements.txt file listing packages I installed as I was kicking the tires on this a few days ago:

fastapi~=0.61.1
aiofiles~=0.5.0
pydantic~=1.6.1
uvicorn~=0.12.1
yfinance~=0.1.54
tortoise-orm~=0.16.16
aerich~=0.2.5
asyncpg~=0.21.0

fastapi is the star of the show. aiofiles is required for serving static files from a fastapi application. pydantic almost certainly doesn’t need to be explicitly included; I’m 90% certain fastapi pulls it in. It provides input and output validation for python data classes and has proven a really nice way to preserve sanity while using JSON. uvicorn is the server I’m using for local development. tortoise-orm is an async-oriented python ORM, and is the piece I’m least certain about in this setup. I should probably just hand-roll SQL for most things I do, but I’ve used sqlalchemy for a while and am especially attached to the low friction it brings to movement between sqlite for local development and postgres or mariadb when I move to a server. Tortoise ORM feels really thin and similar in that regard, so I’m sticking with it for now. aerich is a utility that’s similar to alembic for tortoise-orm. I’m even less certain that I’ll stick with it, but I’d like to kick the tires.

yfinance scrapes Yahoo! finance for data about stocks.

Update 10/15/2020: Add asyncpg to requirements.txt.

To get started, in my directory with .gitignore and requirements.txt I run:

$ python3 -mvenv venv
$ source venv/bin/activate
$ pip install -U pip
$ pip install -r requirements.txt

From there, I start PyCharm and create a new empty python project in that directory. I tell pycharm about the new project interpreter found in the just-populated venv subdirectory, and add a new python package called stocktoolkit at the root level.

To get started, in the stocktoolkit package, I create two new python files: main.py and schemas.py. In the schemas module, I declared a status message format using pydantic:

from pydantic import BaseModel

class StatusMessage(BaseModel):
    code: str
    message: str

I’m putting my JSON message definitions into the schemas module even though Pydantic calls them “models”. I am mainly doing that because it helps me reason about them separately from the ORM’s “models”. Some of the projects I’ve been looking at bunk the two together, some keep the JSON-related pieces with the associated routes, and some separate them out like I did here. I don’t see an overwhelming advantage to any of these yet, so I’m sticking to the one that seems easiest for me to think about for now.

And I’m using that to deliver a “Hello, World” message in the main module:

import uvicorn

from fastapi import FastAPI

from .schemas import StatusMessage


app = FastAPI()


@app.get('/')
def root():
    return StatusMessage(code='success', message='Hello, FastAPI')


if __name__ == "__main__":
    uvicorn.run("stocktoolkit.main:app", host="127.0.0.1", port=8000, reload=True)

This is all very much in-line with the FastAPI sample code used throughout the manual except for the fact that I’m running the server from the main module instead of using uvicorn directly on the command line. Doing it this way makes it easy to establish an interactive debugging session with PyCharm. It’s necessary to specify the ASGI app as a string instead of passing the app object here in order for the hot reloading (specified by reload=True) to work.

The last part of the setup takes place in the IDE, where I create a run configuration:

run configuration

The main noteworthy thing about that is that I’m using a module name to select the script that is run. Also, if “store as project file” the run configuration will land in source control. It will need to be opened, have the interpreter updated, and saved in order to work on a different platform than it was created on. (I was trying it out on Windows when I took the screenshot.) My current preference is to keep the one from Linux checked in and just update it from Windows without committing those settings on those occasions where I’m working from Windows. If I worked on Windows more often, I think I’d add a second run configuration for Windows and commit both to git.

Add a Database

Tortoise ORM is very similar to the SQL Alchemy ORM or to the Django ORM. While FastAPI works well with SQL Alchemy, most projects I’ve found that use it seem to favor Tortoise. Since I’m starting fresh, I’m going to try Tortoise.

First, I create a new file called database.py and use Tortoise to create a data model:

from tortoise.models import Model
from tortoise import fields


# This should match the sqlalchemy definition found here:
# https://github.com/hackingthemarkets/stockscreener/blob/master/models.py
class Stock(Model):
    id = fields.IntField(pk=True, index=True)
    symbol = fields.CharField(max_length=255, unique=True, index=True)
    price = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
    forward_pe = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
    forward_eps = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
    dividend_yield = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
    ma50 = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
    ma200 = fields.DecimalField(max_digits=10, decimal_places=2, null=True)

    class Meta:
        table = 'stocks'

Specifying the table name is not really necessary, but it better matches the convention I’m used to that way and does not hurt anything. The main noteworthy thing about this definition is that all of the fields other than the auto-incrmemented primary key and the ticker symbol need to be nullable. They will never be populated initially; a background task will look up their values using the yahoo finance API.

With the schema defined, Tortoise needs to be told about where the database can be found. The documentation offers several options for this, but only the configuration dictionary appears to be compatible with the aerich management and migration tool. Having no legacy concerns and some inclination to at least try that tool, that’s what I will use for now. I’m storing the database info in a module called config that I expect to vary per-environment because that’s the convention I’ve used before. I don’t see any clear consensus for how people manage those in the wild with this stack yet.

TORTOISE_ORM_SETTINGS = {
    'connections': {
        'default': 'sqlite://stocktoolkit_dev.db',
    },
    'apps': {
        'models': {
            'models': ['stocktoolkit.database', ],
        }
    }
}

With that in place, I add a call to register_tortoise to main.py and start the service to confirm that a database with the expected schema gets created.

Update 10/15/2020: Tortoise ORM’s integration with sqlite is not a super fit for this application. In particular, things like this:

dividend_yield = fields.DecimalField(max_digits=10, decimal_places=2, null=True)

in the schema are problematic. Tortoise uses a string field in sqlite for that, and then queries with string values. So a query with WHERE dividend_yield >= '42' generated by the ORM will return a stock with a dividend yield of 5.25. As a consequence, the most recent version of the project uses postgres. Thus far, that does not make me want to go back to sqlalchemy.

import uvicorn

from fastapi import FastAPI
from tortoise.contrib.fastapi import register_tortoise

from .schemas import StatusMessage
from .config import TORTOISE_ORM_SETTINGS


app = FastAPI()
register_tortoise(app, config=TORTOISE_ORM_SETTINGS, generate_schemas=True)


@app.get('/')
def root():
    return StatusMessage(code='success', message='Hello, FastAPI')


if __name__ == "__main__":
    uvicorn.run("stocktoolkit.main:app", host="127.0.0.1", port=8000, reload=True)

Specify API Messages

With the database defined, now I need to specify messages for retrieving all the stocks from the database and for adding a stock by ticker symbol. The API as I’m currently imagining it doesn’t need an update message and the delete operation will just be a DELETE verb to /stock/id where id is the primary key from the database.

Pydantic generates both of these message formats very tersely by modifying schemas.py:

from pydantic import BaseModel
from tortoise.contrib.pydantic import pydantic_queryset_creator

from .database import Stock


class StatusMessage(BaseModel):
    code: str
    message: str


class SymbolAddRequest(BaseModel):
    ticker_symbol: str


StockList = pydantic_queryset_creator(Stock)

While this brevity is excellent, the bit that I really like is the validation that comes for free when you use type hints.

Stub out API endpoints

With the messages defined, now I need to get preliminary endpoints in place to add stocks to the database, retrieve them, and delete them. For now, I’m putting all of these routes in main.py. The FastAPI manual details a good approach to splitting them up, but even with all of them there main.py will remain small and readable for this application and I won’t learn anything interesting by breaking them up into groups. I currently think the natural seam where a split makes sense is when I have endpoints that act on different kinds of things, like, say, both users and stocks.

import uvicorn

from fastapi import FastAPI
from tortoise.contrib.fastapi import register_tortoise

from .schemas import StatusMessage, SymbolAddRequest, StockList
from .config import TORTOISE_ORM_SETTINGS
from .database import Stock


app = FastAPI()
register_tortoise(app, config=TORTOISE_ORM_SETTINGS, generate_schemas=True)


@app.get('/')
def root():
    return StatusMessage(code='success', message='Hello, FastAPI')


@app.get('/stocks')
async def get_stocks():
    stocks = Stock.all()
    return await StockList.from_queryset(stocks)


@app.post('/stock')
async def create_stock(request: SymbolAddRequest):
    stock = await Stock.create(symbol=request.ticker_symbol)
    return StatusMessage(code='success', message=f'{request.ticker_symbol} added to database with id {stock.id}')

@app.delete('/stock/{stock_id}')
async def delete_stock(stock_id: int):
    stock = await Stock.filter(id=stock_id).first()
    ticker = stock.symbol
    stock_id = stock.id
    await stock.delete()
    return StatusMessage(code='success', message=f'{ticker} ({stock_id}) removed from database')


if __name__ == "__main__":
    uvicorn.run("stocktoolkit.main:app", host="127.0.0.1", port=8000, reload=True)

Ad Hoc Testing

Any HTTP client can test these simple endpoints easily enough. My current tool of choice is insomnia. The core client is FOSS; they are trying to sell some collaboration and design features to monetize it. And it’s very pleasant to use as long as you’re not allergic to electron. A shell script and curl would certainly do the job, though.

Check Empty Database

$ curl --request GET \
  --url http://localhost:8000/stocks
[]

Add a Stock

$ curl --request POST \
  --url http://localhost:8000/stock \
  --header 'content-type: application/json' \
  --data '{
	"ticker_symbol": "AAPL"
}'
{"code":"success","message":"AAPL added to database with id 2"}

Confirm Success

$ curl --request GET \
  --url http://localhost:8000/stocks
[{"id":2,"symbol":"AAPL","price":null,"forward_pe":null,"forward_eps":null,"dividend_yield":null,"ma50":null,"ma200":null}]

Delete the Stock

$ curl --request DELETE \
  --url http://localhost:8000/stock/2
{"code":"success","message":"AAPL (2) removed from database"}

Confirm Success

$ curl --request GET \
  --url http://localhost:8000/stocks
[]

Check API Input Validation

Although no explicit error handling has yet been added, a malformed stock add request can show what FastAPI and Pydantic are giving us for free in terms of error handling with the following request:

$ curl --request POST \
  --url http://localhost:8000/stock \
  --header 'content-type: application/json' \
  --data '{
	"ticker_symbol2": "AAPL"
}'
{"detail":[{"loc":["body","ticker_symbol"],"msg":"field required","type":"value_error.missing"}]}

Generated API Documentation and Test Interface

Swagger documentation of the API is available live at /docs using a web browser.

Rounding out the API

This leaves us with a roughed-in API that has a few rough edges and missing features:

Attempting to delete a nonexistent resource results in an internal server error.
Attempting to add a duplicate ticker symbol results in an internal server error.
Filters for the stock list are not yet implemented.
No data other than the ticker symbol ever gets loaded into the database.

I’m certain there is plenty more we could do, but these basics should help kick the tires on the framework a little more.

Sensible error messages

FastAPI makes it easy to pass errors on to the HTTP client in the form of HTTP status messages with more details in the body. For example, if no stock is found to delete, this:

    if stock is None:
        raise HTTPException(status_code=404, detail=f'Unable to retrieve record for stock with id {stock_id}')

is all that is necessary to induce most clients to report the error appropriately. Similarly, in the case of a constraint violation, catching the exception from the ORM and reporting it as an HTTPException does the job:

@app.post('/stock')
async def create_stock(request: SymbolAddRequest):
    try:
        stock = await Stock.create(symbol=request.ticker_symbol)
    except OperationalError as e:
        raise HTTPException(status_code=409, detail=f'Unable to add {request.ticker_symbol}: {e}')
    return StatusMessage(code='success', message=f'{request.ticker_symbol} added to database with id {stock.id}')

Filtering the Stock List

FastAPI passes query parameters through to API endpoints exactly as you might naiively expect it to based on other observe patterns: you declare them as function parameters with the same name as the query parameters and provide appropriate type annotations. Parameters with default values are optional. Those without default values are required.

The harder part for me to get my head around was actually applying the filter parameters to the query. I’ve used SQL alchemy plenty and written my share of plain ol' SQL, but I’ve never used Django’s ORM for much. Tortoise borrows much of its interface from Django’s ORM:

@app.get('/stocks')
async def get_stocks(forward_pe: float=None, dividend_yield: float=None, ma50: bool=None, ma200: bool=None):
    stocks = Stock.all()
    if forward_pe:
        stocks = stocks.filter(forward_pe__lte=forward_pe)
    if dividend_yield:
        stocks = stocks.filter(dividend_yield__gte=dividend_yield)
    if ma50:
        stocks = stocks.filter(price__gte=F('ma50'))
    if ma200:
        stocks = stocks.filter(price__gte=F('ma200'))
    return await StockList.from_queryset(stocks)

Now that I’m acclimated, I like the extra input validation this provides over hand rolling SQL. I know some prefer to avoid ORMs like the plague, and I wonder if this will continue to hold as my datastores grow more complex.

Populate Data in Background

The last major thing I want to do in the backend is populate information for newly added stocks without blocking the client. I have a mixed reaction to the built-in background task interface offered by FastAPI/Starlette. On one hand, the happy path feels very, very straightforward and is easy to understand. On the other, it does not seem especially robust for the kinds of things I’m likely to want to use it for.

I suspect that, some time soon, it will become obvious that I need to revisit this with Celery or something similar. For now, the background task interface is easy to adopt for a prototype and it doesn’t feel as if it will be hard to replace later.

For now, I added a new stock_utils module. It currently contains one function that fetches data from yahoo finance and updates the local database based on the results:

from .database import Stock
import yfinance


async def fetch_stock_data(stock_id: int):
    stock = await Stock.filter(id=stock_id).first()

    yahoo_data = yfinance.Ticker(stock.symbol)

    stock.ma200 = yahoo_data.info['twoHundredDayAverage']
    stock.ma50 = yahoo_data.info['fiftyDayAverage']
    stock.price = yahoo_data.info['previousClose']
    stock.forward_pe = yahoo_data.info['forwardPE']
    stock.forward_eps = yahoo_data.info['forwardEps']
    if yahoo_data.info['dividendYield'] is not None:
        stock.dividend_yield = yahoo_data.info['dividendYield'] * 100
    await stock.save()

With that, the back end feels finished enough for now. Here are all the endpoints:

import uvicorn

from fastapi import FastAPI, HTTPException, BackgroundTasks
from tortoise.contrib.fastapi import register_tortoise
from tortoise.exceptions import OperationalError

from .schemas import StatusMessage, SymbolAddRequest, StockList
from .config import TORTOISE_ORM_SETTINGS
from .database import Stock
from .stock_utils import fetch_stock_data


app = FastAPI()
register_tortoise(app, config=TORTOISE_ORM_SETTINGS, generate_schemas=True)


@app.get('/')
def root():
    return StatusMessage(code='success', message='Hello, FastAPI')


@app.get('/stocks')
async def get_stocks(forward_pe: float=None, dividend_yield: float=None, ma50: bool=None, ma200: bool=None):
    stocks = Stock.all()
    if forward_pe:
        stocks = stocks.filter(forward_pe__lte=forward_pe)
    if dividend_yield:
        stocks = stocks.filter(dividend_yield__gte=dividend_yield)
    if ma50:
        stocks = stocks.filter(price__gte=F('ma50'))
    if ma200:
        stocks = stocks.filter(price__gte=F('ma200'))
    return await StockList.from_queryset(stocks)


@app.post('/stock')
async def create_stock(request: SymbolAddRequest, background_tasks: BackgroundTasks):
    try:
        stock = await Stock.create(symbol=request.ticker_symbol)
    except OperationalError as e:
        raise HTTPException(status_code=409, detail=f'Unable to add {request.ticker_symbol}: {e}')
    background_tasks.add_task(fetch_stock_data, stock.id)
    return StatusMessage(code='success', message=f'{request.ticker_symbol} added to database with id {stock.id}')


@app.delete('/stock/{stock_id}')
async def delete_stock(stock_id: int):
    stock = await Stock.filter(id=stock_id).first()
    if stock is None:
        raise HTTPException(status_code=404, detail=f'Unable to retrieve record for stock with id {stock_id}')
    ticker = stock.symbol
    stock_id = stock.id
    await stock.delete()
    return StatusMessage(code='success', message=f'{ticker} ({stock_id}) removed from database')


if __name__ == "__main__":
    uvicorn.run("stocktoolkit.main:app", host="127.0.0.1", port=8000, reload=True)

Even with the limitations I’ve noticed already, I’m really happy with the approach this enables. It’s a better fit for how I like to approach problems than turbogears (which I used to use heavily) or flask (which I’d been planning to adopt when I ran across FastAPI).

This is very long. I’ll cover the front-end in another post. The full backend repo, tagged at this stopping point, is available here in case seeing the end result is more appealing than following along.

Learning FastAPI (2): “Stock Toolkit”

Initial FastAPI Setup

Add a Database

Specify API Messages

Stub out API endpoints

Ad Hoc Testing

Check Empty Database

Add a Stock

Confirm Success

Delete the Stock

Confirm Success

Check API Input Validation

Generated API Documentation and Test Interface

Rounding out the API

Sensible error messages

Filtering the Stock List

Populate Data in Background