Learning FastAPI (2): “Stock Toolkit”
As I’ve been trying to update the tools I lean on for quick web things, I am really enjoying using FastAPI with Svelte. These are notes about the development setup that I currently like. Fair warning: this is not battle hardened yet. I’ve used it to make a couple of toys and like how it’s shaping up, though. I give an overview of the stack here.
I’m still mainly talking about a development setup in this post. While I have some ideas about how I want to host it, those are only half-developed at this point. I’m writing these details now half for future reference and half so I’ve got something I can point to and ask for commentary.
To get my arms around these tools, I’m building a small tool for fetching, displaying and filtering information about stocks. It’s very heavily inspired by this one, though FastAPI is really all it’s got in common with that tutorial anymore.
Initial FastAPI Setup
FastAPI is remarkably low-fuss to set up. While I do plan to make a cookiecutter template for the things I find myself doing repeatedly, the only thing I’d put in it so far is my requirements.txt
.
The repository I’m building as I write these posts is available here.
I’m starting with an empty directory containing a .gitignore
file that’s populated for preventing jetbrains noise and node noise from finding their way into my repository, in addition to ignoring sqlite files and related items. I’ve also got a requirements.txt
file listing packages I installed as I was kicking the tires on this a few days ago:
fastapi~=0.61.1
aiofiles~=0.5.0
pydantic~=1.6.1
uvicorn~=0.12.1
yfinance~=0.1.54
tortoise-orm~=0.16.16
aerich~=0.2.5
asyncpg~=0.21.0
fastapi
is the star of the show. aiofiles
is required for serving static files from a fastapi
application. pydantic
almost certainly doesn’t need to be explicitly included; I’m 90% certain fastapi
pulls it in. It provides input and output validation for python data classes and has proven a really nice way to preserve sanity while using JSON. uvicorn
is the server I’m using for local development. tortoise-orm
is an async-oriented python ORM, and is the piece I’m least certain about in this setup. I should probably just hand-roll SQL for most things I do, but I’ve used sqlalchemy for a while and am especially attached to the low friction it brings to movement between sqlite for local development and postgres or mariadb when I move to a server. Tortoise ORM feels really thin and similar in that regard, so I’m sticking with it for now. aerich
is a utility that’s similar to alembic for tortoise-orm. I’m even less certain that I’ll stick with it, but I’d like to kick the tires.
yfinance
scrapes Yahoo! finance for data about stocks.
Update 10/15/2020: Add asyncpg
to requirements.txt
.
To get started, in my directory with .gitignore
and requirements.txt
I run:
$ python3 -mvenv venv
$ source venv/bin/activate
$ pip install -U pip
$ pip install -r requirements.txt
From there, I start PyCharm and create a new empty python project in that directory. I tell pycharm about the new project interpreter found in the just-populated venv
subdirectory, and add a new python package called stocktoolkit
at the root level.
To get started, in the stocktoolkit
package, I create two new python files: main.py
and schemas.py
. In the schemas module, I declared a status message format using pydantic:
from pydantic import BaseModel
class StatusMessage(BaseModel):
code: str
message: str
I’m putting my JSON message definitions into the schemas
module even though Pydantic calls them “models”. I am mainly doing that because it helps me reason about them separately from the ORM’s “models”. Some of the projects I’ve been looking at bunk the two together, some keep the JSON-related pieces with the associated routes, and some separate them out like I did here. I don’t see an overwhelming advantage to any of these yet, so I’m sticking to the one that seems easiest for me to think about for now.
And I’m using that to deliver a “Hello, World” message in the main module:
import uvicorn
from fastapi import FastAPI
from .schemas import StatusMessage
app = FastAPI()
@app.get('/')
def root():
return StatusMessage(code='success', message='Hello, FastAPI')
if __name__ == "__main__":
uvicorn.run("stocktoolkit.main:app", host="127.0.0.1", port=8000, reload=True)
This is all very much in-line with the FastAPI sample code used throughout the manual except for the fact that I’m running the server from the main module instead of using uvicorn directly on the command line. Doing it this way makes it easy to establish an interactive debugging session with PyCharm. It’s necessary to specify the ASGI app as a string instead of passing the app object here in order for the hot reloading (specified by reload=True
) to work.
The last part of the setup takes place in the IDE, where I create a run configuration:
The main noteworthy thing about that is that I’m using a module name to select the script that is run. Also, if “store as project file” the run configuration will land in source control. It will need to be opened, have the interpreter updated, and saved in order to work on a different platform than it was created on. (I was trying it out on Windows when I took the screenshot.) My current preference is to keep the one from Linux checked in and just update it from Windows without committing those settings on those occasions where I’m working from Windows. If I worked on Windows more often, I think I’d add a second run configuration for Windows and commit both to git.
Add a Database
Tortoise ORM is very similar to the SQL Alchemy ORM or to the Django ORM. While FastAPI works well with SQL Alchemy, most projects I’ve found that use it seem to favor Tortoise. Since I’m starting fresh, I’m going to try Tortoise.
First, I create a new file called database.py
and use Tortoise to create a data model:
from tortoise.models import Model
from tortoise import fields
# This should match the sqlalchemy definition found here:
# https://github.com/hackingthemarkets/stockscreener/blob/master/models.py
class Stock(Model):
id = fields.IntField(pk=True, index=True)
symbol = fields.CharField(max_length=255, unique=True, index=True)
price = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
forward_pe = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
forward_eps = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
dividend_yield = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
ma50 = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
ma200 = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
class Meta:
table = 'stocks'
Specifying the table name is not really necessary, but it better matches the convention I’m used to that way and does not hurt anything. The main noteworthy thing about this definition is that all of the fields other than the auto-incrmemented primary key and the ticker symbol need to be nullable. They will never be populated initially; a background task will look up their values using the yahoo finance API.
With the schema defined, Tortoise needs to be told about where the database can be found. The documentation offers several options for this, but only the configuration dictionary appears to be compatible with the aerich
management and migration tool. Having no legacy concerns and some inclination to at least try that tool, that’s what I will use for now. I’m storing the database info in a module called config
that I expect to vary per-environment because that’s the convention I’ve used before. I don’t see any clear consensus for how people manage those in the wild with this stack yet.
TORTOISE_ORM_SETTINGS = {
'connections': {
'default': 'sqlite://stocktoolkit_dev.db',
},
'apps': {
'models': {
'models': ['stocktoolkit.database', ],
}
}
}
With that in place, I add a call to register_tortoise
to main.py
and start the service to confirm that a database with the expected schema gets created.
Update 10/15/2020: Tortoise ORM’s integration with sqlite is not a super fit for this application. In particular, things like this:
dividend_yield = fields.DecimalField(max_digits=10, decimal_places=2, null=True)
in the schema are problematic. Tortoise uses a string field in sqlite for that, and then queries with string values. So a query with WHERE dividend_yield >= '42'
generated by the ORM will return a stock with a dividend yield of 5.25. As a consequence, the most recent version of the project uses postgres. Thus far, that does not make me want to go back to sqlalchemy.
import uvicorn
from fastapi import FastAPI
from tortoise.contrib.fastapi import register_tortoise
from .schemas import StatusMessage
from .config import TORTOISE_ORM_SETTINGS
app = FastAPI()
register_tortoise(app, config=TORTOISE_ORM_SETTINGS, generate_schemas=True)
@app.get('/')
def root():
return StatusMessage(code='success', message='Hello, FastAPI')
if __name__ == "__main__":
uvicorn.run("stocktoolkit.main:app", host="127.0.0.1", port=8000, reload=True)
Specify API Messages
With the database defined, now I need to specify messages for retrieving all the stocks from the database and for adding a stock by ticker symbol. The API as I’m currently imagining it doesn’t need an update message and the delete operation will just be a DELETE
verb to /stock/id
where id
is the primary key from the database.
Pydantic generates both of these message formats very tersely by modifying schemas.py
:
from pydantic import BaseModel
from tortoise.contrib.pydantic import pydantic_queryset_creator
from .database import Stock
class StatusMessage(BaseModel):
code: str
message: str
class SymbolAddRequest(BaseModel):
ticker_symbol: str
StockList = pydantic_queryset_creator(Stock)
While this brevity is excellent, the bit that I really like is the validation that comes for free when you use type hints.
Stub out API endpoints
With the messages defined, now I need to get preliminary endpoints in place to add stocks to the database, retrieve them, and delete them. For now, I’m putting all of these routes in main.py
. The FastAPI manual details a good approach to splitting them up, but even with all of them there main.py
will remain small and readable for this application and I won’t learn anything interesting by breaking them up into groups. I currently think the natural seam where a split makes sense is when I have endpoints that act on different kinds of things, like, say, both users and stocks.
import uvicorn
from fastapi import FastAPI
from tortoise.contrib.fastapi import register_tortoise
from .schemas import StatusMessage, SymbolAddRequest, StockList
from .config import TORTOISE_ORM_SETTINGS
from .database import Stock
app = FastAPI()
register_tortoise(app, config=TORTOISE_ORM_SETTINGS, generate_schemas=True)
@app.get('/')
def root():
return StatusMessage(code='success', message='Hello, FastAPI')
@app.get('/stocks')
async def get_stocks():
stocks = Stock.all()
return await StockList.from_queryset(stocks)
@app.post('/stock')
async def create_stock(request: SymbolAddRequest):
stock = await Stock.create(symbol=request.ticker_symbol)
return StatusMessage(code='success', message=f'{request.ticker_symbol} added to database with id {stock.id}')
@app.delete('/stock/{stock_id}')
async def delete_stock(stock_id: int):
stock = await Stock.filter(id=stock_id).first()
ticker = stock.symbol
stock_id = stock.id
await stock.delete()
return StatusMessage(code='success', message=f'{ticker} ({stock_id}) removed from database')
if __name__ == "__main__":
uvicorn.run("stocktoolkit.main:app", host="127.0.0.1", port=8000, reload=True)
Ad Hoc Testing
Any HTTP client can test these simple endpoints easily enough. My current tool of choice is insomnia. The core client is FOSS; they are trying to sell some collaboration and design features to monetize it. And it’s very pleasant to use as long as you’re not allergic to electron. A shell script and curl
would certainly do the job, though.
Check Empty Database
$ curl --request GET \
--url http://localhost:8000/stocks
[]
Add a Stock
$ curl --request POST \
--url http://localhost:8000/stock \
--header 'content-type: application/json' \
--data '{
"ticker_symbol": "AAPL"
}'
{"code":"success","message":"AAPL added to database with id 2"}
Confirm Success
$ curl --request GET \
--url http://localhost:8000/stocks
[{"id":2,"symbol":"AAPL","price":null,"forward_pe":null,"forward_eps":null,"dividend_yield":null,"ma50":null,"ma200":null}]
Delete the Stock
$ curl --request DELETE \
--url http://localhost:8000/stock/2
{"code":"success","message":"AAPL (2) removed from database"}
Confirm Success
$ curl --request GET \
--url http://localhost:8000/stocks
[]
Check API Input Validation
Although no explicit error handling has yet been added, a malformed stock add request can show what FastAPI and Pydantic are giving us for free in terms of error handling with the following request:
$ curl --request POST \
--url http://localhost:8000/stock \
--header 'content-type: application/json' \
--data '{
"ticker_symbol2": "AAPL"
}'
{"detail":[{"loc":["body","ticker_symbol"],"msg":"field required","type":"value_error.missing"}]}
Generated API Documentation and Test Interface
Swagger documentation of the API is available live at /docs
using a web browser.
Rounding out the API
This leaves us with a roughed-in API that has a few rough edges and missing features:
- Attempting to delete a nonexistent resource results in an internal server error.
- Attempting to add a duplicate ticker symbol results in an internal server error.
- Filters for the stock list are not yet implemented.
- No data other than the ticker symbol ever gets loaded into the database.
I’m certain there is plenty more we could do, but these basics should help kick the tires on the framework a little more.
Sensible error messages
FastAPI makes it easy to pass errors on to the HTTP client in the form of HTTP status messages with more details in the body. For example, if no stock is found to delete, this:
if stock is None:
raise HTTPException(status_code=404, detail=f'Unable to retrieve record for stock with id {stock_id}')
is all that is necessary to induce most clients to report the error appropriately. Similarly, in the case of a constraint violation, catching the exception from the ORM and reporting it as an HTTPException
does the job:
@app.post('/stock')
async def create_stock(request: SymbolAddRequest):
try:
stock = await Stock.create(symbol=request.ticker_symbol)
except OperationalError as e:
raise HTTPException(status_code=409, detail=f'Unable to add {request.ticker_symbol}: {e}')
return StatusMessage(code='success', message=f'{request.ticker_symbol} added to database with id {stock.id}')
Filtering the Stock List
FastAPI passes query parameters through to API endpoints exactly as you might naiively expect it to based on other observe patterns: you declare them as function parameters with the same name as the query parameters and provide appropriate type annotations. Parameters with default values are optional. Those without default values are required.
The harder part for me to get my head around was actually applying the filter parameters to the query. I’ve used SQL alchemy plenty and written my share of plain ol' SQL, but I’ve never used Django’s ORM for much. Tortoise borrows much of its interface from Django’s ORM:
@app.get('/stocks')
async def get_stocks(forward_pe: float=None, dividend_yield: float=None, ma50: bool=None, ma200: bool=None):
stocks = Stock.all()
if forward_pe:
stocks = stocks.filter(forward_pe__lte=forward_pe)
if dividend_yield:
stocks = stocks.filter(dividend_yield__gte=dividend_yield)
if ma50:
stocks = stocks.filter(price__gte=F('ma50'))
if ma200:
stocks = stocks.filter(price__gte=F('ma200'))
return await StockList.from_queryset(stocks)
Now that I’m acclimated, I like the extra input validation this provides over hand rolling SQL. I know some prefer to avoid ORMs like the plague, and I wonder if this will continue to hold as my datastores grow more complex.
Populate Data in Background
The last major thing I want to do in the backend is populate information for newly added stocks without blocking the client. I have a mixed reaction to the built-in background task interface offered by FastAPI/Starlette. On one hand, the happy path feels very, very straightforward and is easy to understand. On the other, it does not seem especially robust for the kinds of things I’m likely to want to use it for.
I suspect that, some time soon, it will become obvious that I need to revisit this with Celery or something similar. For now, the background task interface is easy to adopt for a prototype and it doesn’t feel as if it will be hard to replace later.
For now, I added a new stock_utils module. It currently contains one function that fetches data from yahoo finance and updates the local database based on the results:
from .database import Stock
import yfinance
async def fetch_stock_data(stock_id: int):
stock = await Stock.filter(id=stock_id).first()
yahoo_data = yfinance.Ticker(stock.symbol)
stock.ma200 = yahoo_data.info['twoHundredDayAverage']
stock.ma50 = yahoo_data.info['fiftyDayAverage']
stock.price = yahoo_data.info['previousClose']
stock.forward_pe = yahoo_data.info['forwardPE']
stock.forward_eps = yahoo_data.info['forwardEps']
if yahoo_data.info['dividendYield'] is not None:
stock.dividend_yield = yahoo_data.info['dividendYield'] * 100
await stock.save()
With that, the back end feels finished enough for now. Here are all the endpoints:
import uvicorn
from fastapi import FastAPI, HTTPException, BackgroundTasks
from tortoise.contrib.fastapi import register_tortoise
from tortoise.exceptions import OperationalError
from .schemas import StatusMessage, SymbolAddRequest, StockList
from .config import TORTOISE_ORM_SETTINGS
from .database import Stock
from .stock_utils import fetch_stock_data
app = FastAPI()
register_tortoise(app, config=TORTOISE_ORM_SETTINGS, generate_schemas=True)
@app.get('/')
def root():
return StatusMessage(code='success', message='Hello, FastAPI')
@app.get('/stocks')
async def get_stocks(forward_pe: float=None, dividend_yield: float=None, ma50: bool=None, ma200: bool=None):
stocks = Stock.all()
if forward_pe:
stocks = stocks.filter(forward_pe__lte=forward_pe)
if dividend_yield:
stocks = stocks.filter(dividend_yield__gte=dividend_yield)
if ma50:
stocks = stocks.filter(price__gte=F('ma50'))
if ma200:
stocks = stocks.filter(price__gte=F('ma200'))
return await StockList.from_queryset(stocks)
@app.post('/stock')
async def create_stock(request: SymbolAddRequest, background_tasks: BackgroundTasks):
try:
stock = await Stock.create(symbol=request.ticker_symbol)
except OperationalError as e:
raise HTTPException(status_code=409, detail=f'Unable to add {request.ticker_symbol}: {e}')
background_tasks.add_task(fetch_stock_data, stock.id)
return StatusMessage(code='success', message=f'{request.ticker_symbol} added to database with id {stock.id}')
@app.delete('/stock/{stock_id}')
async def delete_stock(stock_id: int):
stock = await Stock.filter(id=stock_id).first()
if stock is None:
raise HTTPException(status_code=404, detail=f'Unable to retrieve record for stock with id {stock_id}')
ticker = stock.symbol
stock_id = stock.id
await stock.delete()
return StatusMessage(code='success', message=f'{ticker} ({stock_id}) removed from database')
if __name__ == "__main__":
uvicorn.run("stocktoolkit.main:app", host="127.0.0.1", port=8000, reload=True)
Even with the limitations I’ve noticed already, I’m really happy with the approach this enables. It’s a better fit for how I like to approach problems than turbogears (which I used to use heavily) or flask (which I’d been planning to adopt when I ran across FastAPI).
This is very long. I’ll cover the front-end in another post. The full backend repo, tagged at this stopping point, is available here in case seeing the end result is more appealing than following along.