For the first time in quite a while, I caught the urge to publish a package to pypi today. The happy path for publishing a pure python package felt less obvious than I think it should’ve. Here’s what I did.

The package in question is published here, and the source code is available on sr.ht.

It’s the beginning of a youth baseball season. One of the things I do is help our commissioners get all the information our families need to the right places where families will find it. For a long time, I tried sending them templates and generating CSVs. Only a few of them used my templates, and the CSVs tended to have weird errors That intermediate step confused and annoyed them, and it wasn’t really saving me any work. So starting last year, when we migrated to a new back-end system, I decided to stop asking them for specific formats and just consume the spreadsheets that they used organically to form teams and develop schedules.

It’s worked shockingly well. I’ve got a small zoo of jupyter notebooks that I can use to quickly wrangle their sheets into the formats our system needs. But as I was building those notebooks, I kept wishing for one feature that openpyxl doesn’t include: I wanted to read all or part of a worksheet into a list of dict objects, much like DictReader in python’s standard csv library does. I looked around to see if anyone had already made that thing available to the public, and found this one version, but my commissioners don’t have the decency to put the things I need in the upper-left corner of a sheet, all nicely defined.

So the only “sane” thing to do was make my own. Thus far, it doesn’t suck. So partly because I like sharing, and partly because it’s a lot easier for me if I can just poetry add my library from pypi in the future, I wanted to publish my library to the cheese shop.

It’s not obvious to me how best to build and maintain my project so that I can easily do that. There’s been a lot of movement in the python building/packaging/distribution space over the past several years, and the advice floating around online is conflicting and unclear.

Here’s where I landed. I’m hoping a more experienced voice can tell me where I’ve gone wrong and how to fix it.

  1. I’m using poetry to build my library from pyproject.toml:
[tool.poetry]
name = "xlsx-dict-reader"
version = "0.2.0"
description = "An interface similar to csv.DictReader for openpyxl WorkSheet objects"
authors = ["Geoff Beier <geoff@tuxpup.com>"]
readme = "README.md"
license = "MIT"
repository = "https://git.sr.ht/~tuxpup/xlsx-dict-reader"

[tool.poetry.dependencies]
python = "^3.10"
openpyxl = "^3.1.2"


[tool.poetry.group.dev.dependencies]
pytest = "^8.1.1"
pre-commit = "^3.6.2"
black = "^24.3.0"
flake8 = "^7.0.0"
tox = "^4.14.2"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"


[tool.black]
line-length = 79
target-version = ['py310']

[tool.isort]
profile = "black"
multi_line_output = 3
py_version = 310
  1. I’m using pre-commit to enforce coding standards before I commit, and I’m not checking them at all in CI.
repos:
  - repo: https://github.com/psf/black
    rev: 24.3.0
    hooks:
      - id: black
  - repo: https://github.com/PyCQA/isort
    rev: 5.13.2
    hooks:
      - id: isort
  - repo: https://github.com/PyCQA/flake8
    rev: 7.0.0
    hooks:
      - id: flake8

.flake8:

[flake8]
ignore = E211, E999, F401, F821, W503
max-doc-length = 72
  1. I’m using pytest to test things:
[pytest]
# Under python 3.12, openpyxl raises a DeprecationWarning for datetime.datetime.utcnow()
filterwarnings =
    ignore::DeprecationWarning:openpyxl.packaging.core
  1. My tests require excel files. I’m keeping those checked into my tests module in a data directory.

https://git.sr.ht/~tuxpup/xlsx-dict-reader/tree/main/item/tests

  1. I’m using tox to test with python versions from 3.10 to 3.12

For now, I bump the version in pyproject.toml and commit that, tag it after tests pass, and run poetry build followed by poetry publish to do a release.

I don’t think I’m quite aligned with what people who do this all the time think is the easiest path. What should I do differently?


I’m trying on Kev Quirk’s “100 Days To Offload” idea. You can see details and join yourself by visiting 100daystooffload.com.

This is day #3?