Follow @alexchabot on Micro.blog.

Publishing datasette to Google Cloud Compute with GitHub Actions

Simon Willison has a fascinating data-publishing and data-management project named datasette. A few months ago, he put together a plugin named datasette-ripgrep that uses ripgrep (you use ripgrep, right?) to search folders of files and display the results using datasette’s machinery.

I thought of creating a datasette-ripgrep instance to search all the packages from the Enthought Tool Suite. Using GitHub to search across this cohesive set of tools, and only this set of tools, doesn’t really work.

Setting datasette-ripgrep up locally turned out to be pretty easy. But publishing it to Google Cloud Compute (GCP) using GitHub Actions so I could automate the daily the content of the indexes repositories turned out to be a multi-month effort.

I started working off the demo deploy action which took me most of the way there. But I kept running into GCP authentication issues. It complained that “No credentials provided, skipping authentication”. That is, until I realized 2 months later (of on-and-off attempts) that I was putting GitHub secrets in Settings > Environment > Secrets, and not in Settings > Secrets. *slaps forehead* I’m sure actions can see secrets in the Environment section somehow, but I don’t know how. Another thing I learned is that when the GCP docs ask you to put the service account key in a GitHub secrets, you can just paste the whole JSON as-is.

The next hurdle was that the datasette publish cloudrun command would fail with the error “You do not appear to have access to project […]“. I tried many things related to IAM, role, service accounts and the likes, but without success. The ah ha! moment came when I realized/remembered that datasette.publish.cloudrun actually talks to GCP using the gcloud command line tool. I identified that it calls the builds and deploy subcommands. Using that information I could make searches to figure out which permissions were required to execute those commands. The one I was missing was Cloud Build Editor (and maybe Viewer).

In the end, the Service Account has the following roles (I’m not 100% sure they’re all necessary):

  • Cloud Build Editor
  • Compute Engine Service Agent
  • Service Account User
  • Cloud Run Admin
  • Storage Admin
  • Viewer

After 100 failed deploys and much reading of mediocre Medium articles and of Google’s (seemingly) incomplete and incorrect READMEs, the 101th deploy succeeded! You can now search the ETS repos at the very unglamorous URL of https://datasette-ripgrep-ets-alicuzwd4a-uc.a.run.app and see the source on GitHub.