typenil

typenil

data|coding|exploding

Git Hooks: Autoformat before commit

Autoformatters can be great, keeping diffs small and a code base readable across teams and engineers. Black formatting has been enforced in Lobit builds since day one and I recently added it to the repos at Downstream as well.

The typical code base

But what isn’t great is forgetting to run the formatter, having the CI build fail, and having to add a “Formatting” commit. That’s where Git Hooks come in.

Git hooks are stored in .git/hooks folder of any given Git project. There should be some samples pre-populated. For our purposes, pre-commit is the hook of interest.

Lets focus on the Black formatter for Python as an example. For the example, it needs to be installed locally (pip install black). Naively, we can add formatting on pre-commit by adding a file called pre-commit to .git/hooks with these contents:

black .

(For Javascript, you might have something like prettier --write **/*.js)

Don’t forget to make the file executable (use chmod +x .git/hooks/pre-commit), or the hook may not fire.

It’s nice and simple, but there are a couple issues with this approach. First off, Black will be running over all the source files rather than just the ones that changed (which can be quite slow, depending on the size of your project). Secondly, the modified source files won’t actually be added to the commit, which means you need to re-commit the reformatted files.

Here comes the convoluted piping. To get the names and statuses of files that are staged for commit, we can use git diff --cached --name-status to yield output like this:

> $ git diff --cached --name-status

M       someproject/src/whatevs.py
A       someproject/tests/whatevs_test.py
D       someproject/tests/lasers.py
M       someproject/README.md

We want to ignore deleted files, since we can’t format them, so let’s filter them out:

> $ git diff --cached --name-status | grep -v '^D'

M       someproject/src/whatevs.py
A       someproject/tests/whatevs_test.py
M       someproject/README.md

We want to ignore non-Python files (or Javascript or Golang or whatever you’re using), so let’s filter on file extension:

> $ git diff --cached --name-status | grep -v '^D' | grep '.py'

M       someproject/src/whatevs.py
A       someproject/tests/whatevs_test.py

Now that we have the files we want, we just want to focus on the names:

> $ git diff --cached --name-status | grep -v '^D' | grep '.py' | sed 's/[A-Z][ \t]*//'

someproject/src/whatevs.py
someproject/tests/whatevs_test.py

And we can pass that right into Black (or whatever formatter you’re using):

> $ git diff --cached --name-status | grep -v '^D' | grep '.py' | sed 's/[A-Z][ \t]*//' | xargs black

reformatted someproject/src/whatevs.py
reformatted someproject/tests/whatevs_test.py
All done! ✨ 🍰 ✨
2 files reformatted.

So that’s all great, but the changes still aren’t staged. Let’s fix that by processing the output from Black into just the files that were reformatted:

> $ git diff --cached --name-status | grep -v '^D' | grep '.py' | sed 's/[A-Z][ \t]*//' | xargs black 2>&1 | grep '^reformatted' | sed 's/reformatted[ \t]//'

someproject/src/whatevs.py
someproject/tests/whatevs_test.py

And now we can finally stage the formatted files:

> $ git diff --cached --name-status | grep -v '^D' | grep '.py' | sed 's/[A-Z][ \t]*//' | xargs black 2>&1 | grep '^reformatted' | sed 's/reformatted[ \t]//' | xargs git add

So the final .git/hooks/pre-commit file should look something like this:

git diff --cached --name-status | grep -v '^D' | grep '.py' | sed 's/[A-Z][ \t]*//' | xargs black 2>&1 | grep '^reformatted' | sed 's/reformatted[ \t]//' | xargs git add

Set it and forget it. Happy coding.


Prefer to catch my posts elsewhere?