2 Use Cases of Python Pre-commit Hooks to Tidy Up Your Git Repositories | by Eldad Uzman | Apr, 2022

photo by Kelly Sikkema on unsplash

Well organized repo leads to a well-organized code and a better developer experience.

In this article, I’ll show you 2 use-cases of pre-commit to maintain a better repository.

Pre-commit is a python based tool that allows you to ‘hook’ into your git repository and trigger a simple code whenever changes are committed.

It’s written and maintained by the awesome python developer and YouTuber Anthony Sottile.

With pre-commit, developers can take advantage of gits distributed architecture and make short and simple code checks before submitting to code review.

This saves a lot of back and forward inter-team communication that is redundant and counter-productive and allows the code reviewer to focus on actual code implementation rather than on surface level concerns.

By using git commit and code review before running cicd builds, we can save time to deliver and some of the costs involved.

To start with git commit you only need 3 steps:

  1. Install pre-commit from pip:
pip install pre-commit

2. Create the pre-commit configuration file in your repository (about that, shortly)

3. Install the commit configuration into your git repo:

pre-commit install

And from that moment on, the hook will be executed according to your configuration file every time when you commit new code.

Let’s take the example of Jupyter notebooks.

A Jupyter notebook contains both the source code and its output materials in a JSON format.

pip install pandas plotly jupyter

If you commit the notebook as is, you will commit it with its outputs and sometimes it could take too much space and contaminate the repository.

So let’s create a hook to clean the repository on commit.

1. Let’s initiate our git repo:

git init

2. Now let’s add a pre-commit configuration file:

3. Install the hook:

pre-commit install>>>
pre-commit installed at .githookspre-commit

4. Run git status

git status>>>
On branch main
No commits yetUntracked files:
(use "git add <file>..." to include in what will be committed)
.gitignore
.pre-commit-config.yaml
notebooks/

I’ve placed the notebook under the notebooks directory

5. Track all the files

git add --all>>>
warning: LF will be replaced by CRLF in .gitignore.
The file will have its original line endings in your working directory
warning: LF will be replaced by CRLF in notebooks/notebook.ipynb.
The file will have its original line endings in your working directory

6. Commit the changes

git commit -m "good commit">>>
jupyter-nb-clear-output..................................................Failed
- hook id: jupyter-nb-clear-output
- exit code: 1
C:workspace-vscodepre-commit-demovenvScriptspython.EXE: No module named nbconvert

Our pre-commit failed because it couldn’t find nbconvert module.
We need to install it:

pip install nbconvert

And commit again:

git commit -m "good commit">>>
jupyter-nb-clear-output..................................................Failed
- hook id: jupyter-nb-clear-output
- files were modified by this hook
[NbConvertApp] Converting notebook notebooks/notebook.ipynb to notebook
[NbConvertApp] Writing 878 bytes to notebooksnotebook.ipynb

Now pre-commit notifies us that the file has been modified.
It cleaned the output off of the notebook.
Now we need to add the changes:

git add --all

And commit:

git commit -m "good commit">>>
jupyter-nb-clear-output..................................................Passed
[main(root-commit) 8560ba3] good commit
4 files changed, 379 insertions(+)
create mode 100644 .gitignore
create mode 100644 .pre-commit-config.yaml
create mode 100644 notebooks/notebook.ipynb
create mode 100644 requirements.txt

This can be divided into 2 domains:

Linting

Let’s create a new dir named scripts and add a file named mainapp.py

In this file, we will write some useless code 🙂

I’ve already written about the benefits of linting here, now it’s time to add linting into our workflow.

  1. Run git status:
git status>>>
On branch main
Untracked files:
(use "git add <file>..." to include in what will be committed)
scripts

2. Add a new hook to pre-commit config file:

We’ve added a new hook with pylint id to our pre-commit config.

3. Install the new hook configuration:

pre-commit install>>>
pre-commit installed at .githookspre-commit

4. Track changes:

git add --all

5. Commit

git commit -m "commit #2">>>
jupyter-nb-clear-output..............................(no files to check)Skipped
pylint...................................................................Failed
- hook id: pylint
- exit code: 16
PYLINTHOME is now 'C:UsersuserAppDataLocalpylintpylintCache' but obsolescent 'C:Usersuser.pylint.d' is found; you can safely remove the latter
************* Module somescript
scriptssomescript.py:1:0: C0114: Missing module docstring (missing-module-docstring)
-----------------------------------
Your code has been rated at 5.00/10

The commit failed due to missing module docstring error from pylint.
Now lets add a module docstring.

Add the changes

git add --all

And commit:

git commit -m "commit #2"
jupyter-nb-clear-output..............................(no files to check)Skipped
pylint...................................................................Passed
[master b2d3105] commit #2
2 files changed, 8 insertions(+), 2 deletions(-)

Commit passed 🙂

Notice — the hook for Jupyter notebooks skipped because there were no changes made to the notebook.
This makes pre-commit extremely efficient.

Formatting

Just like linting, we can trigger auto formatting packages like black

We added the function function to our original script, but foo is ugly formatted.

  1. add the hook configuration:

2. Install the new hook

pre-commit install>>>
pre-commit installed at .githookspre-commit

3. Track the changes:

git add --all

4. Commit

git commit -m "add function">>>
jupyter-nb-clear-output..............................(no files to check)Skipped
pylint...................................................................Passed
black....................................................................Failed
- hook id: black
- exit code: 1
Executable `black` not found

Our commit failed because black was not found.
Let’s install it:

pip install black

And commit again:

git commit -m "add function">>>
jupyter-nb-clear-output..............................(no files to check)Skipped
pylint...................................................................Passed
black....................................................................Failed
- hook id: black
- files were modified by this hook
reformatted scriptssomescript.pyAll done! u2728 U0001f370 u2728
1 file reformatted.

File has been formatted!

Much better!

Now lets add the changes:

git add --all

And commit:

git commit -m "add function"
jupyter-nb-clear-output..............................(no files to check)Skipped
pylint...................................................................Passed
black....................................................................Passed
[master 4bd3217] add function
2 files changed, 12 insertions(+)

Pre-commit can help you quickly identify problems on your local clone of the code without going all the way to a ci/cd build or code review.

It allows you to effectively enforce coding standards and conventions across the organization and have a better-organized codebase.

Leave a Comment