Centralizing Project Management Tasks with doit
At Fenris, we like to focus on fast and clean deployments. To achieve this, we use a plethora of tools to cover our needs for styling, testing, building, and publishing our code. For years, we’ve been using doit, a task management and automation tool for python, to organize our use of these tools.
We had a problem, though. We had practically the same doit
tasks used in all of our projects, resulting in duplicated, and sometimes inconsistent, task definition code across dozens of repositories.
Last week, we came up with a solution. We decided to centralize our project management tasks and use that shared task library across our python projects.
Overview
doit
is a tool that allows us to very simply define and execute all the tasks that we want to run upon pushing new code. As mentioned above, this includes linting, testing, security checks, packaging, publishing, and more.
Using doit
gives us the benefits of optimizing processing by skipping already completed tasks, simplifying complicated command line calls, and most importantly, performing all of these tasks identically with our CI/CD and locally. We were already experiencing the benefits of automation, but wanted the benefits of task standardization that come from using a centralized library.
Context
There are a few things to know about the doit
system before we jump into the code:
- By default, the list of tasks available within a given project are stored in a
dodo.py
file. - The tasks in the
dodo.py
file are simple, often just containing an action, file / task dependencies, a task name, and targets. - The
DOIT_CONFIG
constant specifies the default tasks to be run when$ doit
is run from the command line. So, each individual project can specify its ownDOIT_CONFIG
based on its task needs.
The Solutions
Problem 1: Duplicated Code
The first problem that we needed to tackle was the fact that all of our doit tasks were copy-pasted across projects with some common elements changed, leading to a lot of duplication of code. To solve this issue, we refactored the shared tasks and moved them to a common package.
We defined each of our tasks within our shared fenris_doit
repository, in a tasks
folder. Here’s what an example task looks like:
def task_black(dodo: str, repo: str, tests: Optional[str] = None) -> Generator: """Check standardized code formatting.""" for location in remove_none([dodo, repo, tests]): yield { "name": location, "actions": [f"black -l100 --check {location}"], "file_dep": list_files(location), "task_dep": ["python_dependencies"], }
Problem 2: Project-Specific Task Configuration
The second problem on our list was that there are a variety of things that each project might need to customize for its list of tasks. We’ve broken down the modifications that need to be accounted for below. After outlining these issues, we share the code used to solve them.
1. Setting Task Configuration Parameters
Each of the task functions needs a few arguments supplied, either by default values or by cli arguments passed in. We didn’t want a lot of code across repos, so applying partial
to every function wasn’t an option.
For example, this black task takes a few arguments – the name of the file with the doit specifications (dodo
), and the name of the repository and the tests directory to run the black
command on (repo
and tests
). We needed to enable each project to import specified tasks while also autofilling said parameters so that each task runs with the proper configuration for a given project.
Continuing with this example — if I’m working in a project called my_library
, I want to make sure that when I run $ doit black
, black operates on the my_library
folder, as well as my tests
folder and my dodo.py
file.
So, we developed a solution allowing for the import of various tasks, autofilling parameters with those specified in the import_tasks
function call.
2. Defining Which Tasks to Import / Use
Another challenge that we tackled was supporting that each project might want to import a specific subset of the available shared tasks from our tasks repository. We only want to import the tasks specified in the project’s dodo.py
file.
3. Overriding Existing Tasks
Additionally, we sought to support the need that a project might want to override the logic of a task already defined within the shared tasks repository. For example, the shared black
task might enforce a line length of 100, whereas said project might need to be more strict with 88. Thus, we wanted to ensure that any of our shared tasks could be overridden within a given project.
4. Defining Custom Tasks
Finally, we wanted to ensure that within a project’s dodo.py
file, custom tasks could be defined. Projects may have unique needs that are best tackled by a custom task defined only within said project’s scope.
The Code:
First, we share the import_tasks
logic that we defined to support the above needs.
"""Tools for importing shared doit tasks.""" import importlib import inspect from functools import partial from typing import Any, Iterable, List, Optional DOIT_MODULE_NAME = "fenris_doit.tasks" def import_tasks( globals_: dict, tasks: List[str], repo: str, tests: Optional[str] = "tests", dodo: Optional[str] = "dodo.py", requirements_file: Optional[str] = "requirements-dev.txt", internal_deps: Iterable[str] = tuple(), **kwargs: Any, ) -> None: """Import doit tasks and update task args.""" _import_and_apply_params( globals_=globals_, tasks=tasks, repo=repo, tests=tests, dodo=dodo, requirements_file=requirements_file, internal_deps=internal_deps, **kwargs, ) def _import_and_apply_params(globals_: dict, tasks: List[str], **kwargs: Any) -> None: """Import doit tasks and apply replacement params.""" module = importlib.import_module(DOIT_MODULE_NAME) imported_task_names = [x for x in list(module.__dict__) if x.replace("task_", "") in tasks] # additional tasks defined in dodo.py file custom_task_names = [x for x in globals_ if x.replace("task_", "") in tasks] globals_.update({k: _update_if_callable(getattr(module, k), **kwargs) for k in imported_task_names}) globals_.update({k: _update_if_callable(globals_[k], **kwargs) for k in custom_task_names}) def _update_if_callable(maybe_func: Any, **kwargs: Any) -> Any: if callable(maybe_func) and hasattr(maybe_func, "__name__"): all_args = inspect.getfullargspec(maybe_func)[0] to_apply = {k: v for k, v in kwargs.items() if k in all_args} # Cannot return partial directly, as it doesn't have a __name__ that doit picks up def new_func(*args: Any, **new_func_kwargs: Any) -> Any: try: return partial(maybe_func, **to_apply)(*args, **new_func_kwargs) except TypeError as err: print(f"Doit config error: {err}") exit(1) # have to keep the __name__ the same as the original task name for doit new_func.__name__ = maybe_func.__name__ return new_func else: return maybe_func
A couple of key notes:
- We include the wrapper function
import_tasks
defined on line 10 in order to support default values for parameters such as the test directory name and the dodo file name. - On line 39, we reference
custom_task_names
, which includes any tasks that may have been defined in a project’s dodo file. This allows a project to not only import from our shared repository of tasks, but also to define any custom tasks relevant to the project in question. An example of this is shown in thedodo.py
code below. - the
update.calls
on lines 41-42 are responsible for the replacement of the task parameters with the values provided in theimport_tasks
call. - We utilize the
_update_if_callable
helper function to make sure that we’re only changing the signature of functions (in our case, tasks are defined as functions)
Next, we’re sharing an example of a dodo.py
file found in one of our projects:
"""Doit logic.""" from fenris_doit import import_tasks DOIT_CONFIG = { "default_tasks": [ "black", "pytest", "custom_job", ... ], "cleanforget": True, "verbosity": 0, } def task_black(dodo: str, repo: str, tests: Optional[str] = None) -> Generator: """Check standardized code formatting, overriding existing task and using -l88.""" for location in remove_none([dodo, repo, tests]): yield { "name": location, "actions": [f"black -l88 --check {location}"], "file_dep": list_files(location), "task_dep": ["python_dependencies"], } def task_custom_job(repo: str) -> Generator: """Check standardized code formatting.""" yield { "name": f"custom_job: {repo}", "actions": [echo "custom task commencing..."], "file_dep": list_files(repo), } import_tasks( globals_=globals(), tasks=DOIT_CONFIG["default_tasks"], repo="<< project_name >>", )
A few more notes:
- On line 15, we define a
black
task that overrides that from the shared repository. In this case, we’re using different line length settings within this project. - You’ll notice on line 25 that we define a
custom_job
, a task that’s not defined in our centralized tasks repository. Because of the custom task support we implemented in theimport_tasks
logic, this task will also be configured with the providedrepo
argument.
Finally, applying the correct values upon import of a given function is a bit challenging, so we’ll touch on a few of the specifics here.
- In the
import_tasks
logic seen above, we use thegetfullargspec
to figure out which arguments for a given function need to be specified. - We use
partial
to create a new function (calledtask_black
, for example), with arguments to match what doit requires. Usingpartial
here prevents us from using tons ofpartial
calls within each project’sdodo.py
file. - In the example
dodo.py
file, we pass in theglobals()
dictionary intoimport_tasks
as an argument so that it can be updated with the new tasks created viapartial
. Thus, these new tasks, with the desired arguments applied / specified, can be used by an individual project.
You can see here that we’ve met our various needs. We’re able to specify project-specific task configuration parameters, add our own custom tasks, limit which tasks we are importing from the shared repository, and override existing tasks while still using other tasks from the shared repository. =)
A Sample Run
Then, when we run doit
from the terminal within said project, we see the following:
(<< project_name >>) $ doit . python_dependencies:requirements-dev.txt -- flake8:dodo.py . flake8:<< project_name >> -- flake8:tests -- black:dodo.py . black:<< project_name >> -- black:tests -- pytest:pytest
Tasks marked with --
have already been executed (and thus were skipped), whereas tasks marked with .
were executed this go around.
Final Words
Our centralizing of these project management tasks has helped us clean up our python projects and standardize our deployments even further. Hopefully, the above guidelines can help you and your team to do the same!
Additional Resources
- doit tasks documentation