Python Package Layout for Internal Automation Modules
Most internal automation repositories fail the same way: they begin as scripts, become shared infrastructure, and keep the filesystem shape of a weekend utility long after production systems depend on them.
Situation
Internal automation usually starts close to the work. A release engineer writes a Python script to tag builds. A platform team adds a helper to rotate service credentials. A data infrastructure team creates a backfill runner. The first version lives in scripts/, imports a few local files, and gets called from a laptop or a CI job.
That is reasonable at the beginning. The problem is that internal automation does not stay small if it works. The useful script becomes a module. The module becomes a library. The library gets imported by deployment jobs, migration tooling, incident runbooks, scheduled workflows, and other teams’ glue code.
At that point, package layout stops being an aesthetic preference. It becomes an operational control.
A good layout answers basic questions before production asks them under pressure: what is importable, what is executable, what is test-only, what owns configuration, and what is safe for another repository to depend on?
The Problem
The common failure mode is a flat repository where everything can import everything.
repo/
deploy.py
rotate_keys.py
aws.py
slack.py
utils.py
test_deploy.py
This works until the repository has multiple entry points, multiple owners, and multiple execution environments. Then import behavior starts depending on the current working directory. CI can pass while the packaged artifact fails. A helper named logging.py shadows the standard library. Tests import source files that would not exist in the installed package. One workflow mutates global configuration and another workflow inherits it accidentally.
The real complication is that automation code usually runs with elevated permissions. A package layout mistake is not just a developer inconvenience. It can turn into a bad deploy, a partial rollback, an over-broad cloud permission, or a broken incident tool.
The question is not “where should the files go?”
The question is: how do we make internal automation importable, testable, executable, and boring across laptops, CI, and production runners?
The Answer Is a Package Boundary
Use a src layout, expose explicit command entry points, keep workflow orchestration thin, and treat provider clients as replaceable adapters.
repo/
pyproject.toml
README.md
src/
internal_automation/
__init__.py
cli.py
config.py
workflows/
deploy.py
rotate_credentials.py
providers/
cloud.py
git.py
chat.py
domain/
releases.py
credentials.py
tests/
unit/
integration/
The package name should be boring and specific. Avoid utils, common, or scripts as the primary namespace. Internal users should be able to understand the import boundary from the first line:
from internal_automation.workflows.deploy import run_deploy
The src layout matters because it forces tests and local commands to behave more like installed code. Without it, Python can accidentally import directly from the repository root, masking packaging errors until the code runs somewhere else. The Python Packaging User Guide documents the src layout as a way to avoid accidental imports from the working tree and make installed behavior more representative.
The package should separate four concerns.
First, cli.py owns argument parsing and exit codes. It should not contain cloud logic, deployment rules, or business policy.
Second, workflows/ owns orchestration. These modules answer “what steps happen in what order?” They compose domain logic and provider adapters, but should stay readable enough for an incident review.
Third, domain/ owns decisions. Release eligibility, credential rotation rules, environment promotion policy, and validation logic belong here. This code should be easy to unit test without cloud credentials.
Fourth, providers/ owns side effects. Cloud APIs, Git hosts, ticketing systems, chat systems, secret managers, and artifact stores should sit behind small interfaces. These modules are allowed to know SDK details. The rest of the package should not.
flowchart TD
A[ci job — invokes command] --> B[cli — parse arguments]
B --> C[workflow — coordinate steps]
C --> D[domain — make decisions]
C --> E[providers — external systems]
D --> F[tests — fast unit coverage]
E --> G[integration tests — real contracts]
C --> H[logs — operational trace]
The key is that direction matters. The CLI calls workflows. Workflows call domain logic and providers. Domain logic should not import the CLI. Providers should not reach back into workflow state. Tests should be able to exercise the domain without constructing a fake CI environment.
In Practice
Context: The documented Python packaging pattern is that pyproject.toml describes build metadata, dependencies, and console scripts. Tools such as pip, build, and modern Python build backends use this metadata to install the project as a package rather than treating the repository as an arbitrary folder.
Action: Define console scripts in pyproject.toml instead of asking CI to run python scripts/deploy.py.
[project.scripts]
internal-deploy = "internal_automation.cli:deploy"
rotate-credentials = "internal_automation.cli:rotate_credentials"
Result: The command that runs in CI is the command that an engineer can run locally after installation. Import errors are found at package boundaries rather than hidden by the repository root.
Learning: Internal automation should be installed before it is trusted. A CI job that runs from the source tree alone is not exercising the same contract as a packaged command.
Context: pytest commonly discovers tests from a separate tests/ tree. With a src layout, tests import the installed package path instead of silently importing adjacent source files from the repository root.
Action: Configure test execution to install the package in editable mode during development and as a normal package in CI build verification.
Result: Tests catch missing package data, incorrect dependencies, and import paths that only work because the developer happened to run from the project root.
Learning: A passing test suite is more meaningful when it tests the artifact shape, not just the file tree.
Context: GitHub Actions, GitLab CI, Buildkite, and similar CI systems all execute automation from checked-out repositories, but their working directories, environment variables, secret injection models, and shell behavior differ.
Action: Put CI-specific environment parsing at the edge of the package. Convert environment variables into a typed configuration object in config.py, then pass that object into workflows.
Result: The workflow code can be tested with explicit inputs. CI migration becomes less invasive because the provider-specific details are isolated.
Learning: Environment variables are an integration format, not an internal architecture.
Where It Breaks
| Failure mode | Why it happens | Mitigation |
|---|---|---|
src layout feels heavy for one script | The repository has not yet crossed the reuse threshold | Keep a single module, but still package it once CI depends on it |
| Too many tiny modules | Engineers split files by noun before behavior is stable | Start with cli, config, workflows, domain, and providers; split later |
| Provider adapters become dumping grounds | External SDK calls mix with workflow policy | Keep provider methods narrow and named after capabilities |
| Tests mock everything | The package boundary is clean, but real API contracts drift | Add focused integration tests for provider behavior |
| CLI becomes the application | Argument parsing accumulates business rules | Move decisions into domain and orchestration into workflows |
| Shared automation becomes a platform dependency | Other teams import internals directly | Document supported imports and treat everything else as private |
The layout is not a substitute for ownership. If five teams depend on an internal automation package, the package needs release notes, versioning discipline, and a deprecation path. A clean directory tree will not save an unstable API.
But layout does change the default behavior. It makes the correct path easier than the accidental path.
What to Do Next
- Problem: Your automation repository is still shaped like a script folder even though CI, deploys, or incident workflows depend on it.
- Solution: Move to a
srcpackage layout with explicit console scripts, thin CLI modules, workflow orchestration, domain logic, and provider adapters. - Proof: Verify by installing the package in CI, running commands through entry points, executing unit tests against domain logic, and reserving integration tests for external system contracts.
- Action: Pick one production automation command, package it end to end, and make the CI job call the installed console script instead of a path inside the repository.