Generator
A Generator is a generic plugin that queries data and creates new nodes and relationships based on the result.
- Within your schema you could create an abstract service object that through a Generator creates other nodes.
- Want to read how Generators can be used to create a service catalog? See our blog post on How to Turn Your Source of Truth into a Service Factory.
High level design
Generators are defined as a Generator definition within an .infrahub.yml file. A Generator definition consists of a number of related objects.
- Group of targets - Objects that the Generator will act upon
- Generator class - Python code that defines the generation logic
- GraphQL Query - Data collection specification
Running a Generator definition will create new nodes as defined by the Generator, or remove old ones that are no longer required. The removal of obsolete objects is handled using the SDK tracking feature
The targets point to a group that will consist of objects that are impacted by the Generator. The members of this group can be any type of object within your schema, service objects, devices, contracts or anything you want the Generator to act upon. Generator groups (CoreGeneratorGroup) serve as target collections that define which objects trigger Generator execution, while the actual tracking of generated objects is handled by individual Generator instances.
The GraphQL query defines the data that will be collected when running the Generator. Any object identified in this step is added as a member to a GraphQL query group (CoreGraphQLQueryGroup). The membership in these groups are then used to determine which Generators need to be executed as part of a proposed change during the pipeline run.
The Generator itself is a Python class that is based on the InfrahubGenerator class from the SDK. Just like Transformations and checks, the Generators are user defined.
Generators can be executed in several ways, depending on your workflow and where you are in the lifecycle (local development vs. in Infrahub):
-
During development with infrahubctl
Use the
infrahubctl generatorcommand to iterate locally while building and testing your Generator. -
Manually from the UI
From the Infrahub UI, open the Generator Definition detail page (Actions > Generator Definitions) and click Run to trigger the Generator on demand.
-
Automatically via Proposed Changes
When you open a Proposed Change that affects the Generator's targets, the Generator runs as part of Infrahub's CI checks. Review the results in the Checks and Data tabs of the Proposed Change. This behavior can also be disabled per Generator in the repository configuration file.
-
Automatically via Events and Actions
You can configure Infrahub Event rules and Actions to trigger Generators automatically based on changes in your data. This enables fully automated execution aligned with your workflows.
Per-target execution model
Infrahub does not run a Generator once for the entire target group. Instead, it creates one independent run per member of the target group.
When you trigger a Generator definition, Infrahub:
- Fetches the target group and enumerates its members.
- For each member, extracts scoped variables from the target object using the
parametersmapping. - Creates an independent Generator run for that member, passing the scoped variables to the GraphQL query.
A Generator definition targeting a group with 10 members produces 10 separate runs. Each run sees only the data relevant to its specific target object.
Generator Definition
│
▼
Target Group
├── Member A → Run A (variables from A)
├── Member B → Run B (variables from B)
└── Member C → Run C (variables from C)
Each run is fully independent — it has its own query variables, its own query results, and its own Generator instance. Runs do not share state.
Query parameter mapping
The parameters field in .infrahub.yml controls how Infrahub extracts variables from each target object and passes them to the GraphQL query. This is the mechanism that scopes each run to its target.
How it works
Given this Generator definition:
generator_definitions:
- name: widget_generator
file_path: "generators/widget_generator.py"
targets: widgets
query: widget_query
class_name: WidgetGenerator
parameters:
name: "name__value"
And this GraphQL query:
query Widgets($name: String!) {
TestWidget(name__value: $name) {
edges {
node {
name { value }
count { value }
}
}
}
}
For each member of the widgets group, Infrahub:
- Reads the parameter mapping:
name→"name__value" - Extracts the value from the target object using the defined path
- Passes it as a query variable
For example:
| Target object | Extraction path | Extracted value | Query variable |
|---|---|---|---|
widget1 | widget1.name.value | "widget1" | $name = "widget1" |
widget2 | widget2.name.value | "widget2" | $name = "widget2" |
Each run's GraphQL query only returns data for its specific target, keeping runs independent.
Double-underscore notation
The double-underscore (__) in parameter values traverses the object hierarchy:
name__value: attributename, propertyvaluelocation__name__value: relationshiplocation(cardinality-one), then attributename, propertyvalue
The first segment is checked against the object's schema. If it matches an attribute, the remaining segments traverse the attribute's properties. If it matches a cardinality-one relationship, Infrahub fetches the related node and continues the traversal recursively.
Only cardinality-one relationships are supported in parameter paths. Cardinality-many relationships cannot be traversed this way.
Parallel execution
Because each run is independent — scoped to one target object with no shared state — Infrahub dispatches all runs for a Generator definition concurrently.
This means:
- All members of a target group are processed in parallel, not sequentially.
- Performance scales with available workers, not with target count. A group with 100 members doesn't take 100x longer than a group with 1 member.
- Different Generator definitions can also run concurrently when triggered independently.
What this means for Generator design
Because runs are concurrent:
- Your Generator code should not depend on side effects from other runs of the same Generator.
- Each run should be self-contained — it reads its scoped data, creates its objects, and finishes.
- If you need ordering (layer A must complete before layer B starts), use separate Generator definitions with a trigger mechanism rather than relying on execution order within a single definition. See modular Generators for this pattern.
Generator instances
Each per-target run creates or updates a CoreGeneratorInstance — a tracking object that links three things together:
- The Generator definition that was run
- The target object (the specific group member)
- The status of that run (
pending,ready, orerror)
Generator instances enable:
- Per-target status tracking: you can see which targets succeeded and which failed, without needing to inspect logs.
- Selective re-runs: you can re-run the Generator for a single target object without affecting others. Only the instance for that target gets updated.
- Object lifecycle management: the instance links the Generator to the objects it created, enabling cleanup when a target is removed.
You can view Generator instances in the Infrahub UI under the Generator Definition detail page.
Designing groups for parallelism
Since group structure determines execution structure, how you organize your target groups directly affects parallelism and operational flexibility.
The principle
Group at the level where you want independent execution. If racks should generate independently, make racks the target — not pods. If entire sites should generate as a unit, make sites the target.
More members in the target group means more parallel runs and better utilization of available workers.
Example: modular parallelism
In a modular Generator setup (see modular Generators), parallelism increases at each layer:
| Layer | Target group | Members | Parallel runs |
|---|---|---|---|
| Fabric | dc_fabrics | 1 fabric | 1 |
| Pod | dc_pods | 4-8 pods | 4-8 |
| Rack | dc_racks | 32+ racks | 32+ |
The fabric Generator runs once (1 target). It creates pod objects. The pod Generator runs 4-8 times concurrently. Each pod Generator creates rack objects. The rack Generator runs 32+ times concurrently. Total parallelism is multiplicative across layers.
Guideline: prefer more, smaller targets
A single Generator targeting a group with one member that creates 100 objects runs as one sequential operation. The same work split across 10 targets with 10 objects each runs as 10 concurrent operations — significantly faster on a system with available capacity.
The relationship between member_of_groups and Generator targeting
Objects become Generator targets by being members of the group specified in the Generator definition's targets field.
How objects become targets
You add objects to a target group through the member_of_groups relationship, which can be set:
- In the UI: when creating or editing an object (see organizing objects with groups)
- In object files: define group membership in your YAML object definitions
- Programmatically: via the SDK or GraphQL mutations
Dynamic targeting
Adding or removing group members changes what gets targeted on the next Generator run:
- Add an object to the group: it becomes a target and gets its own Generator run next time the definition executes
- Remove an object from the group: it is no longer targeted (existing generated objects are not automatically cleaned up — see known limitation #3289)
Standard groups vs. Generator groups
These two group types serve different purposes and are often confused:
| Group type | Purpose | You manage it |
|---|---|---|
CoreStandardGroup | Defines which objects are targeted by a Generator. Listed in the targets field of the Generator definition. | Yes — you create it and control membership |
CoreGeneratorGroup | Tracks which objects were created by a Generator instance. Managed automatically by the SDK tracking feature. | No — Infrahub manages this automatically |
The target group (CoreStandardGroup) is an input to the Generator — "run against these objects." The Generator group (CoreGeneratorGroup) is an output — "these objects were created by this Generator."
Query response modes
The convert_query_response flag in .infrahub.yml controls how the GraphQL query results are delivered to your generate() method. This affects how you access data and what SDK features are available.
Raw dict mode (default)
When convert_query_response is false (the default), the data parameter passed to generate() is the raw GraphQL response dictionary:
class DeviceGenerator(InfrahubGenerator):
async def generate(self, data: dict) -> None:
for edge in data["TestDevice"]["edges"]:
device_name = edge["node"]["name"]["value"]
device_role = edge["node"]["role"]["value"]
# ... create objects using self.client
This is the more lightweight mode — no conversion overhead, and you work with plain Python dictionaries. You can also use Pydantic models to parse the response into typed objects for better IDE support and validation.
SDK object mode
When convert_query_response is true, Infrahub converts the GraphQL response into InfrahubNode SDK objects. These are available via self.nodes and self.store. You do not use the data parameter in this mode:
class DeviceGenerator(InfrahubGenerator):
async def generate(self, data: dict) -> None:
for device in self.nodes:
device_name = device.name.value
device_role = device.role.value
# SDK features available: relationships, .save(), .delete()
When to use each
Use raw dict mode (false) when | Use SDK object mode (true) when |
|---|---|
| Dict access is sufficient for your use case | You need SDK features like .save(), .delete(), or relationship traversal on the queried objects |
| You want minimal overhead | You prefer cleaner attribute access (node.name.value vs node["name"]["value"]) |
| The query returns a flat structure | You want to use self.store to look up nodes by ID |
| You want to use Pydantic models for type-safe parsing | You want the SDK to handle response parsing automatically |
Execution lifecycle
The execute_in_proposed_change and execute_after_merge flags in .infrahub.yml control when a Generator runs in relation to the branch lifecycle.
execute_in_proposed_change (default: true)
When true, the Generator runs during proposed change CI.
When false, the Generator is skipped entirely in proposed changes. Use this for Generators that are triggered by events (checksum triggers) or when you want to manually run the generator.
execute_after_merge (default: true)
When true, the Generator runs again after the branch has been merged.
When false, the Generator will not run after the branch has been merged.
Video guides
In this video series we're diving into the concept of Generators and services, exploring their significance, structure, and how they can streamline processes for teams. Whether you're a developer or just curious about automation in IT, this guide will provide you with a comprehensive understanding of Generators and their applications.
The first video will highlight what Generators are and how they can be used to deliver services.
In the second video we will look at how to plan a Generator from coming up with a use case and then finally what the workflow may look like in pseudocode.
In the third video we will look at how a Generator can be created and run in Infrahub. Looking at the .infrahub.yml file the GraphQL query the Generator will run against and finally the logic that will be run against Infrahub to create objects and bring the service to life.
Known limitations
- 3289 deleting a Generator target object should delete the created objects of that target