Infrastructure pipelines and platform workflows are two of the most conflated concepts in modern DevOps and platform engineering. Both involve automation, both use YAML or DSL definitions, and both aim to reduce manual toil. Yet they serve fundamentally different purposes and require distinct design philosophies. Confusing them leads to brittle systems, frustrated teams, and wasted investment. This article provides a clear, actionable framework for distinguishing—and intentionally designing—each.
As of May 2026, the industry has matured enough that many organizations run both pipelines and workflows, often on the same platform (e.g., GitLab CI, GitHub Actions, Argo Workflows, or custom orchestrators). The key is knowing which tool to reach for and when. We'll pull apart the conceptual threads, examine trade-offs, and give you a decision framework you can apply immediately.
Why This Distinction Matters
When teams conflate pipelines and workflows, they often end up with one of two problems: either they force a pipeline tool to handle long-running, stateful, human-intervention-heavy processes (resulting in complex error handling and poor visibility), or they use a workflow engine for simple CI/CD and pay unnecessary overhead. Both outcomes erode trust in automation and slow delivery.
The Hidden Cost of Conflation
Consider a typical scenario: a team builds a deployment pipeline in Jenkins that also triggers a manual approval step, waits for a separate configuration management tool, and then sends a Slack notification. This works—until the manual approval times out, or the configuration step fails halfway, and the pipeline has no built-in compensation logic. The team then adds custom scripts, state files, and retry loops, turning a simple pipeline into a fragile monolith. Meanwhile, a proper workflow engine would have handled state persistence, human tasks, and error recovery natively.
Another common pain point is when platform teams build 'golden pipelines' that attempt to cover every possible scenario—deployments, data migrations, incident response, and compliance checks—all in one monolithic YAML. These become impossible to maintain, test, or extend. By separating concerns, you can build simpler, more reliable components that compose well.
Defining the Boundary
At its core, an infrastructure pipeline is a deterministic, linear (or directed acyclic graph) automation for building, testing, and deploying artifacts. It typically runs in a CI/CD system, is triggered by a code change, and produces a known output. A platform workflow, on the other hand, is a stateful orchestration of multiple steps—some automated, some manual—that may span days, involve multiple teams, and require compensation logic for failures. Workflows often manage infrastructure provisioning, change approvals, incident runbooks, or compliance gates.
The boundary is fuzzy, but the litmus test is: does the process require human judgment, external state, or long-lived coordination? If yes, it's likely a workflow. If it's a pure code-to-production path, it's a pipeline.
Core Conceptual Frameworks
To build a shared understanding, we need to examine the underlying models of pipelines and workflows. These models dictate tool choice, error handling, and operational practices.
Pipeline Model: DAG with Deterministic Execution
A pipeline is modeled as a directed acyclic graph (DAG) where each node is a step (e.g., compile, test, deploy). The graph is static: the steps and their order are known at definition time. Execution is deterministic—given the same input, the same output is expected. Failures are handled by stopping the pipeline or retrying the step, but there is no concept of 'waiting for a human' or 'pausing for a week.' Pipelines are ephemeral: they run in isolated environments (containers, VMs) and leave no persistent state beyond artifacts and logs.
This model is ideal for CI/CD because it ensures reproducibility. Every commit triggers the same sequence, and if a step fails, the developer knows exactly where and why. However, it breaks down when you need to wait for an external event (e.g., a change request approval, a database migration to complete, or a third-party API to respond). Pipelines can poll or use webhooks, but they lack built-in compensation logic for partial failures.
Workflow Model: State Machine with Human-in-the-Loop
A workflow is modeled as a state machine or a process graph. Steps can be automated tasks, manual approvals, or sub-workflows. The workflow maintains persistent state (e.g., in a database or object store) so it can survive crashes, restarts, or long delays. Transitions between steps can be conditional, based on user input or external events. Workflows often include compensation steps (rollback, notify, escalate) for when things go wrong.
This model is essential for platform workflows that coordinate multi-team processes. For example, a 'new environment provisioning' workflow might include: (1) automated infrastructure provisioning via Terraform, (2) manual approval from the security team, (3) automated configuration via Ansible, (4) manual validation by the QA team, and (5) notification to the requester. If step 2 is rejected, the workflow can automatically tear down the partially created environment—something a pipeline would struggle with.
Comparison of Approaches
| Aspect | Pipeline (e.g., Jenkins, GitLab CI, GitHub Actions) | Workflow Engine (e.g., Argo Workflows, Temporal, Camunda) | Hybrid (e.g., Airflow, Prefect for data workflows) |
|---|---|---|---|
| State Management | Ephemeral (logs, artifacts) | Persistent (database) | Persistent (database) |
| Human Tasks | Poor (approval gates are bolted on) | Native (assign, escalate, timeout) | Native (often via sensors) |
| Error Recovery | Retry or fail | Retry, compensate, escalate | Retry, skip, or alert |
| Execution Duration | Minutes to hours | Hours to weeks | Minutes to days |
| Typical Use Case | Build, test, deploy | Provisioning, change management, runbooks | Data pipelines, ETL, ML training |
Designing for the Right Model
Once you understand the conceptual difference, the next step is to design your automation accordingly. This section provides a repeatable process for deciding whether a given process should be a pipeline or a workflow, and how to implement each.
Decision Framework: Pipeline or Workflow?
Ask these three questions:
- Is the process deterministic and short-lived (under a few hours)? If yes, it's likely a pipeline. If it can run for days or requires waiting for external events, it's a workflow.
- Does the process require human judgment or approval? If yes, you need a workflow engine with native human task support. Pipelines can add manual gates, but they become brittle.
- If a step fails midway, can you simply retry or abort? If you need to roll back partially completed work, notify a team, or wait for a fix, you need a workflow with compensation logic.
For example, a simple 'deploy to staging' pipeline is a pipeline. A 'deploy to production with change approval and rollback' is a workflow. Many organizations start with a pipeline and then add workflow features ad hoc—this is where technical debt accumulates.
Step-by-Step: Building a Pipeline
1. Define the artifact: What is being built? (e.g., a Docker image, a compiled binary, a Terraform plan).
2. Map the DAG: List steps in order (lint, unit test, build, integration test, deploy to dev). Ensure no cycles.
3. Choose a CI/CD tool: GitLab CI, GitHub Actions, Jenkins, or CircleCI—pick one that integrates with your VCS.
4. Implement steps as scripts or actions: Each step should be idempotent and stateless.
5. Add caching and parallelism: Speed up by caching dependencies and running independent steps concurrently.
6. Set up notifications: On success/failure, notify the team via Slack, email, or PagerDuty.
Step-by-Step: Building a Platform Workflow
1. Model the process as a state machine: Identify states (e.g., 'pending approval', 'provisioning', 'validating', 'complete', 'failed').
2. Choose a workflow engine: Argo Workflows (Kubernetes-native), Temporal (language-agnostic), or Camunda (BPMN standard).
3. Define steps and transitions: Each step is an activity (automated or manual). Use signals or timers for human tasks.
4. Implement compensation logic: For each step, define what to do on failure (e.g., rollback Terraform, send escalation).
5. Add observability: Expose workflow status via API, dashboard, or audit log. Track business metrics (e.g., time to provision).
6. Test with long-running scenarios: Simulate approvals, timeouts, and failures. Ensure the workflow can recover after a crash.
Tooling, Stack, and Operational Realities
Choosing the right tool for pipelines vs. workflows is critical. Many teams try to use a single tool for both, which can work at small scale but breaks down as complexity grows.
Pipeline Tools: Strengths and Limits
GitHub Actions and GitLab CI are excellent for pipelines. They integrate deeply with the VCS, support matrix builds, and have a rich ecosystem of actions. However, they are not designed for long-running workflows. Their approval gates are basic (wait for a check), and they lack persistent state. If your deployment pipeline takes more than an hour, or requires manual approval that might take days, you'll hit timeout limits and lose context.
Jenkins, while older, offers more flexibility via plugins, but it still treats everything as a build job. Pipelines in Jenkins are groovy scripts that can become hard to maintain. For workflow-style processes, Jenkins is often paired with a separate workflow engine (e.g., Rundeck for operations).
Workflow Engines: When to Adopt
Argo Workflows is a popular choice for Kubernetes-native environments. It defines workflows as YAML CRDs, supports DAGs and steps, and integrates with Kubernetes resources. It has built-in retry, timeout, and artifact passing. However, it is not designed for human tasks; you'd need to combine it with a separate approval system (e.g., via Slack bot or custom UI).
Temporal is a more general-purpose workflow engine that abstracts away state management. It supports long-running workflows, human tasks, and complex error handling. The downside is a steeper learning curve and the need to run a Temporal server. It is ideal for platform teams building internal developer platforms.
Camunda is a BPMN-based engine that excels at human-centric workflows (e.g., change management, incident response). It provides a modeling tool, a REST API, and a tasklist UI. It is overkill for simple pipelines but excellent for compliance-heavy processes.
Operational Considerations
Running a workflow engine adds operational overhead: you need to manage the server, handle scaling, and ensure high availability. For many organizations, a hybrid approach works: use a CI/CD pipeline for the build-test-deploy cycle, and a separate workflow engine for provisioning, change management, and runbooks. The two systems can be connected via webhooks or API calls.
Cost is also a factor. Pipeline tools are often free for public repositories but charge per minute for private ones. Workflow engines are typically open-source but require infrastructure. Evaluate total cost of ownership, including maintenance and training.
Growth Mechanics: Scaling Pipelines and Workflows
As your organization grows, both pipelines and workflows need to evolve. This section covers patterns for scaling without introducing chaos.
Pipeline Growth: From Monolith to Composition
Early-stage teams often have one monolithic pipeline that deploys everything. As the number of services grows, this becomes a bottleneck. The solution is to break pipelines into smaller, composable units: one pipeline per service, with shared steps extracted into reusable actions or templates. Use pipeline triggers to chain deployments (e.g., service A deploys, then triggers service B's pipeline).
Another growth pattern is the 'pipeline of pipelines'—a meta-pipeline that orchestrates multiple service pipelines. This is essentially a workflow, and you should consider using a workflow engine for it. For example, a 'release train' workflow that deploys 10 microservices in a specific order, with manual gates between groups, is better handled by a workflow engine than a CI/CD pipeline.
Workflow Growth: From Ad Hoc to Standard Library
Workflows often start as one-off scripts or runbooks. As the number of workflows grows, you need a standard library of reusable workflow activities (e.g., 'provision VM', 'send Slack message', 'run Terraform plan'). This reduces duplication and ensures consistency. Use versioning for workflows and activities, and provide a self-service catalog for developers to trigger common workflows (e.g., 'request a new database', 'rotate credentials').
Governance becomes important: who can trigger which workflow? What audit trail is needed? Workflow engines often provide RBAC and audit logs natively. Define SLAs for workflow completion and monitor them.
Common Anti-Patterns
One anti-pattern is using a pipeline to orchestrate workflows (e.g., a Jenkins job that calls Argo workflows). This adds unnecessary complexity and coupling. Another is using a workflow engine for simple CI/CD (e.g., using Temporal to run a build step). This adds overhead without benefit. A third is not having a clear boundary: teams end up with 'pipeline-workflow hybrids' that are hard to debug and maintain.
Risks, Pitfalls, and Mitigations
Even with a clear conceptual understanding, teams make mistakes. Here are common pitfalls and how to avoid them.
Pitfall 1: Over-Engineering the Pipeline
Teams often add workflow-like features to their CI/CD pipeline (e.g., manual approvals, state files, retry queues) because they don't want to adopt a separate tool. This leads to brittle, hard-to-maintain pipelines. Mitigation: If your pipeline needs persistent state or human tasks, accept that you need a workflow engine. Start small: move one complex process (e.g., production deployment with approval) to a workflow and see if the complexity drops.
Pitfall 2: Under-Engineering the Workflow
Conversely, teams may adopt a workflow engine but use it as a simple pipeline (e.g., a linear sequence without error handling). This wastes the engine's capabilities and can lead to silent failures. Mitigation: Invest time in modeling the workflow properly. Define compensation steps for every activity. Use the engine's testing framework to simulate failures.
Pitfall 3: Tool Sprawl
Different teams adopt different tools (e.g., Team A uses Argo, Team B uses Temporal, Team C uses Jenkins). This creates silos and makes it hard to share workflows. Mitigation: Establish a platform team that selects and supports a limited set of tools. Provide templates and training. Encourage reuse.
Pitfall 4: Ignoring Security and Compliance
Workflows often involve sensitive operations (e.g., provisioning databases, changing firewall rules). If the workflow engine is not properly secured, it becomes an attack vector. Mitigation: Use RBAC, audit logs, and secrets management. Integrate with your identity provider. Review workflow definitions for security vulnerabilities.
Pitfall 5: Not Testing Workflows
Workflows are harder to test than pipelines because they involve external systems and human steps. Teams often skip testing, leading to production failures. Mitigation: Use workflow engine's testing capabilities (e.g., Temporal's test environment, Argo's local execution). Create integration tests that simulate each step. Have a dry-run mode for manual steps.
Decision Checklist and Mini-FAQ
To help you apply these concepts immediately, here is a checklist and answers to common questions.
Decision Checklist: Pipeline or Workflow?
- ☐ Process is deterministic (same input → same output)? → Pipeline
- ☐ Process completes in under 1 hour? → Pipeline
- ☐ Requires manual approval or human judgment? → Workflow
- ☐ Needs persistent state across steps? → Workflow
- ☐ Failure requires compensation (rollback, notify)? → Workflow
- ☐ Process involves multiple teams or external systems? → Workflow
Mini-FAQ
Q: Can I use GitHub Actions for workflows? A: For simple workflows (e.g., deploy with a manual approval), yes, but it's limited. For complex workflows with state, human tasks, and compensation, use a dedicated workflow engine.
Q: Should I have one pipeline per service or one monorepo pipeline? A: One pipeline per service is more scalable, but you need a way to manage dependencies. Use a workflow to orchestrate multi-service deployments.
Q: How do I handle secrets in workflows? A: Use your workflow engine's secrets management (e.g., Temporal's secret store, Argo's Kubernetes secrets). Avoid hardcoding secrets in workflow definitions.
Q: What if my process is a hybrid (e.g., a pipeline that triggers a workflow)? A: That's fine. For example, a CI pipeline builds an artifact and then triggers a deployment workflow. The key is to keep the pipeline simple and let the workflow handle the complex orchestration.
Q: How do I convince my team to adopt a workflow engine? A: Start with a pain point—a process that is currently fragile or manual. Build a quick prototype showing how the workflow engine simplifies it. Measure the reduction in toil and errors.
Synthesis and Next Actions
Distinguishing between infrastructure pipelines and platform workflows is not an academic exercise—it directly impacts your team's productivity, system reliability, and operational overhead. By applying the frameworks and decision criteria in this guide, you can design automation that is fit for purpose, avoid common pitfalls, and scale your platform engineering efforts effectively.
Immediate Next Steps
- Audit your current automation: List all automated processes. Classify each as pipeline or workflow using the decision checklist. Identify mismatches (e.g., a pipeline with manual approval that times out).
- Choose one problematic process: Pick a process that is currently a brittle pipeline but should be a workflow. Implement it using a workflow engine (start with a simple version).
- Standardize tooling: Agree on one CI/CD tool and one workflow engine across your organization. Provide templates and documentation.
- Train your team: Hold a workshop on the conceptual differences. Share this article as a reference. Encourage questions and discussion.
- Measure and iterate: Track metrics like deployment frequency, time to provision, and failure recovery time. Use data to refine your approach.
Remember, the goal is not to eliminate pipelines or workflows, but to use each where it excels. A well-designed pipeline accelerates development; a well-designed workflow ensures reliable, auditable operations. By pulling them apart conceptually, you can build a platform that serves both.
Comments (0)
Please sign in to post a comment.
Don't have an account? Create one
No comments yet. Be the first to comment!