$ ls ~/blog/

Technical articles about ML, AI, MLOps and software development

Building ML Pipelines with n8n: From Orchestration to Implementation

How I discovered n8n's potential for machine learning workflows and built two different approaches for production ML pipelines

Oussama Boussaid
September 2025
15 min read

The Spark Behind This Experiment

Lately, I've seen a lot of buzz around n8n in the automation space, so I decided to test it on a real ML use case. Instead of gluing Python scripts together with cron jobs, I wanted something visual, lightweight, and flexible.

The idea came from a frustrating experience maintaining a complex ML pipeline with scattered scripts, broken dependencies, and manual deployment steps. I thought: "What if I could see my entire ML workflow visually and manage it from one place?"

That's when n8n caught my attention - a workflow automation tool that promised visual programming with powerful integrations. But could it handle the complexity of machine learning pipelines?

n8n ML Pipeline Architecture: Two Distinct Approaches

After diving deep into n8n's capabilities, I discovered two fundamentally different ways to implement ML pipelines:

🎛️ Approach 1: n8n as Orchestrator

  • • n8n coordinates existing ML infrastructure
  • • External services handle heavy lifting (MLflow, Kubernetes, etc.)
  • • Production-ready and scalable

🔧 Approach 2: All-in-n8n Implementation

  • • Everything implemented within n8n nodes
  • • JavaScript-based ML algorithms
  • • Self-contained but limited

Let me walk you through both approaches and when to use each one.

Approach 1: n8n as ML Pipeline Orchestrator

The Architecture

In this approach, n8n acts as the conductor of an ML orchestra - coordinating specialized services rather than doing the work itself. Think of it as the brain that tells each component when to start, monitors progress, and handles the flow between services.

🎛️ n8n as ML Pipeline Orchestrator

🚀 Triggers📊 Data Pipeline🧠 ML Pipeline🚀 Deployment📈 Monitoring⏰ Cron TriggerDaily @ 2 AM🎯 WebhookManual Trigger📊 HTTP Request→ Data Service⚙️ HTTP Request→ Preprocessing🏗️ HTTP Request→ Feature Engdata-service:8080preprocessing:8081feature-service:8082📊 MLflowTracking🧠 TrainingService✅ ValidationGate📦 RegistryStore Model🚀 K8s DeployProduction📈 PrometheusMetrics📊 GrafanaDashboard🎉 SlackSuccess

n8n coordinates external ML services and infrastructure for production-ready pipelines

Key Components

🚀 n8n Workflow Nodes:

  • HTTP Request nodes → Call external ML services
  • Cron triggers → Automated scheduling
  • IF gates → Conditional logic and validation
  • Wait nodes → Coordination and timing
  • Slack nodes → Notifications and alerts

🏗️ External ML Infrastructure:

  • data-service:8080 → Real data ingestion
  • mlflow-server:5000 → Experiment tracking
  • kubernetes-api → Production deployment
  • prometheus:9090 → Monitoring
  • grafana:3000 → Dashboards

The Workflow Flow

  • Trigger → n8n starts the pipeline (cron or webhook)
  • Coordinate → n8n calls data ingestion service via HTTP
  • Monitor → n8n waits and checks job status
  • Validate → n8n triggers ML training service
  • Deploy → n8n orchestrates Kubernetes deployment
  • Alert → n8n sends notifications via Slack

Sample n8n Node Configuration

{
  "parameters": {
    "url": "http://training-service:8083/api/v1/train",
    "sendHeaders": true,
    "headerParameters": {
      "parameters": [
        {
          "name": "Authorization",
          "value": "Bearer {{ $env.TRAINING_SERVICE_TOKEN }}"
        }
      ]
    },
    "sendBody": true,
    "bodyParameters": {
      "parameters": [
        {
          "name": "feature_job_id",
          "value": "={{ $node['Trigger Feature Engineering'].json.job_id }}"
        },
        {
          "name": "mlflow_run_id",
          "value": "={{ $node['Create MLflow Run'].json.run.info.run_id }}"
        }
      ]
    }
  },
  "name": "🧠 Trigger Model Training",
  "type": "n8n-nodes-base.httpRequest"
}

When to Use This Approach

✅ Perfect for:

  • • Production ML pipelines
  • • Teams with existing ML infrastructure
  • • Complex models requiring specialized frameworks
  • • High-scale deployments
  • • Enterprise environments

⚠️ Consider the setup cost:

  • • Requires existing ML services
  • • More infrastructure to maintain
  • • Higher initial complexity

Approach 2: All-in-n8n Implementation

The Philosophy

What if you could build an entire ML pipeline using only n8n nodes? This approach implements everything - from data processing to model training - directly within n8n using JavaScript Code nodes.

The Complete Implementation

Data Pipeline

  • • PostgreSQL nodes for data ingestion
  • • Code nodes with data cleaning logic
  • • Code nodes for feature engineering

ML Pipeline

  • • Code node with custom Linear Regression implementation
  • • Code node for model validation
  • • IF nodes for deployment gates

Deployment & Monitoring

  • • Code nodes simulating API endpoints
  • • Slack nodes for notifications
  • • Cron triggers for monitoring

📋 All-in-n8n ML Pipeline Structure

🚀 Triggers📊 Data Pipeline🧠 ML Pipeline🚀 Deployment📈 Monitoring⏰ CronDaily Trigger🎯 WebhookManual⚙️ Code NodeData Prep🏗️ Code NodeFeature Eng🗄️ PostgreSQLRaw Data🧠 Code NodeML Training✅ IF GateValidation📦 PostgreSQLStore Model🚀 Code NodeAPI Deploy🎉 SlackSuccess📊 Code NodeMonitor🔄 IF GateRetrain?🚨 SlackAlertn8n WorkflowEverything Inside16 Nodes Total

Complete ML pipeline implemented within n8n nodes using JavaScript and built-in integrations

JavaScript ML Implementation Sample

Here's how model training looks entirely within an n8n Code node:

// Complete ML Model Training Pipeline in n8n
const items = $input.all();
const trainingData = items.slice(1); // Skip summary

// Simple Linear Regression Implementation
class SimpleLinearRegression {
  constructor() {
    this.weights = null;
    this.bias = 0;
    this.learningRate = 0.01;
    this.epochs = 1000;
  }
  
  fit(X, y) {
    const numFeatures = X[0].length;
    this.weights = Array(numFeatures).fill(0).map(() => Math.random() * 0.01);
    
    for (let epoch = 0; epoch < this.epochs; epoch++) {
      for (let i = 0; i < X.length; i++) {
        const prediction = this.predict_single(X[i]);
        const error = prediction - y[i];
        
        // Update weights using gradient descent
        for (let j = 0; j < numFeatures; j++) {
          this.weights[j] -= this.learningRate * error * X[i][j];
        }
        this.bias -= this.learningRate * error;
      }
    }
  }
  
  predict_single(x) {
    let prediction = this.bias;
    for (let i = 0; i < x.length; i++) {
      prediction += this.weights[i] * x[i];
    }
    return prediction;
  }
}

// Train and validate model
const model = new SimpleLinearRegression();
model.fit(trainX, trainY);

// Return results for next node
return [{ 
  json: {
    model_id: model_1758755784697, // model_timestamp
    weights: model.weights,
    bias: model.bias,
    validation_r2: calculateR2(valY, valPredictions),
    trained_at: new Date().toISOString()
  }
}];

The Complete Workflow Architecture

The workflow contains 16 nodes organized into clear stages:

  • Triggers (2 nodes) → Cron + Webhook
  • Data Pipeline (3 nodes) → Fetch + Clean + Engineer
  • ML Pipeline (2 nodes) → Train + Validate
  • Deployment (4 nodes) → Store + Deploy + Notify
  • Monitoring (3 nodes) → Check + Alert + Retrain
  • Feedback Loop → Drift detection triggers retraining

When to Use This Approach

✅ Perfect for:

  • • Learning ML pipeline concepts
  • • Rapid prototyping
  • • Simple models (linear regression, basic classification)
  • • Teams without ML infrastructure
  • • Quick demos and POCs

⚠️ Limitations:

  • • Basic ML algorithms only
  • • No real scalability
  • • Simplified monitoring
  • • Not suitable for complex models

Architecture Comparison

🤔 Architecture Comparison

🎛️ Approach 1: n8n as Orchestrator

n8n coordinates separate ML components

n8nOrchestratorPostgreSQLDatabaseMLflowTrackingFastAPIModel APIKubernetesDeployPrometheusMonitoringGrafanaDashboardSlackAlerts

✅ Pros

  • • Production-ready
  • • Scalable architecture
  • • Tool specialization
  • • Easy debugging

⚠️ Cons

  • • More infrastructure
  • • Multiple tools
  • • Higher complexity
  • • More deployment

🔧 Approach 2: All-in-n8n

Everything implemented within n8n nodes

Code NodeData PrepCode NodeML TrainingHTTP NodeAPI DeployCode NodeMonitoringCode NodeFeature EngCode NodeValidationSlack NodeAlertsDB NodeStorageFile NodeModel Saven8n WorkflowEverything Inside

✅ Pros

  • • Single tool
  • • Rapid prototyping
  • • Visual workflow
  • • Simple deployment

⚠️ Cons

  • • Limited ML capabilities
  • • Not production-scale
  • • Performance issues
  • • Hard to maintain

Side-by-side comparison of both approaches showing their strengths and limitations

Real-World Implementation: What I Learned

The Good Surprises

🎯 Visual Debugging is Amazing

Seeing data flow through nodes in real-time made debugging so much easier than parsing log files.

⚡ Rapid Iteration

Making changes to the workflow is incredibly fast. No code deployments, no container rebuilds.

🔧 Built-in Error Handling

n8n's retry logic and error paths saved me from writing boilerplate error handling code.

The Challenges

📈 Limited ML Libraries

Code nodes only support basic JavaScript, so complex ML operations require external services.

🗄️ State Management

n8n workflows are stateless by design, which can be limiting for ML pipelines.

📊 Monitoring Granularity

While n8n provides execution logs, you'll want external monitoring for production ML metrics.

Production Considerations

Security

  • Use n8n's credential system for API keys
  • Implement proper authentication for webhook triggers
  • Consider network isolation for sensitive data

Scalability

  • Approach 1: Scales with your underlying infrastructure
  • Approach 2: Limited by n8n instance resources

Maintenance

  • Version control your workflows using n8n's export feature
  • Implement proper testing for critical Code nodes
  • Set up monitoring and alerting for production pipelines

Conclusion: n8n's Sweet Spot in the ML Ecosystem

After building both approaches, I'm genuinely impressed with n8n's potential for ML workflows. It's not going to replace specialized ML platforms for every use case, but it fills an important gap.

The real magic happens when you combine n8n's visual workflow management with purpose-built ML tools. Using n8n as an orchestrator (Approach 1) gives you the best of both worlds: visual pipeline management with production-grade ML infrastructure.

For teams just starting their ML journey, the all-in-n8n approach (Approach 2) provides an excellent learning environment where you can understand every component of the pipeline without the complexity of multiple services.

Whether you choose to orchestrate existing services or implement everything within n8n, you'll gain a powerful tool for managing ML workflows that's both accessible and capable.

Final Thoughts: The Right Tool for the Right Scale

Of course, tools like Airflow or Kubeflow remain better for massive projects with complex DAGs, enterprise governance, and advanced scheduling needs. But for small to medium ML pipelines, n8n really hits the sweet spot: fast to set up, easy to maintain, and surprisingly production-ready.

The visual nature of n8n makes it particularly valuable for:

  • Cross-functional teams where non-developers need to understand the pipeline
  • Rapid experimentation where you're iterating on workflow logic
  • Teaching environments where visual representation aids understanding
  • Hybrid teams mixing business logic with technical implementation

Whether you choose to orchestrate existing services or implement everything within n8n, you'll gain a powerful tool for managing ML workflows that's both accessible and capable.


Have you tried using n8n for ML pipelines? I'd love to hear about your experiences and what approaches worked best for your use cases. Feel free to share your thoughts and questions in the comments on my GitHub Repo.

Resources

This article is based on hands-on experimentation with n8n for ML use cases. Your mileage may vary depending on your specific requirements and infrastructure setup.