IMPLEMENTATION GUIDE

Document Intelligence Copilot:
Complete Implementation Guide

Production-grade setup with Application Gateway for Containers, cost analysis, migration strategies, and operational excellence

January 18, 2026 25 min read Azure AI, AKS, AGC
Master Class Series
Part 1: Architecture RAG Pattern & Azure Doc Intelligence
Part 2: Implementation Step-by-Step Production Guide
Part 3: Go-Live Well-Architected Playbook
The Kuberenetes Ingress landscape is failing.

NGINX is retiring. The AKS Application Routing Add-on has an expiration date. If you are building for 2026 on legacy ingress controllers, you are building technical debt. It is time to standardize on Application Gateway for Containers (AGC).

Prerequisites: This guide assumes you've read the architecture overview. This is the deep-dive implementation guide with complete setup steps, cost breakdowns, and migration strategies.

Table of Contents

  1. Architecture Evolution: Why Application Gateway for Containers
  2. Complete AGC Setup Guide (5 Steps)
  3. Cost Analysis: $1K to $19K/Month Scenarios
  4. 7-Week Migration Guide (NGINX → AGC)
  5. Monitoring & Observability
  6. Troubleshooting Runbooks
  7. Testing Strategies
  8. Complete CI/CD Pipeline
  9. YouTube Videos & Microsoft Learn References

1. Architecture Evolution: Why Application Gateway for Containers

In late 2024, the Kubernetes community announced that the ingress-nginx controller will be retired in March 2026. This affects millions of production deployments worldwide.

Timeline

2024 ────────── 2025 ────────── 2026 ────────── 2027+
  │                │                │                │
  ├─ AGC GA        ├─ NGINX         ├─ Gateway API  ├─ Full
  │  (Available)   │  Retirement    │  (Istio)      │  Migration
  │                │  (March)       │  Available    │  Complete

Microsoft's Strategic Direction

Microsoft is investing in two future-proof solutions:

1. Application Gateway for Containers (AGC) - Available Now

2. Gateway API with Istio - Coming H1 2026

This guide uses Application Gateway for Containers (AGC) as the long-term, production-ready solution.


2. Complete AGC Setup Guide

Follow these 5 steps to deploy Application Gateway for Containers for your RAG Copilot.

Prerequisites

# Azure CLI version 2.50.0 or later
az version

# AKS cluster with managed identity
az aks show -g myResourceGroup -n myAKSCluster --query identity

# Required permissions
az role assignment list --assignee <your-identity> --scope <aks-resource-id>

Required Roles:

Step 1: Enable the AGC Add-on

# Register the feature (if not already registered)
az feature register \
  --namespace Microsoft.ContainerService \
  --name AKS-ExtensionManager

# Wait for registration (check status)
az feature show \
  --namespace Microsoft.ContainerService \
  --name AKS-ExtensionManager

# Enable the add-on
az aks enable-addons \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --addons azure-application-gateway-for-containers

# Verify installation
kubectl get pods -n azure-alb-system

Expected Output:

NAME                                  READY   STATUS    RESTARTS   AGE
alb-controller-7d4b8c9f5d-x7k2m      1/1     Running   0          2m
alb-controller-bootstrap-xyz123      0/1     Completed 0          3m

Step 2: Deploy the Traffic Controller (Bicep)

Create the AGC infrastructure using Bicep:

@description('Name of the Application Gateway for Containers')
param agcName string

@description('Location for all resources')
param location string = resourceGroup().location

@description('Subnet ID for AGC')
param subnetId string

// Application Gateway for Containers (Traffic Controller)
resource trafficController 'Microsoft.ServiceNetworking/trafficControllers@2023-11-01' = {
  name: agcName
  location: location
  properties: {
    associations: [
      {
        subnet: {
          id: subnetId
        }
      }
    ]
  }
}

// Frontend configuration
resource frontend 'Microsoft.ServiceNetworking/trafficControllers/frontends@2023-11-01' = {
  parent: trafficController
  name: '${agcName}-frontend'
  location: location
  properties: {
    fqdn: 'api.mycompany.com'
  }
}

output trafficControllerId string = trafficController.id
output frontendFqdn string = frontend.properties.fqdn

Deploy:

az deployment group create \
  --resource-group myResourceGroup \
  --template-file infra/modules/agc.bicep \
  --parameters agcName=agc-rag-copilot \
               subnetId=/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.Network/virtualNetworks/{vnet}/subnets/agc-subnet

Step 3: Configure Gateway API Resources

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: rag-gateway
  namespace: ai-workloads
spec:
  gatewayClassName: azure-alb-external
  listeners:
  - name: https-listener
    protocol: HTTPS
    port: 443
    hostname: "api.mycompany.com"
    tls:
      mode: Terminate
      certificateRefs:
      - kind: Secret
        name: tls-cert-secret

Step 4: Configure HTTPRoute for Path-Based Routing

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: chat-route
  namespace: ai-workloads
spec:
  parentRefs:
  - name: rag-gateway
  hostnames:
  - "api.mycompany.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /v1/chat
      method: POST
    backendRefs:
    - name: rag-orchestrator-service
      port: 8000

📋 The Index Schema (Minimum Viable)

You cannot implement security trimming without the right fields. Here is the JSON definition for your Azure AI Search index:

{
  "name": "rag-index-v1",
  "fields": [
    { "name": "id", "type": "Edm.String", "key": true },
    { "name": "content", "type": "Edm.String", "searchable": true },
    { "name": "embedding", "type": "Collection(Edm.Single)", "dimensions": 1536, "vectorSearchProfile": "my-profile" },
    { "name": "metadata_storage_path", "type": "Edm.String" },
    { "name": "page_number", "type": "Edm.Int32" },
    { "name": "group_ids", "type": "Collection(Edm.String)", "filterable": true },
    { "name": "classification", "type": "Edm.String", "filterable": true }
  ]
}

🔒 Step 4b: The Critical Security Filter

This single line is the difference between a prototype and production. Your RAG orchestrator MUST inject this filter into every AI Search query:

// The filter every query MUST have
{
  "search": "user question",
  "filter": "group_ids/any(g: search.in(g, 'user_department_id'))"
}

Step 5: The Ingestion Pipeline (The Missing Link)

Most guides skip this, but it's 50% of the work. Networking is plumbing; Data is gold. Here is the robust path from PDF to Index:

  1. Extract (Layout Model): Do not use simple OCR. Use the prebuilt-layout model in Document Intelligence to identify paragraphs, tables, and headlines. Keep the page_number metadata for every span!
  2. Chunk Strategy: Split by "Semantic Paragraph". Do not break sentences. Overlap by 50 tokens.
  3. Embed: Pass chunks to text-embedding-ada-002 to get the 1536-dimensional vector.
  4. Index: Push the JSON payload (Content + Vector + group_ids) to Azure AI Search.

Dev Tip: Run this pipeline efficiently using an Azure Function with an Event Grid trigger on the ADLS `landing/` container.

Step 6: Enable WAF Protection

apiVersion: alb.networking.azure.io/v1
kind: ApplicationLoadBalancerPolicy
metadata:
  name: waf-policy
  namespace: ai-workloads
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: rag-gateway
  waf:
    enabled: true
    mode: Prevention
    ruleSetType: OWASP
    ruleSetVersion: "3.2"

3. Cost Analysis: What Will This Actually Cost?

*Estimates based on East US pricing (Jan 2026). Costs may vary by region.

Let's break down the total cost of ownership with real numbers across three deployment scenarios.

Scenario 1: Small Deployment (100 users, 10k queries/month)

Component Cost/Month
AGC $92
AKS (2 nodes) $385
APIM Basic $150
OpenAI (GPT-3.5) $150
AI Search Basic $75
Doc Intelligence $15
Networking $100
Storage & Logs $100
TOTAL $1,067/month

Scenario 2: Medium Deployment (1,000 users, 100k queries/month)

Component Cost/Month
AGC $150
AKS (5 nodes) $770
APIM Standard $750
OpenAI (GPT-4) $1,400
AI Search S1 $250
Doc Intelligence $150
Networking $900
Storage & Logs $350
TOTAL $4,720/month

Scenario 3: Enterprise Deployment (10,000 users, 1M queries/month)

Component Cost/Month
AGC $500
AKS (15 nodes) $2,310
APIM Premium $3,000
OpenAI (GPT-4 + PTU) $8,000
AI Search S2 $1,000
Doc Intelligence $1,500
Networking $1,500
Storage & Logs $1,000
TOTAL $18,810/month

💡 Cost Optimization Strategies


4. 7-Week Migration Guide (NGINX → AGC)

This parallel deployment strategy ensures zero downtime during migration.

Week 1: Preparation

Week 2: Deploy AGC

Week 3: Canary Deployment

Weeks 4-6: Gradual Migration

Week 7: Decommission NGINX

🔧 Rollback Procedure

If issues occur, immediately switch DNS back to NGINX. Wait 5-15 minutes for DNS propagation and monitor traffic shift.


5. Monitoring & Observability

Key Metrics to Monitor

// AGC Capacity Utilization
AzureMetrics
| where ResourceProvider == "MICROSOFT.SERVICENETWORKING"
| where MetricName == "CapacityUnits"
| summarize avg(Average), max(Maximum) by bin(TimeGenerated, 5m)

// OpenAI Token Usage
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| extend tokens = toint(parse_json(properties_s).usage.total_tokens)
| summarize sum(tokens) by bin(TimeGenerated, 1h)

Critical Alerts


6. Troubleshooting Runbooks

Runbook 1: 403 Forbidden Errors (Private DNS Issue)

Root Cause: Private DNS zones not linked to AKS VNET

Solution:

az network private-dns link vnet create \
  --resource-group myResourceGroup \
  --zone-name privatelink.openai.azure.com \
  --name aks-vnet-link \
  --virtual-network /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Network/virtualNetworks/aks-vnet

Runbook 2: 429 Too Many Requests (Quota Exhaustion)

Immediate Fix: Increase TPM quota if available

Long-term Solution: Implement APIM retry policy and semantic caching

Runbook 3: Slow Query Performance

Solution: Reduce AI Search top results from 50 to 10, implement parallel processing


7. Testing Strategies

Unit Testing

def test_citation_extraction(orchestrator):
    mock_response = {
        "choices": [{
            "message": {
                "content": "The refund policy is... [doc1.pdf, page 5]"
            }
        }]
    }
    citations = orchestrator.extract_citations(mock_response)
    assert len(citations) == 1
    assert citations[0].document == "doc1.pdf"

Load Testing

# Test with 100 concurrent users
locust -f tests/load/locustfile.py \
  --host https://api.mycompany.com \
  --users 100 \
  --spawn-rate 10 \
  --run-time 10m

8. Complete CI/CD Pipeline

7-stage GitHub Actions workflow for production deployment:

  1. Validate Infrastructure: Bicep validation and what-if analysis
  2. Security Scanning: Trivy, Checkov, APIM policy validation
  3. Build and Test: Unit tests, Docker build, image scanning
  4. Deploy Infrastructure: Bicep deployment to Azure
  5. Deploy Application: Push to ACR, deploy to AKS
  6. Smoke Tests: Health checks, chat endpoint validation
  7. Rollback: Automatic rollback on failure

9. YouTube Videos & Microsoft Learn References

Essential YouTube Videos

Microsoft Learn References

Azure Architecture Center

Application Gateway for Containers

Azure AI Services


Summary

This implementation guide provides everything you need to deploy a production-grade Document Intelligence Copilot using Application Gateway for Containers. You now have:

Back to Architecture Overview: Read the architecture overview and design principles