Building a Unified AI Gateway | Enterprise AI Harness Part 2

This is Part 2 of our 3-part deep dive into AI Harness Engineering. Read Part 1 here.

In Part 1, we established that a centralised AI Harness is critical for enterprise security, cost control, and preventing vendor lock-in. Now, it is time to build it.

In this guide, we will provide the foundational Python code for an “out-of-the-box” AI Harness that achieves two things:

Model Routing: It exposes a single generate_text function that abstracts away the differences between Anthropic’s Claude and Google’s Gemini SDKs.
Pre-processing Guardrails: It automatically scrubs PII (Personally Identifiable Information) before the prompt ever touches an external network.

The Model Agnostic Router

Different models excel at different tasks. You might want to use Claude Opus 4.6 for intricate, multi-step reasoning, and Gemini 3.5 Flash for rapid, high-volume classification.

To do this seamlessly, we define a common interface. First, ensure you have the required libraries installed:

pip install anthropic google-generativeai presidio-analyzer presidio-anonymizer

Next, we build our wrapper:

import os
import anthropic
import google.generativeai as genai

class AIHarnessRouter:
    def __init__(self):
        # Initialise Anthropic Client
        self.anthropic_client = anthropic.Anthropic(
            api_key=os.environ.get("ANTHROPIC_API_KEY")
        )
        
        # Initialise Google Client
        genai.configure(api_key=os.environ.get("GEMINI_API_KEY"))
        
    def generate_text(self, prompt: str, model_id: str) -> str:
        """
        A unified interface to generate text across multiple model providers.
        """
        if model_id.startswith("claude"):
            return self._call_claude(prompt, model_id)
        elif model_id.startswith("gemini"):
            return self._call_gemini(prompt, model_id)
        else:
            raise ValueError(f"Unsupported model: {model_id}")

    def _call_claude(self, prompt: str, model_id: str) -> str:
        response = self.anthropic_client.messages.create(
            model=model_id, # e.g., "claude-4-6-opus-20260224"
            max_tokens=1024,
            messages=[{"role": "user", "content": prompt}]
        )
        return response.content[0].text

    def _call_gemini(self, prompt: str, model_id: str) -> str:
        model = genai.GenerativeModel(model_id) # e.g., "gemini-3.5-flash"
        response = model.generate_content(prompt)
        return response.text

With this router, your application code no longer needs to import vendor-specific SDKs. Changing from Claude to Gemini is simply a matter of updating a configuration string.

Implementing Pre-Processing Guardrails

A secure AI Harness must protect enterprise data. We cannot rely on LLM providers to scrub our data for us.

We will use Microsoft’s Presidio, an open-source library for PII detection, to build a GuardrailMiddleware.

from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine

class GuardrailMiddleware:
    def __init__(self):
        self.analyzer = AnalyzerEngine()
        self.anonymizer = AnonymizerEngine()

    def scrub_prompt(self, text: str) -> str:
        """
        Detects and masks sensitive entities like email addresses, 
        phone numbers, and credit card details.
        """
        # Analyse text for PII
        results = self.analyzer.analyze(
            text=text,
            entities=["EMAIL_ADDRESS", "PHONE_NUMBER", "CREDIT_CARD"],
            language='en'
        )
        
        # Anonymise the findings
        anonymized_result = self.anonymizer.anonymize(
            text=text, 
            analyzer_results=results
        )
        
        return anonymized_result.text

The Complete Enterprise Harness

Now, we combine the Guardrail Middleware with the Router. The application calls the Harness, the Harness sanitises the prompt, and then routes it to the requested model.

class EnterpriseAIHarness:
    def __init__(self):
        self.router = AIHarnessRouter()
        self.guardrails = GuardrailMiddleware()

    def execute(self, prompt: str, model_id: str = "gemini-3.5-flash") -> str:
        print(f"Original Prompt: {prompt}")
        
        # 1. Apply Guardrails
        safe_prompt = self.guardrails.scrub_prompt(prompt)
        print(f"Sanitised Prompt: {safe_prompt}")
        
        # 2. Route to Model
        response = self.router.generate_text(safe_prompt, model_id)
        
        return response

# Usage Example:
harness = EnterpriseAIHarness()
unsafe_request = "Summarise the account activity for john.doe@example.com."
response = harness.execute(unsafe_request, model_id="claude-4-6-opus-20260224")

When this runs, john.doe@example.com is replaced with <EMAIL_ADDRESS>, ensuring the LLM never sees the customer’s data.

Next Steps

In Part 3, we will build the final pillar of our harness: Continuous Evaluation. We will demonstrate how to use Gemini 3.1 Pro as an automated “LLM-as-a-Judge” to score the quality of your prompt outputs over time.

Why Alps Agility?

At Alps Agility, we combine deep AI expertise with advanced engineering to help you implement autonomous agents that cut costs and improve operational efficiency.

Contact us today to start transforming your enterprise with Agentic AI.

Struggling to move AI from prototype to production? We help enterprises build robust, scalable AI architectures. Book a Generative AI Readiness Assessment.

Building an Enterprise AI Harness: Part 2 - Core Routing and Guardrails

The Model Agnostic Router

Implementing Pre-Processing Guardrails

The Complete Enterprise Harness

Next Steps

Why Alps Agility?

Related Posts

Building an Enterprise AI Harness: Part 3 - Evaluation and Telemetry

Enterprise Generative AI: The 2026 Deployment Whitepaper

Building an Enterprise AI Harness: Part 1 - Architecture & Foundations

Designing Tools for Autonomous AI Agents: A Data Engineer's Guide