How we built a production-grade AI Agent that handles sensitive financial data without ever letting it leave the private network, using Azure OpenAI, Vector Search, and Semantic Kernel.
The Challenge: The Paradox of AI & Privacy
A global financial institution wanted to empower 5,000+ relationship managers with a GenAI "Knowledge Assistant." The goal was to query thousands of complex policy documents and internal reports.
The Blocker: Data Residency and Zero-Trust requirements. Public LLM endpoints were a non-starter.
Critical Requirements
- **Zero Data Leakage:** All traffic via Private Link.
- **RBAC Enforcement:** LLM can only "see" documents the user has access to.
- **Low Latency:** < 2s responses for real-time client meetings.
The Solution: The "Secure RAG" Architecture
We designed a Retrieval-Augmented Generation (RAG) pattern anchored on **Azure OpenAI Service** and **Azure AI Search**.
[OK] Storage: Private Endpoint Active
[OK] Search Index: Encrypted with CMK
1. The Data Ingestion Pipeline
Using Azure Data Factory, we orchestrated an automated pipeline that ingest PDFs from SharePoint, chunks them into semantic fragments, and generates embeddings using `text-embedding-3-large`.
2. Identity-First Retrieval
The core innovation was the **Security Filter Integration**. During the vector search phase, we pass the user's OIDC token. The search index applies a filter based on the document's ACL (Access Control List), ensuring the LLM never sees unauthorized context.
Impact & Business Value
Reduction in Document Search Time
Data Residency Compliance
Prompt Injection Incidents