Sovrn Guide

AI-powered customer support platform with multi-agent orchestration and RAG architecture

View project

Info

Client: Sovrn
Year: 2026

Tech Stack

Python

FastAPI

AWS Bedrock

Redis

Kubernetes

Docker

Sovrn Guide - AI-powered support platform

Sovrn Guide

A production-ready AI-powered customer support service built on multi-agent orchestration and Retrieval Augmented Generation (RAG). The platform uses an “agents as tools” pattern where a central orchestrator delegates to specialized agents—each focused on distinct capabilities like knowledge retrieval or account operations—delivering accurate, context-aware responses to customer queries.

Role: Lead Architect and Primary Developer

The Challenge

Customer support at scale requires balancing response quality with operational costs. Traditional approaches—either expensive fine-tuned models or brittle keyword-based systems—couldn’t meet Sovrn’s needs for accurate, up-to-date responses that could evolve with product changes.

The project required navigating technologies new to the organization: Python backend development, AWS Bedrock Knowledge Bases, the Strands Agents framework, and the Model Context Protocol (MCP). Each decision needed to balance capability against long-term maintainability.

Architecture

The system follows a layered architecture separating API concerns, agent orchestration, external service clients, and tool implementations:

┌─────────────────────────────────────────────────────────────┐
│                    API Layer (FastAPI)                      │
│  REST endpoints • JWT authentication • Request validation   │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                  Agent Layer (Strands)                      │
│  Orchestrator • Specialized agents • Tool routing           │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                     Client Layer                            │
│  Bedrock KB • Redis • S3/Athena Analytics                   │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                      Tool Layer                             │
│  RAG queries • URL encoding • MCP tools (extensible)        │
└─────────────────────────────────────────────────────────────┘

Technical Decisions

RAG Over Fine-Tuning

RAG enables incorporating up-to-date knowledge without expensive model retraining. When documentation changes, the knowledge base updates automatically—no model deployment required.

Multi-Model Strategy

I designed a cost-optimized approach using two models for different purposes:

ModelPurposeRationale
Claude 3.5 SonnetAgent reasoning and response generationSuperior reasoning for complex multi-turn conversations
Amazon Nova ProKnowledge base retrievalBalanced capability and cost for high-volume RAG queries

Multi-Agent Orchestration

The system implements an “agents as tools” pattern where a central orchestrator delegates to specialized agents based on user intent. Each agent is scoped to a specific domain—knowledge retrieval, account operations, or other capabilities—allowing the orchestrator to compose complex workflows while keeping individual agents focused and maintainable.

MCP Integration

The system integrates with a separate MCP server for extensible tool capabilities. New tools can be added without modifying the core service—agents dynamically discover and use available tools at runtime.

Functional Error Handling

I implemented a Rust-inspired Result[T, E] type that makes error handling explicit throughout the codebase, eliminating silent failures and making error paths visible in the type system.

Tech Stack

CategoryTechnologies
BackendPython 3.12, FastAPI, Hypercorn (ASGI)
AI/MLAWS Bedrock (Knowledge Bases, Guardrails), Strands Agents, MCP
LLMsAnthropic Claude 3.5 Sonnet, Amazon Nova Pro
StateRedis (indefinite persistence, 20-message sliding window)
AnalyticsAWS S3 (Parquet), Athena
AuthJWT with JWKS validation
InfrastructureKubernetes, Helm, Docker
CI/CDGitHub Actions, Semantic Release
SDKAuto-generated TypeScript client from OpenAPI

Observability

The service integrates OpenTelemetry with DataDog APM to trace the entire request flow—from API ingestion through agent reasoning to knowledge base retrieval and response generation. Every span is correlated, making it straightforward to diagnose latency issues or debug unexpected agent behavior in production.

Security and Compliance

Mandatory guardrails prevent inappropriate content in public-facing responses with fail-fast validation ensuring they cannot be accidentally disabled. All queries and responses are logged to S3/Athena for audit and analysis.

Role and Impact

I owned every aspect of this project from architecture design through production deployment—100% of the codebase, DevOps configuration, and documentation.

Outcomes:

  • Zero critical incidents since launch
  • 70%+ test coverage enforced via CI
  • TypeScript SDK consumed by frontend applications
  • Patterns (Result type, client abstraction) adopted by other teams
  • Server-agnostic design enabling other teams to build support solutions using the same infrastructure

More work

View all work
American Whitewater

American Whitewater

Full-stack platform modernization and offline-first mobile app for the nation's leading whitewater conservation organization

Commerce Chrome Extension

Commerce Chrome Extension

Complete rewrite of a legacy browser extension with modern tooling and improved developer experience

Sovrn Commerce iOS

Sovrn Commerce iOS

Native iOS app enabling content creators to manage affiliate links on mobile, generating new revenue for the platform