BNP Paribas Virtual Assistant Chatbot

Building an intelligent banking assistant with RAG, LLMs, and production deployment on Azure

Oussama Boussaid

November 2024

20 min read

Introduction

The BNP Paribas Virtual Assistant Chatbot is a production-ready RAG (Retrieval-Augmented Generation) system designed to help customers answer questions about BNP Paribas banking products and services. This project demonstrates a complete end-to-end implementation of a modern LLM-based application, from data collection to production deployment.

The system combines cutting-edge technologies including OpenAI GPT-4 for language understanding, FastAPI for backend infrastructure, Next.js for a responsive frontend, and Azure for scalable cloud deployment. What makes this project particularly interesting is the intelligent combination of web scraping, vector databases, and LLM orchestration to create a system that understands banking context and provides accurate, sourced answers.

Motivation & Problem Statement

When visiting the BNP Paribas website, customers are often overwhelmed with information. Finding answers to specific questions about banking products, services, or procedures requires navigating multiple pages or contacting customer support. This creates friction in the customer journey and puts unnecessary load on support teams.

The Challenge

Information Overload: Too much data scattered across website pages
Support Burden: Repetitive questions consuming support team resources
Multilingual Need: Supporting both French and English queries
Accuracy Requirement: Financial information must be reliable and sourced

The solution: A conversational AI assistant that understands BNP Paribas's entire knowledge base, retrieves relevant information instantly, and provides accurate, sourced answers in the customer's preferred language.

System Architecture

The system architecture follows a three-tier design: frontend user interface, backend API with RAG logic, and cloud infrastructure for scalability and reliability.

Memory Builder

• Extract & split documents
• Convert to embeddings
• Store in vector database
• Index for fast retrieval

RAG ChatBot

• Conversational QA
• Context retrieval
• Chat history tracking
• Answer generation

User Interface

• Real-time chat interaction
• Source attribution
• Multilingual support
• Responsive design

Technology Stack

Frontend

Next.js 16 - React framework with App Router
React 19 - Latest React with server components
TypeScript - Type-safe development
Tailwind CSS - Utility-first styling
Framer Motion - Smooth animations

Backend & AI

FastAPI - High-performance Python API
OpenAI GPT-4 - Language model
LangChain - RAG orchestration
ChromaDB - Vector database
HuggingFace Embeddings - Multilingual embeddings

Infrastructure

Azure Container Instances - Backend hosting
Azure Container Registry - Docker image management
Azure File Share - Persistent storage
Vercel - Frontend deployment

DevOps

GitHub Actions - CI/CD pipeline
Docker - Containerization
Git - Version control
Azure CLI - Cloud management

Implementation Details

1. Data Collection & Preprocessing

The system extracts publicly available information from BNP Paribas web pages through responsible web scraping. The collected documents are then processed and chunked into manageable pieces for embedding.

2. Vector Database Construction

# Vector Database Creation Process
1. Document Loading
   └─ Load from rag_documents.json

2. Text Splitting
   └─ Chunk size: 1000 characters
   └─ Overlap: 200 characters

3. Embedding Generation
   └─ Model: paraphrase-multilingual-MiniLM-L12-v2
   └─ Supports French & English

4. Vector Storage
   └─ ChromaDB storage
   └─ Semantic indexing
   └─ Fast retrieval enabled

3. RAG Pipeline

When a user asks a question, the system:

Converts the question to embeddings
Retrieves top-k most relevant document chunks
Constructs a context window with retrieved documents
Sends context + question to OpenAI GPT-4
Returns AI-generated answer with source attribution

4. Testing & Validation

API Testing with Postman - Query Example

Testing with Postman - Query to `/query` endpoint

The system was thoroughly tested with example queries in both French and English, demonstrating accurate information retrieval and answer generation.

Production Deployment

The application is deployed using a comprehensive CI/CD pipeline with automatic testing, building, and deployment to production.

Backend Deployment (Azure)

Infrastructure

• Service: Container Instances
• Region: East US
• Specs: 2 vCPU, 4GB RAM
• Registry: Azure Container Registry
• Status: Always available

CI/CD Pipeline

• Trigger: Push to main branch
• Build: Docker containerization
• Push: Azure Container Registry
• Deploy: Automatic container update
• Health: Automated checks

Frontend Deployment (Vercel)

• Platform: Vercel (Next.js optimized)
• URL: https://bnpparibasassistant.vercel.app/
• CDN: Global edge network
• Performance: Lighthouse 95+

GitHub Secrets & Environment

Sensitive information is securely managed through GitHub Secrets:

• AZURE_CREDENTIALS - Service Principal JSON
• OPENAI_API_KEY - GPT-4 API Access

Results & Demonstrations

Live Chatbot Demo

Production chatbot interface ready for user queries

Example Query Response

Real API response showing source attribution and detailed answer

Performance Metrics

System Performance

• Average Response Time: < 2 seconds
• Vector DB Size: 56 chunks indexed
• Uptime: 99.9%
• Concurrent Users: Unlimited scaling

Frontend Metrics

• Lighthouse Performance: 95+
• First Contentful Paint: < 1.5s
• Mobile Friendly: Fully responsive
• Accessibility: WCAG compliant

Challenges & Solutions

🔴 Challenge 1: Information Quality

Ensuring the chatbot provides accurate financial information is critical.

Solution: Implemented source attribution - every answer includes references to official BNP Paribas documentation.

🔴 Challenge 2: Multilingual Support

Customers speak multiple languages and may ask questions in mixed languages.

Solution: Used multilingual embeddings (paraphrase-multilingual-MiniLM-L12-v2) that understand both French and English seamlessly.

🔴 Challenge 3: Context Window Management

RAG requires balancing between relevant context and token limits.

Solution: Optimized chunk size (1000 chars) and implemented intelligent top-k retrieval (4 chunks).

🔴 Challenge 4: CI/CD Complexity

Managing secrets, builds, and deployments securely across multiple platforms.

Solution: Implemented comprehensive GitHub Actions pipeline with Azure integration and automated health checks.

Conclusion

The BNP Paribas Virtual Assistant Chatbot demonstrates a complete, production-ready implementation of modern LLM-based applications. By combining RAG techniques with cloud-native infrastructure and comprehensive CI/CD automation, we've created a system that's both powerful and maintainable.

This project showcases how to bridge the gap between research-grade AI models and real-world production systems, proving that sophisticated AI applications can be deployed efficiently with the right combination of tools and practices.

Key takeaways for anyone building similar systems: invest in robust data preprocessing, implement proper source attribution for trustworthiness, use multilingual embeddings for international reach, and automate your entire deployment pipeline for reliability and speed.

Resources & Links

Live Demo Backend Repository Frontend Repository LangChain Docs

Table of Contents

BNP Paribas Virtual Assistant Chatbot

Introduction

Motivation & Problem Statement

The Challenge

System Architecture

Memory Builder

RAG ChatBot

User Interface

Technology Stack

Frontend

Backend & AI

Infrastructure

DevOps

Implementation Details

1. Data Collection & Preprocessing

2. Vector Database Construction

3. RAG Pipeline

4. Testing & Validation

Production Deployment

Backend Deployment (Azure)

Infrastructure

CI/CD Pipeline

Frontend Deployment (Vercel)

GitHub Secrets & Environment

Results & Demonstrations

Live Chatbot Demo

Example Query Response

Performance Metrics

System Performance

Frontend Metrics

Challenges & Solutions

🔴 Challenge 1: Information Quality

🔴 Challenge 2: Multilingual Support

🔴 Challenge 3: Context Window Management

🔴 Challenge 4: CI/CD Complexity

Conclusion

Resources & Links