customeragent-api / server /DATA_SOURCE_PRIORITY.md
anasraza526's picture
Clean deploy to Hugging Face
ac90985

Unified Data Source Priority System

🎯 Updated Priority Order

The system now uses a cost-optimized fallback chain, checking cheaper/faster sources first:

Priority Chain (Best to Last)

1. FAQ Database
   β”œβ”€ Source: Database
   β”œβ”€ Cost: FREE (just DB query)
   β”œβ”€ Speed: FASTEST (~1ms)
   └─ Use: Exact/similar FAQ matches

2. Website Scraped Content
   β”œβ”€ Source: Database
   β”œβ”€ Cost: FREE (just DB query + text search)
   β”œβ”€ Speed: FAST (~5ms)
   └─ Use: Business-specific content

3. Industry Knowledge Base
   β”œβ”€ Source: MedQuAD, CourseQ, STACKED datasets
   β”œβ”€ Cost: FREE (local data)
   β”œβ”€ Speed: FAST (~10ms)
   └─ Use: Healthcare/Education queries

4. Unanswered History
   β”œβ”€ Source: Database (learning from past)
   β”œβ”€ Cost: FREE
   β”œβ”€ Speed: FAST (~5ms)
   └─ Use: Similar previously asked questions

5. Small LLM (Your existing system)
   β”œβ”€ Source: OpenAI/local model
   β”œβ”€ Cost: LOW ($$$)
   β”œβ”€ Speed: MODERATE (~1-2s)
   └─ Use: General Q&A fallback

6. Gemini AI (LAST RESORT)
   β”œβ”€ Source: Google Gemini
   β”œβ”€ Cost: HIGH ($$$$$)
   β”œβ”€ Speed: MODERATE (~2-3s)
   β”œβ”€ Limit: 50 queries/user/day
   └─ Use: ONLY when nothing else works

πŸ’° Cost Optimization Strategy

Free Sources (Check First)

FAQ β†’ Scraped Content β†’ Industry KB β†’ History
   ↓         ↓              ↓           ↓
  0Β’        0Β’             0Β’          0Β’

Paid Sources (Check Last)

Small LLM β†’ Gemini AI
    ↓           ↓
   $0.002     $0.01
  (cheap)   (expensive)

πŸ”„ Example Query Flow

User Query: "mujhe bukhar hai" (I have fever)

Step 1: FAQ Database
  Query: "fever"
  Result: ❌ Not found
  β†’ Continue

Step 2: Website Scraped Content  
  Query: "fever"
  Result: ❌ No scraped content
  β†’ Continue

Step 3: Industry Knowledge Base (Healthcare)
  Query: "fever"
  Result: βœ… FOUND!
  Source: SymCAT + MedQuAD
  Answer: "Symptoms analyzed: fever. Possible: Flu, COVID-19..."
  β†’ STOP (answer found, no LLM needed!)

Cost: $0 βœ…

Another Query: "Write me a poem about health"

Step 1: FAQ Database β†’ ❌ Not found
Step 2: Scraped Content β†’ ❌ Not found
Step 3: Industry KB β†’ ❌ Not a medical Q&A
Step 4: History β†’ ❌ Not asked before
Step 5: Small LLM β†’ βœ… FOUND!
  Answer: Generated poem

Cost: $0.002 βœ… (avoided expensive Gemini)

Edge Case: Gemini Usage

Step 1-4: All checked β†’ ❌ Not found
Step 5: Small LLM β†’ ❌ Failed/unavailable
Step 6: Gemini AI β†’ βœ… Used (last resort)
  Query count: 45/50 (user still has 5 left today)
  Answer: Generated response

Cost: $0.01 ⚠️ (expensive, but necessary)

πŸ›‘οΈ Query Limit Protection

Gemini has daily limits to protect costs:

# Per user per day
DEFAULT_LIMIT = 50 queries/day

# Tracking
- By session_id (anonymous users)
- By visitor_email (logged-in users)

# Behavior
Query 1-49: βœ… Normal Gemini response
Query 50:   βœ… Last allowed  
Query 51+:  ⚠️ "Daily limit reached" message
            β†’ Falls back to generic response

πŸ“Š Data Source Comparison

Source Speed Cost Accuracy Availability
FAQ ⚑⚑⚑⚑⚑ FREE ⭐⭐⭐⭐⭐ If data exists
Scraped ⚑⚑⚑⚑ FREE ⭐⭐⭐⭐ If data exists
Industry KB ⚑⚑⚑⚑ FREE ⭐⭐⭐⭐⭐ Always
History ⚑⚑⚑⚑ FREE ⭐⭐⭐ If data exists
Small LLM ⚑⚑⚑ $ ⭐⭐⭐⭐ Always
Gemini ⚑⚑ $$$$$ ⭐⭐⭐⭐⭐ If enabled + under limit

🎯 Selection Logic

Intent-Based Priority Weights

FAQ Intent:

FAQ: 100% weight
Scraped: 80%
Industry: 60%
Small LLM: 50%
Gemini: 50%

Industry Knowledge Intent:

Industry KB: 100% weight
FAQ: 70%
Small LLM: 65%
Scraped: 50%
Gemini: 50%

Business-Specific Intent:

Scraped: 100% weight
FAQ: 80%
Small LLM: 65%
Industry: 40%
Gemini: 50%

βœ… Benefits

  1. Cost Savings: Check free sources first
  2. Speed: Faster responses from local data
  3. Accuracy: Industry datasets more accurate than generic LLM
  4. Scalability: Limits on expensive Gemini prevent runaway costs
  5. Reliability: Multiple fallbacks ensure answers always provided

πŸ”§ Configuration

Environment Variables

# Small LLM (existing)
OPENAI_API_KEY=your_openai_key

# Gemini AI (expensive fallback)
GEMINI_API_KEY=your_gemini_key
GEMINI_DAILY_LIMIT_PER_USER=50  # Adjust based on budget

Disable Gemini (Save Money)

# Just don't set GEMINI_API_KEY
# System will use Small LLM as final fallback

πŸ“ˆ Expected Cost Impact

Before (if using Gemini for everything):

1000 queries/day Γ— $0.01 = $10/day = $300/month ❌

After (with smart fallbacks):

700 FAQ/KB (free) = $0
200 Small LLM Γ— $0.002 = $0.40
100 Gemini Γ— $0.01 = $1.00
─────────────────────────────
Total: $1.40/day = $42/month βœ…

Savings: 86% ($258/month) πŸŽ‰

πŸš€ Summary

Priority: Free sources β†’ Small LLM β†’ Gemini (last)
Protection: Daily query limits per user
Savings: ~86% cost reduction
Speed: Faster with local data first
Accuracy: Better with industry datasets

Gemini is now truly the LAST resort! 🎯