Spaces:

anasraza526
/

customeragent-api

Runtime error

App Files Files Community

customeragent-api / server /DATA_SOURCE_PRIORITY.md

anasraza526

Clean deploy to Hugging Face

ac90985 21 days ago

preview code

raw

history blame contribute delete

5.52 kB

Unified Data Source Priority System

🎯 Updated Priority Order

The system now uses a cost-optimized fallback chain, checking cheaper/faster sources first:

Priority Chain (Best to Last)

1. FAQ Database
   ├─ Source: Database
   ├─ Cost: FREE (just DB query)
   ├─ Speed: FASTEST (~1ms)
   └─ Use: Exact/similar FAQ matches

2. Website Scraped Content
   ├─ Source: Database
   ├─ Cost: FREE (just DB query + text search)
   ├─ Speed: FAST (~5ms)
   └─ Use: Business-specific content

3. Industry Knowledge Base
   ├─ Source: MedQuAD, CourseQ, STACKED datasets
   ├─ Cost: FREE (local data)
   ├─ Speed: FAST (~10ms)
   └─ Use: Healthcare/Education queries

4. Unanswered History
   ├─ Source: Database (learning from past)
   ├─ Cost: FREE
   ├─ Speed: FAST (~5ms)
   └─ Use: Similar previously asked questions

5. Small LLM (Your existing system)
   ├─ Source: OpenAI/local model
   ├─ Cost: LOW ($$$)
   ├─ Speed: MODERATE (~1-2s)
   └─ Use: General Q&A fallback

6. Gemini AI (LAST RESORT)
   ├─ Source: Google Gemini
   ├─ Cost: HIGH ($$$$$)
   ├─ Speed: MODERATE (~2-3s)
   ├─ Limit: 50 queries/user/day
   └─ Use: ONLY when nothing else works

💰 Cost Optimization Strategy

Free Sources (Check First)

FAQ → Scraped Content → Industry KB → History
   ↓         ↓              ↓           ↓
  0¢        0¢             0¢          0¢

Paid Sources (Check Last)

Small LLM → Gemini AI
    ↓           ↓
   $0.002     $0.01
  (cheap)   (expensive)

🔄 Example Query Flow

User Query: "mujhe bukhar hai" (I have fever)

Step 1: FAQ Database
  Query: "fever"
  Result: ❌ Not found
  → Continue

Step 2: Website Scraped Content  
  Query: "fever"
  Result: ❌ No scraped content
  → Continue

Step 3: Industry Knowledge Base (Healthcare)
  Query: "fever"
  Result: ✅ FOUND!
  Source: SymCAT + MedQuAD
  Answer: "Symptoms analyzed: fever. Possible: Flu, COVID-19..."
  → STOP (answer found, no LLM needed!)

Cost: $0 ✅

Another Query: "Write me a poem about health"

Step 1: FAQ Database → ❌ Not found
Step 2: Scraped Content → ❌ Not found
Step 3: Industry KB → ❌ Not a medical Q&A
Step 4: History → ❌ Not asked before
Step 5: Small LLM → ✅ FOUND!
  Answer: Generated poem

Cost: $0.002 ✅ (avoided expensive Gemini)

Edge Case: Gemini Usage

Step 1-4: All checked → ❌ Not found
Step 5: Small LLM → ❌ Failed/unavailable
Step 6: Gemini AI → ✅ Used (last resort)
  Query count: 45/50 (user still has 5 left today)
  Answer: Generated response

Cost: $0.01 ⚠️ (expensive, but necessary)

🛡️ Query Limit Protection

Gemini has daily limits to protect costs:

# Per user per day
DEFAULT_LIMIT = 50 queries/day

# Tracking
- By session_id (anonymous users)
- By visitor_email (logged-in users)

# Behavior
Query 1-49: ✅ Normal Gemini response
Query 50:   ✅ Last allowed  
Query 51+:  ⚠️ "Daily limit reached" message
            → Falls back to generic response

📊 Data Source Comparison

Source	Speed	Cost	Accuracy	Availability
FAQ	⚡⚡⚡⚡⚡	FREE	⭐⭐⭐⭐⭐	If data exists
Scraped	⚡⚡⚡⚡	FREE	⭐⭐⭐⭐	If data exists
Industry KB	⚡⚡⚡⚡	FREE	⭐⭐⭐⭐⭐	Always
History	⚡⚡⚡⚡	FREE	⭐⭐⭐	If data exists
Small LLM	⚡⚡⚡	$	⭐⭐⭐⭐	Always
Gemini	⚡⚡	$$$$$	⭐⭐⭐⭐⭐	If enabled + under limit

🎯 Selection Logic

Intent-Based Priority Weights

FAQ Intent:

FAQ: 100% weight
Scraped: 80%
Industry: 60%
Small LLM: 50%
Gemini: 50%

Industry Knowledge Intent:

Industry KB: 100% weight
FAQ: 70%
Small LLM: 65%
Scraped: 50%
Gemini: 50%

Business-Specific Intent:

Scraped: 100% weight
FAQ: 80%
Small LLM: 65%
Industry: 40%
Gemini: 50%

✅ Benefits

Cost Savings: Check free sources first
Speed: Faster responses from local data
Accuracy: Industry datasets more accurate than generic LLM
Scalability: Limits on expensive Gemini prevent runaway costs
Reliability: Multiple fallbacks ensure answers always provided

🔧 Configuration

Environment Variables

# Small LLM (existing)
OPENAI_API_KEY=your_openai_key

# Gemini AI (expensive fallback)
GEMINI_API_KEY=your_gemini_key
GEMINI_DAILY_LIMIT_PER_USER=50  # Adjust based on budget

Disable Gemini (Save Money)

# Just don't set GEMINI_API_KEY
# System will use Small LLM as final fallback

📈 Expected Cost Impact

Before (if using Gemini for everything):

1000 queries/day × $0.01 = $10/day = $300/month ❌

After (with smart fallbacks):

700 FAQ/KB (free) = $0
200 Small LLM × $0.002 = $0.40
100 Gemini × $0.01 = $1.00
─────────────────────────────
Total: $1.40/day = $42/month ✅

Savings: 86% ($258/month) 🎉

🚀 Summary

Priority: Free sources → Small LLM → Gemini (last)
Protection: Daily query limits per user
Savings: ~86% cost reduction
Speed: Faster with local data first
Accuracy: Better with industry datasets

Gemini is now truly the LAST resort! 🎯