Spaces:
Sleeping
Sleeping
Commit Β·
a421bc4
1
Parent(s): bfa87c3
docs: correct readme table
Browse files- README-HF.md +5 -5
- README.md +16 -29
README-HF.md
CHANGED
|
@@ -46,11 +46,11 @@ No signup required. Your documents are processed locally and auto-deleted after
|
|
| 46 |
|
| 47 |
## Features
|
| 48 |
|
| 49 |
-
-
|
| 50 |
-
-
|
| 51 |
-
-
|
| 52 |
-
-
|
| 53 |
-
-
|
| 54 |
|
| 55 |
---
|
| 56 |
|
|
|
|
| 46 |
|
| 47 |
## Features
|
| 48 |
|
| 49 |
+
- **Multi-format**: PDF, DOCX, TXT
|
| 50 |
+
- **Citations**: Every answer references source documents
|
| 51 |
+
- **Domain demos**: Legal, Research, FinOps pre-loaded
|
| 52 |
+
- **Privacy-first**: Local processing, auto-delete after 7 days
|
| 53 |
+
- **Fast**: 1-3 second response time
|
| 54 |
|
| 55 |
---
|
| 56 |
|
README.md
CHANGED
|
@@ -87,37 +87,24 @@ python app/main.py
|
|
| 87 |
|
| 88 |
---
|
| 89 |
|
| 90 |
-
## Production Checklist
|
| 91 |
|
| 92 |
> 10 criteria for enterprise-grade RAG. Each is satisfied by this platform.
|
| 93 |
|
| 94 |
-
| # | Criterion | Status | Details |
|
| 95 |
-
|---|-----------|--------|---------|
|
| 96 |
-
| 1 | **Multi-format ingestion** | β
| PDF, DOCX, TXT with intelligent parsing |
|
| 97 |
-
| 2 | **Semantic chunking** | β
| 1000-char chunks, 200-char overlap |
|
| 98 |
-
| 3 | **Production embeddings** | β
| bge-small-en-v1.5 (MTEB optimized) |
|
| 99 |
-
| 4 | **Persistent storage** | β
| ChromaDB survives restarts |
|
| 100 |
-
| 5 | **Citation tracking** | β
| Every answer links to source chunks |
|
| 101 |
-
| 6 | **Rate limiting** | β
| 10 queries/hour (configurable) |
|
| 102 |
-
| 7 | **Privacy controls** | β
| Auto-delete after 7 days |
|
| 103 |
-
| 8 | **Domain demos** | β
| Legal, Research, FinOps samples |
|
| 104 |
-
| 9 | **Docker deployment** | β
| One-command production deploy |
|
| 105 |
-
| 10 | **Monitoring hooks** | β
| Health checks, error logging |
|
| 106 |
-
|
| 107 |
-
π **[Design Decisions β](docs/DESIGN_DECISIONS.md)** β Deep dive into architectural choices.
|
| 108 |
-
|
| 109 |
-
---
|
| 110 |
-
|
| 111 |
-
## Features
|
| 112 |
-
|
| 113 |
| Feature | Description |
|
| 114 |
-
|---------|----------
|
| 115 |
-
|
|
| 116 |
-
|
|
| 117 |
-
|
|
| 118 |
-
|
|
| 119 |
-
|
|
| 120 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 121 |
|
| 122 |
---
|
| 123 |
|
|
@@ -125,8 +112,8 @@ python app/main.py
|
|
| 125 |
|
| 126 |
| Metric | Value |
|
| 127 |
|--------|-------|
|
| 128 |
-
| **End-to-end latency** |
|
| 129 |
-
| **100-page contract** |
|
| 130 |
| **Hallucination rate** | ~4-7% (vs 18% baseline) |
|
| 131 |
| **Throughput** | ~12 docs/min |
|
| 132 |
|
|
|
|
| 87 |
|
| 88 |
---
|
| 89 |
|
| 90 |
+
## Production Features Checklist
|
| 91 |
|
| 92 |
> 10 criteria for enterprise-grade RAG. Each is satisfied by this platform.
|
| 93 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
| Feature | Description |
|
| 95 |
+
|----------|----------|
|
| 96 |
+
| **Multi-format ingestion** | PDF, DOCX, TXT with intelligent parsing |
|
| 97 |
+
| **Semantic chunking** | 1000-char chunks, 200-char overlap |
|
| 98 |
+
| **Production embeddings** | bge-small-en-v1.5 (MTEB optimized) |
|
| 99 |
+
| **Persistent storage** | ChromaDB survives restarts |
|
| 100 |
+
| **Citation tracking** | Every answer links to source chunks |
|
| 101 |
+
| **Rate limiting** | 10 queries/hour (configurable) |
|
| 102 |
+
| **Privacy controls** | Auto-delete after 7 days |
|
| 103 |
+
| **Monitoring hooks** | Health checks, error logging |
|
| 104 |
+
| **Fast** | 1-3 second end-to-end response time |
|
| 105 |
+
| **Portable** | Docker-ready, one-command deploy |
|
| 106 |
+
|
| 107 |
+
**[Design Decisions β](docs/DESIGN_DECISIONS.md)** β Deep dive into architectural choices.
|
| 108 |
|
| 109 |
---
|
| 110 |
|
|
|
|
| 112 |
|
| 113 |
| Metric | Value |
|
| 114 |
|--------|-------|
|
| 115 |
+
| **End-to-end latency** | 1-3 seconds |
|
| 116 |
+
| **100-page contract** | 5-6s process, 1.5s query |
|
| 117 |
| **Hallucination rate** | ~4-7% (vs 18% baseline) |
|
| 118 |
| **Throughput** | ~12 docs/min |
|
| 119 |
|