mohsin-devs commited on
Commit
3342f91
Β·
1 Parent(s): a5d8052

fix YAML placement correctly

Browse files
Files changed (1) hide show
  1. README.md +1 -609
README.md CHANGED
@@ -6,612 +6,4 @@ colorTo: purple
6
  sdk: static
7
  app_file: index.html
8
  pinned: false
9
- ---
10
-
11
- # DocVault - Offline-First Document Storage System
12
-
13
- Complete offline-first document storage system built with **Python Flask** and local filesystem storage. No cloud dependencies, fully self-contained, and ready for future Hugging Face integration.
14
-
15
- ## 🎯 Features
16
-
17
- ### Core Features
18
- - βœ… **Create Files and Folders** - Including nested directory structures
19
- - βœ… **Delete Items** - Individual files/folders or bulk deletion
20
- - βœ… **Upload Files** - Support for 50+ file types
21
- - βœ… **List Contents** - Browse file/folder hierarchy with metadata
22
- - βœ… **Rename Items** - Rename files and folders
23
- - βœ… **Security** - Path traversal prevention, input validation
24
- - βœ… **Logging** - Comprehensive logging with rotation
25
- - βœ… **File Metadata** - Size, creation time, modification time
26
- - βœ… **Multi-User** - Support for multiple users via user IDs
27
-
28
- ### Storage
29
- - Local filesystem storage in `data/{user_id}/` structure
30
- - Automatic marker files (`.gitkeep`) for HF integration compatibility
31
- - Prevents duplicate filenames with auto-numbering
32
- - Maintains clean directory structure
33
-
34
- ## πŸ“ Project Structure
35
-
36
- ```
37
- .
38
- β”œβ”€β”€ server/
39
- β”‚ β”œβ”€β”€ app.py # Flask application
40
- β”‚ β”œβ”€β”€ config.py # Configuration settings
41
- β”‚ β”œβ”€β”€ requirements.txt # Python dependencies
42
- β”‚ β”œβ”€β”€ routes/
43
- β”‚ β”‚ └── api.py # API endpoints
44
- β”‚ β”œβ”€β”€ storage/
45
- β”‚ β”‚ └── manager.py # Storage operations
46
- β”‚ └── utils/
47
- β”‚ β”œβ”€β”€ logger.py # Logging setup
48
- β”‚ └── validators.py # Path validation & security
49
- β”œβ”€β”€ data/ # Storage directory (auto-created)
50
- β”œβ”€β”€ logs/ # Log files (auto-created)
51
- β”œβ”€β”€ tests/
52
- β”‚ β”œβ”€β”€ test_docvault.py # Unit tests
53
- β”‚ └── test_api.sh # API test script
54
- └── README.md # This file
55
- ```
56
-
57
- ## πŸš€ Getting Started
58
-
59
- ### Prerequisites
60
- - Python 3.8+
61
- - Flask 2.3+
62
- - pip (Python package manager)
63
-
64
- ### Installation
65
-
66
- 1. **Clone or download the project**
67
- ```bash
68
- cd path/to/DocVault
69
- ```
70
-
71
- 2. **Create virtual environment** (recommended)
72
- ```bash
73
- python -m venv venv
74
-
75
- # Activate it:
76
- # On Windows:
77
- venv\Scripts\activate
78
- # On Linux/Mac:
79
- source venv/bin/activate
80
- ```
81
-
82
- 3. **Install dependencies**
83
- ```bash
84
- pip install -r server/requirements.txt
85
- ```
86
-
87
- ### Running the Server
88
-
89
- ```bash
90
- python server/app.py
91
- ```
92
-
93
- Server will start at `http://localhost:5000`
94
-
95
- View API docs: `http://localhost:5000/docs`
96
-
97
- ## πŸ“š API Endpoints
98
-
99
- ### 1. Health Check
100
- ```
101
- GET /api/health
102
- ```
103
- Check if server is running.
104
-
105
- **Response:**
106
- ```json
107
- {
108
- "status": "healthy",
109
- "service": "DocVault"
110
- }
111
- ```
112
-
113
- ---
114
-
115
- ### 2. Create Folder
116
- ```
117
- POST /api/create-folder
118
- ```
119
- Create a new folder (including nested folders).
120
-
121
- **Request:**
122
- ```bash
123
- curl -X POST http://localhost:5000/api/create-folder \
124
- -H "Content-Type: application/json" \
125
- -H "X-User-ID: user123" \
126
- -d '{
127
- "folder_path": "Documents/Projects/MyProject"
128
- }'
129
- ```
130
-
131
- **Response (Success):**
132
- ```json
133
- {
134
- "success": true,
135
- "message": "Folder created: Documents/Projects/MyProject",
136
- "folder": {
137
- "name": "MyProject",
138
- "path": "Documents/Projects/MyProject",
139
- "created_at": "2026-04-09T10:30:00.000000",
140
- "type": "folder"
141
- }
142
- }
143
- ```
144
-
145
- ---
146
-
147
- ### 3. Delete Folder
148
- ```
149
- POST /api/delete-folder
150
- ```
151
- Delete a folder. Use `force: true` to delete non-empty folders.
152
-
153
- **Request:**
154
- ```bash
155
- curl -X POST http://localhost:5000/api/delete-folder \
156
- -H "Content-Type: application/json" \
157
- -H "X-User-ID: user123" \
158
- -d '{
159
- "folder_path": "Documents/Projects/MyProject",
160
- "force": true
161
- }'
162
- ```
163
-
164
- **Response:**
165
- ```json
166
- {
167
- "success": true,
168
- "message": "Folder deleted: Documents/Projects/MyProject"
169
- }
170
- ```
171
-
172
- ---
173
-
174
- ### 4. Upload File
175
- ```
176
- POST /api/upload-file
177
- ```
178
- Upload a file to a specific folder.
179
-
180
- **Request:**
181
- ```bash
182
- curl -X POST http://localhost:5000/api/upload-file \
183
- -H "X-User-ID: user123" \
184
- -F "folder_path=Documents" \
185
- -F "file=@/path/to/file.pdf"
186
- ```
187
-
188
- **Response:**
189
- ```json
190
- {
191
- "success": true,
192
- "message": "File uploaded: report.pdf",
193
- "file": {
194
- "name": "report.pdf",
195
- "path": "Documents/report.pdf",
196
- "size": 102400,
197
- "size_formatted": "100.00 KB",
198
- "uploaded_at": "2026-04-09T10:35:00.000000",
199
- "type": "file"
200
- }
201
- }
202
- ```
203
-
204
- ---
205
-
206
- ### 5. List Contents
207
- ```
208
- GET /api/list
209
- ```
210
- List all files and folders in a directory.
211
-
212
- **Request:**
213
- ```bash
214
- # List root
215
- curl -X GET "http://localhost:5000/api/list" \
216
- -H "X-User-ID: user123"
217
-
218
- # List specific folder
219
- curl -X GET "http://localhost:5000/api/list?folder_path=Documents" \
220
- -H "X-User-ID: user123"
221
- ```
222
-
223
- **Response:**
224
- ```json
225
- {
226
- "success": true,
227
- "path": "Documents",
228
- "folders": [
229
- {
230
- "name": "Projects",
231
- "type": "folder",
232
- "path": "Documents/Projects",
233
- "created_at": "2026-04-09T10:30:00.000000",
234
- "modified_at": "2026-04-09T10:30:00.000000"
235
- }
236
- ],
237
- "files": [
238
- {
239
- "name": "notes.txt",
240
- "type": "file",
241
- "path": "Documents/notes.txt",
242
- "size": 1024,
243
- "size_formatted": "1.00 KB",
244
- "created_at": "2026-04-09T10:35:00.000000",
245
- "modified_at": "2026-04-09T10:35:00.000000"
246
- }
247
- ],
248
- "summary": {
249
- "total_folders": 1,
250
- "total_files": 1
251
- }
252
- }
253
- ```
254
-
255
- ---
256
-
257
- ### 6. Rename File/Folder
258
- ```
259
- POST /api/rename
260
- ```
261
- Rename a file or folder.
262
-
263
- **Request:**
264
- ```bash
265
- curl -X POST http://localhost:5000/api/rename \
266
- -H "Content-Type: application/json" \
267
- -H "X-User-ID: user123" \
268
- -d '{
269
- "item_path": "Documents/OldName",
270
- "new_name": "NewName"
271
- }'
272
- ```
273
-
274
- **Response:**
275
- ```json
276
- {
277
- "success": true,
278
- "message": "Folder renamed to: NewName",
279
- "item": {
280
- "name": "NewName",
281
- "type": "folder",
282
- "path": "Documents/NewName"
283
- }
284
- }
285
- ```
286
-
287
- ---
288
-
289
- ### 7. Storage Statistics
290
- ```
291
- GET /api/storage-stats
292
- ```
293
- Get storage usage statistics.
294
-
295
- **Request:**
296
- ```bash
297
- curl -X GET "http://localhost:5000/api/storage-stats" \
298
- -H "X-User-ID: user123"
299
- ```
300
-
301
- **Response:**
302
- ```json
303
- {
304
- "success": true,
305
- "total_size": 5242880,
306
- "total_size_formatted": "5.00 MB",
307
- "total_files": 42,
308
- "total_folders": 8
309
- }
310
- ```
311
-
312
- ---
313
-
314
- ### 8. Download File
315
- ```
316
- GET /api/download/<file_path>
317
- ```
318
- Download a file.
319
-
320
- **Request:**
321
- ```bash
322
- curl -X GET "http://localhost:5000/api/download/Documents/report.pdf" \
323
- -H "X-User-ID: user123" \
324
- -o report.pdf
325
- ```
326
-
327
- ---
328
-
329
- ## πŸ” Security Features
330
-
331
- ### Path Traversal Prevention
332
- - Validates all paths are within user's directory
333
- - Prevents `../` and similar attacks
334
- - Normalizes paths before operations
335
-
336
- ### Input Validation
337
- - Filename restrictions: alphanumeric, hyphens, underscores, dots
338
- - Maximum filename length: 255 characters
339
- - Blocks Windows reserved names (CON, PRN, AUX, etc.)
340
-
341
- ### File Type Restrictions
342
- Allowed extensions: `txt`, `pdf`, `png`, `jpg`, `jpeg`, `gif`, `doc`, `docx`, `xls`, `xlsx`, `ppt`, `pptx`, `zip`, `rar`, `json`, `xml`, `csv`, `md`, `py`, `js`, `html`, `css`, `yml`, `yaml`
343
-
344
- Maximum file size: 50 MB (configurable)
345
-
346
- ---
347
-
348
- ## πŸ§ͺ Testing
349
-
350
- ### Unit Tests
351
- ```bash
352
- python -m pytest tests/test_docvault.py -v
353
- ```
354
-
355
- Or using unittest:
356
- ```bash
357
- python -m unittest tests.test_docvault -v
358
- ```
359
-
360
- ### Manual API Testing
361
-
362
- #### Using curl (Linux/Mac/WSL)
363
- ```bash
364
- bash tests/test_api.sh
365
- ```
366
-
367
- #### Using Postman
368
- 1. Import the endpoints from the documentation above
369
- 2. Set header: `X-User-ID: test_user`
370
- 3. Test each endpoint
371
-
372
- #### Using PowerShell (Windows)
373
- ```powershell
374
- # Create folder
375
- $headers = @{"X-User-ID" = "test_user"; "Content-Type" = "application/json"}
376
- $body = '{"folder_path": "Documents"}'
377
- Invoke-RestMethod -Uri "http://localhost:5000/api/create-folder" `
378
- -Method POST -Headers $headers -Body $body
379
-
380
- # Upload file
381
- $headers = @{"X-User-ID" = "test_user"}
382
- $form = @{"folder_path" = "Documents"; "file" = Get-Item "path/to/file.txt"}
383
- Invoke-RestMethod -Uri "http://localhost:5000/api/upload-file" `
384
- -Method POST -Headers $headers -Form $form
385
-
386
- # List contents
387
- $headers = @{"X-User-ID" = "test_user"}
388
- Invoke-RestMethod -Uri "http://localhost:5000/api/list" `
389
- -Method GET -Headers $headers
390
- ```
391
-
392
- ---
393
-
394
- ## πŸ“ Configuration
395
-
396
- Edit `server/config.py` to customize:
397
-
398
- ```python
399
- # Storage location
400
- DATA_DIR = "data"
401
-
402
- # Maximum file size (bytes)
403
- MAX_CONTENT_LENGTH = 50 * 1024 * 1024 # 50MB
404
-
405
- # Allowed file extensions
406
- ALLOWED_EXTENSIONS = {'txt', 'pdf', 'png', ...}
407
-
408
- # Debug mode
409
- DEBUG = True
410
-
411
- # Logging level
412
- LOG_LEVEL = "INFO"
413
- ```
414
-
415
- ---
416
-
417
- ## πŸ—‚οΈ Storage Structure
418
-
419
- Files are organized by user ID:
420
-
421
- ```
422
- data/
423
- β”œβ”€β”€ default_user/
424
- β”‚ β”œβ”€β”€ Documents/
425
- β”‚ β”‚ β”œβ”€β”€ report.pdf
426
- β”‚ β”‚ β”œβ”€β”€ notes.txt
427
- β”‚ β”‚ β”œβ”€β”€ Projects/
428
- β”‚ β”‚ β”‚ β”œβ”€β”€ ProjectA/
429
- β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ .gitkeep
430
- β”‚ β”‚ β”‚ β”‚ └── code.py
431
- β”‚ β”‚ β”‚ └── .gitkeep
432
- β”‚ β”‚ └── .gitkeep
433
- β”‚ β”œβ”€β”€ Images/
434
- β”‚ └── .gitkeep
435
- β”œβ”€β”€ user123/
436
- └── user456/
437
- ```
438
-
439
- The `.gitkeep` marker file:
440
- - Identifies folders in HF integration
441
- - Allows tracking empty directories in git
442
- - Automatically created with new folders
443
-
444
- ---
445
-
446
- ## πŸ”Œ API Response Format
447
-
448
- ### Success Response
449
- ```json
450
- {
451
- "success": true,
452
- "message": "Operation successful",
453
- "data": {...}
454
- }
455
- ```
456
-
457
- ### Error Response
458
- ```json
459
- {
460
- "success": false,
461
- "error": "Description of error",
462
- "code": "ERROR_CODE"
463
- }
464
- ```
465
-
466
- ### Common Status Codes
467
- - `200`: OK
468
- - `201`: Created
469
- - `400`: Bad Request
470
- - `404`: Not Found
471
- - `413`: Payload Too Large
472
- - `500`: Internal Server Error
473
-
474
- ---
475
-
476
- ## πŸ”„ Future Integration: Hugging Face
477
-
478
- The system is designed for easy HF integration:
479
-
480
- ### Mapping to HF Structure
481
- ```
482
- Local: data/user/folder/file.txt
483
- ↓
484
- HF Git: repo/user/folder/file.txt
485
- ```
486
-
487
- ### When integrating with HF:
488
- 1. Replace `StorageManager` with `HFStorageManager`
489
- 2. Use git operations instead of filesystem
490
- 3. Maintain same API interface
491
- 4. Folder marker files (`.gitkeep`) enable empty folder tracking
492
-
493
- ### Integration Points
494
- - Folder creation β†’ git mkdir + .gitkeep commit
495
- - File upload β†’ git commit with file
496
- - Deletion β†’ git remove file/folder
497
- - Listing β†’ git tree navigation
498
- - Renaming β†’ git move + commit
499
-
500
- ---
501
-
502
- ## πŸ“Š Logging
503
-
504
- Logs are automatically saved and rotated:
505
-
506
- ```
507
- logs/
508
- β”œβ”€β”€ __main__.log
509
- β”œβ”€β”€ routes.api.log
510
- β”œβ”€β”€ storage.manager.log
511
- └── utils.logger.log
512
- ```
513
-
514
- - Max log file size: 10 MB
515
- - Backup count: 5 files
516
- - Format: `timestamp - logger - level - message`
517
-
518
- ---
519
-
520
- ## πŸ› οΈ Troubleshooting
521
-
522
- ### Port Already in Use
523
- ```bash
524
- # Change port in app.py or set environment variable
525
- export FLASK_PORT=5001
526
- python server/app.py
527
- ```
528
-
529
- ### Permission Denied Creating Files
530
- - Ensure write permission to `data/` directory
531
- - On Linux/Mac: `chmod 755 data/`
532
-
533
- ### CORS Issues
534
- - CORS is enabled by default for local development
535
- - Modify `server/app.py` for production settings
536
-
537
- ### 404 on API Endpoints
538
- - Check your base URL is `http://localhost:5000/api`
539
- - Verify endpoint path matches exactly
540
-
541
- ### Duplicate Files
542
- - Files are automatically renamed with `_1`, `_2`, etc.
543
- - Check `/api/list` to see actual filenames
544
-
545
- ---
546
-
547
- ## πŸ“ˆ Performance
548
-
549
- - Average folder creation: < 10ms
550
- - File upload: Limited by disk I/O
551
- - Large file handling: Optimized with streaming
552
- - Concurrent requests: Thread-safe with Flask
553
-
554
- For high-volume operations, consider:
555
- - Database indexing (future upgrade)
556
- - Caching layer (Redis)
557
- - Background tasks (Celery)
558
-
559
- ---
560
-
561
- ## πŸ“„ License
562
-
563
- This project is provided as-is for educational and commercial use.
564
-
565
- ---
566
-
567
- ## 🀝 Contributing
568
-
569
- Contributions welcome! Areas for enhancement:
570
- - Database backend integration
571
- - Advanced search functionality
572
- - File versioning
573
- - Collaborative features
574
- - Mobile app support
575
-
576
- ---
577
-
578
- ## πŸ“ž Support
579
-
580
- For issues or questions:
581
- 1. Check the troubleshooting section
582
- 2. Review log files in `logs/`
583
- 3. Test with sample curl commands
584
- 4. Check configuration in `config.py`
585
-
586
- ---
587
-
588
- ## πŸŽ“ Example Workflow
589
-
590
- ```bash
591
- # 1. Start server
592
- python server/app.py
593
-
594
- # 2. Create workspace
595
- curl -X POST http://localhost:5000/api/create-folder \
596
- -H "X-User-ID: user1" \
597
- -H "Content-Type: application/json" \
598
- -d '{"folder_path": "MyProject"}'
599
-
600
- # 3. Upload files
601
- curl -X POST http://localhost:5000/api/upload-file \
602
- -H "X-User-ID: user1" \
603
- -F "folder_path=MyProject" \
604
- -F "file=@document.pdf"
605
-
606
- # 4. List contents
607
- curl -X GET "http://localhost:5000/api/list?folder_path=MyProject" \
608
- -H "X-User-ID: user1"
609
-
610
- # 5. Check storage
611
- curl -X GET http://localhost:5000/api/storage-stats \
612
- -H "X-User-ID: user1"
613
- ```
614
-
615
- ---
616
-
617
- **DocVault v1.0** - Your offline-first document storage solution ✨
 
6
  sdk: static
7
  app_file: index.html
8
  pinned: false
9
+ ---