GraziePrego commited on
Commit
246cb17
·
verified ·
1 Parent(s): 19741f2

Add HTML API documentation

Browse files
Files changed (1) hide show
  1. docs.html +349 -0
docs.html ADDED
@@ -0,0 +1,349 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>CyberScraper 2077 API Documentation</title>
7
+ <style>
8
+ * {
9
+ margin: 0;
10
+ padding: 0;
11
+ box-sizing: border-box;
12
+ }
13
+ body {
14
+ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
15
+ line-height: 1.6;
16
+ color: #333;
17
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
18
+ min-height: 100vh;
19
+ padding: 20px;
20
+ }
21
+ .container {
22
+ max-width: 1200px;
23
+ margin: 0 auto;
24
+ background: white;
25
+ border-radius: 12px;
26
+ box-shadow: 0 10px 40px rgba(0,0,0,0.2);
27
+ overflow: hidden;
28
+ }
29
+ header {
30
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
31
+ color: white;
32
+ padding: 40px;
33
+ text-align: center;
34
+ }
35
+ header h1 {
36
+ font-size: 2.5em;
37
+ margin-bottom: 10px;
38
+ }
39
+ header p {
40
+ font-size: 1.2em;
41
+ opacity: 0.9;
42
+ }
43
+ .version {
44
+ display: inline-block;
45
+ background: rgba(255,255,255,0.2);
46
+ padding: 5px 15px;
47
+ border-radius: 20px;
48
+ margin-top: 15px;
49
+ font-size: 0.9em;
50
+ }
51
+ .content {
52
+ padding: 40px;
53
+ }
54
+ .section {
55
+ margin-bottom: 40px;
56
+ }
57
+ .section h2 {
58
+ color: #667eea;
59
+ border-bottom: 3px solid #667eea;
60
+ padding-bottom: 10px;
61
+ margin-bottom: 20px;
62
+ font-size: 1.8em;
63
+ }
64
+ .endpoint {
65
+ background: #f8f9fa;
66
+ border-left: 4px solid #667eea;
67
+ padding: 20px;
68
+ margin-bottom: 20px;
69
+ border-radius: 0 8px 8px 0;
70
+ }
71
+ .endpoint-header {
72
+ display: flex;
73
+ align-items: center;
74
+ margin-bottom: 15px;
75
+ flex-wrap: wrap;
76
+ gap: 10px;
77
+ }
78
+ .method {
79
+ display: inline-block;
80
+ padding: 5px 12px;
81
+ border-radius: 5px;
82
+ font-weight: bold;
83
+ font-size: 0.9em;
84
+ min-width: 80px;
85
+ text-align: center;
86
+ }
87
+ .method.get { background: #61affe; color: white; }
88
+ .method.post { background: #49cc90; color: white; }
89
+ .method.delete { background: #f93e3e; color: white; }
90
+ .path {
91
+ font-family: 'Courier New', monospace;
92
+ font-size: 1.1em;
93
+ color: #333;
94
+ font-weight: 600;
95
+ }
96
+ .description {
97
+ color: #666;
98
+ margin-bottom: 15px;
99
+ }
100
+ .parameters, .response {
101
+ background: white;
102
+ padding: 15px;
103
+ border-radius: 5px;
104
+ margin: 10px 0;
105
+ }
106
+ .parameters h4, .response h4 {
107
+ color: #667eea;
108
+ margin-bottom: 10px;
109
+ }
110
+ .code-block {
111
+ background: #2d2d2d;
112
+ color: #f8f8f2;
113
+ padding: 15px;
114
+ border-radius: 5px;
115
+ overflow-x: auto;
116
+ font-family: 'Courier New', monospace;
117
+ font-size: 0.9em;
118
+ margin: 10px 0;
119
+ }
120
+ .code-block pre {
121
+ margin: 0;
122
+ }
123
+ .quick-start {
124
+ background: #e8f4f8;
125
+ padding: 20px;
126
+ border-radius: 8px;
127
+ margin-bottom: 20px;
128
+ }
129
+ .quick-start ol {
130
+ margin-left: 20px;
131
+ }
132
+ .quick-start li {
133
+ margin: 8px 0;
134
+ }
135
+ .example-box {
136
+ background: #fff3cd;
137
+ border: 1px solid #ffc107;
138
+ padding: 15px;
139
+ border-radius: 5px;
140
+ margin: 10px 0;
141
+ }
142
+ .example-box h4 {
143
+ color: #856404;
144
+ margin-bottom: 10px;
145
+ }
146
+ footer {
147
+ background: #f8f9fa;
148
+ padding: 20px;
149
+ text-align: center;
150
+ color: #666;
151
+ border-top: 1px solid #ddd;
152
+ }
153
+ @media (max-width: 768px) {
154
+ header h1 {
155
+ font-size: 1.8em;
156
+ }
157
+ .content {
158
+ padding: 20px;
159
+ }
160
+ .endpoint-header {
161
+ flex-direction: column;
162
+ align-items: flex-start;
163
+ }
164
+ }
165
+ </style>
166
+ </head>
167
+ <body>
168
+ <div class="container">
169
+ <header>
170
+ <h1>🕷️ CyberScraper 2077 API</h1>
171
+ <p>Advanced Web Scraping API with AI-Powered Content Extraction</p>
172
+ <span class="version">Version 1.0.0</span>
173
+ </header>
174
+
175
+ <div class="content">
176
+ <div class="section">
177
+ <h2>🚀 Quick Start</h2>
178
+ <div class="quick-start">
179
+ <ol>
180
+ <li>Make a simple scrape request to <code>/api/scrape</code></li>
181
+ <li>For multiple requests, create a session first using <code>/api/session</code></li>
182
+ <li>Use the session ID for subsequent requests to <code>/api/session/{session_id}/scrape</code></li>
183
+ <li>Always close sessions when done using <code>DELETE /api/session/{session_id}</code></li>
184
+ </ol>
185
+ </div>
186
+ </div>
187
+
188
+ <div class="section">
189
+ <h2>📡 API Endpoints</h2>
190
+
191
+ <div class="endpoint">
192
+ <div class="endpoint-header">
193
+ <span class="method get">GET</span>
194
+ <span class="path">/health</span>
195
+ </div>
196
+ <p class="description">Check if the API is running</p>
197
+ <div class="example-box">
198
+ <h4>Example:</h4>
199
+ <div class="code-block"><pre>curl https://grazieprego-scrapling.hf.space/health</pre></div>
200
+ </div>
201
+ <div class="response">
202
+ <h4>Response:</h4>
203
+ <div class="code-block"><pre>{
204
+ "status": "ok",
205
+ "message": "CyberScraper 2077 API is running"
206
+ }</pre></div>
207
+ </div>
208
+ </div>
209
+
210
+ <div class="endpoint">
211
+ <div class="endpoint-header">
212
+ <span class="method post">POST</span>
213
+ <span class="path">/api/scrape</span>
214
+ </div>
215
+ <p class="description">Stateless scrape request - creates a new extractor for each request</p>
216
+ <div class="parameters">
217
+ <h4>Request Body:</h4>
218
+ <ul>
219
+ <li><strong>url</strong> (string) - The URL to scrape</li>
220
+ <li><strong>query</strong> (string) - The extraction query/instruction</li>
221
+ <li><strong>model_name</strong> (string, optional) - AI model to use (default: 'alias-fast')</li>
222
+ </ul>
223
+ </div>
224
+ <div class="example-box">
225
+ <h4>Example (cURL):</h4>
226
+ <div class="code-block"><pre>curl -X POST https://grazieprego-scrapling.hf.space/api/scrape \
227
+ -H "Content-Type: application/json" \
228
+ -d '{
229
+ "url": "https://example.com",
230
+ "query": "Extract all product prices"
231
+ }'</pre></div>
232
+ <h4>Example (Python):</h4>
233
+ <div class="code-block"><pre>import requests
234
+
235
+ response = requests.post(
236
+ 'https://grazieprego-scrapling.hf.space/api/scrape',
237
+ json={
238
+ 'url': 'https://example.com',
239
+ 'query': 'Extract prices'
240
+ }
241
+ )
242
+ print(response.json())</pre></div>
243
+ </div>
244
+ </div>
245
+
246
+ <div class="endpoint">
247
+ <div class="endpoint-header">
248
+ <span class="method post">POST</span>
249
+ <span class="path">/api/session</span>
250
+ </div>
251
+ <p class="description">Create a persistent scraping session for multiple requests</p>
252
+ <div class="parameters">
253
+ <h4>Request Body:</h4>
254
+ <ul>
255
+ <li><strong>model_name</strong> (string, optional) - AI model to use (default: 'alias-fast')</li>
256
+ </ul>
257
+ </div>
258
+ <div class="example-box">
259
+ <h4>Example:</h4>
260
+ <div class="code-block"><pre>curl -X POST https://grazieprego-scrapling.hf.space/api/session \
261
+ -H "Content-Type: application/json" \
262
+ -d '{"model_name": "alias-fast"}'</pre></div>
263
+ </div>
264
+ </div>
265
+
266
+ <div class="endpoint">
267
+ <div class="endpoint-header">
268
+ <span class="method post">POST</span>
269
+ <span class="path">/api/session/{session_id}/scrape</span>
270
+ </div>
271
+ <p class="description">Scrape using an existing session context (more efficient for multiple requests)</p>
272
+ <div class="parameters">
273
+ <h4>Path Parameters:</h4>
274
+ <ul>
275
+ <li><strong>session_id</strong> (string) - UUID of the session</li>
276
+ </ul>
277
+ <h4>Request Body:</h4>
278
+ <ul>
279
+ <li><strong>url</strong> (string) - The URL to scrape</li>
280
+ <li><strong>query</strong> (string) - The extraction query</li>
281
+ <li><strong>model_name</strong> (string, optional)</li>
282
+ </ul>
283
+ </div>
284
+ <div class="example-box">
285
+ <h4>Example:</h4>
286
+ <div class="code-block"><pre>curl -X POST https://grazieprego-scrapling.hf.space/api/session/uuid-here/scrape \
287
+ -H "Content-Type: application/json" \
288
+ -d '{
289
+ "url": "https://example.com/page1",
290
+ "query": "Extract titles"
291
+ }'</pre></div>
292
+ </div>
293
+ </div>
294
+
295
+ <div class="endpoint">
296
+ <div class="endpoint-header">
297
+ <span class="method delete">DELETE</span>
298
+ <span class="path">/api/session/{session_id}</span>
299
+ </div>
300
+ <p class="description">Close a session and release resources</p>
301
+ <div class="parameters">
302
+ <h4>Path Parameters:</h4>
303
+ <ul>
304
+ <li><strong>session_id</strong> (string) - UUID of the session to close</li>
305
+ </ul>
306
+ </div>
307
+ <div class="example-box">
308
+ <h4>Example:</h4>
309
+ <div class="code-block"><pre>curl -X DELETE https://grazieprego-scrapling.hf.space/api/session/uuid-here</pre></div>
310
+ </div>
311
+ </div>
312
+ </div>
313
+
314
+ <div class="section">
315
+ <h2>💡 Best Practices</h2>
316
+ <ul>
317
+ <li>Use stateless <code>/api/scrape</code> for one-off requests</li>
318
+ <li>Use sessions for batch processing multiple URLs</li>
319
+ <li>Always close sessions when finished to free resources</li>
320
+ <li>Handle errors gracefully (500 errors may occur on complex sites)</li>
321
+ <li>Set appropriate timeouts for slow-loading pages</li>
322
+ <li>Implement retry logic for production use</li>
323
+ </ul>
324
+ </div>
325
+
326
+ <div class="section">
327
+ <h2>⚠️ Error Handling</h2>
328
+ <div class="parameters">
329
+ <ul>
330
+ <li><strong>404</strong> - Session not found (for session endpoints)</li>
331
+ <li><strong>500</strong> - Internal server error - check the detail message</li>
332
+ </ul>
333
+ <p><strong>Common Issues:</strong></p>
334
+ <ul>
335
+ <li>URL unreachable or timeout</li>
336
+ <li>JavaScript-heavy sites may require different approaches</li>
337
+ <li>Bot protection may block requests</li>
338
+ </ul>
339
+ </div>
340
+ </div>
341
+ </div>
342
+
343
+ <footer>
344
+ <p>CyberScraper 2077 API - Powered by Scrapling & AI</p>
345
+ <p>Base URL: <a href="https://grazieprego-scrapling.hf.space">https://grazieprego-scrapling.hf.space</a></p>
346
+ </footer>
347
+ </div>
348
+ </body>
349
+ </html>