Saad4web commited on
Commit
f061186
·
verified ·
1 Parent(s): f685cb6

Here’s a ready-to-use “meta-prompt” you can feed into your AI agent to kick off the local build of your Flashscore scraper: You are a Senior JavaScript Automation Engineer. Your task is to scaffold and implement, step by step, a local Flashscore data-scraping tool in Node.js, using Playwright (or Puppeteer) and Cheerio. Follow these requirements exactly: 1. **Project Initialization** - Create a new npm project (`npm init -y`). - Install dependencies: ```bash npm install playwright cheerio axios dotenv fs-extra node-cron ``` 2. **File Structure** Build this directory tree: flashscore-scraper/ ├── src/ │ ├── scrapers/ │ │ ├── base-scraper.js # launches browser, handles sessions, stealth │ │ ├── match-summary.js # extracts match info & events │ │ └── lineups.js # extracts formations & lineups │ ├── utils/ │ │ ├── browser-manager.js # singleton browser/context manager │ │ ├── data-processor.js # cleans & normalizes scraped data │ │ └── proxy-manager.js # rotates proxies & delays │ ├── models/ │ │ ├── match-data.js # JS class/schema for match summary │ │ └── team-data.js # JS class/schema for lineup data │ └── index.js # CLI entrypoint & cron scheduler ├── config/ │ └── settings.js # base URL, selectors, proxy list, cron schedule ├── data/ │ ├── matches/ # JSON output files │ └── cache/ # temporary HTML snapshots └── package.json 3. **Stealth & Throttling** In `base-scraper.js`, implement: - Realistic `User-Agent`, random delays (2–8 s) between actions. - Puppeteer extra stealth plugin or Playwright stealth options. - Proxy rotation every 50 requests. - Block images & ads via request interception. 4. **Scraper Modules** - **match-summary.js**: Navigate to a match URL, wait for `.match-summary` selector, scrape: - Teams, final score, date & time, half-time score. - Events array: goals (scorer/time/assist), cards, substitutions, injuries. - **lineups.js**: Navigate to `/lineups`, wait for lineup container, scrape: - Starting XI, substitutes, coaching staff, formation map. 5. **Data Models & Processing** - Define `MatchData` and `TeamData` classes with clear fields. - In `data-processor.js`, normalize time stamps, convert date strings to ISO, validate numeric scores. 6. **Scheduling & CLI** - In `index.js`, read a match URL from CLI or `.env`. - Schedule daily runs via `node-cron` (configurable cron expression). - Save JSON to `data/matches/<matchId>.json`. 7. **Error Handling & Logging** - Retry up to 3 times on network or selector errors with exponential backoff. - Log successes and failures to a rotating log file in `data/logs/`. 8. **Next Steps (after MVP)** - Add an Express API wrapper (`/api/match/:id`). - Build a simple dashboard to visualize scraped stats. - Integrate a caching layer (Redis or file-based) for repeated queries. Please generate all boilerplate code accordingly, with comments explaining each major section. Start by creating `src/utils/browser-manager.js` and `src/scrapers/base-scraper.js`. Proceed one module at a time, and after each file, run a quick example invocation to verify connectivity to Flashscore.com. - Initial Deployment

Browse files
Files changed (2) hide show
  1. README.md +7 -5
  2. index.html +643 -19
README.md CHANGED
@@ -1,10 +1,12 @@
1
  ---
2
- title: Fs
3
- emoji: 🌍
4
- colorFrom: green
5
- colorTo: red
6
  sdk: static
7
  pinned: false
 
 
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
+ title: fs
3
+ emoji: 🐳
4
+ colorFrom: yellow
5
+ colorTo: pink
6
  sdk: static
7
  pinned: false
8
+ tags:
9
+ - deepsite
10
  ---
11
 
12
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
index.html CHANGED
@@ -1,19 +1,643 @@
1
- <!doctype html>
2
- <html>
3
- <head>
4
- <meta charset="utf-8" />
5
- <meta name="viewport" content="width=device-width" />
6
- <title>My static Space</title>
7
- <link rel="stylesheet" href="style.css" />
8
- </head>
9
- <body>
10
- <div class="card">
11
- <h1>Welcome to your static Space!</h1>
12
- <p>You can modify this app directly by editing <i>index.html</i> in the Files and versions tab.</p>
13
- <p>
14
- Also don't forget to check the
15
- <a href="https://huggingface.co/docs/hub/spaces" target="_blank">Spaces documentation</a>.
16
- </p>
17
- </div>
18
- </body>
19
- </html>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Flashscore Scraper Project | Documentation</title>
7
+ <script src="https://cdn.tailwindcss.com"></script>
8
+ <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css">
9
+ <script>
10
+ tailwind.config = {
11
+ theme: {
12
+ extend: {
13
+ colors: {
14
+ primary: '#0d9488',
15
+ secondary: '#134e4a',
16
+ accent: '#14b8a6'
17
+ }
18
+ }
19
+ }
20
+ }
21
+ </script>
22
+ <style>
23
+ .file-tree li {
24
+ position: relative;
25
+ padding-left: 1.5rem;
26
+ }
27
+
28
+ .file-tree li:before {
29
+ content: '';
30
+ position: absolute;
31
+ left: 0;
32
+ top: 0;
33
+ bottom: 0;
34
+ width: 1px;
35
+ background-color: #cbd5e1;
36
+ }
37
+
38
+ .file-tree li:after {
39
+ content: '';
40
+ position: absolute;
41
+ left: 0;
42
+ top: 12px;
43
+ height: 1px;
44
+ width: 10px;
45
+ background-color: #cbd5e1;
46
+ }
47
+
48
+ .file-tree .folder:before {
49
+ font-family: "Font Awesome 6 Free";
50
+ content: "\f07b";
51
+ position: absolute;
52
+ left: -1.5rem;
53
+ top: 0;
54
+ font-weight: 900;
55
+ color: #0d9488;
56
+ }
57
+
58
+ .file-tree .file:before {
59
+ font-family: "Font Awesome 6 Free";
60
+ content: "\f15b";
61
+ position: absolute;
62
+ left: -1.5rem;
63
+ top: 0;
64
+ font-weight: 400;
65
+ color: #94a3b8;
66
+ }
67
+
68
+ .code-header {
69
+ font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace;
70
+ }
71
+
72
+ pre {
73
+ font-size: 0.875rem;
74
+ line-height: 1.5rem;
75
+ overflow-x: auto;
76
+ padding: 0;
77
+ background-color: #1e293b;
78
+ border-radius: 0.5rem;
79
+ margin: 0;
80
+ }
81
+
82
+ .code-container {
83
+ display: none;
84
+ }
85
+
86
+ .active-file {
87
+ background-color: #f8fafc;
88
+ }
89
+
90
+ @media (max-width: 768px) {
91
+ .sidebar-container {
92
+ max-height: 300px;
93
+ overflow-y: auto;
94
+ }
95
+ }
96
+ </style>
97
+ </head>
98
+ <body class="bg-slate-50 text-slate-800">
99
+ <!-- Navigation -->
100
+ <nav class="bg-gradient-to-r from-secondary to-primary text-white p-4 shadow-lg">
101
+ <div class="container mx-auto flex justify-between items-center">
102
+ <div class="flex items-center">
103
+ <i class="fas fa-database text-2xl mr-3"></i>
104
+ <h1 class="text-2xl font-bold">Flashscore Scraper</h1>
105
+ </div>
106
+ <div>
107
+ <span class="bg-white text-primary px-3 py-1 rounded-full text-sm font-bold">Node.js v18</span>
108
+ </div>
109
+ </div>
110
+ </nav>
111
+
112
+ <!-- Header -->
113
+ <header class="bg-gradient-to-r from-secondary/5 to-primary/5 py-12">
114
+ <div class="container mx-auto px-4">
115
+ <div class="max-w-4xl mx-auto text-center">
116
+ <h2 class="text-4xl font-bold text-slate-800 mb-4">Sports Data Extraction Tool</h2>
117
+ <p class="text-lg text-slate-600 mb-6">
118
+ Comprehensive Flashscore.com scraper built with Playwright and Cheerio.
119
+ Collects match data, lineups, statistics and schedules.
120
+ </p>
121
+ <div class="flex flex-wrap justify-center gap-3">
122
+ <span class="bg-white px-3 py-1 rounded-full text-sm font-medium border border-primary/30">
123
+ <i class="fas fa-play mr-1"></i> Playwright
124
+ </span>
125
+ <span class="bg-white px-3 py-1 rounded-full text-sm font-medium border border-primary/30">
126
+ <i class="fas fa-filter mr-1"></i> Cheerio
127
+ </span>
128
+ <span class="bg-white px-3 py-1 rounded-full text-sm font-medium border border-primary/30">
129
+ <i class="fas fa-clock mr-1"></i> CRON Scheduling
130
+ </span>
131
+ <span class="bg-white px-3 py-1 rounded-full text-sm font-medium border border-primary/30">
132
+ <i class="fas fa-robot mr-1"></i> Stealth Mode
133
+ </span>
134
+ </div>
135
+ </div>
136
+ </div>
137
+ </header>
138
+
139
+ <div class="container mx-auto px-4 py-8">
140
+ <!-- Project Structure -->
141
+ <section class="bg-white rounded-xl shadow-lg overflow-hidden mb-8">
142
+ <div class="border-b border-slate-200 py-4 px-6 flex items-center">
143
+ <i class="fas fa-sitemap text-primary mr-3"></i>
144
+ <h3 class="text-xl font-bold text-slate-800">Project Structure</h3>
145
+ </div>
146
+ <div class="p-6">
147
+ <div class="grid grid-cols-1 md:grid-cols-12 gap-6">
148
+ <div class="md:col-span-4 sidebar-container">
149
+ <div class="bg-slate-50 p-4 rounded-lg">
150
+ <h4 class="font-bold text-primary mb-3">Project Files</h4>
151
+ <ul class="file-tree text-sm space-y-1">
152
+ <li class="folder">flashscore-scraper
153
+ <ul class="pl-4 space-y-1">
154
+ <li class="folder">config
155
+ <ul class="pl-4 space-y-1">
156
+ <li class="file" data-file="settings.js">settings.js</li>
157
+ </ul>
158
+ </li>
159
+ <li class="folder">data
160
+ <ul class="pl-4 space-y-1">
161
+ <li class="folder">matches</li>
162
+ <li class="folder">cache</li>
163
+ </ul>
164
+ </li>
165
+ <li class="folder">src
166
+ <ul class="pl-4 space-y-1">
167
+ <li class="folder">models
168
+ <ul class="pl-4 space-y-1">
169
+ <li class="file" data-file="match-data.js">match-data.js</li>
170
+ <li class="file" data-file="team-data.js">team-data.js</li>
171
+ </ul>
172
+ </li>
173
+ <li class="folder">scrapers
174
+ <ul class="pl-4 space-y-1">
175
+ <li class="file" data-file="base-scraper.js">base-scraper.js</li>
176
+ <li class="file" data-file="match-summary.js">match-summary.js</li>
177
+ <li class="file" data-file="lineups.js">lineups.js</li>
178
+ </ul>
179
+ </li>
180
+ <li class="folder">utils
181
+ <ul class="pl-4 space-y-1">
182
+ <li class="file" data-file="browser-manager.js">browser-manager.js</li>
183
+ <li class="file" data-file="data-processor.js">data-processor.js</li>
184
+ <li class="file" data-file="proxy-manager.js">proxy-manager.js</li>
185
+ </ul>
186
+ </li>
187
+ <li class="file" data-file="index.js">index.js</li>
188
+ </ul>
189
+ </li>
190
+ <li class="file" data-file="package.json">package.json</li>
191
+ </ul>
192
+ </li>
193
+ </ul>
194
+ </div>
195
+ </div>
196
+
197
+ <div class="md:col-span-8">
198
+ <!-- Code View Tabs -->
199
+ <div class="code-container active" id="base-scraper.js">
200
+ <div class="code-header bg-slate-800 text-slate-200 px-4 py-2 rounded-t-lg flex justify-between">
201
+ <div>
202
+ <i class="far fa-file-code mr-2"></i>
203
+ <span class="font-mono">src/scrapers/base-scraper.js</span>
204
+ </div>
205
+ <div>
206
+ <span class="text-green-400">•</span>
207
+ <span class="text-xs ml-1">JavaScript</span>
208
+ </div>
209
+ </div>
210
+ <pre class="rounded-b-lg"><code class="language-javascript">const { chromium } = require('playwright');
211
+ const StealthPlugin = require('puppeteer-extra-plugin-stealth')();
212
+ const UserAgent = require('user-agents');
213
+ const ProxyManager = require('../utils/proxy-manager');
214
+
215
+ class BaseScraper {
216
+ constructor() {
217
+ this.proxyManager = new ProxyManager();
218
+ this.stealthPlugin = StealthPlugin;
219
+ this.userAgent = new UserAgent();
220
+ }
221
+
222
+ async launchBrowser() {
223
+ const proxy = this.proxyManager.getNextProxy();
224
+ this.browser = await chromium.launch({
225
+ headless: true,
226
+ proxy: {
227
+ server: proxy,
228
+ },
229
+ args: [
230
+ '--disable-blink-features=AutomationControlled',
231
+ '--no-sandbox'
232
+ ],
233
+ });
234
+
235
+ this.context = await this.browser.newContext({
236
+ userAgent: this.userAgent.toString(),
237
+ viewport: { width: 1920, height: 1080 },
238
+ });
239
+
240
+ this.page = await this.context.newPage();
241
+
242
+ // Block unnecessary resources
243
+ await this.page.route(/\.(jpg|jpeg|png|gif|css|ads|adservice|googleadservices|doubleclick)/, route => route.abort());
244
+
245
+ // Enable stealth
246
+ await this.stealthPlugin.onPageCreated(this.page);
247
+ }
248
+
249
+ async navigateTo(url) {
250
+ await this.page.goto(url, { waitUntil: 'networkidle', timeout: 60000 });
251
+ await this.randomDelay(2000, 8000);
252
+ }
253
+
254
+ async randomDelay(min, max) {
255
+ const delay = Math.floor(Math.random() * (max - min + 1)) + min;
256
+ await this.page.waitForTimeout(delay);
257
+ }
258
+
259
+ async closeBrowser() {
260
+ await this.browser.close();
261
+ }
262
+ }
263
+
264
+ module.exports = BaseScraper;</code></pre>
265
+ </div>
266
+
267
+ <div class="code-container" id="browser-manager.js">
268
+ <div class="code-header bg-slate-800 text-slate-200 px-4 py-2 rounded-t-lg flex justify-between">
269
+ <div>
270
+ <i class="far fa-file-code mr-2"></i>
271
+ <span class="font-mono">src/utils/browser-manager.js</span>
272
+ </div>
273
+ <div>
274
+ <span class="text-green-400">•</span>
275
+ <span class="text-xs ml-1">JavaScript</span>
276
+ </div>
277
+ </div>
278
+ <pre class="rounded-b-lg"><code class="language-javascript">const { chromium } = require('playwright');
279
+ const singleton = Symbol();
280
+ const singletonEnforcer = Symbol();
281
+
282
+ class BrowserManager {
283
+ constructor(enforcer) {
284
+ if (enforcer !== singletonEnforcer) {
285
+ throw new Error('Cannot construct singleton');
286
+ }
287
+ this.browser = null;
288
+ }
289
+
290
+ static get instance() {
291
+ if (!this[singleton]) {
292
+ this[singleton] = new BrowserManager(singletonEnforcer);
293
+ }
294
+ return this[singleton];
295
+ }
296
+
297
+ async launch() {
298
+ if (!this.browser || !this.browser.isConnected()) {
299
+ this.browser = await chromium.launch({
300
+ headless: true,
301
+ args: [
302
+ '--disable-blink-features=AutomationControlled',
303
+ '--no-sandbox'
304
+ ],
305
+ });
306
+ }
307
+ return this.browser;
308
+ }
309
+
310
+ async newContext() {
311
+ const browser = await this.launch();
312
+ return browser.newContext({
313
+ viewport: { width: 1920, height: 1080 },
314
+ userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36'
315
+ });
316
+ }
317
+
318
+ async close() {
319
+ if (this.browser) {
320
+ await this.browser.close();
321
+ this.browser = null;
322
+ }
323
+ }
324
+ }
325
+
326
+ module.exports = BrowserManager;</code></pre>
327
+ </div>
328
+
329
+ <div class="code-container" id="match-summary.js">
330
+ <div class="code-header bg-slate-800 text-slate-200 px-4 py-2 rounded-t-lg flex justify-between">
331
+ <div>
332
+ <i class="far fa-file-code mr-2"></i>
333
+ <span class="font-mono">src/scrapers/match-summary.js</span>
334
+ </div>
335
+ <div>
336
+ <span class="text-green-400">•</span>
337
+ <span class="text-xs ml-1">JavaScript</span>
338
+ </div>
339
+ </div>
340
+ <pre class="rounded-b-lg"><code class="language-javascript">const BaseScraper = require('./base-scraper');
341
+ const cheerio = require('cheerio');
342
+ const MatchData = require('../models/match-data');
343
+ const DataProcessor = require('../utils/data-processor');
344
+
345
+ class MatchSummaryScraper extends BaseScraper {
346
+ constructor(matchUrl) {
347
+ super();
348
+ this.matchUrl = matchUrl;
349
+ this.matchData = new MatchData();
350
+ }
351
+
352
+ async scrape() {
353
+ try {
354
+ await this.launchBrowser();
355
+ await this.navigateTo(this.matchUrl);
356
+ await this.page.waitForSelector('.matchSummary', { timeout: 10000 });
357
+
358
+ const html = await this.page.content();
359
+ this.matchData = this.parseMatchSummary(html);
360
+ await this.closeBrowser();
361
+ return this.matchData;
362
+ } catch (error) {
363
+ console.error('Error scraping match summary:', error);
364
+ await this.closeBrowser();
365
+ throw error;
366
+ }
367
+ }
368
+
369
+ parseMatchSummary(html) {
370
+ const $ = cheerio.load(html);
371
+ const match = new MatchData();
372
+
373
+ // Parse teams and scores
374
+ match.homeTeam = $('.home-team-name').text().trim();
375
+ match.awayTeam = $('.away-team-name').text().trim();
376
+ match.score = $('.score').text().trim();
377
+ match.halfTimeScore = $('.half-time-score').text().trim();
378
+
379
+ // Parse date and time
380
+ match.date = $('.match-date').attr('data-date');
381
+ match.time = $('.match-time').attr('data-time');
382
+
383
+ // Parse match events
384
+ $('.event-row').each((i, element) => {
385
+ const event = {
386
+ type: $(element).find('.event-type').text().trim(),
387
+ time: $(element).find('.event-time').text().trim(),
388
+ player: $(element).find('.event-player').text().trim(),
389
+ team: $(element).attr('class').includes('home') ? 'home' : 'away'
390
+ };
391
+
392
+ match.events.push(event);
393
+ });
394
+
395
+ // Data normalization
396
+ match.date = DataProcessor.normalizeDate(match.date);
397
+ match.events = DataProcessor.normalizeEvents(match.events);
398
+
399
+ return match;
400
+ }
401
+ }
402
+
403
+ module.exports = MatchSummaryScraper;</code></pre>
404
+ </div>
405
+
406
+ <div class="code-container" id="match-data.js">
407
+ <div class="code-header bg-slate-800 text-slate-200 px-4 py-2 rounded-t-lg flex justify-between">
408
+ <div>
409
+ <i class="far fa-file-code mr-2"></i>
410
+ <span class="font-mono">src/models/match-data.js</span>
411
+ </div>
412
+ <div>
413
+ <span class="text-green-400">•</span>
414
+ <span class="text-xs ml-1">JavaScript</span>
415
+ </div>
416
+ </div>
417
+ <pre class="rounded-b-lg"><code class="language-javascript">class MatchData {
418
+ constructor() {
419
+ this.id = '';
420
+ this.homeTeam = '';
421
+ this.awayTeam = '';
422
+ this.competition = '';
423
+ this.status = '';
424
+ this.date = '';
425
+ this.time = '';
426
+ this.score = '';
427
+ this.halfTimeScore = '';
428
+ this.venue = '';
429
+ this.attendance = 0;
430
+ this.referee = '';
431
+ this.events = [];
432
+ this.statistics = {};
433
+ this.lastUpdated = new Date();
434
+ }
435
+
436
+ addEvent(event) {
437
+ this.events.push(event);
438
+ }
439
+
440
+ addStatistic(type, value) {
441
+ this.statistics[type] = value;
442
+ }
443
+
444
+ toJSON() {
445
+ return {
446
+ id: this.id,
447
+ homeTeam: this.homeTeam,
448
+ awayTeam: this.awayTeam,
449
+ competition: this.competition,
450
+ status: this.status,
451
+ date: this.date,
452
+ time: this.time,
453
+ score: this.score,
454
+ halfTimeScore: this.halfTimeScore,
455
+ venue: this.venue,
456
+ attendance: this.attendance,
457
+ referee: this.referee,
458
+ events: this.events,
459
+ statistics: this.statistics,
460
+ lastUpdated: this.lastUpdated.toISOString()
461
+ };
462
+ }
463
+ }
464
+
465
+ module.exports = MatchData;</code></pre>
466
+ </div>
467
+ </div>
468
+ </div>
469
+ </div>
470
+ </section>
471
+
472
+ <!-- Features -->
473
+ <section class="mb-12">
474
+ <h3 class="text-2xl font-bold text-center mb-8 text-slate-800">Project Features</h3>
475
+ <div class="grid grid-cols-1 md:grid-cols-3 gap-6">
476
+ <!-- Feature 1 -->
477
+ <div class="bg-white rounded-xl shadow-lg p-6 transition-all hover:shadow-xl">
478
+ <div class="w-16 h-16 bg-primary/10 rounded-full flex items-center justify-center mb-4">
479
+ <i class="fas fa-user-secret text-primary text-2xl"></i>
480
+ </div>
481
+ <h4 class="text-xl font-bold mb-2 text-slate-800">Stealth Mode</h4>
482
+ <p class="text-slate-600">Avoid detection with advanced techniques like randomized user agents, request delays, and proxy rotation.</p>
483
+ </div>
484
+
485
+ <!-- Feature 2 -->
486
+ <div class="bg-white rounded-xl shadow-lg p-6 transition-all hover:shadow-xl">
487
+ <div class="w-16 h-16 bg-primary/10 rounded-full flex items-center justify-center mb-4">
488
+ <i class="fas fa-history text-primary text-2xl"></i>
489
+ </div>
490
+ <h4 class="text-xl font-bold mb-2 text-slate-800">Scheduled Scraping</h4>
491
+ <p class="text-slate-600">Regularly collect data using cron scheduling and automated retries with exponential backoff.</p>
492
+ </div>
493
+
494
+ <!-- Feature 3 -->
495
+ <div class="bg-white rounded-xl shadow-lg p-6 transition-all hover:shadow-xl">
496
+ <div class="w-16 h-16 bg-primary/10 rounded-full flex items-center justify-center mb-4">
497
+ <i class="fas fa-th-large text-primary text-2xl"></i>
498
+ </div>
499
+ <h4 class="text-xl font-bold mb-2 text-slate-800">Modular Architecture</h4>
500
+ <p class="text-slate-600">Clean separation of concerns with independent modules for scraping, data processing, and utilities.</p>
501
+ </div>
502
+ </div>
503
+ </section>
504
+
505
+ <!-- Installation -->
506
+ <section class="bg-gradient-to-r from-primary/10 to-secondary/10 rounded-xl p-8 mb-12">
507
+ <div class="max-w-4xl mx-auto">
508
+ <h3 class="text-2xl font-bold text-center mb-6 text-slate-800">Installation & Usage</h3>
509
+
510
+ <div class="bg-white rounded-xl shadow-lg p-6 mb-6">
511
+ <div class="flex items-start">
512
+ <div class="w-10 h-10 rounded-full bg-primary text-white flex items-center justify-center mr-4 flex-shrink-0">
513
+ 1
514
+ </div>
515
+ <div>
516
+ <h4 class="text-xl font-bold mb-2 text-slate-800">Initialize Project</h4>
517
+ <pre class="bg-slate-800 text-green-400 rounded-lg p-4"><code class="language-bash"># Create project directory
518
+ mkdir flashscore-scraper
519
+ cd flashscore-scraper
520
+
521
+ # Initialize npm project
522
+ npm init -y
523
+
524
+ # Install dependencies
525
+ npm install playwright cheerio axios dotenv fs-extra node-cron</code></pre>
526
+ </div>
527
+ </div>
528
+ </div>
529
+
530
+ <div class="bg-white rounded-xl shadow-lg p-6 mb-6">
531
+ <div class="flex items-start">
532
+ <div class="w-10 h-10 rounded-full bg-primary text-white flex items-center justify-center mr-4 flex-shrink-0">
533
+ 2
534
+ </div>
535
+ <div>
536
+ <h4 class="text-xl font-bold mb-2 text-slate-800">Configure Environment</h4>
537
+ <p class="text-slate-600 mb-4">Create a <code class="bg-slate-100 px-2 py-1 rounded">.env</code> file with your configuration:</p>
538
+ <pre class="bg-slate-800 text-yellow-300 rounded-lg p-4"><code class="language-bash"># Proxy configuration
539
+ PROXY_SERVERS="http://user:pass@proxy1.com:8080,http://user:pass@proxy2.com:8080"
540
+
541
+ # Flashscore base URL
542
+ BASE_URL="https://www.flashscore.com"
543
+
544
+ # Schedule - every day at midnight
545
+ CRON_SCHEDULE="0 0 * * *"
546
+
547
+ # Output directory
548
+ DATA_DIR="./data"</code></pre>
549
+ </div>
550
+ </div>
551
+ </div>
552
+
553
+ <div class="bg-white rounded-xl shadow-lg p-6">
554
+ <div class="flex items-start">
555
+ <div class="w-10 h-10 rounded-full bg-primary text-white flex items-center justify-center mr-4 flex-shrink-0">
556
+ 3
557
+ </div>
558
+ <div>
559
+ <h4 class="text-xl font-bold mb-2 text-slate-800">Run Scraper</h4>
560
+ <p class="text-slate-600 mb-4">Execute the main script directly or set up a cron job:</p>
561
+ <pre class="bg-slate-800 text-amber-200 rounded-lg p-4"><code class="language-bash"># Run once for a specific match
562
+ node src/index.js --matchId "123456"
563
+
564
+ # Or run on a schedule according to settings.js
565
+ node src/index.js --cron</code></pre>
566
+ </div>
567
+ </div>
568
+ </div>
569
+ </div>
570
+ </section>
571
+
572
+ <!-- Footer -->
573
+ <footer class="bg-slate-900 text-white rounded-xl p-8">
574
+ <div class="max-w-6xl mx-auto">
575
+ <div class="flex flex-col md:flex-row justify-between items-center">
576
+ <div class="mb-6 md:mb-0">
577
+ <h3 class="text-2xl font-bold mb-4">Flashscore Scraper</h3>
578
+ <p class="text-slate-400">
579
+ Robust and scalable data extraction solution<br>
580
+ Built for developers by developers
581
+ </p>
582
+ </div>
583
+
584
+ <div class="flex space-x-6">
585
+ <div>
586
+ <h4 class="font-bold text-primary mb-2">Technology Stack</h4>
587
+ <ul class="text-slate-400 text-sm space-y-1">
588
+ <li><i class="fas fa-play mr-2"></i> Node.js</li>
589
+ <li><i class="fas fa-window-restore mr-2"></i> Playwright</li>
590
+ <li><i class="fas fa-filter mr-2"></i> Cheerio</li>
591
+ </ul>
592
+ </div>
593
+
594
+ <div>
595
+ <h4 class="font-bold text-primary mb-2">Documentation</h4>
596
+ <ul class="text-slate-400 text-sm space-y-1">
597
+ <li><i class="fas fa-book mr-2"></i> GitHub</li>
598
+ <li><i class="fas fa-code mr-2"></i> API Reference</li>
599
+ <li><i class="fas fa-exclamation-circle mr-2"></i> FAQ</li>
600
+ </ul>
601
+ </div>
602
+ </div>
603
+ </div>
604
+
605
+ <div class="border-t border-slate-800 mt-8 pt-8 text-center text-slate-500">
606
+ <p>&copy; 2023 Flashscore Scraper Project. All rights reserved.</p>
607
+ </div>
608
+ </div>
609
+ </footer>
610
+ </div>
611
+
612
+ <script>
613
+ document.addEventListener('DOMContentLoaded', function() {
614
+ // File selection logic
615
+ const fileItems = document.querySelectorAll('.file');
616
+ const codeContainers = document.querySelectorAll('.code-container');
617
+
618
+ fileItems.forEach(file => {
619
+ file.addEventListener('click', function() {
620
+ const fileId = this.getAttribute('data-file');
621
+
622
+ // Update active file styling
623
+ fileItems.forEach(item => item.classList.remove('active-file'));
624
+ this.classList.add('active-file');
625
+
626
+ // Show selected code container
627
+ codeContainers.forEach(container => {
628
+ container.classList.remove('active');
629
+ if (container.id === fileId) {
630
+ container.classList.add('active');
631
+ }
632
+ });
633
+ });
634
+ });
635
+
636
+ // Activate the first file by default
637
+ if (fileItems.length > 0) {
638
+ fileItems[0].click();
639
+ }
640
+ });
641
+ </script>
642
+ <p style="border-radius: 8px; text-align: center; font-size: 12px; color: #fff; margin-top: 16px;position: fixed; left: 8px; bottom: 8px; z-index: 10; background: rgba(0, 0, 0, 0.8); padding: 4px 8px;">Made with <img src="https://enzostvs-deepsite.hf.space/logo.svg" alt="DeepSite Logo" style="width: 16px; height: 16px; vertical-align: middle;display:inline-block;margin-right:3px;filter:brightness(0) invert(1);"><a href="https://enzostvs-deepsite.hf.space" style="color: #fff;text-decoration: underline;" target="_blank" >DeepSite</a> - 🧬 <a href="https://enzostvs-deepsite.hf.space?remix=Saad4web/fs" style="color: #fff;text-decoration: underline;" target="_blank" >Remix</a></p></body>
643
+ </html>