Spaces:

lenson78
/

Scrapling

Paused

Karim shoair commited on Feb 15

Commit

87facc2

1 Parent(s): 983da92

docs: update the titles for all files

Files changed (14) hide show

docs/development/adaptive_storage_system.md CHANGED Viewed

@@ -1,3 +1,5 @@
 Scrapling uses SQLite by default, but this tutorial shows how to write your own storage system to store element properties for the `adaptive` feature.
 You might want to use Firebase, for example, and share the database between multiple spiders on different machines. It's a great idea to use an online database like that because spiders can share adaptive data with each other.

+# Writing your retrieval system
 Scrapling uses SQLite by default, but this tutorial shows how to write your own storage system to store element properties for the `adaptive` feature.
 You might want to use Firebase, for example, and share the database between multiple spiders on different machines. It's a great idea to use an online database like that because spiders can share adaptive data with each other.

docs/development/scrapling_custom_types.md CHANGED Viewed

@@ -1,3 +1,5 @@
 > You can take advantage of the custom-made types for Scrapling and use them outside the library if you want. It's better than copying their code, after all :)
 ### All current types can be imported alone, like below

+# Using Scrapling's custom types
 > You can take advantage of the custom-made types for Scrapling and use them outside the library if you want. It's better than copying their code, after all :)
 ### All current types can be imported alone, like below

docs/fetching/choosing.md CHANGED Viewed

@@ -1,3 +1,5 @@
 ## Introduction
 Fetchers are classes that can do requests or fetch pages for you easily in a single-line fashion with many features and then return a [Response](#response-object) object. Starting with v0.3, all fetchers have separate classes to keep the session running, so for example, a fetcher that uses a browser will keep the browser open till you finish all your requests through it instead of opening multiple browsers. So it depends on your use case.

+# Fetchers basics
 ## Introduction
 Fetchers are classes that can do requests or fetch pages for you easily in a single-line fashion with many features and then return a [Response](#response-object) object. Starting with v0.3, all fetchers have separate classes to keep the session running, so for example, a fetcher that uses a browser will keep the browser open till you finish all your requests through it instead of opening multiple browsers. So it depends on your use case.

docs/fetching/dynamic.md CHANGED Viewed

@@ -1,4 +1,4 @@
-# Introduction
 Here, we will discuss the `DynamicFetcher` class (formerly `PlayWrightFetcher`). This class provides flexible browser automation with multiple configuration options and little under-the-hood stealth improvements.


1	+ # Fetching dynamic websites
2
3	Here, we will discuss the `DynamicFetcher` class (formerly `PlayWrightFetcher`). This class provides flexible browser automation with multiple configuration options and little under-the-hood stealth improvements.
4

docs/fetching/static.md CHANGED Viewed

@@ -1,4 +1,4 @@
-# Introduction
 The `Fetcher` class provides rapid and lightweight HTTP requests using the high-performance `curl_cffi` library with a lot of stealth capabilities.


1	+ # HTTP requests
2
3	The `Fetcher` class provides rapid and lightweight HTTP requests using the high-performance `curl_cffi` library with a lot of stealth capabilities.
4

docs/fetching/stealthy.md CHANGED Viewed

@@ -1,4 +1,4 @@
-# Introduction
 Here, we will discuss the `StealthyFetcher` class. This class is very similar to the [DynamicFetcher](dynamic.md#introduction) class, including the browsers, the automation, and the use of [Playwright's API](https://playwright.dev/python/docs/intro). The main difference is that this class provides advanced anti-bot protection bypass capabilities; most of them are handled automatically under the hood, and the rest is up to you to enable.


1	+ # Fetching dynamic websites with hard protections
2
3	Here, we will discuss the `StealthyFetcher` class. This class is very similar to the [DynamicFetcher](dynamic.md#introduction) class, including the browsers, the automation, and the use of [Playwright's API](https://playwright.dev/python/docs/intro). The main difference is that this class provides advanced anti-bot protection bypass capabilities; most of them are handled automatically under the hood, and the rest is up to you to enable.
4

docs/parsing/adaptive.md CHANGED Viewed

@@ -1,4 +1,4 @@
-## Introduction
 !!! success "Prerequisites"

docs/parsing/main_classes.md CHANGED Viewed

@@ -1,4 +1,4 @@
-## Introduction
 !!! success "Prerequisites"

docs/parsing/selection.md CHANGED Viewed

@@ -1,4 +1,4 @@
-## Introduction
 Scrapling currently supports parsing HTML pages exclusively, so it doesn't support XML feeds. This decision was made because the adaptive feature won't work with XML, but that might change soon, so stay tuned :)
 In Scrapling, there are five main ways to find elements:

+# Querying elements
 Scrapling currently supports parsing HTML pages exclusively, so it doesn't support XML feeds. This decision was made because the adaptive feature won't work with XML, but that might change soon, so stay tuned :)
 In Scrapling, there are five main ways to find elements:

docs/spiders/architecture.md CHANGED Viewed

@@ -1,4 +1,4 @@
-## Introduction
 !!! success "Prerequisites"

docs/spiders/getting-started.md CHANGED Viewed

@@ -1,3 +1,5 @@
 ## Introduction
 !!! success "Prerequisites"

+# Getting started
 ## Introduction
 !!! success "Prerequisites"

docs/spiders/proxy-blocking.md CHANGED Viewed

@@ -203,7 +203,7 @@ class MySpider(Spider):
         yield {"title": response.css("title::text").get("")}
 ```
-What happened above is that I left the blocking detection logic unchanged and made the spider mainly use requests until it gets blocked, then it switches to the stealthy browser.
 Putting it all together:

         yield {"title": response.css("title::text").get("")}
 ```
+What happened above is that I left the blocking detection logic unchanged and had the spider mainly use requests until it got blocked, then switch to the stealthy browser.
 Putting it all together:

docs/spiders/requests-responses.md CHANGED Viewed

@@ -1,4 +1,4 @@
-## Introduction
 !!! success "Prerequisites"

docs/spiders/sessions.md CHANGED Viewed

@@ -1,4 +1,4 @@
-## Introduction
 !!! success "Prerequisites"