Merge branch 'main' into upgrade-github-actions-node24-general
Browse files
.github/workflows/code-quality.yml
CHANGED
|
@@ -37,12 +37,12 @@ jobs:
|
|
| 37 |
|
| 38 |
steps:
|
| 39 |
- name: Checkout code
|
| 40 |
-
uses: actions/checkout@
|
| 41 |
with:
|
| 42 |
fetch-depth: 0 # Full history for better analysis
|
| 43 |
|
| 44 |
- name: Set up Python
|
| 45 |
-
uses: actions/setup-python@
|
| 46 |
with:
|
| 47 |
python-version: '3.10'
|
| 48 |
cache: 'pip'
|
|
@@ -177,7 +177,7 @@ jobs:
|
|
| 177 |
|
| 178 |
- name: Upload Bandit report
|
| 179 |
if: always() && steps.bandit.outcome != 'skipped'
|
| 180 |
-
uses: actions/upload-artifact@
|
| 181 |
with:
|
| 182 |
name: bandit-security-report
|
| 183 |
path: bandit-report.json
|
|
|
|
| 37 |
|
| 38 |
steps:
|
| 39 |
- name: Checkout code
|
| 40 |
+
uses: actions/checkout@v6
|
| 41 |
with:
|
| 42 |
fetch-depth: 0 # Full history for better analysis
|
| 43 |
|
| 44 |
- name: Set up Python
|
| 45 |
+
uses: actions/setup-python@v6
|
| 46 |
with:
|
| 47 |
python-version: '3.10'
|
| 48 |
cache: 'pip'
|
|
|
|
| 177 |
|
| 178 |
- name: Upload Bandit report
|
| 179 |
if: always() && steps.bandit.outcome != 'skipped'
|
| 180 |
+
uses: actions/upload-artifact@v6
|
| 181 |
with:
|
| 182 |
name: bandit-security-report
|
| 183 |
path: bandit-report.json
|
.github/workflows/docker-build.yml
CHANGED
|
@@ -25,7 +25,7 @@ jobs:
|
|
| 25 |
|
| 26 |
steps:
|
| 27 |
- name: Checkout repository
|
| 28 |
-
uses: actions/checkout@
|
| 29 |
|
| 30 |
- name: Set up Docker Buildx
|
| 31 |
uses: docker/setup-buildx-action@v3
|
|
|
|
| 25 |
|
| 26 |
steps:
|
| 27 |
- name: Checkout repository
|
| 28 |
+
uses: actions/checkout@v6
|
| 29 |
|
| 30 |
- name: Set up Docker Buildx
|
| 31 |
uses: docker/setup-buildx-action@v3
|
.github/workflows/release-and-publish.yml
CHANGED
|
@@ -18,7 +18,7 @@ jobs:
|
|
| 18 |
contents: write
|
| 19 |
id-token: write
|
| 20 |
steps:
|
| 21 |
-
- uses: actions/checkout@
|
| 22 |
with:
|
| 23 |
fetch-depth: 0
|
| 24 |
|
|
@@ -27,7 +27,7 @@ jobs:
|
|
| 27 |
run: echo "title=${{ github.event.pull_request.title }}" >> $GITHUB_OUTPUT
|
| 28 |
|
| 29 |
- name: Save PR body to file
|
| 30 |
-
uses: actions/github-script@
|
| 31 |
with:
|
| 32 |
script: |
|
| 33 |
const fs = require('fs');
|
|
@@ -57,7 +57,7 @@ jobs:
|
|
| 57 |
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
| 58 |
|
| 59 |
- name: Set up Python
|
| 60 |
-
uses: actions/setup-python@
|
| 61 |
with:
|
| 62 |
python-version: 3.12
|
| 63 |
|
|
|
|
| 18 |
contents: write
|
| 19 |
id-token: write
|
| 20 |
steps:
|
| 21 |
+
- uses: actions/checkout@v6
|
| 22 |
with:
|
| 23 |
fetch-depth: 0
|
| 24 |
|
|
|
|
| 27 |
run: echo "title=${{ github.event.pull_request.title }}" >> $GITHUB_OUTPUT
|
| 28 |
|
| 29 |
- name: Save PR body to file
|
| 30 |
+
uses: actions/github-script@v8
|
| 31 |
with:
|
| 32 |
script: |
|
| 33 |
const fs = require('fs');
|
|
|
|
| 57 |
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
| 58 |
|
| 59 |
- name: Set up Python
|
| 60 |
+
uses: actions/setup-python@v6
|
| 61 |
with:
|
| 62 |
python-version: 3.12
|
| 63 |
|
.github/workflows/tests.yml
CHANGED
|
@@ -44,10 +44,10 @@ jobs:
|
|
| 44 |
TOXENV: py313
|
| 45 |
|
| 46 |
steps:
|
| 47 |
-
- uses: actions/checkout@
|
| 48 |
|
| 49 |
- name: Set up Python ${{ matrix.python-version }}
|
| 50 |
-
uses: actions/setup-python@
|
| 51 |
with:
|
| 52 |
python-version: ${{ matrix.python-version }}
|
| 53 |
cache: 'pip'
|
|
@@ -69,7 +69,7 @@ jobs:
|
|
| 69 |
|
| 70 |
- name: Retrieve Playwright browsers from cache if any
|
| 71 |
id: playwright-cache
|
| 72 |
-
uses: actions/cache@
|
| 73 |
with:
|
| 74 |
path: |
|
| 75 |
~/.cache/ms-playwright
|
|
@@ -92,7 +92,7 @@ jobs:
|
|
| 92 |
|
| 93 |
# Cache tox environments
|
| 94 |
- name: Cache tox environments
|
| 95 |
-
uses: actions/cache@
|
| 96 |
with:
|
| 97 |
path: .tox
|
| 98 |
# Include python version and os in the cache key
|
|
|
|
| 44 |
TOXENV: py313
|
| 45 |
|
| 46 |
steps:
|
| 47 |
+
- uses: actions/checkout@v6
|
| 48 |
|
| 49 |
- name: Set up Python ${{ matrix.python-version }}
|
| 50 |
+
uses: actions/setup-python@v6
|
| 51 |
with:
|
| 52 |
python-version: ${{ matrix.python-version }}
|
| 53 |
cache: 'pip'
|
|
|
|
| 69 |
|
| 70 |
- name: Retrieve Playwright browsers from cache if any
|
| 71 |
id: playwright-cache
|
| 72 |
+
uses: actions/cache@v5
|
| 73 |
with:
|
| 74 |
path: |
|
| 75 |
~/.cache/ms-playwright
|
|
|
|
| 92 |
|
| 93 |
# Cache tox environments
|
| 94 |
- name: Cache tox environments
|
| 95 |
+
uses: actions/cache@v5
|
| 96 |
with:
|
| 97 |
path: .tox
|
| 98 |
# Include python version and os in the cache key
|
docs/tutorials/external.md
DELETED
|
@@ -1,34 +0,0 @@
|
|
| 1 |
-
|
| 2 |
-
If you have issues with the browser installation, such as resource management, we recommend you try the Cloud Browser from [Scrapeless](https://www.scrapeless.com/en/product/scraping-browser?utm_source=official&utm_term=scrapling) for free!
|
| 3 |
-
|
| 4 |
-
The usage is straightforward: create an account and [get your API key](https://docs.scrapeless.com/en/scraping-browser/quickstart/getting-started/?utm_source=official&utm_term=scrapling), then pass it to the `DynamicSession` like this:
|
| 5 |
-
|
| 6 |
-
```python
|
| 7 |
-
from urllib.parse import urlencode
|
| 8 |
-
|
| 9 |
-
from scrapling.fetchers import DynamicSession
|
| 10 |
-
|
| 11 |
-
# Configure your browser session
|
| 12 |
-
config = {
|
| 13 |
-
"token": "YOUR_API_KEY",
|
| 14 |
-
"sessionName": "scrapling-session",
|
| 15 |
-
"sessionTTL": "300", # 5 minutes
|
| 16 |
-
"proxyCountry": "ANY",
|
| 17 |
-
"sessionRecording": "false",
|
| 18 |
-
}
|
| 19 |
-
|
| 20 |
-
# Build WebSocket URL
|
| 21 |
-
ws_endpoint = f"wss://browser.scrapeless.com/api/v2/browser?{urlencode(config)}"
|
| 22 |
-
print('Connecting to Scrapeless...')
|
| 23 |
-
|
| 24 |
-
with DynamicSession(cdp_url=ws_endpoint, disable_resources=True) as s:
|
| 25 |
-
print("Connected!")
|
| 26 |
-
page = s.fetch("https://httpbin.org/headers", network_idle=True)
|
| 27 |
-
print(f"Page loaded, content length: {len(page.body)}")
|
| 28 |
-
print(page.json())
|
| 29 |
-
```
|
| 30 |
-
The `DynamicSession` class instance will work as usual, so no further explanation is needed.
|
| 31 |
-
|
| 32 |
-
However, the Scrapeless Cloud Browser can be configured with proxy options, like the proxy country in the config above, [custom fingerprint](https://docs.scrapeless.com/en/scraping-browser/features/advanced-privacy-anti-detection/custom-fingerprint/?utm_source=official&utm_term=scrapling) configuration, [captcha solving](https://docs.scrapeless.com/en/scraping-browser/features/advanced-privacy-anti-detection/supported-captchas/?utm_source=official&utm_term=scrapling), and more.
|
| 33 |
-
|
| 34 |
-
Check out the [Scrapeless's browser documentation](https://docs.scrapeless.com/en/scraping-browser/quickstart/introduction/?utm_source=official&utm_term=scrapling) for more details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
zensical.toml
CHANGED
|
@@ -50,8 +50,7 @@ nav = [
|
|
| 50 |
]},
|
| 51 |
{Tutorials = [
|
| 52 |
{"A Free Alternative to AI for Robust Web Scraping" = "tutorials/replacing_ai.md"},
|
| 53 |
-
{"Migrating from BeautifulSoup" = "tutorials/migrating_from_beautifulsoup.md"}
|
| 54 |
-
{"Using Scrapeless browser" = "tutorials/external.md"}
|
| 55 |
]},
|
| 56 |
{Development = [
|
| 57 |
{"API Reference" = [
|
|
|
|
| 50 |
]},
|
| 51 |
{Tutorials = [
|
| 52 |
{"A Free Alternative to AI for Robust Web Scraping" = "tutorials/replacing_ai.md"},
|
| 53 |
+
{"Migrating from BeautifulSoup" = "tutorials/migrating_from_beautifulsoup.md"}
|
|
|
|
| 54 |
]},
|
| 55 |
{Development = [
|
| 56 |
{"API Reference" = [
|