Karim shoair commited on
Commit
35e5856
·
1 Parent(s): 5ec834e

docs: Updating the contribution guide

Browse files
.github/PULL_REQUEST_TEMPLATE.md CHANGED
@@ -5,10 +5,8 @@
5
 
6
  ## Proposed change
7
  <!--
8
- Describe the big picture of your changes here to communicate to the
9
- maintainers why we should accept this pull request. If it fixes a bug
10
- or resolves a feature request, be sure to link to that issue in the
11
- additional information section.
12
  -->
13
 
14
 
@@ -34,12 +32,12 @@
34
 
35
  ### Additional information
36
  <!--
37
- Details are important, and help maintainers processing your PR.
38
  Please be sure to fill out additional details, if applicable.
39
  -->
40
 
41
- - This PR fixes or closes issue: fixes #
42
- - This PR is related to issue:
43
  - Link to documentation pull request: **
44
 
45
  ### Checklist:
 
5
 
6
  ## Proposed change
7
  <!--
8
+ Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request.
9
+ If it fixes a bug or resolves a feature request, be sure to link to that issue in the additional information section.
 
 
10
  -->
11
 
12
 
 
32
 
33
  ### Additional information
34
  <!--
35
+ Details are important and help maintainers processing your PR.
36
  Please be sure to fill out additional details, if applicable.
37
  -->
38
 
39
+ - This PR fixes or closes an issue: fixes #
40
+ - This PR is related to an issue: #
41
  - Link to documentation pull request: **
42
 
43
  ### Checklist:
CONTRIBUTING.md CHANGED
@@ -1,39 +1,106 @@
1
  # Contributing to Scrapling
2
- Everybody is invited and welcome to contribute to Scrapling. Smaller changes have a better chance to get included in a timely manner. Adding unit tests for new features or test cases for bugs you've fixed help us to ensure that the Pull Request (PR) is fine.
3
 
4
- There is a lot to do...
5
- - If you are not a developer perhaps you would like to help with the [documentation](https://github.com/D4Vinci/Scrapling/tree/docs)?
6
- - If you are a developer, most of the features I'm planning to add in the future are moved to [roadmap file](https://github.com/D4Vinci/Scrapling/blob/main/ROADMAP.md) so consider reading it.
7
 
8
- Scrapling includes a comprehensive test suite which can be executed with pytest:
9
- ```bash
10
- $ pytest
11
- =============================== test session starts ===============================
12
- platform darwin -- Python 3.12.7, pytest-8.3.3, pluggy-1.5.0
13
- rootdir: /<some_where>/Scrapling
14
- configfile: pytest.ini
15
- plugins: cov-5.0.0, anyio-4.6.0
16
- collected 16 items
17
 
18
- tests/test_parser_functions.py ................ [100%]
 
 
 
 
 
 
 
19
 
20
- =============================== 16 passed in 0.22s ================================
21
- ```
22
- Also, consider setting the scrapling logging level to `debug` so it's easier to know what's happening in the background.
23
- ```python
24
- >>> import logging
25
- >>> logging.getLogger("scrapling").setLevel(logging.DEBUG)
26
- ```
27
 
28
- ### The process is straight-forward.
29
 
30
- - Read [How to get faster PR reviews](https://github.com/kubernetes/community/blob/master/contributors/guide/pull-requests.md#best-practices-for-faster-reviews) by Kubernetes (but skip step 0 and 1)
31
- - Fork Scrapling [git repository](https://github.com/D4Vinci/Scrapling).
32
- - Make your changes.
33
- - Ensure tests work.
34
- - Create a Pull Request against the [**dev**](https://github.com/D4Vinci/Scrapling/tree/dev) branch of Scrapling.
35
 
36
- ### Installing the latest changes from the dev branch
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ```commandline
38
  pip3 install git+https://github.com/D4Vinci/Scrapling.git@dev
39
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Contributing to Scrapling
 
2
 
3
+ Thank you for your interest in contributing to Scrapling!
 
 
4
 
5
+ Everybody is invited and welcome to contribute to Scrapling.
 
 
 
 
 
 
 
 
6
 
7
+ Minor changes have a better chance of being included promptly. Adding unit tests for new features or test cases for bugs you've fixed helps us ensure that the Pull Request (PR) is acceptable.
8
+
9
+ There are many ways to contribute to Scrapling. Here are some of them:
10
+
11
+ - Report bugs and request features using the [GitHub issues](https://github.com/D4Vinci/Scrapling/issues). Please follow the issue template to help us resolve your issue quickly.
12
+ - Blog about Scrapling. Tell the world how you’re using Scrapling. This will help newcomers with more examples and increase the Scrapling project's visibility.
13
+ - Join the [Discord community](https://discord.gg/EMgGbDceNQ) and share your ideas on how to improve Scrapling. We’re always open to suggestions.
14
+ - If you are not a developer, perhaps you would like to help with translating the [documentation](https://github.com/D4Vinci/Scrapling/tree/docs)?
15
 
 
 
 
 
 
 
 
16
 
17
+ ## Finding work
18
 
19
+ If you have decided to make a contribution to Scrapling, but you do not know what to contribute, here are some ways to find pending work:
 
 
 
 
20
 
21
+ - Check out the [contribution](https://github.com/D4Vinci/Scrapling/contribute) GitHub page, which lists open issues tagged as good first issue. These issues provide a good starting point.
22
+ - There are also the [help wanted](https://github.com/D4Vinci/Scrapling/issues?q=is%3Aissue%20label%3A%22help%20wanted%22%20state%3Aopen) issues, but know that some may require familiarity with the Scrapling code base first. You can also target any other issue, provided it is not tagged as `invalid`, `wontfix`, or similar tags.
23
+ - If you enjoy writing automated tests, you can work on increasing our test coverage. Currently, the test coverage is around 90–92%.
24
+ - Join the [Discord community](https://discord.gg/EMgGbDceNQ) and ask questions in the `#help` channel.
25
+
26
+ ## Coding style
27
+ Please follow these coding conventions as we do when writing code for Scrapling:
28
+ - We use [pre-commit](https://pre-commit.com/) to automatically address simple code issues before every commit, so please install it and run `pre-commit install` to set it up. This will install hooks to run [ruff](https://docs.astral.sh/ruff/), [bandit](https://github.com/PyCQA/bandit), and [vermin](https://github.com/netromdk/vermin) on every commit. We are currently using a workflow to automatically run these tools on every PR, so if your code doesn't pass these checks, the PR will be rejected.
29
+ - We use type hints for better code clarity and [pyright](https://github.com/microsoft/pyright) for static type checking, which depends on the type hints, of course.
30
+ - We use the conventional commit messages format as [here](https://gist.github.com/qoomon/5dfcdf8eec66a051ecd85625518cfd13#types), so for example, we use the following prefixes for commit messages:
31
+
32
+ | Prefix | When to use it |
33
+ |-------------|--------------------------|
34
+ | `feat:` | New feature added |
35
+ | `fix:` | Bug fix |
36
+ | `docs:` | Documentation change/add |
37
+ | `test:` | Tests |
38
+ | `refactor:` | Code refactoring |
39
+ | `chore:` | Maintenance tasks |
40
+
41
+ Then include the details of the change in the body/description of the commit message.
42
+
43
+ Example:
44
+ ```
45
+ feat: add `adaptive` for similar elements
46
+
47
+ - Added find_similar() method
48
+ - Implemented pattern matching
49
+ - Added tests and documentation
50
+ ```
51
+
52
+ > Please don’t put your name in the code you contribute; git provides enough metadata to identify the author of the code.
53
+
54
+ ## Development
55
+ Setting the scrapling logging level to `debug` makes it easier to know what's happening in the background.
56
+ ```python
57
+ import logging
58
+ logging.getLogger("scrapling").setLevel(logging.DEBUG)
59
+ ```
60
+ Bonus: You can install the beta of the upcoming update from the dev branch as follows
61
  ```commandline
62
  pip3 install git+https://github.com/D4Vinci/Scrapling.git@dev
63
  ```
64
+
65
+ ## Building Documentation
66
+ Documentation is built using [MkDocs](https://www.mkdocs.org/). You can build it locally using the following commands:
67
+ ```bash
68
+ pip install mkdocs-material
69
+ mkdocs serve # Local preview
70
+ mkdocs build # Build the static site
71
+ ```
72
+
73
+ ## Tests
74
+ Scrapling includes a comprehensive test suite that can be executed with pytest. However, first, you need to install all libraries and `pytest-plugins` listed in `tests/requirements.txt`. Then, running the tests will result in an output like this:
75
+ ```bash
76
+ $ pytest tests -n auto
77
+ =============================== test session starts ===============================
78
+ platform darwin -- Python 3.13.8, pytest-8.4.2, pluggy-1.6.0 -- /Users/<redacted>/.venv/bin/python3.13
79
+ cachedir: .pytest_cache
80
+ rootdir: /Users/<redacted>/scrapling
81
+ configfile: pytest.ini
82
+ plugins: asyncio-1.2.0, anyio-4.11.0, xdist-3.8.0, httpbin-2.1.0, cov-7.0.0
83
+ asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=function, asyncio_default_test_loop_scope=function
84
+ 10 workers [271 items]
85
+ scheduling tests via LoadScheduling
86
+
87
+ ...<shortened>...
88
+
89
+ =============================== 271 passed in 52.68s ==============================
90
+ ```
91
+ Hence, we used `-n auto` in the command above to run tests in threads to increase speed.
92
+
93
+ Bonus: You can also see the test coverage with the `pytest` plugin below
94
+ ```bash
95
+ pytest --cov=scrapling tests/
96
+ ```
97
+
98
+ ## Making a Pull Request
99
+ To ensure that your PR gets accepted, please make sure that your PR is based on the latest changes from the dev branch and that it satisfies the following requirements:
100
+
101
+ - The PR should be made against the [**dev**](https://github.com/D4Vinci/Scrapling/tree/dev) branch of Scrapling. Any PR made against the main branch will be rejected.
102
+ - The code should be passing all available tests. We are using tox with GitHub's CI to run the current tests on all supported Python versions with every commit.
103
+ - The code should be passing all code quality checks we mentioned above. We are using GitHub's CI to enforce the code style checks performed by pre-commit. If you were using the pre-commit hooks we discussed above, you should not see any issues when committing your changes.
104
+ - Make your changes, keep the code clean with an explanation of any part that might be vague, and remember to create a separate virtual environment for this project.
105
+ - If you are adding a new feature, please add tests for it.
106
+ - If you are fixing a bug, please add code with the PR that reproduces the bug.
docs/contributing.md DELETED
@@ -1,102 +0,0 @@
1
- Thank you for your interest in contributing to Scrapling!
2
-
3
- Everybody is invited and welcome to contribute to Scrapling.
4
-
5
- Smaller changes have a better chance of getting included in a timely manner. Adding unit tests for new features or test cases for bugs you've fixed helps us to ensure that the Pull Request (PR) is acceptable.
6
-
7
- There is a lot to do...
8
-
9
- - If you are not a developer, you can help us improve the documentation.
10
- - If you are a developer, most of the features I'm planning to add in the future are moved to [roadmap file](https://github.com/D4Vinci/Scrapling/blob/main/ROADMAP.md), so consider reading it.
11
-
12
- ## Running tests
13
- Scrapling includes a comprehensive test suite that can be executed with pytest, but first, you need to install all libraries and `pytest-plugins` inside `tests/requirements.txt`. Then, running the tests will result in an output like this:
14
- ```bash
15
- $ pytest tests
16
- =============================== test session starts ===============================
17
- platform darwin -- Python 3.12.8, pytest-8.3.3, pluggy-1.5.0 -- /Users/<redacted>/.venv/bin/python3.12
18
- cachedir: .pytest_cache
19
- rootdir: /Users/<redacted>/scrapling
20
- configfile: pytest.ini
21
- plugins: cov-5.0.0, asyncio-0.25.0, base-url-2.1.0, httpbin-2.1.0, playwright-0.5.2, anyio-4.6.2.post1, xdist-3.6.1, typeguard-4.3.0
22
- asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=function
23
- collected 83 items
24
-
25
- ...<shortened>...
26
-
27
- =============================== 83 passed in 157.52s (0:02:37) =====================
28
- ```
29
- Hence, you can add `-n auto` to the command above to run tests in threads to increase speed.
30
-
31
- Bonus: You can also see the test coverage with the pytest plugin below
32
- ```bash
33
- pytest --cov=scrapling tests/
34
- ```
35
-
36
- ## Installing the latest unstable version from the dev branch
37
- ```bash
38
- pip3 install git+https://github.com/D4Vinci/Scrapling.git@dev
39
- ```
40
-
41
- ## Development
42
- Setting the scrapling logging level to `debug` makes it easier to know what's happening in the background.
43
- ```python
44
- >>> import logging
45
- >>> logging.getLogger("scrapling").setLevel(logging.DEBUG)
46
- ```
47
- ### Code Style
48
-
49
- We use:
50
-
51
- 1. Type hints for better code clarity
52
- 2. Flake8, bandit, isort, and other hooks through `pre-commit`. <br/>Please install the hooks before committing with:
53
- ```bash
54
- pip install pre-commit
55
- pre-commit install
56
- ```
57
- It will run automatically on the code you push with each commit.
58
- 3. Conventional commit messages format. We use the below format for commit messages
59
-
60
- | Prefix | When to use it |
61
- |-------------|--------------------------|
62
- | `feat:` | New feature added |
63
- | `fix:` | Bug fix |
64
- | `docs:` | Documentation change/add |
65
- | `test:` | Tests |
66
- | `refactor:` | Code refactoring |
67
- | `chore:` | Maintenance tasks |
68
-
69
- Example:
70
- ```
71
- feat: add `adaptive` for similar elements
72
-
73
- - Added find_similar() method
74
- - Implemented pattern matching
75
- - Added tests and documentation
76
- ```
77
-
78
- ### Push changes to the library
79
-
80
- Then, the process is straightforward.
81
-
82
- - Read [How to get faster PR reviews](https://github.com/kubernetes/community/blob/master/contributors/guide/pull-requests.md#best-practices-for-faster-reviews) by Kubernetes (but skip step 0 and 1)
83
- - Fork Scrapling [Git repository](https://github.com/D4Vinci/Scrapling.git).
84
- - Make your changes, and don't forget to create a separate virtual environment for this project.
85
- - Ensure all tests are passing.
86
- - Create a Pull Request against the [**dev**](https://github.com/D4Vinci/Scrapling/tree/dev) branch of Scrapling.
87
-
88
- A bonus: if you have more than one version of Python installed, you can use tox to run tests on each version with:
89
- ```bash
90
- pip install tox
91
- tox
92
- ```
93
-
94
- > Note: All tests are automatically run with each push on Github on all supported Python versions using tox, so ensure all tests pass, or your PR will not be accepted.
95
-
96
-
97
- ## Building Documentation
98
- ```bash
99
- pip install mkdocs-material
100
- mkdocs serve # Local preview
101
- mkdocs build # Build the static site
102
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
mkdocs.yml CHANGED
@@ -79,7 +79,7 @@ nav:
79
  - Writing your retrieval system: development/adaptive_storage_system.md
80
  - Using Scrapling's custom types: development/scrapling_custom_types.md
81
  - Support and Advertisement: donate.md
82
- - Contributing: contributing.md
83
  - Changelog: 'https://github.com/D4Vinci/Scrapling/releases'
84
 
85
  markdown_extensions:
 
79
  - Writing your retrieval system: development/adaptive_storage_system.md
80
  - Using Scrapling's custom types: development/scrapling_custom_types.md
81
  - Support and Advertisement: donate.md
82
+ - Contributing: 'https://github.com/D4Vinci/Scrapling/blob/main/CONTRIBUTING.md'
83
  - Changelog: 'https://github.com/D4Vinci/Scrapling/releases'
84
 
85
  markdown_extensions: