-
Notifications
You must be signed in to change notification settings - Fork 187
Enforce ruff/flake8-bugbear rules (B) #1703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
e0bcbf2 to
c2e277c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the PR.
Overall, the missing asserts are useful (really wondering why they're missing in the first place). Other changes do not look like they improve the code in a meaningful way (B007, B004, B010); we can stil implement the changes, however.
What I am really concerned with is actually enforcing the bugbear rules. Looking at the docs, a lot of them look fairly esoteric and definitely not something most people (especially new contributors) would be familiar with. Enforcing those rules would make contributing far more complicated than it already is.
As a result, those rules should not be added to the default linting. I'll add them to my own ruleset, but it should not be in the default pyproject.
|
|
||
| __all__ = ["ApplyToCols", "SingleColumnTransformer", "RejectColumn"] | ||
|
|
||
| _SELECTORS = selectors.all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
selctors.all() isn't selecting all columns, it's selecting the all() selector, which selects all columns.
| _SELECTORS = selectors.all() | |
| # By default, select all columns | |
| _SELECT_ALL_COLUMNS = selectors.all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same change in all the other files
|
|
||
| __all__ = ["ApplyToCols", "SingleColumnTransformer", "RejectColumn"] | ||
|
|
||
| _SELECTORS = selectors.all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same change in all the other files
| from ._subsampling import SubsamplePreviews, env_with_subsampling | ||
| from ._utils import KFOLD_5, NULL, attribute_error | ||
|
|
||
| _SELECTORS = selectors.all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| _SELECTORS = selectors.all() | |
| _SELECT_ALL_COLUMNS = selectors.all() |
|
As a more general comment for this and all the other PRs that you opened, formatting rules should be chosen by the maintainers of the project, rather than added in a dozen of separate PRs that need to be reviewed one by one. In this PR, a large part of the diff is "whatever", the The suggested rules are way too restrictive for new contributors, so that part of the PR is also not going to be merged, and should definitely have been discussed in an issue rather than added here. And this is only one of 20 very similar PRs that have been opened, and that I have to spend time on reviewing, making sure that none of the small changes that have been made across a bunch of files can lead to issues of some kind. In the future, please consider the time and effort that goes into reviewing all of this. I will still try to review the PRs that have already been opened, and track the suggested changes in an issue for discussion and possible implementation. The proper way of contributing to the repo would be opening a meta-issue that collects a series of sub-issues, so that it is possible to decide whether specific changes are warranted. This would take less time than reviewing PRs, and would not risk introducing bugs in the code. If it does not look like future PRs take this comment into consideration, I don't exclude closing any new PR like this without even looking at what's inside. |
|
ca75b1e to
b22ed93
Compare
I have kept the changes in this PR, but these rules will be ignored in the future. |
|
Still need to fix this new error in a test: Columns should be sorted in the same order as in the initial skrub/skrub/tests/test_multi_agg_joiner.py Lines 9 to 19 in 20598ed
|
|
Note that Ruff can automatically apply some rules, and pre-commit will enforce that without any additional work from contributors. Therefore, rules with automatic fixes are not a barrier, they result in consistent and readable code which can only help maintenance and new contributions. Ruff provides automatic fixes for many B rules, including B004, B007, B010. |
5bae622 to
4a2a41f
Compare
B004 Using `hasattr(x, "__call__")` to test if x is callable is unreliable.
Use `callable(x)` for consistent results.
B005 Using `.strip()` with multi-character strings is misleading
B006 Do not use mutable data structures for argument defaults
B007 Loop control variable not used within loop body
B008 Do not perform function call `selectors.all` in argument defaults;
instead, perform the call within the function, or read the default
from a module-level singleton variable
B010 Do not call `setattr` with a constant attribute value.
It is not any safer than normal property access.
B015 Pointless comparison.
a2bb309 to
d307610
Compare
Apply Repo-Review suggestion RF101:
https://learn.scientific-python.org/development/guides/repo-review/?repo=skrub-data%2Fskrub&ref=HEAD