Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pylint] Implement consider-using-assignment-expr (R6103) #13196

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

vincevannoort
Copy link
Contributor

@vincevannoort vincevannoort commented Sep 1, 2024

Summary

This pull request implements the R6103 pylint rule: pylint documentation.

It checks assignments which are directly followed by if statements using that expression:

test1 = "example"
if test1:
    print(test1)

And suggest to use the := operator to

if test1 := "example":
    print(test1)

Test Plan

I have added test cases, and checked some of the ruff ecosystem results.

Copy link

codspeed-hq bot commented Sep 1, 2024

CodSpeed Performance Report

Merging #13196 will not alter performance

Comparing vincevannoort:consider-using-assignment-expr (c3a8ae3) with main (2ca7872)

Summary

✅ 32 untouched benchmarks

@vincevannoort vincevannoort changed the title [WIP] Pylint R6103 [pylint][WIP] Implement R6103 Sep 1, 2024
@vincevannoort vincevannoort changed the title [pylint][WIP] Implement R6103 [WIP] [pylint] Implement R6103 Sep 1, 2024
Copy link
Contributor

github-actions bot commented Sep 1, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+5110 -0 violations, +0 -0 fixes in 15 projects; 39 projects unchanged)

aiven/aiven-client (+27 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ aiven/client/argx.py:158:16: PLR6103 Use walrus operator `(arg_list := getattr(func, ARG_LIST_PROP, None))`.
+ aiven/client/argx.py:251:16: PLR6103 Use walrus operator `(cat := tuple(cats[:level + 1]))`.
+ aiven/client/cli.py:1451:12: PLR6103 Use walrus operator `(route := self.args.route)`.
+ aiven/client/cli.py:1468:12: PLR6103 Use walrus operator `(privatelink_connection_id := self.args.privatelink_connection_id)`.
+ aiven/client/cli.py:1590:12: PLR6103 Use walrus operator `(service_type := match and match.group(2))`.
+ aiven/client/cli.py:1800:16: PLR6103 Use walrus operator `(value := arg_vars[key])`.
+ aiven/client/cli.py:1821:12: PLR6103 Use walrus operator `(access_control := self._parse_access_control())`.
+ aiven/client/cli.py:2000:16: PLR6103 Use walrus operator `(cert := user.get("access_cert"))`.
+ aiven/client/cli.py:2008:16: PLR6103 Use walrus operator `(key := user.get("access_key"))`.
+ aiven/client/cli.py:2684:12: PLR6103 Use walrus operator `(is_tiered := "remote_storage_enable" in topic["config"] and topic["config"]["remote_storage_enable"]["value"])`.
... 17 additional changes omitted for project

PlasmaPy/PlasmaPy (+51 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ docs/notebooks/langmuir_samples/_generate_noisy.ipynb:cell 6:2:4: PLR6103 Use walrus operator `(save := False)`.
+ noxfile.py:468:18: PLR6103 Use walrus operator `(extraneous_files := source_directory.glob("changelog/*[0-9]*.*.rst?*"))`.
+ src/plasmapy/analysis/nullpoint.py:1453:24: PLR6103 Use walrus operator `(loc := _locate_null_point(vspace, [i, j, k], maxiter, err))`.
+ src/plasmapy/analysis/nullpoint.py:1456:28: PLR6103 Use walrus operator `(p := NullPoint(loc, null_type))`.
+ src/plasmapy/analysis/nullpoint.py:706:40: PLR6103 Use walrus operator `(z_close := np.isclose(root[2], r[2], atol=_EQUALITY_ATOL))`.
+ src/plasmapy/analysis/swept_langmuir/floating_potential.py:280:12: PLR6103 Use walrus operator `(isl_window := np.abs(np.r_[rtn_extras["islands"][-1]][-1] - np.r_[rtn_extras["islands"][0]][0]) + 1)`.
+ src/plasmapy/analysis/swept_langmuir/floating_potential.py:299:12: PLR6103 Use walrus operator `(iadd := istop - istart + 1 - min_points)`.
+ src/plasmapy/diagnostics/charged_particle_radiography/synthetic_radiography.py:1215:8: PLR6103 Use walrus operator `(percentage := np.sum(intensity) / d["nparticles"])`.
+ src/plasmapy/diagnostics/charged_particle_radiography/synthetic_radiography.py:745:12: PLR6103 Use walrus operator `(n_wrong_way := np.sum(np.where(self.theta > np.pi / 2, 1, 0)))`.
+ src/plasmapy/diagnostics/thomson.py:889:16: PLR6103 Use walrus operator `(key := f"{p}_{num!s}")`.
... 41 additional changes omitted for project

apache/airflow (+1406 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ airflow/api/auth/backend/kerberos_auth.py:155:12: PLR6103 Use walrus operator `(header := request.headers.get("Authorization"))`.
+ airflow/api/client/__init__.py:32:12: PLR6103 Use walrus operator `(session_factory := getattr(backend, "create_client_session", None))`.
+ airflow/api/client/local_client.py:49:12: PLR6103 Use walrus operator `(dag_run := trigger_dag.trigger_dag(dag_id=dag_id, triggered_by=DagRunTriggeredByType.CLI, run_id=run_id, conf=conf, execution_date=execution_date, replace_microseconds=replace_microseconds))`.
+ airflow/api/common/airflow_health.py:42:12: PLR6103 Use walrus operator `(latest_scheduler_job := SchedulerJobRunner.most_recent_job())`.
+ airflow/api/common/airflow_health.py:52:12: PLR6103 Use walrus operator `(latest_triggerer_job := TriggererJobRunner.most_recent_job())`.
+ airflow/api/common/airflow_health.py:64:12: PLR6103 Use walrus operator `(latest_dag_processor_job := DagProcessorJobRunner.most_recent_job())`.
+ airflow/api/common/delete_dag.py:61:8: PLR6103 Use walrus operator `(running_tis := session.scalar(select(models.TaskInstance.state).where(models.TaskInstance.dag_id == dag_id).where(models.TaskInstance.state == TaskInstanceState.RUNNING).limit(1)))`.
+ airflow/api/common/delete_dag.py:64:8: PLR6103 Use walrus operator `(dag := session.scalar(select(DagModel).where(DagModel.dag_id == dag_id).limit(1)))`.
+ airflow/api/common/mark_tasks.py:134:8: PLR6103 Use walrus operator `(dag := next(iter(task_dags)))`.
+ airflow/api/common/mark_tasks.py:208:8: PLR6103 Use walrus operator `(latest_execution_date := dag.get_latest_execution_date(session=session))`.
+ airflow/api/common/trigger_dag.py:135:8: PLR6103 Use walrus operator `(dag_model := DagModel.get_current(dag_id))`.
+ airflow/api/common/trigger_dag.py:76:12: PLR6103 Use walrus operator `(min_dag_start_date := dag.default_args["start_date"])`.
+ airflow/api/common/trigger_dag.py:89:8: PLR6103 Use walrus operator `(dag_run := DagRun.find_duplicate(dag_id=dag_id, execution_date=execution_date, run_id=run_id))`.
... 1393 additional changes omitted for project

apache/superset (+337 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ RELEASING/changelog.py:112:12: PLR6103 Use walrus operator `(github_login := self._github_login_cache.get(author_name))`.
+ RELEASING/changelog.py:116:16: PLR6103 Use walrus operator `(pr_info := self._fetch_github_pr(git_log.pr_number))`.
+ RELEASING/changelog.py:132:12: PLR6103 Use walrus operator `(pr_number := git_log.pr_number)`.
+ RELEASING/changelog.py:134:16: PLR6103 Use walrus operator `(detail := self._pr_logs_with_details.get(pr_number))`.
+ RELEASING/changelog.py:141:12: PLR6103 Use walrus operator `(pr_type := re.match(SUPERSET_PULL_REQUEST_TYPES, title))`.
+ RELEASING/changelog.py:163:16: PLR6103 Use walrus operator `(risk_label := re.match(SUPERSET_RISKY_LABELS, label.name))`.
+ RELEASING/changelog.py:284:12: PLR6103 Use walrus operator `(current_head := self._git_get_current_head())`.
+ RELEASING/changelog.py:307:12: PLR6103 Use walrus operator `(match := re.match(".*\\(\\#(\\d*)\\)", split_log_item[4]))`.
+ scripts/benchmark_migration.py:115:20: PLR6103 Use walrus operator `(table := foreign_key.column.table.name)`.
+ scripts/benchmark_migration.py:194:16: PLR6103 Use walrus operator `(missing := min_entities - model_rows[model])`.
... 327 additional changes omitted for project

aws/aws-sam-cli (+235 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ samcli/cli/cli_config_file.py:148:20: PLR6103 Use walrus operator `(allow_multiple := options_map[config_name].multiple)`.
+ samcli/cli/cli_config_file.py:319:12: PLR6103 Use walrus operator `(param_value := ctx.params.get(param_name, None))`.
+ samcli/cli/command.py:111:16: PLR6103 Use walrus operator `(row := param.get_help_record(ctx))`.
+ samcli/cli/context.py:145:12: PLR6103 Use walrus operator `(click_core_ctx := click.get_current_context())`.
+ samcli/cli/context.py:197:12: PLR6103 Use walrus operator `(click_core_ctx := click.get_current_context())`.
+ samcli/cli/types.py:267:12: PLR6103 Use walrus operator `(equals_count := tag_value.count("="))`.
+ samcli/cli/types.py:360:12: PLR6103 Use walrus operator `(equals_count := signing_profile.count(":"))`.
+ samcli/commands/_utils/click_mutex.py:76:20: PLR6103 Use walrus operator `(has_all_required_params := False not in [required_param in opts for required_param in required_params])`.
+ samcli/commands/_utils/command_exception_handler.py:75:20: PLR6103 Use walrus operator `(exception_handler := (additional_mapping or {}).get(exception_type))`.
+ samcli/commands/_utils/command_exception_handler.py:85:24: PLR6103 Use walrus operator `(handler := exception_handler.get_handler(exception_type))`.
... 225 additional changes omitted for project

bokeh/bokeh (+185 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ examples/basic/data/ajax_source.py:41:12: PLR6103 Use walrus operator `(requested_headers := request.headers.get('Access-Control-Request-Headers'))`.
+ examples/basic/data/server_sent_events_source.py:43:12: PLR6103 Use walrus operator `(requested_headers := request.headers.get('Access-Control-Request-Headers'))`.
+ examples/basic/scatters/markertypes.py:35:12: PLR6103 Use walrus operator `(name := f"{base}_{kind}" if kind else base)`.
+ examples/output/jupyter/push_notebook/Numba Image Example.ipynb:cell 13:12:8: PLR6103 Use walrus operator `(ksum := np.sum(kernel))`.
+ examples/output/jupyter/push_notebook/Numba Image Example.ipynb:cell 18:8:8: PLR6103 Use walrus operator `(kernel := kernels.get(kernel_name, None))`.
+ examples/reference/models/dropdown_menu_server.py:24:8: PLR6103 Use walrus operator `(active_dropdown := dropdown.value)`.
+ examples/reference/models/radio_button_group_server.py:28:8: PLR6103 Use walrus operator `(active_radio := radio_button_group.active)`.
+ examples/reference/models/radio_group_server.py:28:8: PLR6103 Use walrus operator `(active_radio := radio_group.active)`.
+ examples/reference/models/select_server.py:28:8: PLR6103 Use walrus operator `(active_select := select.value)`.
+ examples/server/app/gapminder/main.py:60:8: PLR6103 Use walrus operator `(year := slider.value + 1)`.
... 175 additional changes omitted for project

freedomofpress/securedrop (+74 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ admin/bootstrap.py:63:8: PLR6103 Use walrus operator `(return_code := popen.wait())`.
+ admin/securedrop_admin/__init__.py:1240:12: PLR6103 Use walrus operator `(return_code := args.func(args))`.
+ admin/securedrop_admin/__init__.py:204:16: PLR6103 Use walrus operator `(text := document.text.replace(" ", ""))`.
+ admin/securedrop_admin/__init__.py:254:16: PLR6103 Use walrus operator `(text := document.text)`.
+ admin/securedrop_admin/__init__.py:268:16: PLR6103 Use walrus operator `(text := document.text)`.
+ admin/securedrop_admin/__init__.py:278:16: PLR6103 Use walrus operator `(text := document.text)`.
+ install_files/ansible-base/roles/restore/files/compare_torrc.py:24:16: PLR6103 Use walrus operator `(m := service_re.match(line))`.
+ journalist_gui/journalist_gui/SecureDropUpdater.py:23:8: PLR6103 Use walrus operator `(pwd_flag := subprocess.check_output(["passwd", "--status"]).decode("utf-8").split()[1])`.
+ journalist_gui/journalist_gui/resources_rc.py:1015:4: PLR6103 Use walrus operator `(qt_version := QtCore.qVersion().split("."))`.
+ journalist_gui/test_gui.py:70:12: PLR6103 Use walrus operator `(qApp := QApplication.instance())`.
... 64 additional changes omitted for project

fronzbot/blinkpy (+21 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ blinkpy/blinkpy.py:188:20: PLR6103 Use walrus operator `(network_id := str(owl["network_id"]))`.
+ blinkpy/blinkpy.py:212:20: PLR6103 Use walrus operator `(network_id := str(lotus["network_id"]))`.
+ blinkpy/blinkpy.py:242:20: PLR6103 Use walrus operator `(camera_network := str(network["network_id"]))`.
+ blinkpy/blinkpy.py:306:12: PLR6103 Use walrus operator `(last_refresh := self.last_refresh)`.
+ blinkpy/camera.py:140:12: PLR6103 Use walrus operator `(res := await api.request_get_config(self.sync.blink, self.network_id, self.camera_id, product_type=self.product_type))`.
+ blinkpy/camera.py:169:12: PLR6103 Use walrus operator `(res := await api.request_update_config(self.sync.blink, self.network_id, self.camera_id, product_type=self.product_type, data=data))`.
+ blinkpy/camera.py:221:12: PLR6103 Use walrus operator `(response := await self.get_media())`.
+ blinkpy/camera.py:342:28: PLR6103 Use walrus operator `(recent := {"time": self.last_record, "clip": self.clip})`.
+ blinkpy/camera.py:374:16: PLR6103 Use walrus operator `(response := await self.get_media())`.
+ blinkpy/camera.py:379:16: PLR6103 Use walrus operator `(response := await self.get_media(media_type="video"))`.
... 11 additional changes omitted for project

... Truncated remaining completed project reports due to GitHub comment length restrictions

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
PLR6103 5110 5110 0 0 0

Comment on lines +1134 to +1136
if checker.enabled(Rule::UnnecessaryAssignment) {
pylint::rules::unnecessary_assignment(checker, if_);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied this rule by checking whether the previous statement from a IfStmt is an AssignStmt, however I am wondering if it is more desirable to do check if the next statement of a AssignStmt is an IfStmt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What benefits do you see in testing the next statement after an AssignStmt.

My intuition here is that there are probably more assignment than if statements. Therefore, running the rules on if nodes might overall be faster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that assignments are far more common than if statements.

I think the answer depends on the costs for checking previous_statement and next_statement.

My thought here is that retrieving the previous statement using the newly added previous_statement might be more expensive because it has to iterate over all previous statements using previous_statements to find the previous statement (if there is a better way, please let me know).

While checking an assignment, then checking the next_statement might be a cheap operation.

Do you have any idea? If they have equal cost I think the current implementation is fine. 😄

@vincevannoort vincevannoort changed the title [WIP] [pylint] Implement R6103 [WIP] [pylint] Implement consider-using-assignment-expr (R6103) Sep 7, 2024
@vincevannoort vincevannoort changed the title [WIP] [pylint] Implement consider-using-assignment-expr (R6103) [pylint] Implement consider-using-assignment-expr (R6103) Sep 7, 2024
Comment on lines +1250 to +1252
pub fn previous_statement(&self, stmt: &'a Stmt) -> Option<&Stmt> {
self.previous_statements(stmt)?.next()
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed something like this function, not sure if there is an existing better way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I think we can make it more efficient with #13895
which reduces an extra find and collect

@vincevannoort vincevannoort marked this pull request as ready for review September 7, 2024 12:12
@MichaReiser MichaReiser added rule Implementing or modifying a lint rule preview Related to preview mode features labels Sep 8, 2024
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution. My plane is about to land. I've to finish the review at a later time.

I only had enough time to quickly glance over the Rust code. We should look into removing the many node.clone() calls because it is fairly expensive to clone nodes and probably unnecessary. Let me know if you need some guidance on how to remove the clone calls (It probably requires adding some lifetimes)

Regarding the rule's naming.

  • I did a quick search to see how we referred to := in other rules. There are not many usages but named expression (walrus operator) is the most common form.
  • The rule name seems too generic to me and its name is very similar to unnecessary-assign (which we should rename to `unnecessary-assignment). Reading through the examples the rule mainly is about assigning a value that is then only used in an if condition. I need to think a bit more about what a good rule name could be. Maybe you have an idea?

Comment on lines +19 to +30
bad5 = (
'example',
'example',
'example',
'example',
'example',
'example',
'example',
'example',
'example',
'example',
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the example because it shows a potentially controversial use case. I would probably prefer the assignment to keep the if smaller.

Comment on lines +1134 to +1136
if checker.enabled(Rule::UnnecessaryAssignment) {
pylint::rules::unnecessary_assignment(checker, if_);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What benefits do you see in testing the next statement after an AssignStmt.

My intuition here is that there are probably more assignment than if statements. Therefore, running the rules on if nodes might overall be faster?

@vincevannoort
Copy link
Contributor Author

Thanks for this contribution. My plane is about to land. I've to finish the review at a later time.

Thanks for the review, much appreciated! 👍

I only had enough time to quickly glance over the Rust code. We should look into removing the many node.clone() calls because it is fairly expensive to clone nodes and probably unnecessary. Let me know if you need some guidance on how to remove the clone calls (It probably requires adding some lifetimes)

I have tried and was able to remove almost all clone calls, except a few which I think are needed for returning the diagnostic. Could you take a look at the remaining ones and see if any of the 3 can still be removed?

Regarding the rule's naming.

  • I did a quick search to see how we referred to := in other rules. There are not many usages but named expression (walrus operator) is the most common form.
  • The rule name seems too generic to me and its name is very similar to unnecessary-assign (which we should rename to `unnecessary-assignment). Reading through the examples the rule mainly is about assigning a value that is then only used in an if condition. I need to think a bit more about what a good rule name could be. Maybe you have an idea?

I agree, here are some possible options:

  1. unnecessary_assignment_before_if_stmt
  2. redundant_assignment_before_if _stmt
  3. standalone_assignment_before_if_stmt

These seem close to what the lint is trying to prevent, do you have other ideas in mind?

@vincevannoort
Copy link
Contributor Author

Hey @MichaReiser, once you have the time, would you mind giving this pull request another look? 😄

@MichaReiser MichaReiser self-assigned this Sep 20, 2024
@vincevannoort
Copy link
Contributor Author

Hey @MichaReiser, could you or someone else from the team have a look? 😄

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping and sorry for the late review. I pushed a few smaller refactors to avoid unnecessary collects.

I created a PR that should allow us to implement a more efficient previous_statement here.

I think the rule has to become cleverer, at least when we want to support handling elif cases because today's implementation can result in changes that fail at runtime.

};
}

// case - elif else clauses (`elif test1:`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why compare expressions aren't handled inside elif_else clauses?

Comment on lines 111 to 119
errors.extend(
stmt.elif_else_clauses
.iter()
.filter(|elif_else_clause| elif_else_clause.test.is_some())
.filter_map(|elif_else_clause| {
let elif_check = elif_else_clause.test.as_ref().unwrap();
find_assignment_before_if_stmt(semantic, elif_check, elif_check)
})
.collect::<Vec<AssignmentBeforeIfStmt>>(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing assignments to walrus operators in elif else branches is semantically incorrect if the variable is used afterwards

>>> if True: ...
... elif x :=10: ...
... 
Ellipsis
>>> x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined

error: AssignmentBeforeIfStmt,
) -> Diagnostic {
let (origin, expr_name, assignment) = error;
let assignment_expr = generator.expr(&assignment.value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for calling into the generator here? We try to avoid using the generator because it removes comments. Could we instead take the assignment value as it is in the source (using locator)? Note: We have to be careful about parenthesized expressions.

Comment on lines +1250 to +1252
pub fn previous_statement(&self, stmt: &'a Stmt) -> Option<&Stmt> {
self.previous_statements(stmt)?.next()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I think we can make it more efficient with #13895
which reduces an extra find and collect

Comment on lines +38 to +42
bad7 = 'example'
if bad7 == 'something': # [consider-using-assignment-expr]
pass
elif bad7 == 'something else':
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting example and possibly controversial. I would prefer the existing solution because the assignment in the if is very subtle and I can see how it can be confusing when trying to figure out what the value of bad7 is in the elif branch

@vincevannoort
Copy link
Contributor Author

Thanks for the review @MichaReiser 👍 , just a heads up: I will be travelling without my laptop for the coming 5 weeks, so will only get back to pull request this after.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants