Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New linter for complex conditional expressions #2676

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

IndrajeetPatil
Copy link
Collaborator

@IndrajeetPatil IndrajeetPatil commented Oct 26, 2024

closes #1830

Draft PR for early feedback.


Examples

library(lintr)

lint(
  text = "if (a && b) NULL",
  linters = complex_conditional_linter()
)
#> ℹ No lints found.

lint(
  text = "if (a && b || c) NULL",
  linters = complex_conditional_linter()
)
#> ℹ No lints found.

lint(
  text = "if (a && b || c) NULL",
  linters = complex_conditional_linter(1L)
)
#> <text>:1:5: warning: [complex_conditional_linter] Complex conditional with more than 1 logical operator(s). Consider extracting into a boolean function or variable for readability and reusability.
#> if (a && b || c) NULL
#>     ^~~~~~~~~~~

Created on 2024-11-01 with reprex v2.1.1

#'
#' @param threshold Integer. The maximum number of logical operators (`&&` or `||`)
#' allowed in a conditional expression. The default is `1L`, meaning any conditional expression
#' with more than one logical operator will be flagged.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaelChirico, @AshesITR WDYT?

I can't make up my mind if the threshold should represent the number of operators or operands.
I have chosen the former because I feel that it's an easier and user-friendly way to detect the complexity of the conditional expression, but I am not wedded to the idea.

@MichaelChirico
Copy link
Collaborator

I don't think a && b || c is complex at all TBH, besides the mixing of precedences (but presumably (a && b) || c would also throw a lint here?), which I think is a separate issue.

I'm more comfortable calling nontrivial_expr(a) && nontrivial_expr(b) || nontrivial_expr(c) complex, but just plain names don't seem to add complexity IMO.

Defining the metric for "complexity" here is going to be quite hard -- are there other languages that have implemented something similar, or is there some CS literature we can turn to for some more generic definitions?

@IndrajeetPatil
Copy link
Collaborator Author

Defining the metric for "complexity" here is going to be quite hard -- are there other languages that have implemented something similar, or is there some CS literature we can turn to for some more generic definitions?

You are correct that this is a difficult question, but I don't think we need to have a consensus definition of what counts as "complex" conditional because that's an inherently subjective notion. The only thing we need to figure out is a good default (the same way we did for cyclocomp linter). And, since this is a configurable linter, even if we adopt a threshold that users feel is too restrictive, they can easily change this in config.

I think the current default of threshold = 1L is too aggressive, but we can easily change it to threshold = 2L (or higher):

lint(
  text = "if (a && b || c) NULL",
  linters = complex_conditional_linter(2)
)
#> ℹ No lints found.

So the question for us to resolve is what should be the default threshold?
How did we decide on 15L as the default threshold for McCabe complexity linter?


I would vote for threshold = 2L; any conditional expression with more than 2 operators (and 3 logical operands) can benefit from simplification, IMO.

Here is an example from our codebase:

if (inherits(e, "lint") && (is.na(e$line) || !nzchar(e$line) || e$message == "unexpected end of input")) {
  ...
}

which lints with this new linter and can be modified something along these lines:

is_expression_valid <- (is.na(e$line) || !nzchar(e$line) || e$message == "unexpected end of input")
if (inherits(e, "lint") && is_expression_valid) {
  ...
}

P.S. As for linters in other programming languages, I can't think of any (neither ruff nor eslint have anything similar; the closest ones are about complex conditional in type stubs or mixing operators).

@IndrajeetPatil IndrajeetPatil marked this pull request as ready for review November 5, 2024 22:09
@IndrajeetPatil IndrajeetPatil changed the title Draft: New linter for complex conditional expressions New linter for complex conditional expressions Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New linter to suggest moving complex conditional expressions to boolean function/variable?
2 participants