Skip to content

Commit

Permalink
Update rule metadata (#1601)
Browse files Browse the repository at this point in the history
  • Loading branch information
joke1196 authored Oct 10, 2023
1 parent b29355f commit 416eeeb
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 7 deletions.
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
<p>This rule raises an issue when 5 or more commands are applied on a data frame.</p>
<p>This rule raises an issue when 7 or more commands are applied on a data frame.</p>
<h2>Why is this an issue?</h2>
<p>The pandas library provides many ways to filter, select, reshape and modify a data frame. Pandas supports as well method chaining, which means that
many <code>DataFrame</code> methods return a modified <code>DataFrame</code>. This allows the user to chain multiple operations together, making it
effortless perform several of them in one line of code:</p>
<pre>
import pandas as pd

joe = pd.read_csv("data.csv", dtype={'user_id':'str', 'name':'str'}).set_index("name").filter(like='jo', axis=0).head()
schema = {'name':str, 'domain': str, 'revenue': 'Int64'}
joe = pd.read_csv("data.csv", dtype=schema).set_index('name').filter(like='joe', axis=0).groupby('domain').mean().round().sample()
</pre>
<p>While this code is correct and concise, it can be challenging to follow its logic and flow, making it harder to debug or modify in the future.</p>
<p>To improve code readability, debugging, and maintainability, it is recommended to break down long chains of pandas instructions into smaller, more
Expand All @@ -21,20 +22,20 @@ <h4>Noncompliant code example</h4>
import pandas as pd

def foo(df: pd.DataFrame):
return df.set_index("name").filter(like='joe', axis=0).groupby("team")["salary"].mean().head() # Noncompliant: too many operations happen on this data frame.
return df.set_index('name').filter(like='joe', axis=0).groupby('team').mean().round().sort_values('salary').take([0]) # Noncompliant: too many operations happen on this data frame.
</pre>
<h4>Compliant solution</h4>
<pre data-diff-id="1" data-diff-type="compliant">
import pandas as pd

def select_joes(df):
return df.set_index("name").filter(like='joe', axis=0)
return df.set_index('name').filter(like='joe', axis=0)

def compute_mean_salary_per_team(df):
return df.groupby("team")["salary"].mean()
return df.groupby('team').mean().round()

def foo(df: pd.DataFrame):
return df.pipe(select_joes).pipe(compute_mean_salary_per_team).head() # Compliant
return df.pipe(select_joes).pipe(compute_mean_salary_per_team).sort_values('salary').take([0]) # Compliant
</pre>
<h2>Resources</h2>
<h3>Documentation</h3>
Expand Down
2 changes: 1 addition & 1 deletion sonarpedia.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"languages": [
"PY"
],
"latest-update": "2023-10-06T11:02:04.798788Z",
"latest-update": "2023-10-09T13:33:42.838821515Z",
"options": {
"no-language-in-filenames": true,
"preserve-filenames": true
Expand Down

0 comments on commit 416eeeb

Please sign in to comment.