Skip to content

Commit

Permalink
mm: memcontrol: fix GFP_NOFS recursion in memory.high enforcement
Browse files Browse the repository at this point in the history
commit 9ea9cb0 upstream.

Breno and Josef report a deadlock scenario from cgroup reclaim
re-entering the filesystem:

[  361.546690] ======================================================
[  361.559210] WARNING: possible circular locking dependency detected
[  361.571703] 6.5.0-0_fbk700_debug_rc0_kbuilder_13159_gbf787a128001 #1 Tainted: G S          E
[  361.589704] ------------------------------------------------------
[  361.602277] find/9315 is trying to acquire lock:
[  361.611625] ffff88837ba140c0 (&delayed_node->mutex){+.+.}-{4:4}, at: __btrfs_release_delayed_node+0x68/0x4f0
[  361.631437]
[  361.631437] but task is already holding lock:
[  361.643243] ffff8881765b8678 (btrfs-tree-01){++++}-{4:4}, at: btrfs_tree_read_lock+0x1e/0x40

[  362.904457]  mutex_lock_nested+0x1c/0x30
[  362.912414]  __btrfs_release_delayed_node+0x68/0x4f0
[  362.922460]  btrfs_evict_inode+0x301/0x770
[  362.982726]  evict+0x17c/0x380
[  362.988944]  prune_icache_sb+0x100/0x1d0
[  363.005559]  super_cache_scan+0x1f8/0x260
[  363.013695]  do_shrink_slab+0x2a2/0x540
[  363.021489]  shrink_slab_memcg+0x237/0x3d0
[  363.050606]  shrink_slab+0xa7/0x240
[  363.083382]  shrink_node_memcgs+0x262/0x3b0
[  363.091870]  shrink_node+0x1a4/0x720
[  363.099150]  shrink_zones+0x1f6/0x5d0
[  363.148798]  do_try_to_free_pages+0x19b/0x5e0
[  363.157633]  try_to_free_mem_cgroup_pages+0x266/0x370
[  363.190575]  reclaim_high+0x16f/0x1f0
[  363.208409]  mem_cgroup_handle_over_high+0x10b/0x270
[  363.246678]  try_charge_memcg+0xaf2/0xc70
[  363.304151]  charge_memcg+0xf0/0x350
[  363.320070]  __mem_cgroup_charge+0x28/0x40
[  363.328371]  __filemap_add_folio+0x870/0xd50
[  363.371303]  filemap_add_folio+0xdd/0x310
[  363.399696]  __filemap_get_folio+0x2fc/0x7d0
[  363.419086]  pagecache_get_page+0xe/0x30
[  363.427048]  alloc_extent_buffer+0x1cd/0x6a0
[  363.435704]  read_tree_block+0x43/0xc0
[  363.443316]  read_block_for_search+0x361/0x510
[  363.466690]  btrfs_search_slot+0xc8c/0x1520

This is caused by the mem_cgroup_handle_over_high() not respecting the
gfp_mask of the allocation context.  We used to only call this function on
resume to userspace, where no locks were held.  But c9afe31 ("memcg:
synchronously enforce memory.high for large overcharges") added a call
from the allocation context without considering the gfp.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: c9afe31 ("memcg: synchronously enforce memory.high for large overcharges")
Signed-off-by: Johannes Weiner <[email protected]>
Reported-by: Breno Leitao <[email protected]>
Reported-by: Josef Bacik <[email protected]>
Acked-by: Shakeel Butt <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: <[email protected]>	[5.17+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
  • Loading branch information
hnaz authored and gregkh committed Oct 6, 2023
1 parent eaf409c commit 7bc7cbf
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 6 deletions.
4 changes: 2 additions & 2 deletions include/linux/memcontrol.h
Original file line number Diff line number Diff line change
Expand Up @@ -919,7 +919,7 @@ unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec,
return READ_ONCE(mz->lru_zone_size[zone_idx][lru]);
}

void mem_cgroup_handle_over_high(void);
void mem_cgroup_handle_over_high(gfp_t gfp_mask);

unsigned long mem_cgroup_get_max(struct mem_cgroup *memcg);

Expand Down Expand Up @@ -1460,7 +1460,7 @@ static inline void mem_cgroup_unlock_pages(void)
rcu_read_unlock();
}

static inline void mem_cgroup_handle_over_high(void)
static inline void mem_cgroup_handle_over_high(gfp_t gfp_mask)
{
}

Expand Down
2 changes: 1 addition & 1 deletion include/linux/resume_user_mode.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ static inline void resume_user_mode_work(struct pt_regs *regs)
}
#endif

mem_cgroup_handle_over_high();
mem_cgroup_handle_over_high(GFP_KERNEL);
blkcg_maybe_throttle_current();

rseq_handle_notify_resume(NULL, regs);
Expand Down
6 changes: 3 additions & 3 deletions mm/memcontrol.c
Original file line number Diff line number Diff line change
Expand Up @@ -2559,7 +2559,7 @@ static unsigned long calculate_high_delay(struct mem_cgroup *memcg,
* Scheduled by try_charge() to be executed from the userland return path
* and reclaims memory over the high limit.
*/
void mem_cgroup_handle_over_high(void)
void mem_cgroup_handle_over_high(gfp_t gfp_mask)
{
unsigned long penalty_jiffies;
unsigned long pflags;
Expand Down Expand Up @@ -2587,7 +2587,7 @@ void mem_cgroup_handle_over_high(void)
*/
nr_reclaimed = reclaim_high(memcg,
in_retry ? SWAP_CLUSTER_MAX : nr_pages,
GFP_KERNEL);
gfp_mask);

/*
* memory.high is breached and reclaim is unable to keep up. Throttle
Expand Down Expand Up @@ -2823,7 +2823,7 @@ static int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask,
if (current->memcg_nr_pages_over_high > MEMCG_CHARGE_BATCH &&
!(current->flags & PF_MEMALLOC) &&
gfpflags_allow_blocking(gfp_mask)) {
mem_cgroup_handle_over_high();
mem_cgroup_handle_over_high(gfp_mask);
}
return 0;
}
Expand Down

0 comments on commit 7bc7cbf

Please sign in to comment.