Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{bp-15338} arm64: fix tpidr maybe null #15367

Merged
merged 1 commit into from
Dec 28, 2024

Conversation

jerpelea
Copy link
Contributor

Summary

Before the MPU initialization, the up_update_task(this_cpu()) function is called at a time when hardware cache coherency is not yet enabled. In certain critical scenarios, Core 1 reads a zero value for tcb from the global variable g_assignedtask and stores this zero value into the tpidr register. This results in subsequent interrupt handlers reading a zero tcb, causing an exception.

Impact

RELEASE

Testing

CI

Before the MPU initialization, the up_update_task(this_cpu()) function is called at a time when hardware cache coherency is not yet enabled.
In certain critical scenarios, Core 1 reads a zero value for tcb from the global variable g_assignedtask and stores this zero value into the tpidr
register. This results in subsequent interrupt handlers reading a zero tcb, causing an exception.

Signed-off-by: hujun5 <[email protected]>
@github-actions github-actions bot added Arch: arm64 Issues related to ARM64 (64-bit) architecture Size: S The size of the change in this PR is small labels Dec 27, 2024
@nuttxpr
Copy link

nuttxpr commented Dec 27, 2024

[Experimental Bot, please feedback here]

No, this PR description does not fully meet the NuttX requirements. While it provides a decent summary of the why, it lacks crucial details. Here's a breakdown:

Missing/Insufficient Information:

  • Summary:

    • What functional part is changed? Which specific file(s) and function(s) were modified? Mentioning up_update_task is a start, but more detail is needed.
    • How does the change work? Be precise. Does the PR move the up_update_task call? Does it add synchronization primitives? Does it initialize the g_assignedtask differently?
    • Issue References: Are there any related NuttX or NuttX Apps issues? Provide links.
  • Impact:

    • "RELEASE" is not a sufficient description. While it implies this is a bug fix, explicitly state "NO" to all impact categories unless they are affected. If this fixes a critical bug, then arguably there is an impact on the user (a positive one, fixing a crash). Be explicit. Example for User Impact: "YES, this fixes a potential crash scenario on multi-core systems where Core 1 could encounter an exception due to an uninitialized TCB."
    • Consider: This field is for other unforeseen consequences. Given this is related to MPU initialization and caching, are there any subtle performance implications (positive or negative)?
  • Testing:

    • "CI" is also insufficient. While CI testing is essential, you MUST provide specific local testing details.
    • Build Host(s): List the OS, CPU architecture, and compiler used for your local build.
    • Target(s): Specify the architecture (e.g., sim, qemu-rv32ima, stm32f4discovery).
    • Testing Logs: The logs must demonstrate the issue before the change and the correct behavior after the change. Just saying "CI" doesn't prove anything in the PR itself. Show evidence of the fix. Even a simplified test case would be helpful.

Example of improved description (fill in the specifics):

Summary

This PR fixes a race condition during MPU initialization on multi-core systems that could lead to a crash on Core 1. The issue arises because up_update_task(this_cpu()) is called before hardware cache coherency is enabled, potentially leading to Core 1 reading an uninitialized value (zero) for g_assignedtask and storing it into the tpidr register. This results in subsequent interrupt handlers on Core 1 using a null tcb pointer, causing an exception. The fix modifies the arch/arm/src/armv8-m/up_mpu.c file, moving the call to up_update_task(this_cpu()) to [explain where it's moved and why that location ensures correct initialization].

Fixes #[NuttX Issue Number (if applicable)]

Impact

  • Impact on user: YES. This fixes a potential crash on multi-core systems. Users running on affected hardware will benefit from increased stability.
  • Impact on build: NO.
  • Impact on hardware: YES. This specifically affects multi-core ARMv8-M architectures. [List specific boards or SoCs if known]
  • Impact on documentation: NO.
  • Impact on security: YES. This fixes a potential crash which could be exploited in certain contexts.
  • Impact on compatibility: NO.
  • Anything else to consider: [Mention any potential performance impact, even if minor]

Testing

I confirm that changes are verified on local setup and works as intended:

  • Build Host(s): Linux (Ubuntu 20.04), x86_64, GCC 9.4.0
  • Target(s): qemu-armv8m (Cortex-M7, dual-core config)

Testing logs before change:

[Core 1] up_update_task: g_assignedtask is 0x0
[Core 1] Exception in interrupt handler: Invalid tcb pointer

Testing logs after change:

[Core 1] up_update_task: g_assignedtask is 0x[Valid Address]
[Core 1] Interrupt handled successfully

By providing this level of detail, reviewers can quickly understand the change, its impact, and verify the testing results. This will significantly improve the chances of your PR being accepted.

@xiaoxiang781216 xiaoxiang781216 merged commit 6e80eaf into apache:releases/12.8 Dec 28, 2024
12 checks passed
@jerpelea jerpelea deleted the bp-15338 branch December 28, 2024 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch: arm64 Issues related to ARM64 (64-bit) architecture Size: S The size of the change in this PR is small
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants