Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU metrics are incorrect on Windows machines with more than 64 cores #40926

Open
faec opened this issue Sep 20, 2024 · 4 comments
Open

CPU metrics are incorrect on Windows machines with more than 64 cores #40926

faec opened this issue Sep 20, 2024 · 4 comments
Assignees
Labels
bug Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Comments

@faec
Copy link
Contributor

faec commented Sep 20, 2024

On Windows, Metricbeat measures CPU use via the Windows API call GetSystemTimes. Each metrics interval, it fetches the CPU numbers, and compares them to the previous measurement to determine CPU load during that interval. On most systems this includes CPU time "including all threads in all processes, on all processors". However, on systems with more than 64 cores, it returns only the data for the current processor group of up to 64 cores.

This has two consequences on high-core machines:

  • Reported CPU metrics include only CPU cores in the same processor group as the Metricbeat process.
  • When the Windows scheduler moves Metricbeat from one processor group to another (which is unpredictable but happens regularly), GetSystemTimes returns data from a different set of cores. If the new processor group has a lower CPU total than the previous one, Metricbeat will report negative numbers for some CPU metrics.

The most visible symptom is occasional negative numbers in CPU-related metrics, especially coming in pairs of two adjacent data points.

This seems to apply to ~all Metricbeat versions, and all versions of Windows that support more than 64 CPU cores.

@faec faec added bug Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels Sep 20, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@nimarezainia
Copy link
Contributor

fyi @flexitrev

@cmacknz
Copy link
Member

cmacknz commented Oct 21, 2024

It is not immediately clear how to solve this, and the solution does not strictly seem like it will be simple, especially without being able to use the Task Manager implementation as a reference.

I suspect we will need something like a thread or process per processor group with their processor group affinities set appropriately so that they get scheduled into different processor groups.

@gabriellandau
Copy link

However, on systems with more than 64 cores, it returns only the data for the current processor group of up to 64 cores.

Is this documented anywhere? I don't see it in the GetSystemTimes docs.

My reading of this MS doc suggests our threads can switch between processor groups. Here's python pseudocode to do this. I don't have a >64T system to test with.

global last_system_times = {}


def get_cpu_time_deltas():
    idle_time_delta, kernel_time_delta, user_time_delta = 0, 0, 0
    
    # Backup original affinity
    GetThreadGroupAffinity(GetCurrentThread(), &saved_group_affinity)

    # Enumerate each NUMA node
    for node in range(GetNumaHighestNodeNumber()):
        # Enumerate all the processor groups for this NUMA node
        for group_affinity in GetNumaNodeProcessorMask2(node):
        
            # Switch to this processor group
            SetThreadGroupAffinity(GetCurrentThread(), &group_affinity)
            
            # Retrieve metrics for this processor group
            GetSystemTimes(&idle, &kernel, &user)
            
            # Retrieve old values
            if group_affinity.group in last_system_times:
                last_idle, last_kernel, last_user = last_system_times[group_affinity.group]
                
                # Compute deltas
                idle_time_delta += (idle - last_idle)
                kernel_time_delta += (kernel - last_kernel)
                user_time_delta += (user - last_user)
            
            # Store latest values
            last_system_times[group_affinity.group] = (idle, kernel, user)
            
    # Restore original affinity
    SetThreadGroupAffinity(GetCurrentThread(), &saved_group_affinity)
    
    return idle_time_delta, kernel_time_delta, user_time_delta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

No branches or pull requests

6 participants