Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NT heap scales horrendously in some cases #106

Open
Donpedro13 opened this issue Jan 31, 2022 · 2 comments
Open

NT heap scales horrendously in some cases #106

Donpedro13 opened this issue Jan 31, 2022 · 2 comments

Comments

@Donpedro13
Copy link

Windows Build Number

Win32NT 10.0.22000.0 Microsoft Windows NT 10.0.22000.0

Processor Architecture

AMD64

Memory

64 GB

Storage Type, free / capacity

SSD 80/512 GB

Relevant apps installed

Traces collected via Feedback Hub

We collected profile traces with both Visual Studio and VTune, we can provide these via a private channel, if needed.

Isssue description

Recently, my current employer started measuring the multithreaded performance of a commercial application. We were interested in both raw performance numbers and scalability in terms of CPU core count. We were surprised to see that some operations scale terribly: the durations actually increase with the core count (contrary to the usual case). Profiling revealed that in most of the problematic cases the biggest bottleneck was the NT heap, due to its scalability problems. We measured with other heaps as well (Intel's TBB, and the Segment Heap to name a few), and none of them suffered from the same phenomenon.

Here's a chart plotting some of our measurements, lower is better (Y-axis: the time it takes to perform a certain operation in seconds X-axis: number of CPUs):

image

Here's a second chart that compares the NT heap and Segment Heap, a value below 100% means that the Segment Heap performed better (Y-axis: Segment heap/NT heap relative time as percentage X-axis: number of CPUs):

image

I'm aware that this is a bit too vague, I can provide the whole dataset through a private channel if required.

We opened a PSfD support case (can provide the case number, if needed), as we believed that we might be hitting some pathological path in the NT heap implementation that should be fixed on Microsoft's side. We were basically told, that:

  • the "classic" NT heap is a legacy heap, and won't be further improved
  • we should migrate to the Segment Heap, as "that's the future"

That's all well and good, we wouldn't mind switching to the Segment Heap, per se. However, there are many cases where the Segment Heap has worse performance than the "classic" one. I've included every data point of every measurement we did on the chart below. Relative performance, a value above 100% means that the Segment Heap performed worse:

image

Trading in some performance (about 10% on average) in many cases for scalability in others does not seem like a very good deal.

Is this expected? We would prefer to stay on a heap that's part of the operating system (either the "classic" or Segment heap), but these are the kind of trade-offs that make it not worth it.

Steps to reproduce

No easy repro (the phenomenon in question was reproduced in a commercial application that requires a license and some setup/installation steps).

Expected Behavior

The NT heap scales at an acceptable level, or the Segment Heap performs at least as good as the "classic" NT heap in every case.

Actual Behavior

The NT heap scales horrendously in some cases. The segment heap scales well but has worse performance in many cases.

@ghost ghost added the Needs-Triage 🔍 label Jan 31, 2022
@AvriMSFT
Copy link
Contributor

Hey! Thanks for reporting and giving such detailed descriptions of the issue🙂. I'm working on routing this issue to the right team and will report back soon.

@Eli-Black-Work
Copy link

Eli-Black-Work commented Feb 7, 2023

@AvriMSFT Were you able to route this issue to the right team?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants