Replies: 3 comments 11 replies
-
Not speeding up over 24 cpu suggests hitting the limit due to serial time per Amdahl. I expect most of the memory is in the netlist and segs/vias not the local per thread routing structures. In any case the RSS can't grow beyond the available memory so any excess would be in the virtual size which isn't measured here. |
Beta Was this translation helpful? Give feedback.
-
OR could print vm size as well as rss size if that is useful. The data is available in /proc's statm (or stat or pid). |
Beta Was this translation helpful? Give feedback.
-
Just one more datapoint. After detailed route completed with NUM_CORES=48, this is reported by "top":
From above, with NUM_CORES=6, essentially no difference.
|
Beta Was this translation helpful? Give feedback.
-
Some rough numbers and takeaways:
First: virtual memory size is the total amount of memory allocated by the process. Resident memory size, measured by make elapsed, is the amount of physical memory in use. Measuring virtual memory size can not be done by the "time" command.
The current default policy of using all CPU cores, 48 in my case, vs. 6 below, doesn't have a material difference in memory usage (virtual memory size). I don't have accurate numbers from that, just what I recall from having observed detailed routing in the past on megaboom. In both cases, max resident set is near physical memory(90%+).
The speed doesn't improve, nor deteriorate above 24 cores(50%).
Since I don't care if I use virtual memory, I have to set it up anyway, then, based on the measurement below, it would seem that there is no improved policy that exists for megaboom, for the observable parameters I care about.
It is surprising to me that memory usage in detailed routing isn't a function of number of cores. I don't know why, but I always thought so, to the point where I fooled myself to believing I heard it somewhere. Am I missing something?
Numbers from "make elapsed", so seconds, then megabytes resident memory set.
NUM_CORES=48
5_3_route 17246 60673
NUM_CORES=24
5_3_route 17516 61012
NUM_CORES=12
5_3_route 28831 60868
NUM_CORES=6
Still it uses more than 64gByte of virtual memory, runs for at least 40000 seconds(didn't finish yet, but I finished my investigations here).
Beta Was this translation helpful? Give feedback.
All reactions