Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xen is booting very slowly after Dynamic Launch #16

Open
miczyg1 opened this issue May 29, 2024 · 10 comments
Open

Xen is booting very slowly after Dynamic Launch #16

miczyg1 opened this issue May 29, 2024 · 10 comments
Assignees

Comments

@miczyg1
Copy link

miczyg1 commented May 29, 2024

Right after the memory map is printed (quite early), Xen doesn't print anything on serial console for 1 minute or longer. I suspect it might be related to the MTRRs not yet being set up on BSP after Dynamic Launch.

@miczyg1
Copy link
Author

miczyg1 commented May 29, 2024

(XEN) Xen version 4.17.4 (user@[unknown]) (gcc (GCC) 12.3.1 20230508 (Red Hat 12.3.1-1)) debug=n Thu May 23 22:18:23 UTC 2024
(XEN) Latest ChangeSet: 
(XEN) Bootloader: GRUB 2.06
(XEN) Command line: placeholder console=none dom0_mem=min:1024M dom0_mem=max:4096M ucode=scan smt=off gnttab_max_frames=2048 gnttab_max_maptrack_frames=4096
(XEN) Xen image load base address: 0x98a00000
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN) Disc information:
(XEN)  Found 0 MBR signatures
(XEN)  Found 3 EDD information structures
(XEN) SLAUNCH: reserving event log (0x99411000 - 0x99419000)
(XEN) SLAUNCH: reserving TXT heap (0x99f10000 - 0x9a000000)
(XEN) SLAUNCH: reserving SINIT memory (0x99ec0000 - 0x99f10000)
(XEN) Xen-e820 RAM map:
(XEN)  [0000000000000000, 000000000009fbff] (usable)
(XEN)  [000000000009fc00, 000000000009ffff] (reserved)
(XEN)  [00000000000f0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 0000000099410fff] (usable)
(XEN)  [0000000099411000, 0000000099418fff] (reserved)
(XEN)  [0000000099419000, 0000000099438fff] (usable)
(XEN)  [0000000099439000, 000000009f7fffff] (reserved)
(XEN)  [00000000e0000000, 00000000efffffff] (reserved)
(XEN)  [00000000fc000000, 00000000fc000fff] (reserved)
(XEN)  [00000000fe000000, 00000000fe00ffff] (reserved)
(XEN)  [00000000fed10000, 00000000fed17fff] (reserved)
(XEN)  [00000000fed20000, 00000000fed91fff] (reserved)
(XEN)  [00000000feda0000, 00000000feda1fff] (reserved)
(XEN)  [00000000ff000000, 0000000100000fff] (reserved)
(XEN)  [0000000100001000, 000000045e7fffff] (usable)
(this is where it stops for a longer time) <----------------------------
(XEN) ACPI: RSDP 000F6D50, 0024 (r2 COREv4)
(XEN) ACPI: XSDT 9946D0E0, 006C (r1 COREv4 COREBOOT        0 CORE 20230628)
(XEN) ACPI: FACP 9946F5D0, 0114 (r6 COREv4 COREBOOT        0 CORE 20230628)
(XEN) ACPI: DSDT 9946D280, 234C (r2 COREv4 COREBOOT 20110725 INTL 20230628)
(XEN) ACPI: FACS 9946D240, 0040
(XEN) ACPI: SSDT 9946F6F0, 2974 (r2 COREv4 COREBOOT        0 CORE 20230628)
(XEN) ACPI: MCFG 99472070, 003C (r1 COREv4 COREBOOT        0 CORE 20230628)
(XEN) ACPI: TPM2 994720B0, 004C (r4 COREv4 COREBOOT        0 CORE 20230628)
(XEN) ACPI: LPIT 99472100, 0094 (r0 COREv4 COREBOOT        0 CORE 20230628)
(XEN) ACPI: APIC 994721A0, 00B2 (r3 COREv4 COREBOOT        0 CORE 20230628)
(XEN) ACPI: SPCR 99472260, 0058 (r4 COREv4 COREBOOT        0 CORE 20230628)
(XEN) ACPI: DMAR 994722C0, 0088 (r1 COREv4 COREBOOT        0 CORE 20230628)
(XEN) ACPI: HPET 99472350, 0038 (r1 COREv4 COREBOOT        0 CORE 20230628)

@krystian-hebel
Copy link
Collaborator

Can you run with e820-verbose=true passed to Xen and paste the log? MTRRs are restored just before e820 map is printed, so unless wrong memory types are saved by GRUB, it should already be using cache at this point.

@pietrushnic
Copy link
Member

Does it make sense to call @andyhhp ?

@andyhhp
Copy link
Contributor

andyhhp commented May 29, 2024

Does it make sense to call @andyhhp ?

Lol, who you gonna call?

debug=n

Please always use debug builds of Xen. Also, boot with console_timestamps=boot to get some Linux-style numbers here, and cpuinfo mtrr.show to get some extra diagnostics that may be relevant.

Which platform is this? Something Intel, but once again I'm missing an expected printk() in there.

In that period of time is when we're physically relocating the multiboot modules, and in particular putting dom0/initrd at the very top of memory. If that's ending up being uncached, then yes it will proceed slowly. This ought to make it obvious, if it's this.

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index dd51e68dbe5b..4299641e8a71 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1457,6 +1457,7 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p)
                  (headroom ||
                   ((end - size) >> PAGE_SHIFT) > mod[j].mod_start) )
             {
+                printk("*** Relocating mod[%d]\n", j);
                 move_memory(end - size + headroom,
                             (uint64_t)mod[j].mod_start << PAGE_SHIFT,
                             mod[j].mod_end);

but there are various other things we're doing during this period, including building pagetables, setting up the physical memory manager, etc.

@rossphilipson
Copy link
Contributor

rossphilipson commented May 29, 2024

That does sound like MTRRs not being restored. The entire world is UC except for the ACM area on SENTER.

@krystian-hebel
Copy link
Collaborator

Isn't this the platform on which coreboot complains about lack of free MTRRs for ROM? If it gives up in the middle of writing those and doesn't set something sane for >4GB, perhaps wrong settings are properly restored.

@miczyg1
Copy link
Author

miczyg1 commented Jun 6, 2024

Isn't this the platform on which coreboot complains about lack of free MTRRs for ROM? If it gives up in the middle of writing those and doesn't set something sane for >4GB, perhaps wrong settings are properly restored.

I have commented out the part where a temporary MTRR is placed, so MTRRs should be alright.

Which platform is this? Something Intel, but once again I'm missing an expected printk() in there.

Intel Comet Lake U

In that period of time is when we're physically relocating the multiboot modules, and in particular putting dom0/initrd at the very top of memory. If that's ending up being uncached, then yes it will proceed slowly. This ought to make it obvious, if it's this.

Okay, but on the non-Dynamic Launch path, it is not taking that long (I get dom0 systemd messages/LUKS key prompt on the serial console in a matter of seconds compared to over 60 seconds on Dynamic Launch).

@miczyg1
Copy link
Author

miczyg1 commented Aug 17, 2024

I'm trying to tacke this issue. Found some small problems around MTRR code and fixed them:
TrenchBoot/grub#23
#17

However, I still observe serious delays in the boot process. MTRRs are now the same as when coreboot was setting them up. I no longer have any clue what is wrong and how to debug further. I have also dumped the registers state on BSP (used WARN() right after init_e820 where MTRRs are restored), there there are almost no differences in the state. Attaching some logs from slaunch boot and non-slaunch boot for reference (log from minicom with timestamps, one can see that hashing takes significantly long, and some RAM region walks). At this point I think only the page tables/mapping may be causing the ranges to be non-cacheable.

xen_non_tb_boot.log
xen_tb_boot.log

@andyhhp any pointers appreciated.

@krystian-hebel
Copy link
Collaborator

@miczyg1 are you invalidating the cache between GRUB and Xen to get apples to apples comparison? After restoring MTRRs the cache is invalid, so it must get read at least once. Ideally this should be done after relocator, but as long as all the big parts are loaded to their final addresses doing it in late C code should be enough.

Time difference between SHA1 and SHA256 for dom0 initramfs is significantly bigger than the rest, which may point to cache trashing. This is expected when size of measured data is bigger than cache size. You may try commenting out the hashing too see if it impacts execution times of next steps.

@miczyg1
Copy link
Author

miczyg1 commented Aug 18, 2024

@miczyg1 are you invalidating the cache between GRUB and Xen to get apples to apples comparison? After restoring MTRRs the cache is invalid, so it must get read at least once. Ideally this should be done after relocator, but as long as all the big parts are loaded to their final addresses doing it in late C code should be enough.

Not sure if there is any invalidation in between.

Time difference between SHA1 and SHA256 for dom0 initramfs is significantly bigger than the rest, which may point to cache trashing. This is expected when size of measured data is bigger than cache size. You may try commenting out the hashing too see if it impacts execution times of next steps.

I may try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants