Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPTEE freezes while starting up on an imx7d with optee 3.19.0 #7139

Closed
marcer1 opened this issue Nov 20, 2024 · 10 comments
Closed

OPTEE freezes while starting up on an imx7d with optee 3.19.0 #7139

marcer1 opened this issue Nov 20, 2024 · 10 comments
Labels

Comments

@marcer1
Copy link

marcer1 commented Nov 20, 2024

Hello,

I've got a small problem. I try to get OPTEE 3.19.0 running on an Colibri IMX7D with Yocto Linux 4.0 from Toradex which is currently not supported bei Toradex, but the imx7d from NXP is supported by optee. I am evaluating it on an Colibri Eval Board V3.
So in theory it should work and I even found an old issue here on github from somenone working at toradex who manged to get it running. But he had a different issue than me. I got in touch with him, but it has been some time and he sadly can not provide any documentation anymore.

My approach was to port OPTEE with the official Porting Guide from NXP. Sec.5.4 states how it should be done. I did everything which had to be changed. But everytime i tried to boot into the Kernel I got the error: TEEC_InitializeContext failed with code 0xffff0008.

I found out that the tee supplicant was not running, but I did not manage to start it. The TAs are there, but the OPTEE-OS did not seem to be running. I tried booting into OPTEE (uTee) manually in U-Boot. But that did not work either. I got the following message:

D/TC:0   get_aslr_seed:1498 Bad fdt: -9
D/TC:0   plat_get_aslr_seed:111 Warning: no ASLR seed
D/TC:0   add_phys_mem:635 ROUNDDOWN(IRAM_S_BASE, CORE_MMU_PGDIR_SIZE) type TEE_COHERENT 0x00100000 size 0x00100000
D/TC:0   add_phys_mem:635 ROUNDDOWN(IRAM_BASE, CORE_MMU_PGDIR_SIZE) type TEE_COHERENT 0x00900000 size 0x00100000
D/TC:0   add_phys_mem:635 ROUNDDOWN(0x30800000, CORE_MMU_PGDIR_SIZE) type IO_SEC 0x30800000 size 0x00400000
D/TC:0   add_phys_mem:635 ROUNDDOWN(0x30400000, CORE_MMU_PGDIR_SIZE) type IO_SEC 0x30400000 size 0x00400000
D/TC:0   add_phys_mem:635 ROUNDDOWN(0x30000000, CORE_MMU_PGDIR_SIZE) type IO_SEC 0x30000000 size 0x00400000
D/TC:0   add_phys_mem:635 ROUNDDOWN(0x30360000, CORE_MMU_PGDIR_SIZE) type IO_SEC 0x30300000 size 0x00200000
D/TC:0   add_phys_mem:649 Physical mem map overlaps 0x30300000
D/TC:0   add_phys_mem:635 ROUNDDOWN(0x31000000, CORE_MMU_PGDIR_SIZE) type IO_SEC 0x31000000 size 0x00100000
D/TC:0   add_phys_mem:635 ROUNDDOWN((0x30860000), CORE_MMU_PGDIR_SIZE) type IO_NSEC 0x30800000 size 0x00200000
D/TC:0   add_phys_mem:635 TZASC_BASE type IO_SEC 0x30780000 size 0x00010000
D/TC:0   add_phys_mem:649 Physical mem map overlaps 0x30780000
D/TC:0   add_phys_mem:635 ROUNDDOWN(0x30350000, CORE_MMU_PGDIR_SIZE) type IO_SEC 0x30300000 size 0x00200000
D/TC:0   add_phys_mem:649 Physical mem map overlaps 0x30300000
D/TC:0   add_phys_mem:635 SECMEM_BASE type IO_SEC 0x00100000 size 0x00008000
D/TC:0   add_phys_mem:635 TEE_SHMEM_START type NSEC_SHM 0xbfe00000 size 0x00200000
D/TC:0   add_phys_mem:635 TA_RAM_START type TA_RAM 0xbe100000 size 0x01d00000
D/TC:0   add_phys_mem:635 VCORE_UNPG_RW_PA type TEE_RAM_RW 0xbe065000 size 0x0009b000
D/TC:0   add_phys_mem:635 VCORE_UNPG_RX_PA type TEE_RAM_RX 0xbe000000 size 0x00065000
D/TC:0   add_va_space:675 type RES_VASPACE size 0x00a00000
D/TC:0   add_va_space:675 type SHM_VASPACE size 0x02000000
D/TC:0   dump_mmap_table:800 type NSEC_SHM     va 0xb7f00000..0xb80fffff pa 0xbfe00000..0xbfffffff size 0x00200000 (pgdir)
D/TC:0   dump_mmap_table:800 type TA_RAM       va 0xb8100000..0xb9dfffff pa 0xbe100000..0xbfdfffff size 0x01d00000 (pgdir)
D/TC:0   dump_mmap_table:800 type IO_SEC       va 0xb9f00000..0xb9ffffff pa 0x31000000..0x310fffff size 0x00100000 (pgdir)
D/TC:0   dump_mmap_table:800 type IO_SEC       va 0xba000000..0xba3fffff pa 0x30800000..0x30bfffff size 0x00400000 (pgdir)
D/TC:0   dump_mmap_table:800 type IO_NSEC      va 0xba500000..0xba6fffff pa 0x30800000..0x309fffff size 0x00200000 (pgdir)
D/TC:0   dump_mmap_table:800 type IO_SEC       va 0xba700000..0xbaafffff pa 0x30400000..0x307fffff size 0x00400000 (pgdir)
D/TC:0   dump_mmap_table:800 type IO_SEC       va 0xbac00000..0xbaffffff pa 0x30000000..0x303fffff size 0x00400000 (pgdir)
D/TC:0   dump_mmap_table:800 type TEE_COHERENT va 0xbb100000..0xbb1fffff pa 0x00900000..0x009fffff size 0x00100000 (pgdir)
D/TC:0   dump_mmap_table:800 type TEE_COHERENT va 0xbb200000..0xbb2fffff pa 0x00100000..0x001fffff size 0x00100000 (pgdir)
D/TC:0   dump_mmap_table:800 type RES_VASPACE  va 0xbb300000..0xbbcfffff pa 0x00000000..0x009fffff size 0x00a00000 (pgdir)
D/TC:0   dump_mmap_table:800 type SHM_VASPACE  va 0xbbe00000..0xbddfffff pa 0x00000000..0x01ffffff size 0x02000000 (pgdir)
D/TC:0   dump_mmap_table:800 type IO_SEC       va 0xbdff8000..0xbdffffff pa 0x00100000..0x00107fff size 0x00008000 (smallpg)
D/TC:0   dump_mmap_table:800 type TEE_RAM_RX   va 0xbe000000..0xbe064fff pa 0xbe000000..0xbe064fff size 0x00065000 (smallpg)
D/TC:0   dump_mmap_table:800 type TEE_RAM_RW   va 0xbe065000..0xbe0fffff pa 0xbe065000..0xbe0fffff size 0x0009b000 (smallpg)
D/TC:0   core_mmu_alloc_l2:316 L2 table used: 1/6
D/TC:0   core_mmu_alloc_l2:316 L2 table used: 2/6

OPTEE just freezes after the last line. I jumped to conclusions that (although it is not marked as an error by the logger), maybe ASLR was the problem. So I disabled it, but the problem persisted. It just freezes after the last line.
I wrote some DMESG-output into the core_mmu_alloc_l2 before line 316, but the alloc does not fail. It succeeds. Even for the second alloc. So allocation is not the problem either.

I looked into the aslr error. -9 means bad magic number. In the binaries of the dtb there has to be the magic number 0xd00dfeed at the start of the binary. Looked into it with a hex-editor. Definitly is there.
So I thought maybe the dtb is located in the wrong addres. I disabled devicetree relocation in u-boot by setting the env-variable fdt-high to 0xFFFFFFFF which disables this. Same outcome. The filetree should definitly be at the address I provied optee with compilation-options.

I wrote a small u-boot-script to set all this manually:

setenv fdt_high 0xFFFFFFFF;
fatload mmc 0:1 ${ramdisk_addr_r} uTee-;
fatload mmc 0:1 ${fdt_addr_r} imx7d-colibri-emmc-eval-v3.dtb;
fatload mmc 0:1 ${kernel_addr_r} zImage;
bootm ${ramdisk_addr_r} - ${fdt_addr_r};

Adresses are:

Name Wert
kernel_addr_r 0x84200000
fdt_addr_r 0x88200000
ramdisk_addr_r 0x88400000

CFG_NS_ENTRY_ADDR is set to the value of kernel_addr_r, as this sould be the NextStage I thought after uTee loaded. I thought uTee would load, manipulate the fdt, start itself and jump back into the kernel, booting it up (according to the PortingGuide by NXP).
CFG_DT_ADDR is set to the value of fdt_addr_r.

Does anybody have an idea why it could be freezing? What would've come next? Is there documentation on the bootflow of uTee with function-signatures?

Thank you,

marcer1

@jenswi-linaro
Copy link
Contributor

From the log, it looks like OP-TEE freezes after the MMU has been enabled.

@marcer1
Copy link
Author

marcer1 commented Nov 22, 2024

Hello,

yes, that sould be about right. I need more information about the booflow as I dont know how optee is proceeding. I looked into the logfiles from Jeremias Issue but I only see stack canaries beeing loaded after that from init_canaries.

My understanding of the bootflow is, that core/arch/arm/kernel/boot.c states te complete bootflow, right? The last output (L2 tables used) is from core/arch/arm/mm/core_mmu_v7.c from a static function named core_mmu_alloc_l2(). I can't get any link from the function name of core_mmu_v7.c to boot.c.

So as far as I understood with my mx7d in boot.c the booflow is as follows:

boot_init_primary_early -> init_primary. But I don't see any funciton-calls assiciated with mmu_initialization. I saw in jeremias log, that there is an Info-DMESG printing a raw newline character after the mmu_init. And I don't get that character. This char is printed in init_runtime() in init_primary in boot.c. So I am asuming that it freezes before init_runtime().

The functions called before that are: thread_init_core_local_stacks(), thread_set_exceptions(), primary_save_cntfrq() and init_vfp_se(). I looked into each of them, but i don't get any call from one of them to core_mmu_alloc_l2().

@jenswi-linaro
Copy link
Contributor

The call to enable the MMU is done here:

ldr r1, =boot_mmu_config
bl core_init_mmu_map
#ifdef CFG_CORE_ASLR
/*
* Process relocation information for updating with the new offset.
* We're doing this now before MMU is enabled as some of the memory
* will become write protected.
*/
ldr r0, =boot_mmu_config
ldr r0, [r0, #CORE_MMU_CONFIG_LOAD_OFFSET]
/*
* Update cached_mem_end address with load offset since it was
* calculated before relocation.
*/
ldr r2, cached_mem_end
add r2, r2, r0
str r2, cached_mem_end
bl relocate
#endif
bl __get_core_pos
bl enable_mmu
#ifdef CFG_CORE_ASLR
/*
* Reinitialize console, since register_serial_console() has
* previously registered a PA and with ASLR the VA is different
* from the PA.
*/
bl console_init
#endif
#ifdef CFG_VIRTUALIZATION
/*
* Initialize partition tables for each partition to
* default_partition which has been relocated now to a different VA
*/
bl core_mmu_set_default_prtn_tbl
#endif
mov r0, r4 /* pageable part address */
mov r1, r5 /* ns-entry address */
bl boot_init_primary_early

bl core_init_mmu_map initializes the translation tables and prepares values for the relevant registers.
bl core_init_mmu_map enables the MMU, this will fail if the function enable_mmu isn't mapped with an identical virtual address range. Since you've disabled ASLR we don't need to worry about how that's handled in this issue.

@marcer1
Copy link
Author

marcer1 commented Nov 25, 2024

Hello,

so i looked into entry_a32.S. I added debug messages in and around core_init_mmu_map to see where it exactly freezes and i got it down to two lines: 530 and 531. Either it fails in __get_core_pos or enable_mmu. I don't get to boot_init_primary_early .

I can't tell exactly though because neither is written in C, so I can't use the DMSG Macro. I tried to implement a function in c and calling it in asm, but now the log is completly quiet.

As __get_core_pos has been called multiple times before that flawlessyly i highly suspect enable_mmu.

@jenswi-linaro
Copy link
Contributor

Is it perhaps a cache issue? You could try to comment out:

orr r0, r0, #SCTLR_I
orr r0, r0, #SCTLR_C

To rule out some WXN error you could also comment out:

orr r0, r0, #(SCTLR_WXN | SCTLR_UWXN)

@marcer1
Copy link
Author

marcer1 commented Nov 26, 2024

Hey,

Thanks for the thint. I did comment out the mentioned lines. I also had to comment out write_icialluf or else it would'nt compile but I did not get any further.

So i did some rudimentary debugging, printing out a string after each line of the enable_mmufunction. It failed to print out lines after the isb after write_sctlr.

orr r0, r0, #SCTLR_M
#ifndef CFG_WITH_LPAE
/* Enable Access flag (simplified access permissions) and TEX remap */
orr r0, r0, #(SCTLR_AFE | SCTLR_TRE)
#endif
write_sctlr r0
isb
/* Update vbar */
read_vbar r1

I should mention that CFG_WITH_LPAE is disabled and not parsed.

@jenswi-linaro
Copy link
Contributor

I also had to comment out write_icialluf or else it would'nt compile but I did not get any further.

That doesn't make sense. It compiled before and suddenly it wouldn't accept write_iciallu?

@marcer1
Copy link
Author

marcer1 commented Nov 27, 2024

That doesn't make sense. It compiled before and suddenly it wouldn't accept write_iciallu?

The Error is:

core/arch/arm/kernel/entry_a32.S:870: Error: bad instruction `write_icialluf'

I neither think that would be a problem, as we disable the I-Cache after `write_icialluf'. I we'd disable it before I'd understand that much more.


In the meantime I specifically pinpointed the line in which its freezing:

I looked into the arm ref man and it says the M Bit is the first bit in the SCTLR Register, which enables the mmu. After that isb is called:

write_sctlr r0
isb

which means the processor waits until every instruction has been executed. I also tried to comment out isb although it seemed crucial because I was curious. Butit freezes nonetheless.
Setting the M bit in the SCTLR, respectivly enableing the mmu, seems to halt the system. Can you image why this could happen? I also tried out a second IMX7D and it's got the same problem. So I am certain that there is not a hardware issue with the processor.

My considerations here are, that it this is all quite low level. Is there maybe a high level option I failed to enable, related to the mmu? I also experimented with CFG_WITH_PAGER. Did enable that, tried again, still froze. I also tried enableing CFG_WITH_LPAE but this is not supported on my System. When I try to enable it I get the error:

 core/arch/arm/sm/pm_a32.S:177:2: error: #error "Not supported -"

I don't understand why though, because in the docs it is stated that L2 Tables are only used with LPAE enabled. Else L1 is used. Buf if you remember the last entry of my log I started the issue it states in the last lines that;

D/TC:0 core_mmu_alloc_l2:316 L2 table used: 1/6
D/TC:0 core_mmu_alloc_l2:316 L2 table used: 2/6

But I don't have LPAE enabled, so why would L2 Tables be used? Or do I have a mistunderstanding with the docs?

@marcer1
Copy link
Author

marcer1 commented Nov 27, 2024

Ok my bad write_icialluf is really not a valid call. write_iciallu is. Got it from the source:

I guess that's on me. Trying again.


Update

Tried it out with write_iciallu and without orr r0, r0, #SCTLR_I , orr r0, r0, #SCTLR_C and orr r0, r0, #(SCTLR_WXN | SCTLR_UWXN)

But it still hangs at write_sctlr.

Copy link

This issue has been marked as a stale issue because it has been open (more than) 30 days with no activity. Remove the stale label or add a comment, otherwise this issue will automatically be closed in 5 days. Note, that you can always re-open a closed issue at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants