Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial commit to enable amx #1618

Merged
merged 1 commit into from
Aug 3, 2024

Conversation

ziyizhang-1
Copy link
Contributor

@ziyizhang-1 ziyizhang-1 commented Jul 23, 2024

As mentioned in x86_amx_intrinsics

Completed AMX Intrinsics:
amx-tile

  • _tile_loadconfig
  • _tile_storeconfig
  • _tile_loadd
  • _tile_release
  • _tile_stored
  • _tile_stream_loadd
  • _tile_zero
    amx-int8
  • _tile_dpbssd
  • _tile_dpbsud
  • _tile_dpbusd
  • _tile_dpbuud
    amx-bf16
  • _tile_dpbf16ps
    amx-fp16
  • _tile_dpfp16ps
    amx-complex
  • _tile_cmmimfp16ps
  • _tile_cmmrlfp16ps

@rustbot
Copy link
Collaborator

rustbot commented Jul 23, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Amanieu (or someone else) some time within the next two weeks.

@sayantn
Copy link
Contributor

sayantn commented Jul 25, 2024

Can you please run cargo test -p stdarch-verify with the PRINT_MISSING_LISTS_MARKDOWN constant set to true? This would update the missing-x86.md file to reflect the new added intrinsics. Also, please do not edit the x86-intel.xml file manually. Download the latest from Intel's website (I think v3.6.9 currently) and replace this.

Also, there are very few amx intrinsics, so they can be implemented in a single amx.rs file instead of 5 files

crates/core_arch/src/x86_64/mod.rs Outdated Show resolved Hide resolved
crates/stdarch-test/src/lib.rs Outdated Show resolved Hide resolved
crates/core_arch/src/x86_64/amxtile.rs Outdated Show resolved Hide resolved
crates/core_arch/src/x86_64/amxtile.rs Outdated Show resolved Hide resolved
@sayantn
Copy link
Contributor

sayantn commented Jul 25, 2024

Even though windows doesn't need the syscall, you still need to have the function to compile. Maybe add an empty init_amx function with #[cfg(all(target_arch = "x86_64", windows))]

@ziyizhang-1
Copy link
Contributor Author

Even though windows doesn't need the syscall, you still need to have the function to compile. Maybe add an empty init_amx function with #[cfg(all(target_arch = "x86_64", windows))]

yep, that's a good workaround to prevent repeated condition check

@ziyizhang-1
Copy link
Contributor Author

Can you please run cargo test -p stdarch-verify with the PRINT_MISSING_LISTS_MARKDOWN constant set to true? This would update the missing-x86.md file to reflect the new added intrinsics. Also, please do not edit the x86-intel.xml file manually. Download the latest from Intel's website (I think v3.6.9 currently) and replace this.

Also, there are very few amx intrinsics, so they can be implemented in a single amx.rs file instead of 5 files

updated x86-intel.xml to v3.6.9, please help to review the change

crates/core_arch/src/x86_64/mod.rs Outdated Show resolved Hide resolved
crates/stdarch-verify/tests/x86-intel.rs Outdated Show resolved Hide resolved
crates/core_arch/src/x86/test.rs Outdated Show resolved Hide resolved
@sayantn
Copy link
Contributor

sayantn commented Jul 25, 2024

The x86_64-unknown-linux-gnu-emulated CI runs all the tests (using Intel SDE), so your implementation amx-complex and amx-fp16 has been tested.

Also, maybe squash all these small commits into one single commit, just for the sake of cleanliness

@ziyizhang-1 ziyizhang-1 force-pushed the core_arch_enable_amx branch 2 times, most recently from 4d11307 to 02d797c Compare July 26, 2024 05:21
crates/core_arch/src/x86_64/amx.rs Outdated Show resolved Hide resolved
crates/core_arch/src/x86_64/amx.rs Outdated Show resolved Hide resolved
@bors
Copy link
Contributor

bors commented Jul 26, 2024

☔ The latest upstream changes (presumably fb90dfa) made this pull request unmergeable. Please resolve the merge conflicts.

@ziyizhang-1 ziyizhang-1 force-pushed the core_arch_enable_amx branch 2 times, most recently from 226daeb to fd48ea1 Compare July 26, 2024 12:58
@sayantn
Copy link
Contributor

sayantn commented Jul 26, 2024

@Amanieu should we keep the __tilecfg type? It indeed makes for a nice abstraction (all C programs that use amx have to define a struct or use an array because C lacks this type)

@Amanieu
Copy link
Member

Amanieu commented Jul 26, 2024

I don't think we should be inventing our own API in this crate, we should strictly follow the vendor's API. What if Intel decides to add its own __tilecfg type in the future that is incompatible with ours? I think it is best to leave this to an external crate. That crate can also deal with other AMX-related issues such as the syscalls needed for initialization.

@ziyizhang-1
Copy link
Contributor Author

I understood the concern raised by @Amanieu; there is indeed some probability that Intel will change the definition of the tile configuration in the future, for example, increasing/decreasing the number of tiles, which can lead to a change of the config layout. Anyway, I just moved __tilecfg to the tests mod, and considering the current feature x86_amx_intrinsics is still unstable, any different points of view can be welcomed and discussed.

@Amanieu
Copy link
Member

Amanieu commented Aug 1, 2024

Everything else looks good, so this can be merged once the i32 issue is fixed.

AMX Intrinsics:

amx-tile:
  - _tile_loadconfig
  - _tile_storeconfig
  - _tile_loadd
  - _tile_release
  - _tile_stored
  - _tile_stream_loadd
  - _tile_zero
amx-int8:
  - _tile_dpbssd
  - _tile_dpbsud
  - _tile_dpbusd
  - _tile_dpbuud
amx-bf16:
  - _tile_dpbf16ps
amx-fp16
  - _tile_dpfp16ps
amx-complex
  - _tile_cmmimfp16ps
  - _tile_cmmrlfp16ps
@Amanieu Amanieu merged commit 5097cfb into rust-lang:master Aug 3, 2024
30 checks passed
@ziyizhang-1 ziyizhang-1 deleted the core_arch_enable_amx branch August 4, 2024 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants