Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SIMD support for ESP32S3 #56

Closed
modi12jin opened this issue Mar 18, 2023 · 17 comments
Closed

Add SIMD support for ESP32S3 #56

modi12jin opened this issue Mar 18, 2023 · 17 comments

Comments

@modi12jin
Copy link

modi12jin commented Mar 18, 2023

This passage was generated by chatGPT

ESP32-S3 is a high-performance, low-power microcontroller that supports SSE (SIMD) instruction set, which can complete multiple operations in one instruction, improving code efficiency and running speed. Here is a sample code using SIMD instructions on ESP32-S3:

#include "xtensa/hal.h"
#include "esp_simd.h"
#include <stdio.h>

void simd_example(void)
{
    SIMDDATA s1 = {1.0f, 2.0f, 3.0f, 4.0f};
    SIMDDATA s2 = {5.0f, 6.0f, 7.0f, 8.0f};
    SIMDDATA s3 = {0.0f, 0.0f, 0.0f, 0.0f};

    // Add the numbers in each element of s1 and s2, and store the result in s3
    s3 = SIMD_ADD(s1, s2);

    // output each element in s3
    printf("%f, %f, %f, %f\r\n", s3.f32[0], s3.f32[1], s3.f32[2], s3.f32[3]);

    // Multiply the numbers in each element of s1 and s2, and store the result in s3
    s3 = SIMDMUL(s1, s2);

    // output each element in s3
    printf("%f, %f, %f, %f\r\n", s3.f32[0], s3.f32[1], s3.f32[2], s3.f32[3]);
}

In this sample code, the SIMD_ADD() and SIMDMUL() functions are functions that use SIMD instructions to complete addition and multiplication operations, and the SIMDDATA type is a pointer, which is used to point to a vector array containing 4 floating-point number elements. Using these functions can greatly improve code efficiency and execution speed.

It should be noted that to use SIMD instructions on ESP32-S3, you need to include <xtensa/hal.h> header files and <esp_simd.h> header files, and use the -msimd option to enable SIMD instructions when compiling set support.

@modi12jin
Copy link
Author

modi12jin commented Mar 18, 2023

Maybe this passage will help you

espressif/idf-extra-components#106

@bitbank2
Copy link
Owner

I'm quite familiar with SIMD coding and would be happy to optimize my code for the S3, but I can't find the include files you referenced above. Do you have a working Github link to them?

@modi12jin
Copy link
Author

modi12jin commented Mar 19, 2023

@bitbank2 Thank you for your reply, maybe the directory file name has been changed, causing the address he gave to be invalid

https://github.com/espressif/esp-adf-libs/tree/master/esp_codec/include/codec

I can't find the header file esp_simd.h either, maybe this question helps

espressif/esp-idf#7745

https://github.com/espressif/esp-dsp

@modi12jin
Copy link
Author

modi12jin commented Mar 19, 2023

I saw on twitter that they have introduced SIMD instructions in the technical reference manual

https://mobile.twitter.com/eMbeddedHome/status/1570520252123062274

https://mobile.twitter.com/lovyan03/status/1622846385438720002

@bitbank2
Copy link
Owner

I saw these references months ago, but no concrete examples. I thought you had new information. I'll keep searching for this info and when it actually becomes available, I'll implement it. For now, writing in ESP32 assembly language is not going to happen.

@modi12jin
Copy link
Author

Many thanks! Looking forward to your work.

@modi12jin
Copy link
Author

modi12jin commented Mar 20, 2023

@bitbank2 Contact Espressif's official staff, he said that there seems to be no fully open version of the SIMD data.
There may be some clues hidden in esp-dsp.

@bitbank2
Copy link
Owner

I would honestly like to work on this, but I have very little free time. It will need to be painless and well documented.

@modi12jin
Copy link
Author

@bitbank2 i got some replies

espressif/esp-bsp#154

@modi12jin
Copy link
Author

@bitbank2 It should be possible to call the DSP like this from the Arduino.

espressif/esp-dsp#11

espressif/arduino-esp32#7710

#include <Arduino.h>
#include "dsps_biquad_gen.h" 

void setup() {
  Serial.begin(115200);
  float coeffs[15]={0},f=0.4,qFactor=4;
dsps_biquad_gen_lpf_f32(coeffs, f,  qFactor);

for (int i=0;i<15;i++){
  Serial.printf("%f \n",coeffs[i]);
}
}

void loop() {
}

@bitbank2
Copy link
Owner

This DSP API library has been around for several years. It MAY be optimized for SIMD, but still doesn't really help any of my work.

@modi12jin
Copy link
Author

modi12jin commented Sep 13, 2023

@bitbank2 Sorry to bother you again! I got new news that this component supports SIMD.
Officials told me that this only supports whole frame decoding and cannot be divided into blocks, and if you use this, the buffer used to decode seems to need to be 16-byte aligned.

https://github.com/espressif/esp-dev-kits/tree/master/esp32-s3-lcd-ev-board%2Fexamples%2Fusb_camera_lcd%2Fcomponents%2Fesp_jpeg

Components ported from ESP_ADF

@bitbank2
Copy link
Owner

Unfortunately not helpful because they didn't release the source code.

@modi12jin
Copy link
Author

@bitbank2 Sorry to bother you again, this may not be helpful, but I wanted to tell you the test results.

JPEG decoding with SIMD, currently the whole frame, cannot be partial, there will be more in the future

The only thing that needs attention is that the buffer must be 16-byte aligned. I tested 320240 with a box and it took an average of 42 ms to decode RGB565. The performance on Arduino is really not good. I remember that decoding 800480 under IDF took less than 50ms.

JPEGDEC seems to be 68ms

https://github.com/esp-arduino-libs/ESP32_JPEG/blob/master/examples/DecodeTest/DecodeTest.ino

@bitbank2
Copy link
Owner

I worked on this over the weekend and got some good results optimizing my JPEG decoder. I'll publish the code soon.
What I find strange is your first comment on this issue - you show instructions, include files, and things that don't actually exist in the ESP32-S3 instruction set. The SIMD instructions (according to Espressif's own documentation), only support integer operations and are somewhat limited. I'm continuing my search for more info, but so far the SIMD of the S3 is mostly disappointing.

@modi12jin
Copy link
Author

@bitbank2 This is jpeg SIMD decoding, which is now partially supported. Sir, you can try it and see how it works

7_20231124_jpeg_block_decoder_esp32s3.zip

@bitbank2
Copy link
Owner

I'm not interested in someone else's closed source code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants