Adafruit graphics library optimization #69
Replies: 73 comments 7 replies
-
Try latest release 0.9.2. I tested using that lib myself, but forgot to |
Beta Was this translation helpful? Give feedback.
-
Thanks, EEPROM now works fine! The Adafruit ILI9341 comes a bit further but gives errors here: C:\Users\marcel\Documents\Arduino\libraries\Adafruit_BusIO\Adafruit_SPIDevice.cpp: In constructor 'Adafruit_SPIDevice::Adafruit_SPIDevice(int8_t, int8_t, int8_t, int8_t, uint32_t, BitOrder, uint8_t)': Library versions: Adafruit BusIO 1.72 if I #undef BUSIO_USE_FAST_PINIO in Adafruit_SPIDevice_h then it compiles. Have not connected a real display yet. Thanks! Marcel |
Beta Was this translation helpful? Give feedback.
-
That’s expected. The direct pin register writes that the fastio used aren’t needed or supported on the Pico. we might want to see if it’s possible to add a wrapper to support then at some point. PRs always welcome. :) |
Beta Was this translation helpful? Give feedback.
-
Hi, |
Beta Was this translation helpful? Give feedback.
-
A quick look at the Adafruit_BusIO library shows that there is an AdafruitSPI constructor which takes a HW SPI. Why not pass in the existing HW SPI device (SPI object, been tested to work w/SdFat to read/write FAT SD cards)? |
Beta Was this translation helpful? Give feedback.
-
Thanks, that works, and is 5 times faster but still very slow. For comparison with hardware SPI on a teensy 4.0, it is 80 times faster (SPI clock set to 75MHz), while teensy4.0 CPU clock is 5 times higher. Interesting your default SPI is on pins 0-3, while the pico schematic shows the default on pins 16-19. |
Beta Was this translation helpful? Give feedback.
-
Did you specify the SPI clock as part of the SPIConfig? Are you equipped to check the SPI clock frequency? I might have an issue setting it somewhere.
Where did you find defaults? I just worked from the rp2040 datasheet and picked the first block of muxes it came out of, not for any real reason. There are also calls |
Beta Was this translation helpful? Give feedback.
-
Just measured. Clock is 25 MHz but only 1 word transmitted per ~10 us. So it
is slowing down elsewhere. Other speed is with teensy specific library
setclock. software adafruit on fast teensy is about 5 times faster than hw
spi on pico.
…On Thu, 25 Mar 2021, 21:42 Earle F. Philhower, III, < ***@***.***> wrote:
Thanks, that works, and is 5 times faster but still very slow. For
comparison with hardware SPI on a teensy 4.0, it is 80 times faster (SPI
clock set to 75MHz), while teensy4.0 CPU clock is 5 times higher.
Did you specify the SPI clock as part of the SPIConfig? Are you equipped
to check the SPI clock frequency? I might have an issue setting it
somewhere.
Interesting your default SPI is on pins 0-3, while the pico schematic
shows the default on pins 16-19.
Where did you find defaults? I just worked from the rp2040 datasheet and
picked the first block of muxes it came out of, not for any real reason.
There are also calls setXXX which can adjust the used pins (call before
SPI::begin)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVDWJO5TINYN32VCW2G7BLTFOU4NANCNFSM4ZWYVQ5A>
.
|
Beta Was this translation helpful? Give feedback.
-
And default spi has different color on pico pinout diagram. Thanks
…On Thu, 25 Mar 2021, 21:42 Earle F. Philhower, III, < ***@***.***> wrote:
Thanks, that works, and is 5 times faster but still very slow. For
comparison with hardware SPI on a teensy 4.0, it is 80 times faster (SPI
clock set to 75MHz), while teensy4.0 CPU clock is 5 times higher.
Did you specify the SPI clock as part of the SPIConfig? Are you equipped
to check the SPI clock frequency? I might have an issue setting it
somewhere.
Interesting your default SPI is on pins 0-3, while the pico schematic
shows the default on pins 16-19.
Where did you find defaults? I just worked from the rp2040 datasheet and
picked the first block of muxes it came out of, not for any real reason.
There are also calls setXXX which can adjust the used pins (call before
SPI::begin)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVDWJO5TINYN32VCW2G7BLTFOU4NANCNFSM4ZWYVQ5A>
.
|
Beta Was this translation helpful? Give feedback.
-
Odd. There is very little code in the SPI.cpp wrapper: https://github.com/earlephilhower/arduino-pico/blob/master/libraries/SPI/SPI.cpp Basically, the Pico SDK is doing all the work. It is possible the |
Beta Was this translation helpful? Give feedback.
-
Thanks, just tried adding the cache, makes little difference. I'll keep on looking. There is no floating point used either. |
Beta Was this translation helpful? Give feedback.
-
Just checking, @marcelvanherk, but did you make |
Beta Was this translation helpful? Give feedback.
-
yes. It was outside the routine.
…On Fri, 26 Mar 2021, 16:01 Earle F. Philhower, III, < ***@***.***> wrote:
Just checking, @marcelvanherk <https://github.com/marcelvanherk>, but did
you make n static? OTW it'll always end up calling set-format each pass...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVDWJMQCXD7KPQ2KLIF5LTTFSVVRANCNFSM4ZWYVQ5A>
.
|
Beta Was this translation helpful? Give feedback.
-
Hi again, Could the speed difference be due to different compiler optimisation? My teensy 3.5 at 120 MHz runs about 4 times faster than the 125 MHz PICO, both with hardware SPI, running exactly the same code. Marcel |
Beta Was this translation helpful? Give feedback.
-
Possibly. I've selected Edit platform.txt and swap |
Beta Was this translation helpful? Give feedback.
-
I have put the new pixel code in the Adafruit_ILI9341 library and run some tests. Original, from Adafruit repository without RP2040 optimisations, 62.5MHz SPI clock:
With basic code optimisations for RP2040 (6 to 10x faster):
Note how slow the "Lines" and "Circles (outline)" speed is due to the SDK function overheads affecting the large number of single pixel writes. Then with the new RP2040 writePixel function added to Adafruit_ILI9241 library to override the GFX function:
The 3.8x faster "Lines" speed is solely due to the faster RP2040 pixel optimisation. I suspect this is still not as fast as the Teensy code you were running though... After more tests I will update my Github fork. |
Beta Was this translation helpful? Give feedback.
-
Pixel optimisation added to Adafruit_ILI9341 fork. Coordinate checks had to be added, so this slowed it a tiny bit compared to results above. No change was needed to the RP2040 optimised fork of the Adafruit_GFX library. |
Beta Was this translation helpful? Give feedback.
-
@Bodmer you should think about submitting a PR to Adafruit with your changes, if they're all bracketed with |
Beta Was this translation helpful? Give feedback.
-
I totally agree. Speed is 5 times improved for my app. However, software
SPI is broken. Have just located the issue. One ifdef misses an if for
software mode, around line 995 in Adafruit_SPITFT.cpp should read:
#elif defined(ARDUINO_ARCH_RP2040)
if (connection == TFT_HARD_SPI) {
spi_write16_blocking(spi0, (const uint16_t*)colors, len);
return;
}
And in Adafruit_ILI9341.cpp writePixel this code should be added:
if (connection != 0) {
Adafruit_SPITFT::writePixel(x, y, color);
return;
}
Then SW SPI works as well!
regards,
Marcel
…On Fri, 2 Apr 2021, 15:44 Earle F. Philhower, III, ***@***.***> wrote:
@Bodmer <https://github.com/Bodmer> you should think about submitting a
PR to Adafruit with your changes, if they're all bracketed with #ifdef
ARDUINO_ARCH_RP2040 so the lib will still work elsewhere. Seems like
massive improvements, and Adafruit does have a RP2040 board out already.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#69 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVDWJILQBJB6LBRFXH2HYDTGXJ3PANCNFSM42HOOZAA>
.
|
Beta Was this translation helpful? Give feedback.
-
I have also updated the setAddrWindow function to use the optimised SPI code for the RP2040 and this is also in the updated librarys here and here. This shows the performance improvement from the original to the RP2040 optimised version: |
Beta Was this translation helpful? Give feedback.
-
I have submitted pull requests but I suspect Adafruit will not like the changes and may wish to wait for the "official" Arduino core... in case of incompatibility. However the approach used may be of interest and at least it chown what can be achieved with relatively simple changes. |
Beta Was this translation helpful? Give feedback.
-
Me too!
Marcel
…On Sat, Apr 3, 2021 at 7:15 PM Bodmer ***@***.***> wrote:
Yep, understandable. BTW thanks for creating the Arduino RP2040 support
package and being so responsive. I am really pleased with it and am having
great fun!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#69 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVDWJIUKXVMIIUPVCS37HTTG5LLFANCNFSM42HOOZAA>
.
|
Beta Was this translation helpful? Give feedback.
-
Further tests indicate a lot of time is being lost when beginTransaction(SPISettings settings) is called, this takes about 68us which is a very long time when compared to some acttivities like drawing a pixel. One of my test sketches runs >15 times faster if I grab the bus by calling it once. With Adafruit_GFX this can be demonstrated by comparing a tft.drawPixel(...) loop with a single call to start tft.startWrite() then a loop with writePixel(). For example at 62.5MHz:
Prints:
So the repeated calls to beginTransaction lead to a ~19x performance degradation due to the 68us overhead for every graphics operation. I have added a transaction lock to my library (TFT_eSPI) and one of my cellular automata (Game of life) sketches then runs at blistering rate, more commensurate with the processors capability. I will look to see if this can be easily added to the Adafruit_GFX fork easily so tft.startWrite() locks transaction for all graphics functions until endWrite is called. Also I will look to see why the beginTransaction is so slow, maybe this is a required setup delay, but it seems daft to me. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I found that as well, and updated SPI.C to achieve this effect. This makes
it my application more than twice as fast.
https://hackaday.io/project/178790-pico-data-general-nova-simulator
…On Sun, Apr 4, 2021 at 11:09 PM Bodmer ***@***.***> wrote:
Further tests indicate a lot of time is being lost when
beginTransaction(SPISettings settings) is called, this takes about 68us
which is a very long time when compared to some acttivities like drawing a
pixel. One of my test sketches runs >15 times faster if I grab the bus by
calling it once. With Adafruit_GFX this can be demonstrated by comparing a
tft.drawPixel(...) loop with a single call to start tft.startWrite() then a
loop with writePixel(). For example at 62.5MHz:
uint32_t dt = millis();
for (int i = 0; i < 10000; i++) { tft.drawPixel(0,0, TFT_BLUE); tft.drawPixel(1,1, TFT_RED); }
Serial.print("drawPixel = "); Serial.println(millis() - dt);
tft.startWrite();
dt = millis();
for (int i = 0; i < 10000; i++) { tft.writePixel(0,0, TFT_BLUE); tft.writePixel(1,1, TFT_RED); }
Serial.print("writePixel = "); Serial.println(millis() - dt);
Prints:
drawPixel = 1515
writePixel = 80
So the repeated calls to beginTransaction lead to a ~19x performance
degradation due to the 68us overhead for every graphics operation.
I have added a transaction lock to my library (TFT_eSPI) and one of my
cellular automate (Game of life) sketches then runs at blistering rate,
more commensurate with the processors capability.
I will look to see if this can be easily added to the Adafruit_GFX fork
easily so tft.startWrite() locks transaction for all graphics functions
until endWrite is called.
Also I will look to see why the beginTransaction is so slow, maybe this is
a required setup delay, but it seems daft to me.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#69 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVDWJNS3ADVJD4YWPM3JA3THDPQLANCNFSM42HOOZAA>
.
|
Beta Was this translation helpful? Give feedback.
-
It also shows the SDK is not well optimised at all for one of the most
common use cases - simple graphics without a framebuffer.
…On Sun, Apr 4, 2021 at 11:09 PM Bodmer ***@***.***> wrote:
Further tests indicate a lot of time is being lost when
beginTransaction(SPISettings settings) is called, this takes about 68us
which is a very long time when compared to some acttivities like drawing a
pixel. One of my test sketches runs >15 times faster if I grab the bus by
calling it once. With Adafruit_GFX this can be demonstrated by comparing a
tft.drawPixel(...) loop with a single call to start tft.startWrite() then a
loop with writePixel(). For example at 62.5MHz:
uint32_t dt = millis();
for (int i = 0; i < 10000; i++) { tft.drawPixel(0,0, TFT_BLUE); tft.drawPixel(1,1, TFT_RED); }
Serial.print("drawPixel = "); Serial.println(millis() - dt);
tft.startWrite();
dt = millis();
for (int i = 0; i < 10000; i++) { tft.writePixel(0,0, TFT_BLUE); tft.writePixel(1,1, TFT_RED); }
Serial.print("writePixel = "); Serial.println(millis() - dt);
Prints:
drawPixel = 1515
writePixel = 80
So the repeated calls to beginTransaction lead to a ~19x performance
degradation due to the 68us overhead for every graphics operation.
I have added a transaction lock to my library (TFT_eSPI) and one of my
cellular automate (Game of life) sketches then runs at blistering rate,
more commensurate with the processors capability.
I will look to see if this can be easily added to the Adafruit_GFX fork
easily so tft.startWrite() locks transaction for all graphics functions
until endWrite is called.
Also I will look to see why the beginTransaction is so slow, maybe this is
a required setup delay, but it seems daft to me.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#69 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVDWJNS3ADVJD4YWPM3JA3THDPQLANCNFSM42HOOZAA>
.
|
Beta Was this translation helpful? Give feedback.
-
I have updated the Adafruit_GFX library fork here. A new function has been added to lock access to the the bus:
Results for drawPixel loop above are 12x faster:
Text rendering when using print stream isa also ~2x faster. Locked code for drawPixel test:
New results for graphics test with locked transaction just breaks the 1 second barrier, but that is not very drawPixel intensive as it uses writePixel() internally. Some improvement though across functions.
|
Beta Was this translation helpful? Give feedback.
-
I think the SDK has been pulled together from other sources and in the SPI case it is rather bloated for the low level write only SPI functions so the performance if poor unless streaming. I have written my own functions now for TFT_eSPI and these allow processing to carry on while the SPI FIFO empties. Then I only flush the read FIFO when I want to read from the screen or am ending the transaction. It looks like the whole SPI function gets reset anyway for the next bus user but I have not tested with other functions on the bus yet. With this update the Pico beats the ESP8266 and ESP32 for simple graphics stuff. Using SPI FIFO transmit time:
|
Beta Was this translation helpful? Give feedback.
-
Here are the results for TFT_eSPI with the locked transaction, this is the number of millseconds to draw 20,000 pixels:
So 3.85us per pixel. |
Beta Was this translation helpful? Give feedback.
-
Hi, my similar changes are in SPI.cpp are to just keep spi initited at all times: SPIClassRP2040::beginTransaction, replace spi_deinit(_spi) by return Overall speed is then good, but the Teensy3.5 with ili941_t3 library is still 1.5 times faster |
Beta Was this translation helpful? Give feedback.
-
I changed my implementation using the orginal SPI.cpp, and your latest release and the following lines: SPIClassRP2040 spi00 = SPIClassRP2040(spi0, TFT_MOSI, TFT_CS, TFT_SCK, TFT_MISO); There is no speed difference, i.e., it works well. Marcel |
Beta Was this translation helpful? Give feedback.
-
Hi,
I use Adafruit_ILI9341 to drive a 9341 LCD display in my Data General Nova simulator.
https://github.com/marcelvanherk/nova1200-restoration
(folder teensy_nova)
I have successfully compiled this for the pico with your platform (with a few changes) but:
#include "Adafruit_ILI9341.h"
(latest version as well as dependencies) reports:
C:\Users\marcel\Documents\Arduino\libraries\Adafruit_ILI9341\Adafruit_ILI9341.cpp:53:10: fatal error: wiring_private.h: No such file or directory
53 | #include "wiring_private.h"
| ^~~~~~~~~~~~~~~~~~
compilation terminated.
could this be made to work on the pico?
Marcel
Beta Was this translation helpful? Give feedback.
All reactions