Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential improvement of encoding/decoding times #2

Open
tobybear opened this issue Jan 12, 2024 · 2 comments
Open

Potential improvement of encoding/decoding times #2

tobybear opened this issue Jan 12, 2024 · 2 comments

Comments

@tobybear
Copy link

tobybear commented Jan 12, 2024

Thank you for this interesting project!
As mentioned already in the readme, both the encoding and decoding process can be a bit slow. While this might not necessarily be that relevant for the encoding side, decoding speed of files can be problematic for longer files. On my system, encoding speed is around 0.25x while decoding is at 0.5x. The culprit is the MDCT conversion that is calculated quite a lot.
One simple approach I did use to improve this is by simply precalculating the MDCT over all possible coefficients when calling the encode/decode function. This needs to be done for short and long window sizes separately, with each lookup table having the size of (winsize * winsize / 2) floats.
At the expense of some temporary memory allocation, this yielded quite some improvements in my tests.
Using a 28 second test file, without the lookup table I had the following times:

  • encoding: 122s
  • decoding: 51s

With the lookup table:

  • encoding: 45s
  • decoding: 2s (!)

Encoded and decoded files are of course identical between the two variants, just the processing is around 3 times fast for encoding and around 25 times faster for decoding.
Not sure if this is of interest to any of you, but I thought I would just mention it here in case anyone is playing around with the library like I did. :)

@yupferris
Copy link
Member

Hi, thanks for the interest in the project, glad you had fun with it!

Bear in mind that the intended use case for the codec (particularly the decoder) is decoding short one-shot samples for 64k intros. As such, a really stupid brute-force MDCT impl is not at all a bad tradeoff when decode time is minimal and decoder code bytes are at a premium.

That said, even for tooling (or for just playing around with other use cases like you seem to have done) a faster MDCT would be desirable. I think a satisfying solution to this is to somehow decouple the MDCT impl from the rest of the code, assuming that can be done with zero (and I mean zero) additional bytes in the decoder when used in intros (the decoder is already a couple hundred bytes larger than I had hoped!).

I recall @going-digital doing some work to that effect, on this branch (which he mentioned to me in a twitter dm but I unfortunately forgot to reply until now, sorry about that!), which uses the classic resonator approach instead of precalcing a whole table, which is nice (that said, I haven't really looked in detail admittedly as my time is fairly limited for these kinds of projects).

Anyways, for something like this to be merged I'd need to see that it doesn't increase code size and isn't too complicated. I like @going-digital's approach I think but haven't done the due-dilligence, so there's an opportunity there perhaps :)

@tobybear
Copy link
Author

tobybear commented Feb 6, 2024

Thanks for your response. Just to clarify, I never wanted this test of mine to be an actual replacement for the intended use case of short samples in 64k demos - I know that there every byte matters ;)
I am currently just experimenting with taking this codec as a base/idea for my own projects in a different scenario where actual ecoding and decoding times matter, so the naive pre-calcs actually worked for me. Funnily, last week I also experimented with simple oscillators and had similar results to the repo/branch you posted.

For example, here is one of my recent tests:

Encoding of a 13s file:

  • vanilla: 85s
  • pre-calc: 18s
  • osc(toby): 24s
  • osc(going-digital): 25s

Decoding of a 13s file:

  • vanilla: 63s
  • pre-calc: 2s
  • osc(toby): 7s
  • osc(going-digital): 6s

I will test the increase in code size tomorrow, i.e. how many bytes each of the variations add to the base code.

Shadlock0133 added a commit to Shadlock0133/pulsejet that referenced this issue Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants