Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout not applying to encoding? #4257

Open
rexxars opened this issue Nov 7, 2024 · 8 comments
Open

Timeout not applying to encoding? #4257

rexxars opened this issue Nov 7, 2024 · 8 comments
Labels

Comments

@rexxars
Copy link
Contributor

rexxars commented Nov 7, 2024

Question about an existing feature

The .timeout() method seems to only apply to certain operations, not the actual encoding of the image - is that right? Is there a way one could abort an ongoing transform/encode operation? If so, would it be possible to make the timeout apply to this step as well?

What are you trying to achieve?

I'd like to "give up" on processing/encoding images if they take longer than a given threshold. Without having been able to reproduce this consistently, it seems processing sometimes gets "stuck" on an image for a very long time, and I'd love to be able to simply drop the operation entirely - eg prevent it from using any more resources, and explicitly fail.

While I can set a timeout in user-land code and proceed as if it timed out, I cannot abort the actual encoding process, so it will continue taking CPU time until it eventually finishes/errors.

Additionally, it would be nice to be able to abort transformations, eg:

import sharp from 'sharp'

const abortController = new AbortController()
setTimeout(() => abortController.abort(), 1000)

try {
  const image = await sharp('input.jpg')
    .resize(1000, 1000)
    .toFormat('avif', {
      signal: abortController.signal
    })
    .toBuffer()
} catch (err) {
  console.log(err instanceof AbortError) // true
}

When you searched for similar issues, what did you find that might be related?

Please provide a minimal, standalone code sample, without other dependencies, that demonstrates this question

Repository that illustrates the problem: https://github.com/rexxars/sharp-timeout-issue

Please provide sample image(s) that help explain this question

See repo above, but any sufficiently large image will do

@lovell
Copy link
Owner

lovell commented Nov 7, 2024

Almost all libvips operations support timeout, however the HEIF writer does not as the API of the underlying libheif library exposes a "one shot" encoder.

https://github.com/libvips/libvips/blob/b27b8e450e6508f3d7e840da8212ec59695b632d/libvips/foreign/heifsave.c#L321-L322

There is a brand new feature in the latest libheif v0.19.x that exposes HEIF's grid feature, which might allow for tile-based output, but support for this would need to be added to libvips (mapping its concept of an image region to HEIF's concept of a grid image). This would increase file size a little, but should reduce memory consumption and I would guess slightly improve performance as encoding can start sooner.

@kleisauke
Copy link
Contributor

FWIW, the comment at kleisauke/wasm-vips#78 (comment) provides more details on why this occurs with the HEIF saver (or any saver that doesn't support streaming/chunk-wise encoding).

@rexxars
Copy link
Contributor Author

rexxars commented Nov 7, 2024

For what it's worth, in the repo I linked I am seeing the same behavior as AVIF (heif) when encoding PNGs and GIFs, eg if I set the timeout to 2 seconds:

  [avif]
    ❌ Exceeded timeout by 4293 milliseconds without failing
  [webp]
    ✅ Processed image in less than 2 seconds
  [jpeg]
    ✅ Processed image in less than 2 seconds
  [png]
    ❌ Exceeded timeout by 665 milliseconds without failing
  [gif]
    ❌ Exceeded timeout by 1297 milliseconds without failing

I am completely unfamiliar with the internals of libvips and how it uses threads, but would it be possible to forcefully "kill"/abort an ongoing encoding operation?

@lovell
Copy link
Owner

lovell commented Nov 19, 2024

To improve things for PNG writing, I think we'd need to add a call to vips_image_eval then check vips_image_iskilled to see if we should exit within the loop that calls spng_encode_row here:

https://github.com/libvips/libvips/blob/e38953c5dfa302875c4578ad1222b114ee6eac69/libvips/foreign/spngsave.c#L353

(Could probably limit this check to every N iterations of the loop.)

@styfle
Copy link

styfle commented Jan 7, 2025

@lovell I just discovered this issue on the latest sharp (0.33.5).

I found that both avif() and gif() suffer from this issue (but png() and webp() work fine).

Here's the code to reproduce:

// index.mjs
import sharp from 'sharp'
const res = await fetch('https://image-optimization-one.vercel.app/large.png')
const input = await res.bytes()
const startTime = performance.now()
const output = await sharp(input).timeout({seconds: 1}).resize(5_000).gif().toBuffer()
console.log({
  totalTimeMs: performance.now() - startTime,
  inputBytes: input.byteLength,
  outputBytes: output.byteLength,
})

Note the timeout of 1 second yet it takes 5000ms on my machine and never throws an error.

Is there any workaround? This seems like a big problem.

@lovell
Copy link
Owner

lovell commented Jan 8, 2025

@styfle Thanks for following up, I carried out some investigation and have opened libvips/libvips#4344 to hopefully improve the situation for GIF output.

@styfle
Copy link

styfle commented Jan 13, 2025

@lovell Thanks for fixing the GIF output so fast! Looking forward to that landing in sharp! 🚀

Regarding the similar issue with AVIF - I see you mentioned upgrading libheif in a previous comment #4257 (comment) but that sounds like it might be a big effort.

I'm curious if the timeout issue could be solved more easily than that since libheif or libaom runs AVIF encoding in threads (I think) and could kill a thread if its taking longer than the expected timeout (I assume). Would that be easier to implement?

I'm hoping we can find a quick workaround for AVIF since this is becoming increasingly popular and its particularly compute heavy so timeout is basically a requirement for many use cases 😅

@jcupitt
Copy link
Contributor

jcupitt commented Jan 14, 2025

You can kill processes, but you can't really kill threads, since they don't automatically clean up. What state would they leave the rest of the program in? What if a thread was holding an important mutex when it was removed?

The best solution is for the controlling thread to set a "please exit when you can" flag, and for the worker to check this periodically at safe points. We could add this when heifsave gets chunked output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants