Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] epoll-like PAL interface #1971

Open
mwkmwkmwk opened this issue Aug 13, 2024 · 4 comments
Open

[RFC] epoll-like PAL interface #1971

mwkmwkmwk opened this issue Aug 13, 2024 · 4 comments

Comments

@mwkmwkmwk
Copy link
Contributor

Description of the feature

I propose implementing a new set of PAL APIs for persistent host-side event collection. The new PAL APIs should, for the most part, directly follow epoll_create/epoll_wait/epoll_ctl design.

The current epoll implementation in LibOS should be rewritten on top of these APIs, and so should the async thread.

Arguably, the current PalStreamsWaitEvents API could be replaced with the new APIs, but it may still have the benefit of being more efficient for simple cases, so I am not proposing that.

Why Gramine should implement it?

The reasons here are twofold: correctness and performance.

Currently, the implementation of FIOASYNC aka O_ASYNC aka the async thread is just plain broken. Among other serious problems, it is written such that after running FIOASYNC, the application gets exactly one SIGIO when the underlying handle becomes ready, after which the handle is removed from the async thread's notification list. However, the correct semantics in this case require us to send another SIGIO every time the underlying handle becomes ready again.

I do not believe these semantics can be reasonably implemented with the current PalStreamsWaitEvents API. If we change the implementation of async thread to not remove the handle from the list, the semantics of poll underlying PalStreamsWaitEvents will just cause the callback to be triggered again and again in a loop. We would need some mechanism to re-add the stream to the async thread's list when it becomes un-ready again, but we don't have anything like that.

Having an epoll-based API in the PAL solves that — we can keep the set of all FIOASYNC open handles persistently as an epoll handle on the host side, using EPOLLET edge-triggered mode. Then we can just translate all incoming events directly into SIGIO. While I am not fully convinced the semantics are a perfect match (they go through distinct pathways in kernel source), I believe they are close enough for our purposes.

The second reason has to do with the performance of our epoll implementation in LibOS — currently it is effectively emulated via poll, inheriting all of its problems.

@dimakuv
Copy link

dimakuv commented Aug 13, 2024

I am not convinced by this proposal.

I think I understand the rationale, but here are my counter-arguments:

  1. Stateful objects in PAL are complex. We strive to keep PAL stateless (and when we don't, we have security vulnerabilities like [PAL/Linux-SGX] Fix Trusted Files degenerating to Allowed Files on fork #1796). This epoll PAL object will be non-trivial to checkpoint-and-restore on fork.
  2. One more attack surface. Because you don't just introduce a new PAL API, you also introduce a bunch of new OCALLs -- the ones that will perform epoll_create/epoll_wait/epoll_ctl on the host.
  3. Non-trivial implementation/porting effort for other, non-Linux-based PALs. You propose to use non-trivial features like EPOLLET. Not sure about e.g. Windows or FreeBSD -- do these OS kernels support edge-triggered events? Similarly, implementing this new PAL API will be a pain in smth like Gramine-TDX.
  4. Duplication of polling mechanisms -- you propose to keep the old poll-based API. Thus, we'll have poll and epoll PAL APIs simultaneously -- this goes against the philosophy of keeping the interface minimal.
  5. I don't buy the performance argument -- I don't recall seeing real bottlenecks due to poll inefficiencies. Of course there are such corner cases, but there was no real need to replace poll with epoll; also Gramine has many perf bottlenecks in other places which should be fixed with higher priority (e.g., our coarse-grained locking).

Also, regarding FIOASYNC -- I think it's an ancient method used by dinosaur applications (maybe Apache web server?). I wouldn't mind emulating it on a best-effort basis (i.e., with terrible performance). Adding epoll PAL API just for this is an overkill.

@mkow
Copy link
Member

mkow commented Aug 13, 2024

  1. Stateful objects in PAL are complex. We strive to keep PAL stateless (and when we don't, we have security vulnerabilities like [PAL/Linux-SGX] Fix Trusted Files degenerating to Allowed Files on fork #1796). This epoll PAL object will be non-trivial to checkpoint-and-restore on fork.

That bug was caused by a terrible design, where whether a file is trusted or allowed was stored implicitly, and obtained by looking at some other fields of the struct, and the default was "allowed".

@dimakuv
Copy link

dimakuv commented Aug 14, 2024

This issue was discussed in #1964, please see the comments there.

TLDR: It was agreed to try to implement the EPOLLET emulation similar to the already-existing one, without adding the epoll PAL API.

@DemiMarie
Copy link

I think it is important to mention that poll is asymptotically worse than epoll in high-concurrency environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants