-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Threads stuck at READ-INITIAL-REQUEST-LINE #189
Comments
HUNCHENTOOT::READ-INITIAL-REQUEST-LINE
HUNCHENTOOT::READ-INITIAL-REQUEST-LINE
Don't think there's enough information to diagnose. |
Here's some more info I found out: notice the timeout variable is
So
|
Ok, the stream object's |
Does your acceptor have a timeout? |
Yeah, both read and write timeout are 20. |
Probably doesn't work with ssl. |
Ok, so the problem is in cl+ssl:
When they create the steam from the socket they probably should determine the deadline attribute from the socket's timeout. |
Thanks! |
Well, hunchentoot sets timeouts afterwards, with set-timeouts, so maybe it's both. |
I don't know much about how all these works, but I think it might be a different timeout. |
@casouri , the According to @stassats , hunchentoot set's timeout of ssl socket after it is created. |
After brief study of hunchentoot code, it seems hunchentoot does not set It simply sets the timeouts on the underlying socket. And later |
Though
Yes, because the stream is created by |
Do you have enough to make a pull request? |
Yes, more or less. I have patched it up like this:
It sets |
Actually that's only a patch, it doesn't use the |
Ok, here:
I don't know about the internals and overall structure of hunchentoot, so I'm not sure if it's the most appropriate place to set that |
@casouri , setting fixed deadline for connection stream when initializing it will lead to an error signaled when handling one of the next requests on this connection - not what we want. |
@stassats, @casouri, I propose the following approach (draft for now): avodonosov@3a1297e It's based on these two comments: cl-plus-ssl/cl-plus-ssl#69 (comment) and the reply CC @hanshuebner |
Thanks for taking the time! |
I am exploring this approach, calling The Several links on how and why Despite some of the comments in these links, the |
Looking at your patch @avodonosov, it seems like a pretty ugly solution. Wouldn't it be possible to patch hunchentoot so it sets the cl+ssl deadline whenever it sets the socket timeout? |
@fjl, I propose that approach because it does not depend on lisp implementation behavior in several layers of lisp libraries, unlike the socket timeouts / cl+ssl deadlines. Why do you think it's ugly? Speaking of the deadlines, first, they are not really supported by cl+ssl. Also note, deadline and timeout have different meaning, so deadline can not be set at the same place where timeout is set. Deadline is an absolute time, it would need to be set for every request, maybe at several stages of request processing. Essentially, the function I introduced in the draft is a deadline setting (I even considered naming it correspondingly) - after certain absolute time the IO is aborted (implemented as socket shutdown). It is used for 3 stages: stream setup (ssl handshake), request reading, request processing with response writing. |
Hope you're not offended by the term. I just said ugly because the solution in your draft didn't match my intution for the problem. Having looked into it more now, I can definitely see where you're coming from with this, but there are still a couple issues to consider before settling on your approach. Specifically, I'm thinking of these two things: (1) In HTTP servers, generally, there are three distinct timeouts to consider (you also have them in your draft!): the client must provide the headers / request line within a short time after opening the connection. A fixed timeout is OK for this, and a deadline would also work (it can just be current time + fixed timeout). Once the headers are in, things get more complicated because the size of the data that will follow is not of fixed size. Two solutions possible here: put the application in charge of the timeout, or use some kind of 'idle timeout' that ensures at least some amount of data will flow within the given time. Either of these is necessary to prevent the slowloris attack. A fixed timeout (or deadline) is not great because it will not work with large file uploads, for example. Finally, there should be timeout on delivering the response data back to the client, because the client could otherwise tie up the connection. Here, a fixed timeout / deadline is also inappropriate because the response may be very large. (2) Now, hunchentoots timeout handling isn't all that great. It does set timeouts, but if / how these are handled is kind of up to the lisp implementation. Also, since the client may get a wrapped stream from functions like RAW-POST-DATA, there is really no way to set the timeout on the application side. This is why I think the 'idle timeout' approach would work best for hunchentoot. It prevents the attack with a stuck socket, while also allowing the app not to care about timeouts at all while reading request body text. IMHO this is a problem that should be solved by the usocket API, somehow. Ideally, usocket would provide methods on the stream returned by SOCKET-STREAM to set the read / write idle timeouts that apply to all future calls of READ-BYTE and READ-SEQUENCE on the stream. Working around the fact that there is no such API, hunchentoot has set-timeouts.lisp. With CL+SSL, it gets even more complicated because it operates on a stream, not a socket. So not only is there no defined way for it to set the timeouts on the underlying socket, there is also no obvious way for library users to set the timeout on the SSL stream (except by calling custom CL+SSL methods). This, too, could be fixed by adding a stream timeout API to usocket. If such API existed, CL+SSL could: have a stream class specific to usocket, implement those timeout methods for usocket-based SSL streams (just pass them through), and thus allow callers to set the timeout generically for both SSL and non-SSL streams. Mostly, what I'm getting at here is: hunchentoot tries to paper over a lot of library issues already, and while creating another hack to make it work will probably fix the issues for hunchentoot, setting timeouts will still be a nightmare for any other application. We don't even need to look very far beyond hunchentoot for this, just think of hunchensocket. It also needs timeouts, and they wouldn't work over SSL with the hack. Would be much nicer to find a solution that improves things for all applications and then use that to improve / simplify hunchentoot. |
You should raise that over cl+ssl/usocket, there is nothing hunchentoot can do. In the mean time, hunchentoot need to fix this problem to be useable, regardless how "ugly" you think it is. |
If you need a fix short-term, you should really consider running hunchentoot behind a proxy. I personally use caddy as the SSL frontend for hunchentoot. You can also set timeouts on the proxy and offload static file serving to it. |
A little update. The cl+ssl fix for cl-plus-ssl/cl-plus-ssl#133 (already in Quicklisp) restores Lisp BIO functionality, which means instead of giving OpenSSL a file descriptor of the lisp stream, we can wrap the lisp stream itself, thus inheriting all the timeout semantics. So the hunchentoot threads on inactive connections will be unblocked by IO timeout in the To utilize this functionality either globally Line 83 in 460a32c
Maybe in the future timeout support can be added to cl+ssl in the default mode were we pass the file descriptor to OpenSSL. But I am not sure of interface for this. Maybe it's better to drop the current timeouts / deadlines support, which tries to signal a lisp implementation specific condition, and introduce an explicit timeout parameter and signal a new type of error condition. Also I still think it may be good to have a kind of health-check for hunchentoot that detects stuck threads and logs them, or optionally tries to terminate them by shutting down the connection socket. It will be useful in case the timeout functionality degrades in the future, or for slow request attack, where 1 byte is sent every 15 seconds, for example, so that timeout does not happen, and the tread remains occupied by this connection. I have a branch where such a stuck thread monitor is implemented. It logs stuck threads, and optionally can shutdown their sockets automatically in order to unblock them. https://github.com/edicl/hunchentoot/compare/master...avodonosov:ssl-thread-leak?expand=1 @stassats and other maintainers, do you think that's a useful feature for hunchentoot? The implementation in my branch tries to touch hucnhentoot core as little as possible - only introduces a generic function The mixin is implemented in a separate package |
I think I'd prefer working timeouts. |
It would be good if hunchentoot at least exposed a I am packaging my solution as a separate module anyways: https://github.com/avodonosov/hunchentoot-stuck-connection-monitor But it took copy-pasting the whole |
I'm bitten by this problem, and it's the only thing that occasionally stops service from my otherwise solid hunchetoot server. I might have a look, but maybe somebody with hunchentoot/cl+ssl knowledge could offer some updated insight into this? |
This patch could help: cl-plus-ssl/cl-plus-ssl#157 |
Look where? You mean investigate the issue? This thead provides several solutions already. |
Yes.
I don't quite understand. Are you saying this is not a flaw in Hunchentoot? |
@frodef, it is a defect (in hunchentoot, or in cl+ssl, or in CL, or in the universe...) This thread mentions 4 solutions, from very easy (one two lines) to little less easy. Do you mean none of them work for you? Or it's difficult to find them in the discussion? |
I noticed that many threads hang around for a very long time and eventually there are too many of them that the server can't accept new requests. All of them hang in
HUNCHENTOOT::READ-INITIAL-REQUEST-LINE
. The read and write timeout for the acceptor is the default 20. I couldn't figure out what's going on, the functions in the call chain don't seem to have timeout parameters.Here is the stackframe of one of the threads:
The text was updated successfully, but these errors were encountered: