Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OXT-1521: Replace UID with Quark #1109

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

crogers1
Copy link
Contributor

UID is an Ocaml based webserver in dom0 that supports the backend operations of the UI and UIVM. In the current state of OpenXT, much of the code is unused. It simply serves the javascript files to the browser in the UI and listens for POST messages for debug/logging.

We can replace UID with a simpler webserver, Quark, that has several benefits. It's more lightweight, written in C, and is actively maintained. This PR seeks to replace UID and update the refpolicy to define quark-specific types.

@crogers1
Copy link
Contributor Author

This should be considered for merge after #1101

@stacktrust
Copy link

Please create a JIRA ticket for this new component.

@stacktrust
Copy link

stacktrust commented Mar 21, 2019 via email

@rneilturner
Copy link
Contributor

Tested downstream and works as expected. +1

@crogers1
Copy link
Contributor Author

@jean-edouard @eric-ch @dozylynx @jandryuk thoughts?

@jean-edouard
Copy link
Member

Thoughts about replacing an obsolete and un-maintained ocaml solution with a current and upstream C one?
I think I'll go with +1 :)

-LDFLAGS = -s
+CPPFLAGS ?= -DVERSION=\"$(VERSION)\" -D_DEFAULT_SOURCE -D_XOPEN_SOURCE=700 -D_BSD_SOURCE
+CFLAGS ?= -std=c99 -pedantic -Wall -Wextra -O
+LDFLAGS ?=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weak assignment is no issue, but why change the defaults?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CFLAGS change was a leftover change from when I was debugging and can be reverted. The LDFLAGS change (I believe) was necessary to compile debug symbols for testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, -s is not desirable. Switching to weak assignment lets Bitbake override the variables with the toolchain configuration though.
It is unfortunately done in many projects :).

- die("getgrnam '%s': %s", group, errno ? strerror(errno) :
- "Entry not found");
- }
-
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this? This is not documented in the patch and should be split in its own patch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user/group validation and the uid operations in your below comment required quark to have some admin-like SELinux privileges. I wanted to avoid that, so I just removed the functionality entirely. We don't need these as far as I'm aware anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you left the options definitions in getopt() and the usage, so not entirely. It really belongs in a separate patch "Add POST and v4v support to quark." is not really related to this.

SELinux handles these sort of perms, plenty of other daemons drop root. It can be added to the policy file.

- }
- if (getgid() == 0) {
- die("Won't run as root group", argv0);
- }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before, also this is crippling options -u and -g which could be used should the default user/group hardcoded in there not be available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above

"Date: %s\r\n"
"Connection: close\r\n"
"Last-Modified: %s\r\n"
+ "Expires: Wed, 1 Jan 2030 00:00:00 GMT\r\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that necessary? Is there some clever caching being done on the UI side?[1].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason this is here is because UID sent a similar line in its response header (see https://github.com/OpenXT/uid/blob/85bdfe033454efc8016362c466599b0658d91c79/file_handler.ml#L82). It's supposed to make use of HTTP caching, but I'm not even really sure that the client is taking advantage of that. i can remove it if desired.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UID does it on images only, this is probably to not reload the entire UI and get things a little faster, the browser likely handles the caching.
While the images in the UI page won't change, could this trigger some fun on other resources?

Also quark already has helpers to handle timestamps, this could be now + 10years, or something like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch I didn't realize it was just for images. I can do something similar using the helpers.

+ FD_SET(fd, &writefds);
+ }
+
+ select(fd + 1, &readfds, &writefds, NULL, &tv);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

select(2) can be interrupted by signals(7).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concur. Also, since you have a timeout specified and the function returns void, you don't know which occurred when calling this function, either read access, timeout, or interrupt. The return of the select() can help make this determination clearer, or allow you to put in a bit of a spinlock if you want to keep this function returning void.

+ /* wait until our fd is available for the next operation */
+ wait_fd_ready(fd, true, true);
+ ret = resp_file(fd, RELPATH(realtarget), r, &st, mime, lower, upper);
+ wait_fd_ready(fd, true, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not fix anything. If it does, we have bigger problems.

  • resp_file() will only write to fd.
  • write(2) should only block if there is no space left to send the message.

With that in mind:

  • Waiting on read before writing would be equivalent to sleep(500ms).
  • Waiting on write before writing should not block.
  • Waiting on read after writing most likely sleep(500ms), but maybe the client will have time to reply something (not that anything is expected?) or close the connection.

So I am slightly curious: What was blocking exactly and are both wait_fd_ready necessary to no longer observe the issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our first iteration of the issue, we had sleep(100)'s where the wait_fd_ready() functions are now. The delays were all necessary for proper execution, and to honest I still don't know why. In an effort to make the fix less hacky, I replaced them with select() calls, but based on your comment I'm guessing this was the wrong choice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am asking to figure out if this reveals a problem in V4V or between the UI and Quark (or something else entirely...).

If this is something V4V does not handle, that problem will occur somewhere else eventually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not confident this is a v4v issue, but unfortunately my knowledge of socket operations is not very strong at all. Would someone else who's more knowledgeable be able to take a look? I can stick a TODO in there for now so as to not hold up this merge.

@dpsmith
Copy link
Member

dpsmith commented Apr 23, 2019

I would be willing to +1 if the packagegroup explicit replace change is instead made to be a check against DISTRO_FEATURES for quark at which point it is included with the default being to use uid.

@jean-edouard
Copy link
Member

@dpsmith -1 to that. If it's optional and disabled by default, it might as well not be there.
Besides, I'm also against build time switches since each one doubles the OpenXT combinations to maintain. OE layers should be used for that kind of scenario.
@crogers1, if you were to push, somewhere on the internet, an OE layer that enables quark, I will add it to the layers used by default by the AIS builder.

@tsirakisn
Copy link
Contributor

@crogers1 not sure if this is on your radar but this should be merged with OpenXT/toolstack-data#51

@dpsmith
Copy link
Member

dpsmith commented Apr 23, 2019

@jed, okay then -1 from me.

Copy link
Contributor

@jandryuk jandryuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to switching from UID to quark. Getting rid of a custom component is good. However this particular PR feels a little messy. It made quark work, but didn't go back and clean up the patch set.

The one patch, add-POST-and-v4v-support.patch, states 2 things, but really does 5.

  1. Add POST
    This is okay and needed
  2. Random wait_fd_ready calls
    As @eric-ch said, these seem wrong. We control v4v, so if something is broken there then we should fix that.
  3. Makefile changes
    Again, needed, but they should be explained.
  4. Strip out user/group changing
    I think these should be dropped since we can change our SELinux policy to accommodate.
  5. Hard-codes an Expires header
    I don't know enough to evaluate this.

These are all independent changes, so I would structure them in separate patches.

We want to align with upstream sources to reduce our maintenance burden going forward. Therefore, we should only patch what we absolutely must. If we can re-work other parts of OpenXT, then we should. This is where I'd rather fix v4v and SELinux than patch quark.

+ /* keep the first newline */
+ p += 2;
+
+ /* find second occurrence - if exists, POST request, else GET */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the request state "POST" or "GET" explicitly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is reading the full HTTP message, POST or GET should be the absolute first characters seen, so it would be far more efficient to look in the absolute positions than a string match when handling GET requests. The first occurrence of double-CRLF is required, but the second search could probably be taken out completely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second search was less about figuring out the type of request and more about finding where to null terminate it and copying the correct number of bytes from the payload. That being said the second search can be done conditionally to avoid searching unnecessarily for a GET request.

+ if (!status) {
+ if (r.method == M_POST) {
+ /* handle POST */
+ syslog(LOG_INFO, "[quark] %s", r.payload);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A syslog message for each post seems excessive. Maybe drop it to LOG_DEBUG? Also, POST requests don't get a response?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, logging all requests at LOG_INFO seems excessive, especially if you are simply dumping the payload into the log message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell this is how UID handled POST messages as well. The only time POST is ever used is for debug messages (see https://github.com/OpenXT/uid/blob/85bdfe033454efc8016362c466599b0658d91c79/httpserver.ml#L205).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And see https://github.com/OpenXT/toolstack-data/blob/master/dist/script/lib/debug.js#L42 for where the POST calls originate from. AFAIK there are none besides these UI logging calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @tsirakisn . Yes, I forgot about the POST logging. I guess that means it should stay LOG_INFO.

I see uid sends an HTTP response to the POST messages. I think those should be maintained in quark. I don't know if it matters, but surf may be hanging onto resources for those unanswered requests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I'll mirror UID's response header in quark

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also add a lengthier comment above the syslog call explaining my above comments

echo "Stopping Quark server"
stop_quark
;;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nitpick. Inconsistent whitespacing, some have a blank space between branches and some do not.

@crogers1
Copy link
Contributor Author

@jandryuk Just some background, the direction I gave my team was to avoid modifying v4v or the interposer if at all possible. Replacing UID with Quark affects a relatively small number of components, whereas a v4v or interposer change potentially affects every component that uses v4v, which is a considerable number, including critical ones such as rpc-proxy. With the massive changes moving to 64-bit, that seemed like the correct approach at the time.

Now that x64 is merged if we want to reevaluate how we approach Quark, that's fine with me.

@dpsmith @jean-edouard, Putting this in a layer as an optional component may be the best path forward, seeing as there's a pretty even split among contributors. I can get behind that.

@dpsmith
Copy link
Member

dpsmith commented Apr 23, 2019

@crogers1 that would be fine and to be clear, my opposition is not in that I don't want to see uid replaced. It's just that uid works today with no major bugs, it has been exercised across many releases, and has been proven in production. While this commit introduces quark as a wholesale replacement that modifies its behavior from upstream thus this patched implemented has no track record of production usage to demonstrate stability, e.g. it is an experimental feature. As an experimental feature it should be an explicit action to be enabled for those that want to use and test. After a release or two with better testing, iterations of bugs, and demonstrated stability in production, then uid could be retired and quark could be made the default.

@crogers1 crogers1 force-pushed the quark-x64 branch 2 times, most recently from 2b9de62 to e0ef433 Compare December 24, 2020 14:58
Chris Rogers added 2 commits December 24, 2020 10:05
  Quark is a lightweight C-based webserver. This commit removes the old
  OCAML-based webserver, UID, and replaces it with Quark. We also
  add POST support to Quark to allow for the UI code to log debug
  messages to dom0

Signed-off-by: Chris Rogers <[email protected]>
  The removal of UID and addition of Quark requires updates
  to the refpolicy. This commit adds a new quark module and
  removes several uid_t rules in other .te files.

Signed-off-by: Nicholas Tsirakis <[email protected]>
Signed-off-by: Chris Rogers <[email protected]>
@crogers1
Copy link
Contributor Author

Rebased this PR. Compiled on latest master. @eric-ch if we could get a custom build from the build server with this in it, that'd be awesome!

Copy link
Contributor

@jandryuk jandryuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I lost track of reviewing this.

Previous feedback mentioned the single add-POST-and-argo-support.patch patch names 2 things, but actually does 4 things. I think that should be split into 4.

Has any effort been made to investigated the need for wait_fd_ready? Including as much detail of "server lockup" would be good.

Instead of patching out setuid/setgid, was running as non-root investigated? /dev/argo_stream would need a group assignment, but it would be nice. Though I have no idea what other changes may be needed to get it to actually work :)

type uid_tmp_t;
files_tmp_file(uid_tmp_t)
files_tmp_filetrans(uid_t, uid_tmp_t, file)
xc_files_rw_v4v_chr(quark_t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want: xc_files_rw_argo_chr

@@ -62,7 +62,6 @@ kernel_read_vm_overcommit_sysctl(updatemgr_t)
logging_send_syslog_msg(updatemgr_t)

dbd_dbus_chat(updatemgr_t)
uid_dbus_chat(updatemgr_t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have quark_dbus_chat - don't you need to replace these?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants