Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose more data over the API #143

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

sophiabits
Copy link

Exposes additional API endpoints (e.g. GET /problems/:id.xml), adds additional data to some endpoints (e.g. <groups> node when retrieving a ProblemSet), and hides some sensitive info such as the judge log when retrieving submissions.

Serializers

Serialization logic has been implemented by overriding the to_xml method of the relevant models which avoids the need to repeat :only and :include all over the place. Ideally we'd want this logic separated out into serializer classes, but I couldn't figure out an elegant way of doing that.

Some things I tried:

  • The active_model_serializers gem does not support XML, and while the authors are interested in adding support they're only looking into it for version 0.10 which requires Rails 6. Train runs on Rails 4 and I imagine bumping by 2 major versions will break things (see: Added XML support rails-api/active_model_serializers#448 (comment))
  • ActiveModel::Serializers::XML is, afaict, built-in to Rails 4 (it was split out in v5?) and so isn't very helpful
  • A custom *Serializer class doesn't seem to work the way I'd expect it to (I've tried both returning yielding a Builder::MarkupXml and wind up with strange extraneous elements inserted)
  • The serializable_hash method feels a bit nicer to override than to_xml, but doesn't receive all of the builder options (:include is missing, for instance)
  • An attributes method results in pretty strange XML, see the total-weighting element here: https://i.imgur.com/w0zsVTj.png

If there's a better place for the serialization logic or a library you know of, I'm happy to rework the PR. Ruby isn't my area of expertise :)

The stock to_xml behavior adds type attributes to nodes, but if you manually add a tag through a builder you don't get that attribute. XmlUtil exists mainly to ergonomically add type attributes to custom nodes in order to maintain consistency with the rest of the XML doc.

Swapping over to JSON could be an idea, as it would let us use active_model_serializers. JSON also handles whitespace perfectly, whereas with XML it's possible to configure your parser to strip out extraneous whitepsace -- which is a problem when working with submissions. This felt like a pretty major change though so I didn't do it.

Didn't do

A few things mentioned on Discord haven't been done:

  • Filtering out comments from problem statements; I'm not sure how to accomplish that using the built-in XML serializer.
  • Hiding updated-at: updated-at works great as a cache key and I can't see any issue with exposing it -- so it's still visible.
  • It was mentioned to hide name on User docs; I've set it up so that if you hit the API from a staff account you get the user's name (and email), but otherwise those fields don't exist. This mirrors the behavior of the HTML page.

Future

  • None of these endpoints are paginated; I'm assuming this isn't an issue given Tom's comment re HTTP requests
  • Swapping to JSON might be a good idea as it seems like there's more tooling available in the Rails ecosystem

Here are some endpoints which I haven't included in this PR, but would like to have done eventually (eventually probably meaning never -- parsing the HTML feels easier than writing the serializers):

  • Including scoreboard data in GET /contests/:id.xml. There are a lot of moving parts to that code and I'm not entirely sure I understand all the acl rules.
  • It would also be nice to have the parsed judge data inlined in inside GET /submissions/:id.xml as most of the info is available through the web interface anyway, but I didn't do that either as I think there's room for discussion about the exact shape of the XML and it's also fairly straightforward to parse out judge results from the HTML, anyway. Here's where I got up to before deciding it was getting too big: https://i.imgur.com/ibclHnr.png

XML Samples

Here are some sample XML docs (note that problems in the contest sample is collapsed in the UI; the contest startcode is now censored, and points+max-points are also excluded from submissions now): https://imgur.com/a/UfJs203

Copy link
Member

@Holmes98 Holmes98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For stripping comments, could we just do something like this?

to_xml do |xml|
  xml.tag! 'statement', Loofah.fragment(statement).scrub!(Loofah::Scrubbers::NoComment.new)
end

Also, I think :owner_id should be hidden for problems, problem sets, contests, and groups (at least for non-admins).

XmlUtil.serialize_id_list xml, 'contestants', contestants
end

if policy.scoreboard? then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if policy.scoreboard? then
if policy.show_details? then

Problem IDs should be hidden to users unless the contest has ended or they are a current/past contestant (we may want to reuse problems). We should probably also restrict access to :problem_set_id in the same way.

@@ -166,4 +166,38 @@ def max_extra_time
(duration*3600).to_i
end

def to_xml(opts={})
opts[:exclude] ||= [:startcode]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
opts[:exclude] ||= [:startcode]
opts[:except] ||= [:startcode]

Comment on lines +44 to +45
XmlUtil.serialize_id_list xml, 'contests', contests
XmlUtil.serialize_id_list xml, 'groups', groups
Copy link
Member

@Holmes98 Holmes98 Feb 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as with Problem#to_xml, should we restrict these to admins?

Comment on lines +117 to +119
XmlUtil.serialize_id_list xml, 'contests', contests
XmlUtil.serialize_id_list xml, 'groups', groups
XmlUtil.serialize_id_list xml, 'problem-sets', problem_sets
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we restrict these to admins? This will list upcoming/current contests and groups/problem sets that the user doesn't have access to.

Comment on lines +48 to +53
# `expired_at` has the value Float::INFINITY when the request hasn't expired,
# and the XML formatter explodes when it encounters that value. Adding :expired_at
# to opts[:exclude] is the obvious solution, but for reasons unknown to me it does
# not appear to work as expected. Changing `expired_at` to some other value (which isn't
# a datetime) will result in an empty tag being emitted with a `nil="true"` attribute
# which seems like the best solution after just omitting the tag entirely.`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opts[:except] should work =)

@@ -1,3 +1,5 @@
require 'builder'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem to be used?

Suggested change
require 'builder'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants