Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a manual parser for rust-timer queue #1994

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented Oct 11, 2024

Instead of a regex. This allows the user to pass the arguments in an arbitrary order.

@Kobzol Kobzol requested a review from lqd October 11, 2024 22:48
Instead of a regex. This allows the user to pass the arguments in an arbitrary order.
Copy link
Member

@lqd lqd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments.

Here's a free test shuffling the arg order.

#[test]
fn queue_command_parameter_order() {
    insta::assert_compact_debug_snapshot!(parse_queue_command("@rust-timer queue runs=3 exclude=c,a include=b"),
        @r###"Some(Ok(QueueCommand { include: Some("b"), exclude: Some("c,a"), runs: Some(3) }))"###);
}

@@ -5,15 +5,13 @@ use crate::github::{
};
use crate::load::SiteCtxt;

use std::sync::Arc;

use hashbrown::HashMap;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why hashbrown and not std?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real reason, it was autoimported. But I don't see any disadvantage in using hashbrown, since we already depend on it, it doesn't really matter. We don't need DOS protection here xD

@@ -57,3 +57,6 @@ jemalloc-ctl = "0.5"
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
toml = "0.7"

[dev-dependencies]
insta = "1.40.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want/need insta? Managing the snapshots files can be cumbersome, and it's yet another tool to use and learn. It's fine if we want to make good use of it in the future, and not a fancy assert_eq!(format!("{obj:?}"), "Struct { field: 'bla'} ").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'm refactoring this, I want to do it properly. Note that I'm not using snapshot files, I'm using the inline snapshots, which makes this much easier to grok, I think.

I'm already anticipating future changes (like your PR to add the backends) parameter. It's no fun updating tens of tests anytime you change the structure of the thing that you parse. Snapshot testing makes this much easier.

I will document in readme what to do to update the snapshots though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insta adds snapshot files locally when a test fails, the pending snap file. I saw that when adding the test I shared. It didn’t remove it when the test passed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, these things. Right, I guess we can add them to .gitignore, but they should be removed after you use cargo insta review I think (the snapshots shouldn't really be modified manually).

site/src/request_handlers/github.rs Outdated Show resolved Hide resolved
.map(|index| line[index + prefix.len()..].trim())
})?;

let args = bot_line.strip_prefix("queue").map(|l| l.trim())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH never reformats comments, right? That is, it will not introduce unexpected \ns that would separate the command from some of its arguments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh, that would be quite surprising I think. GH actually has a bunch of formats of the comments, like text, HTML, MD etc., but I think (hope) that here we work with the raw text. We were already requiring = to be right after include etc. before, so I think that this should be fine.

site/src/request_handlers/github.rs Outdated Show resolved Hide resolved
format!("Error occurred while parsing comment: {error}")
}
};
main_client.post_comment(issue.number, msg).await;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to take special care (like sanitization) in posting the error message to GH as its text will be partially controlled by users' input commands?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting idea. I don't think so, it should have the same rules as when you create the comment manually. I guess that the user could make the bot print "Unknown command argument SEND MONEY TO THE FOLLOWING BITCOIN ADDRESS TO SPONSOR THE RUST FOUNDATION", but I don't think that they can XSS or something like that (that would be really bad hole in GH's security).

@lqd
Copy link
Member

lqd commented Oct 12, 2024

When we do basically this for the build command it should also fix the bug I mentioned with the regex handling that ignores some of the commands in the comment. I believe I saw it happen during triage where people can want to build multiple shas to analyze a rollup, and "it didn't work".

@Kobzol
Copy link
Contributor Author

Kobzol commented Oct 12, 2024

I wanted to also refactor the build command parsing, but it's a bit tricky, because it has a positional argument, and I need to move on to other stuff atm :( So let's merge this to make it slightly easier to add the backends parameter. We don't need to support backends in build I think, I haven't ever seen someone use include/exclude with build anyway. Well, I haven't ever seen anyone use it with queue either, but at least we have a use-case for that now 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants