-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add URScript language #7144
base: main
Are you sure you want to change the base?
Add URScript language #7144
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the inline comments, you're also going to need to add this to generic.yml
and come up with a 100% precise matching regex and add it to the heuristics. This is because .script
is incredibly generic so the only way we'll be able to correctly identify these files is via a heuristc.
Popularity is no where near ready for merging so you might want to consider holding off on the suggested changes, close this PR and wait for popularity to meet our requirements.
Co-authored-by: Colin Seymour <[email protected]>
Regarding popularity this search: Regarding the heuristc. That makes a lot of sense. I will look into that. Thank you for you feedback so far🙏 |
Lines 76 to 80 in 5fad8d5
|
I just added some heuristics I think that should reduce the risk of false positives. |
Please merge main into your branch. Your branch is still showing signs of #7145 which has been resolved now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See inline comments.
lib/linguist/heuristics.yml
Outdated
- extensions: ['.script'] | ||
rules: | ||
- language: URScript | ||
pattern: '^\s*def\s*[a-zA-Z_][a-zA-Z0-9_]*\s*\([^)]*\)\s*:[\s\S]*\send($|\s)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks much better, thanks. I think there's still room for improvement:
pattern: '^\s*def\s*[a-zA-Z_][a-zA-Z0-9_]*\s*\([^)]*\)\s*:[\s\S]*\send($|\s)' | |
pattern: '^\s*def\s+[a-zA-Z_]\w*\s*\([^)]*\)\s*:[\s\S]*\send\b' |
^\s*def\s*
replaced with^\s*def\s+
as it appears at least one space is required[a-zA-Z0-9_]*
is the same as\w*
\s*end($|\s)
is simplified to\s*end\b
as\b
matches a word boundary, which is equivalent to the end of the wordend
.
@Alhadis may have some further suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
\send\b
will match a (presumably illegal) construction likeend.
,end-
,end(
, and many, many others. -
For performance's sake,
[\s\S]*
should match conservatively ([\s\S]*?
) so that anything past the first\send\b
isn't searched and included as part of the match:#!/usr/bin/env ruby greedy = /^\s*def\s*[a-zA-Z_][a-zA-Z0-9_]*\s*\([^)]*\)\s*:[\s\S]*\send($|\s)/ thrifty = /^\s*def\s*[a-zA-Z_][a-zA-Z0-9_]*\s*\([^)]*\)\s*:[\s\S]*?\send($|\s)/ input = File.read("samples/URScript/admittance_control.script") greedy =~ input; $&.length # => 10019 thrifty =~ input; $&.length # => 126
lib/linguist/heuristics.yml
Outdated
@@ -712,7 +712,7 @@ disambiguations: | |||
- extensions: ['.script'] | |||
rules: | |||
- language: URScript | |||
pattern: '^\s*def\s+[a-zA-Z_]\w*\s*\([^)]*\)\s*:[\s\S]*?\send($|\s)' | |||
pattern: '(^\s*|\s+)def\s+[a-zA-Z_]\w*\s*\([^)]*\)\s*:[\s\S]*?\send($|\s)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unnecessary (\s*
is zero or more which includes \s+
which is one or more) and introduces a ReDoS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I reverted the commit.
This reverts commit 6dbf49f.
URScript is the custom programming language, developed by Universal Robots, that controls the robot arm. Any interactions with PolyScope get converted into URScript commands, and are subsequently sent to the robot to be executed. This high level language is easy to learn, and allows you to program the robot without the PolyScope graphical interface. The URScript Manual contains an overview of the language structure, information on data types, and a complete reference of the standard functions.
Description
Checklist:
#RRGGBB