-
Notifications
You must be signed in to change notification settings - Fork 918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSS-Fuzz targets improvements #267
Conversation
WAT |
May I ask for a review from someone? |
Gentle ping |
@chqrlie Do you want these here or in NG? I might be able to help with either. |
@saghul Since only this original repository is registered into oss-fuzz (allowing to publicly keep track of the performance improvement), my first priority is to merge these changes here. But, since libFuzzer can be executed locally too, NG could also gain profit from these changes, so I'm not against merging there, too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments, please take a look! How is this supposed to work? Those are 3 files, none of them use the dictionary, and there are no instructions on how one can run the fuzzer. IMHO a README in the fuzz directory is necessary here.
@@ -0,0 +1,257 @@ | |||
"__loadScript" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this dict created? If automated, is it possible to check-in the script that does so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The names were collected with a JS script, I'll upload it in the next round.
fuzz/fuzz_compile.c
Outdated
JS_SetInterruptHandler(JS_GetRuntime(ctx), interrupt_handler, NULL); | ||
js_std_add_helpers(ctx, 0, NULL); | ||
|
||
uint8_t *NullTerminatedData = (uint8_t *)malloc(Size + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary cast
fuzz/fuzz_compile.c
Outdated
memcpy(NullTerminatedData, Data, Size); | ||
NullTerminatedData[Size] = 0; | ||
JSValue obj; | ||
obj = JS_Eval(ctx, (const char *)NullTerminatedData, Size, "<none>", JS_EVAL_FLAG_COMPILE_ONLY | JS_EVAL_TYPE_GLOBAL | JS_EVAL_TYPE_MODULE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing both JS_EVAL_TYPE_GLOBAL and JS_EVAL_TYPE_MODULE doesn't make sense. It's either a global eval or a module eval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a legacy code from oss-fuzz, but I think the current version is equivalent with JS_EVAL_FLAG_COMPILE_ONLY | JS_EVAL_TYPE_MODULE
, so I'll update it.
obj = JS_Eval(ctx, (const char *)NullTerminatedData, Size, "<none>", JS_EVAL_FLAG_COMPILE_ONLY | JS_EVAL_TYPE_GLOBAL | JS_EVAL_TYPE_MODULE); | ||
free(NullTerminatedData); | ||
//TODO target with JS_ParseJSON | ||
if (JS_IsException(obj)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is awaits in there the returned object will be a promise, you'll want to js_std_await it.
free(NullTerminatedData); | ||
//TODO target with JS_ParseJSON | ||
if (JS_IsException(obj)) { | ||
JS_FreeContext(ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JS_FreeValue(ctx, obj); is missing here, for completeness.
fuzz/fuzz_regexp.c
Outdated
|
||
#define CAPTURE_COUNT_MAX 255 | ||
|
||
FILE *outfile=NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can go inside the function.
fuzz/fuzz_regexp.c
Outdated
#include <stdint.h> | ||
#include <stdio.h> | ||
|
||
#define CAPTURE_COUNT_MAX 255 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used in a single place and multiplied by 2. Why not just write the number literal?
fuzz/fuzz_regexp.c
Outdated
return 0; | ||
} | ||
bc = lre_compile(&len, error_msg, sizeof(error_msg), (const char *) Data, | ||
Size1, 0, ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing the context as the opaque accomplishes nothing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should be passed as opaque
then? (Using NULL
instead of ctx
causes a segfault.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, the main test in libregexp.c even passes it as NULL. What is the backtrace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--- a/fuzz/fuzz_regexp.c
+++ b/fuzz/fuzz_regexp.c
@@ -56,7 +56,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
return 0;
}
bc = lre_compile(&len, error_msg, sizeof(error_msg), (const char *) data,
- size1, 0, ctx);
+ size1, 0, NULL/*ctx*/);
if (!bc) {
return 0;
}
==1848896==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000018 (pc 0x555555672e50 bp 0x7fffffffc3c0 sp 0x7fffffffc380 T1848896)
==1848896==The signal is caused by a READ memory access.
==1848896==Hint: address points to the zero page.
#0 0x555555672e50 in lre_realloc quickjs/quickjs.c:43870:31
#1 0x55555577e5c5 in dbuf_realloc quickjs/cutils.c:114:19
#2 0x55555577e861 in dbuf_put quickjs/cutils.c:140:13
#3 0x55555577eb39 in dbuf_putc quickjs/cutils.c:161:12
#4 0x55555576ad83 in lre_compile quickjs/libregexp.c:1771:5
#5 0x55555562ec7d in LLVMFuzzerTestOneInput quickjs/fuzz/fuzz_regexp.c:58:10
#6 0x5555555ecc30 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (quickjs/fuzz_regexp+0x98c30) (BuildId: 383173e4c29297490e72ea7182a0eff6c735fb39)
#7 0x5555555ec3a5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) (quickjs/fuzz_regexp+0x983a5) (BuildId: 383173e4c29297490e72ea7182a0eff6c735fb39)
#8 0x5555555edb85 in fuzzer::Fuzzer::MutateAndTestOne() (quickjs/fuzz_regexp+0x99b85) (BuildId: 383173e4c29297490e72ea7182a0eff6c735fb39)
#9 0x5555555ee795 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (quickjs/fuzz_regexp+0x9a795) (BuildId: 383173e4c29297490e72ea7182a0eff6c735fb39)
#10 0x5555555dc8fb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (quickjs/fuzz_regexp+0x888fb) (BuildId: 383173e4c29297490e72ea7182a0eff6c735fb39)
#11 0x555555605932 in main (quickjs/fuzz_regexp+0xb1932) (BuildId: 383173e4c29297490e72ea7182a0eff6c735fb39)
#12 0x7ffff7a52d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#13 0x7ffff7a52e3f in __libc_start_main csu/../csu/libc-start.c:392:3
#14 0x5555555d1d64 in _start (quickjs/fuzz_regexp+0x7dd64) (BuildId: 383173e4c29297490e72ea7182a0eff6c735fb39)
UndefinedBehaviorSanitizer can not provide additional info.
SUMMARY: UndefinedBehaviorSanitizer: SEGV quickjs/quickjs.c:43870:31 in lre_realloc
==1848896==ABORTING
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. quickjs.c provides a lre_realloc
implementation which uses the context. Since the text uses just libregexp, it should compile in quickjs.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to provide these 2 in the fuzzer script:
/* must be provided by the user */
LRE_BOOL lre_check_stack_overflow(void *opaque, size_t alloca_size);
void *lre_realloc(void *opaque, void *ptr, size_t size);
And avoid compiling quickjs.c for that particular test.
fuzz/fuzz_regexp.c
Outdated
return 0; | ||
} | ||
input = Data+Size1+1; | ||
ret = lre_exec(capture, bc, input, 0, Size-(Size1+1), 0, ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto about ctx.
fuzz/fuzz_regexp.c
Outdated
JSContext *ctx; | ||
|
||
int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { | ||
if (outfile == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no way for outfile to not be null, what's the deal here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I only used and improved the eval and compile targets, while the regexp target was simply copied from oss-fuzz. But, I think the purpose of outfile is similar as of the initialized
variable in the other two targets. If I simplify libFuzzer, then it's nothing more than a fuzzing harness with an infinite loop which calls the LLVMFuzzerTestOneInput
function linked to it until it crashes. Since in a fuzzing session everything is about performance, the one-time initializations are preferred to be executed only once. So, I think the NULL-check of outfile
also serves this purpose: after the first test run, outfile
won't be NULL (most probably), hence the initialization won't run again either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, I don't see the purpose of writing the result of lre_exec
into this outfile
, since it's never read back and file IO (especially unnecessary IO) and fuzzing are not the best friends (performance reasons). Could it be reasoned with some compiler hack to avoid optimizing out lre_*
calls if their result is not used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea TBH.
The use of JS_NewContext instead of JS_NewContextRaw spares to call JS_AddIntrinsic<XYZ> functions from the fuzz target, since the public JS_NewContext API does exactly the same.
fuzz_regexp doesn't need to be dependant on libquickjs since the runtime and the context - that were provided by libquickjs - were only created to call two simple functions implemented in libquickjs which could be mimicked by the fuzzer. The removal of runtime and context objects implicated further simplifications, like the omission of their one-time creation. Finally, writing the result of the regexp operations into a file is also superfluous, since it's not used by anybody.
…pile targets Before this patch, the test executions were not independent, since all the executed tests used the same JavaScript runtime and context, causing irreproducible failure reports.
Big numbers are used by the input corpus, but the targets were not able to interpret them since they were not compiled into them. This change improved the inital coverage of the fuzz_eval target with 21% and the coverage of the fuzz_compile target with 25% when using the official corpus.
Furthermore, added a JS script that collects all the builtin names from the executing engine.
49e4ebf
to
956038a
Compare
I hope that I addressed or answered all the requests and questions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as a first check-in! @chqrlie PTAL!
Very interesting feature. I am waiting for Fabrice's approval to merge this in. Good work! |
Thank you @renatahodovan for your contribution. Fuzzing is an enticing path to find software flaws. You seem to have an advanced understanding of the domain, do you participate in conferences such as FOSdem in Brussels where I followed several sessions on the subject? I would be interested to follow the progress and discuss research ideas on this. Best regards Chqrlie |
Hi @chqrlie ! |
36911f0 regexp: fix non greedy quantizers with zero length matches d86aaf0 updated test262.patch adec734 fixed test of test262 directory d378a9f Improve `js_os_exec` (bellard#295) 97be5a3 Add `js_resolve_proxy` (bellard#293) f3f2f42 Add `JS_StrictEq()`, `JS_SameValue()`, and `JS_SameValueZero()` (bellard#264) 6f9d05f Expose `JS_SetUncatchableError()` (bellard#262) d53aafe Add the missing fuzz_common.c (bellard#292) db9dbd0 Add `JS_HasException()` (bellard#265) 6c43013 Add `JS_NewTypedArray()` (bellard#272) 01454ca OSS-Fuzz targets improvements (bellard#267) 0c8feca Improve class parser (bellard#289) d9c699f fix class method with name get (bellard#258) 7a2c6f4 Improve libunicode and libregexp headers (bellard#288) 1402478 Improve unicode table handling (bellard#286) 3b45d15 Fix endianness handling in `js_dataview_getValue` / `js_dataview_setValue` 653b227 Improve error handling 203fe2d Improve `JSON.stringify` ce6b6dc Use more explicit magic values for array methods c0e67c4 Simplify redundant initializers for `JS_NewBool()` 0665131 Fix compilation with -DCONFIG_BIGNUM 65ecb0b Improve Date.parse, small fixes 6a89d7c Add CI targets, fix test_std.js (bellard#247) ebe7496 Fix build: use LRE_BOOL in libunicode.h (bellard#244) 1a5333b prevent 0 length allocation in `js_worker_postMessage` e17cb9f Add github CI tests 06c100c Prevent UB on memcpy and floating point conversions 3dd93eb fix microbench when microbench.txt is missing (bellard#246) 35b7b3c Improve Date.parse 8d64731 Improve Number.prototype.toString for radix other than 10 a78d2cb Improve repl regexp handling 8180d3d Improve microbench.js 78db49c Improve Date.parse 6428ce0 show readable representation of Date objects in repl 27928ce Fix Map hash bug b70e764 Rewrite `set_date_fields` to match the ECMA specification b91a2ae Add C API function JS_GetClassID() 12c91df Improve surrogate handling readability 8d932de Rename regex flag and field utf16 -> unicode 97ae6f3 Add benchmarks target c24a865 Improve run-test262 bbf36d5 Fix big endian serialization 530ba6a handle missing test262 gracefully 0a361b7 handle missing test262 gracefully 74bdb49 Improve tests 85fb2ca Fix UB signed integer overflow in js_math_imul 8df4327 Fix UB left shift of negative number 3bb2ca3 Remove unnecessary ssize_t posix-ism c06af87 Improve string concatenation hack 8e21b96 pass node-js command line arguments to microbench 95e0aa0 Reverse e140122 1fe0414 Fix test262 error ef4e7b2 Fix compiler warnings 92e339d Simplify and clarify URL quoting js_std_urlGet 636c946 FreeBSD QuickJS Patch (bellard#203) ae6fa8d Fix shell injection bug in std.urlGet (bellard#61) 693449e add gitignore for build objects (bellard#84) e140122 Fix sloppy mode arguments uninitialized value use 6dbf01b Remove unsafe sprintf() and strcat() calls 6535064 Fix undefined behavior (UBSAN) e53d622 Fix UB in js_dtoa1 fd6e039 Add UndefinedBehaviorSanitizer support 325ca19 Add MemorySanitizer support git-subtree-dir: vendor/quickjs git-subtree-split: 36911f0
I was experimenting with quickjs fuzzing through libFuzzer, when I recognized some inefficiencies. I've uploaded a patch set to OSS-Fuzz to fix them, where I was advised to ask a review from the quickjs maintainers and possibly move the implementation of the fuzz targets into the main codebase (here). So, I'm doing it in this PR.
The patches should be straightforward: the first one is a copy of the current version of the fuzz targets from oss-fuzz, while the others incrementally apply independent fixes/improvements.
If and when you find these commits acceptable, then I'll adapt the infra in oss-fuzz to use the new code. Plus, if requested, I can enable to mirror the found bugs to the GitHub issue tracker as well.