Commit fuzzer source code. #2

viniul · 2020-01-22T17:23:02Z

As discussed on the mailing list already, this PR adds the fuzzing harnesses written by @bshastry and me to the xerces upstream. An initial integration into already uncovered some bugs. Integrating the fuzzing harnesses into upstream provides a cleaner way to test the xerces code.

The purpose of this PR and the integration of xerces into oss-fuzz is to allow parts of xerces' code to be continuously fuzzed, which would probably result in the detection of security bugs early on in the development process.

CC @bshastry

This commit fixes an issue with the xml proto converter, see the following PR: google/oss-fuzz#3167

This commit contains the source code for fuzzing harnesses written by Bhargava Shastry and Vincent Ulitzsch. The fuzzing harnesses are used for an integration into oss-fuzz

…nds reads

cmake: Correct shared library name and symlink creation

viniul · 2020-04-04T13:44:14Z

Just following up - is there still interest in merging this PR? If so, is there anything on our side to help get this PR merged?

scantor · 2020-04-06T11:40:23Z

It appears this is code that isn't part of the actual build (no makefile changes), so what's the significance of this being in the distribution? Seems like maintaining it as a patch would be more agile since there are no committers with any familiarity or involvement in it.

That said, I don't have any personal objection, and it appears to be Apache-licensed, so I can check it in if there's value. I can't accept the PR (no access and I don't use GitHub for commits) but I can apply a patch via GitBox.

viniul · 2020-04-06T11:55:55Z

You are right, the current PR does not make any changes to the Makefile - that was a mistake on our part, thank you for pointing that out. We will add another commit to this PR that will introduce the necessary changes to the Makefile/configure file. I would suggest that the changes introduce a new flag --build-fuzzers to to the configure file, so that the fuzzing binaries will only be build once the flag is enabled. This would have the advantage that developers could immediately fuzz their code prior to commiting if they are interested.

scantor · 2020-04-06T11:58:03Z

That clarifies things, thank you. I'm not opposed, but you can save me some time by attaching the patch to a JIRA ticket (there may be one already for your original request? I don't recall) rather than using GitHub, which I cannot make use of.

viniul · 2020-04-06T11:59:04Z

Absolutely! Anything else we can do on our side to make this integration as easy as possible?

scantor · 2020-04-06T12:00:39Z

Just minimizing the impact on the configure script, so I can avoid having to do too much integration testing/review, but otherwise since all the code you're adding is separate I really don't have other concerns.

viniul · 2020-04-06T12:02:36Z

Got it, will keep that in mind. Thank you very much - I will work on the changes to the configure script and post a link to the jira ticket here as soon as I am done.

This commit fixes an issue with the xml proto converter, see the following PR: google/oss-fuzz#3167

…zers

scantor · 2020-04-07T12:56:15Z

This will probably have to miss the next patch because I need to get this release done but we can commit for the next.

…e of no fuzzing

viniul · 2020-05-08T10:46:53Z

Hey @scantor one last question: Would you also be fine with changes to the cmake file? I would prefer this as opposed to adapting the configure script, if it would be fine for you. From what I can see, adapting the cmake file can be done with very little impact, whereas I am not so sure I can say the same for the configure script

scantor · 2020-05-08T15:27:27Z

It depends what your goal is. The CMake script is not the supported means of building anywhere but Windows, but if the changes you're adding aren't formally part of the build process for a normal release on the other platforms then I don't care that much if they're included.

viniul · 2020-05-22T13:28:52Z

I opened a jira ticket here:
https://issues.apache.org/jira/browse/XERCESC-2199

viniul · 2020-06-20T00:33:51Z

@scantor Anything we can do to improve the patch?

scantor · 2020-06-20T01:11:32Z

I haven't looked at it. I can't spend time until/unless something arises to force me to. You'd be best off trying to get @rleigh-codelibre to include it in the discussions about the master branch and doing a 4.0 release.

Once you missed the last patch window, that was really it for me until something else comes up.

viniul · 2020-06-20T10:19:37Z

CC @rleigh-codelibre As described above, this PR adds some fuzzing harnesses for xerces, which we (@bshastry and me) think would be nice to have in the master branch, because it allows continuous fuzzing via oss-fuzz and also makes it very easy for developers to fuzz-test their code prior to the release. Should I post this to the mailing list?

rleigh-codelibre · 2020-06-20T10:44:20Z

@viniul The addition of a test looks fairly low-risk in terms of wider impact on the project.

Has the necessary Apache ICLA and/or company CLA been done?

On the integration side, it would be nice to have this added to the CMakeLists.txt as for the other tests, and likewise to Makefile.am. If the fuzzing is long-running, I would suggest not adding to CTest so that it isn't part of the regular integration testing.

Add fuzzers

viniul · 2020-06-20T11:50:49Z

@rleigh-codelibre I updated the PR to also includes the changes for CMake. Is this sufficient to get the PR merge dfor now? I f so, let me create a new clean PR based on the current master branch of xeres-c

rleigh-codelibre · 2020-06-20T12:10:50Z

.gitignore

@@ -105,3 +105,6 @@ tests/scripts/*.log
 tests/scripts/*.trs
 tests/test-suite.log
 Testing/
+
+# Vincent
+build/


This change should be kept private. Please could you remove it from the PR?

rleigh-codelibre · 2020-06-20T12:12:02Z

CMakeLists.txt

@@ -177,8 +177,14 @@ install(
 # Process subdirectories
 add_subdirectory(doc)
 add_subdirectory(src)
-add_subdirectory(tests)
-add_subdirectory(samples)
+if (NOT (DEFINED XERCES_BUILD_FUZZERS))


Why are these specifically excluded? It might be nicer to only use the fuzzer-specific conditions on the fuzzer code itself.

Fuzzing requires some compile-time instrumentation of the xerces-library code and certain libraries that need to be linked against the instrumented code later on. Since this instrumentation is not needed for the other tests, because I thought this way it would be cleaner. Do you still want to me to change?

I am afraid I misunderstand your question though: Do you want just the if condition to change?

I would prefer that the fuzzer keep its conditionals self-contained if possible. That is, rather than disabling other unrelated features, it should only enable or disable the fuzzing functionality. If you want to disable building of the tests and/or samples, I think they should have separate options to allow them to be enabled or disabled independently.

You could then do -Dtests=OFF -Dsamples-OFF -Dfuzzer=ON to achieve the same effect.

Also, the extra compile-time stuff using ExternalProject looks separate from building the fuzzer itself. Maybe always build the fuzzer but have that additional stuff gated on the fuzzer option. Or gate all of it. But I would put that conditional stuff inside the fuzzer CMakeLists.txt rather than having any conditionals around include statements.

rleigh-codelibre · 2020-06-20T12:13:06Z

CMakeLists.txt

+  add_subdirectory(samples)
+endif()
+
+if ((DEFINED XERCES_BUILD_FUZZERS) OR (DEFINED XERCES_BUILD_FOR_OSS_FUZZ))


Please always include all subdirectories, and place the above conditionals around the logic inside the file. This ensures the logic is evaluated whatever the cmake options, and helps to prevent regressions.

Additionally, rather than use XERCES_BUILD_FUZZERS, why not define an option fuzzers with option() defaulting to off, then you can just use -Dfuzzers=ON?

Will do that, thanks!

rleigh-codelibre · 2020-06-20T12:16:33Z

build_fuzzer.sh

@@ -0,0 +1,8 @@
+#!/bin/bash


I don't see this file being used anywhere. Does it really belong in the source tree?

If it does belong, could it go into the scripts subdirectory alongside the CI scripts?

Not necessarily needed, but I thought it might be helpful if someone does not want to figure out the build options by themselves. Should it go into the scripts sub-directory or be completely removed?

rleigh-codelibre · 2020-06-20T12:18:04Z

cmake/XercesIntTypes.cmake

@@ -58,14 +58,14 @@ set(HAVE_OFF_T ${SIZEOF_OFF_T})
 set(HAVE_SIZE_T ${SIZEOF_SIZE_T})
 set(HAVE_SSIZE_T ${SSIZEOF_SSIZE_T})
 set(HAVE_WCHAR_T ${WCHAROF_WCHAR_T})
-if(SIZEOF_SIZE_T)
+if(HAVE_SIZEOF_SIZE_T)


I'm not sure where these changes have come from. Is this a reversion of an earlier change?

If the history is messy, it might be easiest to rebase all your changes onto the current master and start a new PR.

rleigh-codelibre · 2020-06-20T12:19:30Z

fuzzers/CMakeLists.txt

+
+# Definitions required for building
+add_definitions(
+  -DHAVE_CONFIG_H=1


If this is necessary, I think target_compile_definitions would be more appropriate, rather than setting it globally.

rleigh-codelibre · 2020-06-20T12:23:06Z

src/xercesc/util/XMLChar.cpp

@@ -8837,7 +8837,7 @@ XMLByte XMLChar1_1::fgCharCharsTable1_1[0x10000] =

 #include <stdio.h>

-static XMLCh gTmpCharTable[0xFFFF];


I'm unsure about these any other code changes, above. Are they intentional?

If they are, please could you open separate PRs for them so we can keep code fixes separate from the fuzzer addition.

viniul · 2020-06-20T12:27:05Z

@rleigh-codelibre Thank you very much for reviewing this PR. I will implement the suggested changes and open a new PR based on the current master.

rleigh-codelibre · 2020-06-20T12:28:27Z

@viniul I have done a preliminary review, please see the above comments.

Please could you also answer the question regarding the licensing. I see Google copyrights on some of the changes. And presumably there are your own and/or your employer's copyrights in addition. The necessary corporate and individual copyright licensing agreements must be submitted before we are able to merge your work.

viniul · 2020-06-20T12:36:36Z

@rleigh-codelibre We have not yet submitted the Apache licensing agreement - this is individual work by me and @bshastry, so I think if we both submit the ICLA this should be sufficient, correct?

The two Google copyrights came because we started those two files based on a Google template. The final file that will be committed was written by us though (in fact this was written by @bshastry, could you please confirm?) -- So I think we can just replace the Google copyright.

rleigh-codelibre · 2020-06-20T18:09:21Z

If all the work is individual and there is no corporate ownership of any of it, then if you both submit the needed ICLA that should be sufficient. Thanks.

viniul and others added 9 commits December 23, 2019 17:35

commit Fuzzer source code

ef33bdd

Fix xml proto converter issue.

78bc635

This commit fixes an issue with the xml proto converter, see the following PR: google/oss-fuzz#3167

Commit fuzzer source code.

9de52c0

This commit contains the source code for fuzzing harnesses written by Bhargava Shastry and Vincent Ulitzsch. The fuzzing harnesses are used for an integration into oss-fuzz

XERCESC-2194 - Windows/cmake fails due to undefined ssize_t

aacdef3

XERCESC-2189 - XMLChar with NEED_TO_GEN_TABLE has 2 buffer out of bou…

a91b38c

…nds reads

XERCESC-2187 / others - DOM tests crashing on unsupported platforms

d00c6a5

XERCESC-1963 - Custom HTTP headers missing with CURL NetAccessor

d8881d6

cmake: Correct shared library name and symlink creation

84395e5

Merge pull request apache#9 from rleigh-codelibre/cmake-libname-fixes

03fee10

cmake: Correct shared library name and symlink creation

viniul added 3 commits April 6, 2020 23:53

commit Fuzzer source code

5750949

Fix xml proto converter issue.

888a08b

This commit fixes an issue with the xml proto converter, see the following PR: google/oss-fuzz#3167

Merge branch 'add_fuzzers' of github.com:viniul/xerces-c into add_fuz…

01c5daf

…zers

Vincent Ulitzsch added 4 commits April 24, 2020 13:10

CMakefiles edit

1f43abf

Now use toolchain protoc

24b361a

Add XERCES_BUILD_FOR_OSS_FUZZ

f4c7bd7

Update CMakeLists.txt to only create other standalone binaries in cas…

dd72220

…e of no fuzzing

Correct compiling options for fuzzer

b9bc503

viniul and others added 2 commits June 20, 2020 13:39

Merge pull request apache#1 from viniul/add_fuzzers

0f48672

Add fuzzers

Merge branch 'master' of github.com:viniul/xerces-c-1

07fca5a

rleigh-codelibre suggested changes Jun 20, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit fuzzer source code. #2

Commit fuzzer source code. #2

viniul commented Jan 22, 2020

viniul commented Apr 4, 2020

scantor commented Apr 6, 2020

viniul commented Apr 6, 2020

scantor commented Apr 6, 2020

viniul commented Apr 6, 2020

scantor commented Apr 6, 2020

viniul commented Apr 6, 2020

scantor commented Apr 7, 2020

viniul commented May 8, 2020 •

edited

Loading

scantor commented May 8, 2020

viniul commented May 22, 2020

viniul commented Jun 20, 2020

scantor commented Jun 20, 2020

viniul commented Jun 20, 2020

rleigh-codelibre commented Jun 20, 2020

viniul commented Jun 20, 2020 •

edited

Loading

rleigh-codelibre Jun 20, 2020

rleigh-codelibre Jun 20, 2020

viniul Jun 20, 2020

viniul Jun 20, 2020

rleigh-codelibre Jun 20, 2020

rleigh-codelibre Jun 20, 2020

rleigh-codelibre Jun 20, 2020

viniul Jun 20, 2020

rleigh-codelibre Jun 20, 2020

viniul Jun 20, 2020

rleigh-codelibre Jun 20, 2020

rleigh-codelibre Jun 20, 2020

rleigh-codelibre Jun 20, 2020

viniul commented Jun 20, 2020

rleigh-codelibre commented Jun 20, 2020

viniul commented Jun 20, 2020 •

edited

Loading

rleigh-codelibre commented Jun 20, 2020

		@@ -8837,7 +8837,7 @@ XMLByte XMLChar1_1::fgCharCharsTable1_1[0x10000] =

		#include <stdio.h>

		static XMLCh gTmpCharTable[0xFFFF];

Commit fuzzer source code. #2

Are you sure you want to change the base?

Commit fuzzer source code. #2

Conversation

viniul commented Jan 22, 2020

viniul commented Apr 4, 2020

scantor commented Apr 6, 2020

viniul commented Apr 6, 2020

scantor commented Apr 6, 2020

viniul commented Apr 6, 2020

scantor commented Apr 6, 2020

viniul commented Apr 6, 2020

scantor commented Apr 7, 2020

viniul commented May 8, 2020 • edited Loading

scantor commented May 8, 2020

viniul commented May 22, 2020

viniul commented Jun 20, 2020

scantor commented Jun 20, 2020

viniul commented Jun 20, 2020

rleigh-codelibre commented Jun 20, 2020

viniul commented Jun 20, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viniul commented Jun 20, 2020

rleigh-codelibre commented Jun 20, 2020

viniul commented Jun 20, 2020 • edited Loading

rleigh-codelibre commented Jun 20, 2020

viniul commented May 8, 2020 •

edited

Loading

viniul commented Jun 20, 2020 •

edited

Loading

viniul commented Jun 20, 2020 •

edited

Loading