Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more stuff to JSON file #7757

Merged
merged 1 commit into from
Feb 2, 2018
Merged

Add more stuff to JSON file #7757

merged 1 commit into from
Feb 2, 2018

Conversation

marler8997
Copy link
Contributor

@marler8997 marler8997 commented Jan 20, 2018

This PR is based on recommendation from @WalterBright (#7746 (comment)). In order for rdmd to support single-invocation, we need a way to get the information it currently gets from "verbose output", but without losing error messages and other stdout messages. Walter recommended rdmd switch to using the JSON file that dmd generates. Currently the JSON file is missing most of the information that rdmd requires, so this PR is adding that information to it.

Most of the information needed by rdmd is "semantic data" whereas the JSON file seems to contain "syntax data". Because of this, I've added the "semantic data" to the end of the JSON file in it's own object called semantics. Here's a list of the new additions and the current status:

  • non-root imported modules (ADDED)
  • file content imports (i.e. import("somefile")) (ADDED)

Along with semantic data, there is some build information that rdmd uses. I've put this in the beginning of the JSON file in an object named "buildInfo". It contains the following:

  • target library file if -lib is used (ADDED)
  • the compiler config file (ADDED)
  • the compiler binary file itself (ADDED)

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @marler8997! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

  • smaller, focused PRs are easier to review than big ones

  • try not to mix up refactoring or style changes with bug fixes or feature enhancements

  • provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@marler8997
Copy link
Contributor Author

Hmmm, this is going to make testing more difficult. Now that the JSON file contains semantic data, it's contents are different for each platform, and also changes depending on the order that files were passed to the compiler.

@timotheecour
Copy link
Contributor

timotheecour commented Jan 20, 2018 via email

@marler8997
Copy link
Contributor Author

Based on the code, I don't anticipate json code generation having a big performance impact. However, if it does then filtering can always be added later as an after thought. Maybe:

-X (enable json code generation)
-Xf=<file> (set file to output json to)
-Xo=-functions  (An example interface to filter out function from json code generation)

@marler8997 marler8997 force-pushed the extendJson branch 6 times, most recently from 479722a to 9a297d6 Compare January 20, 2018 20:20
@CyberShadow
Copy link
Member

1 quick sanity check: seeing whether on a large project, adding -Xf=file slows down compilation (or consumes more memory ?) than without.

I checked on ae (dmd -o- -deps -X all.d where all.d is generated by makejson.sh), and saw no measurable impact on execution time and a negligible (200 KB) increase in memory usage, so using the JSON output looks pretty promising!

@CyberShadow
Copy link
Member

-Xo=-functions (An example interface to filter out function from json code generation)

BTW, if we ever do need to add that, it should probably be a list of things to include rather than a list of things to exclude. Tools that would make use of this option will know exactly what they want, so any information that's added later will not be useful to them.

@marler8997
Copy link
Contributor Author

yeah good point

@marler8997
Copy link
Contributor Author

@WalterBright We may want to try to give this PR some attention early on because if this PR along with the changes to rdmd don't get in before the next release, then we will have an extra use case to handle, namely, the case where you have a compiler that supports -i but doesn't generate the JSON output that is needed. I'd like to avoid that if possible.

{
objectStart();
if(m.md)
property("name", m.md.toChars());
Copy link
Contributor

@timotheecour timotheecour Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about:
property("name", m.md ? m.md.toChars() : "");
to make each entry have each field? (and update https://github.com/dlang/tools/pull/292/files#diff-e6f94050037e7c9a64684829764b1f0bR832 accordingly)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the convention established here (https://github.com/dlang/dmd/pull/7757/files/a90c983d6e436e46dde3b1ee07fdd280ea1396e7#diff-1d73c979f233c43e47fb88fac32c21b3L475)

I don't really see much difference whether or not there is an empty name or no name at all, but following the existing convention seems the better choice.

objectStart();
if(m.md)
property("name", m.md.toChars());
property("file", m.srcfile.toChars());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about absolutePath ( m.srcfile.toChars() ) so that if rdmd is called from a different directory, it'll still work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm, interesting thought. I'd have to think about that one. This is mimicking the current functionality from the verbose output, but maybe it can be improved. I'll come back to this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: "contentImports" appears to use absolutePath (i tried with -J. and import("foo.txt") ), so I would make sense to be consistent. I suggest using absolutePath for file as well (and wherever else it makes sense)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that would be changing behavior though...we could end up with a regression if we change the behavior.

Copy link
Contributor

@timotheecour timotheecour Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, that info wasn't in json output before... so it's one of those gray area things.
one option is to allow configuring json output (as I suggested somewhere else), which is OK if we have good (easily extensible) way to customize, eg:

-Xf=file -Xopt=abspaths,imports,-functions
each option has a corresponding - flag to negate it, just like you did for -i=core,-std

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about it, there might be a bug in the current design. If you build with rdmd, and then you change directories and build again, then all the relative module filenames will be incorrect. It might just force you to do a rebuild, but that's no good. We could fix this using absolute paths, or we could save off the original CWD that rdmd was using when it first performed the build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a test for that in the rdmd test suite, FWIW.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there may even be corner cases where it picks up wrong file eg:

ls
main.d
fun.d
temp/fun.d // slightly different version main.d

rdmd main.d
cd temp
rdmd ../main.d
=> can it get confused and get the wrong fun.d?

Here's a good compromise:

add a CWD field, and when rdmd reads a path (any path) call buildPath(json.CWD, json.path) (does the right thing w absolute paths)

and allow for:
rdmd main.d
cd temp
rdmd ../main.d => should reuse cache and not rebuild

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we should definitely make behavior consistent between contentImports and modules.file:
import("fun.txt") should not get auto-transformed in an absolute path either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that the JSON file saves the CWD in the new buildInfo object.

src/dmd/json.d Outdated
arrayStart();
foreach (file; m.contentImportedFiles)
{
value(file);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BUG! missing separaters in generated JSON
"contentImports" : [
"/tmp/myfile.txt""/tmp/myfile.txt"
]

{
"kind" : "buildInfo",
"binary" : "../generated/linux/release/64/dmd",
"version" : "v2.078.1-beta.1-494-g479722a-dirty",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to have frontendVersion here too.

Copy link
Member

@wilzbach wilzbach Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you can't check for neither the version nor binary in the testsuite - they are supposed to change ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are "grepped" out along with some other types of lines (see json-postscript.sh). I thought keeping an example in the actual expected results didn't hurt though.

Copy link
Member

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to generate the JSON file?
It makes future updates of the JSON file more difficult as the line numbers on the diff won't reflect the ones in the generated script. Also it's easy to screw up the JSON generation. How about moving the things that are platform-specific into a different test?

@marler8997 marler8997 force-pushed the extendJson branch 7 times, most recently from 4f8dfc7 to 3e18689 Compare January 27, 2018 22:17
@marler8997
Copy link
Contributor Author

marler8997 commented Jan 30, 2018

@andralex has determined that adding policies to std.json is not a desirable enhancement (dlang/phobos#6059 (comment)). Given that now we can't use std.json to sanitize the file, using the provided JSON parser seems like a solid option to me.

@CyberShadow
Copy link
Member

Given that now we can't use std.json to sanitize the file,

Sorry, I don't think we finished the previous conversation. Why not use std.json, and rely on its current behavior of sorting key names lexicographically? Won't that allow achieving the same goal without adding a JSON parser to the DMD repository?

@marler8997
Copy link
Contributor Author

That would work "functionally". The only drawback is that the two files won't be in the same order that was generated from DMD, which would look odd. The order of the fields as generated by DMD makes sense, i.e. name first, followed by other identifying properties, followed by more verbose fields such as large objects. Would be nice to keep it in the same order. I refuse to accept that sanitizing a JSON file while preserving the original object field order is an insurmountable task in D :)

@CyberShadow
Copy link
Member

The only drawback is that the two files won't be in the same order that was generated from DMD, which would look odd.

IMHO that's fine, considering the circumstances, but it's not up to me to decide.

I refuse to accept that sanitizing a JSON file while preserving the original object field order is an insurmountable task in D :)

Certainly not insurmountable, but it may be worth keeping our feet on the ground in this case. It's just a test case file that hopefully won't be edited / looked at often at all, so maybe just go for the simplest solution.

@marler8997
Copy link
Contributor Author

marler8997 commented Jan 30, 2018

Certainly not insurmountable, but it may be worth keeping our feet on the ground in this case. It's just a test case file that hopefully won't be edited / looked at often at all, so maybe just go for the simplest solution.

Yes I see your point. I'll concede this and go with your solution. Be prepared for a large json.out diff :)

@marler8997
Copy link
Contributor Author

Ok I caved, the JSON sanitizer uses std.json so all the object fields are sorted alphabetically.

@marler8997
Copy link
Contributor Author

Ping. Trying to get this PR finalized and integrated so we have some time before the 2.079 release to modify rdmd to use it (dlang/tools#292).

@JinShil
Copy link
Contributor

JinShil commented Feb 2, 2018

Restarted Jenkins

Copy link
Member

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but please add a short changelog entry.

*fileNode = JSONValue(normalizeFile(fileNode.str));
newModules.put(JSONValue(semanticModule));
}
*modulesArrayPtr = newModules.data;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holy moly, is std.json ugly. We really need to put a warning in this module to prevent people from using it.

@@ -1,930 +1,976 @@
[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI for the future: typically automatically run commands are separated in different commits, s.t. reviewers don't have to look at them too closely. In this case, GH hides the file anyways.

else if (name == "offset")
removeNumber(&value.object[name]);
else
sanitizeSyntaxNode(value.object[name]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of this (I know that's how the status quo is).
Ideally this would be configurable via a flag, s.t. special platform-specific, short tests which test that the deco is correctly set in the json output can be added, but that's definitely not something which shouldn't block this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I added a --keep-deco flag that can be used by tests in the future.

@wilzbach wilzbach added the Needs Changelog A changelog entry needs to be added to /changelog label Feb 2, 2018
@marler8997 marler8997 force-pushed the extendJson branch 2 times, most recently from 1471901 to dc7fb7a Compare February 2, 2018 16:40
@marler8997
Copy link
Contributor Author

Added the changelog

@wilzbach wilzbach removed the Needs Changelog A changelog entry needs to be added to /changelog label Feb 2, 2018
Copy link
Member

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for putting all the hard work into this!
I'm going to merge this now to unblock your work on rdmd - could you do a quick follow up to cover the uncovered output of the imported files? Thanks!

foreach (file; m.contentImportedFiles)
{
item(file);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't covered.

@wilzbach wilzbach merged commit cca2a1e into dlang:master Feb 2, 2018
}
else
{
writeln("Error: too many command line arguments");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stderr?

@marler8997 marler8997 deleted the extendJson branch February 6, 2018 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants