Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize sourcemap processing #1617

Closed

Conversation

OlivierNicole
Copy link
Contributor

@OlivierNicole OlivierNicole commented May 23, 2024

This improves the efficiency of sourcemap processing, chiefly during linking, but also in serializing and deserializing sourcemaps.

Currently, when linking with source maps enabled, js_of_ocaml will decode the source maps for each input file completely into an in-memory explicit mapping, then merge them, then re-encode the merged sourcemap. The merging algorithm involves superlinear sorting operations. However, the format of sourcemaps allows to avoid sorting for the concatenation of generated files.

In addition, all operations performed by linking (concatenation of files, removal and addition of lines) can be performed directly on the encoded form, which further avoids a full parsing to an in-memory representation.

When compiling js_of_ocaml with itself, the final linking step is accelerated by a speedup of about 2.3x with OCaml 5.2.0+fp:

recent trunk (bfbeb69) this PR
331.8 ms ± 6.1 ms 143.3 ms ± 1.1 ms

fix #1446

@OlivierNicole
Copy link
Contributor Author

OlivierNicole commented May 23, 2024

I can’t seem to reproduce the CI failure:

File "toplevel/examples/lwt_toplevel/dune", line 2, characters 8-16:
2 |  (names toplevel)
            ^^^^^^^^
(cd _build/default/toplevel/examples/lwt_toplevel && ../../../compiler/bin-js_of_ocaml/js_of_ocaml.exe link --source-map-inline -o toplevel.bc.js .toplevel.eobjs/jsoo/toplevel.bc.runtime.js ../../../.js/!effects+toplevel/stdlib/stdlib.cma.js ../../../.js/!effects+toplevel/compiler-libs.common/ocamlcommon.cma.js ../../../.js/!effects+toplevel/compiler-libs.bytecomp/ocamlbytecomp.cma.js ../../../.js/!effects+toplevel/menhirLib/menhirLib.cma.js ../../../.js/!effects+toplevel/gen/gen.cma.js ../../../.js/!effects+toplevel/sedlex/sedlex.cma.js ../../../.js/!effects+toplevel/yojson/yojson.cma.js ../../../compiler/lib/.js_of_ocaml_compiler.objs/jsoo/!effects+toplevel/js_of_ocaml_compiler.cma.js ../../../lib/runtime/.jsoo_runtime.objs/jsoo/!effects+toplevel/jsoo_runtime.cma.js ../../../lib/js_of_ocaml/.js_of_ocaml.objs/jsoo/!effects+toplevel/js_of_ocaml.cma.js ../../../.js/!effects+toplevel/uutf/uutf.cma.js ../../../.js/!effects+toplevel/re/re.cma.js ../../../.js/!effects+toplevel/tyxml.functor/tyxml_f.cma.js ../../../.js/!effects+toplevel/tyxml/tyxml.cma.js ../../../.js/!effects+toplevel/react/react.cma.js ../../../.js/!effects+toplevel/reactiveData/reactiveData.cma.js ../../../lib/tyxml/.js_of_ocaml_tyxml.objs/jsoo/!effects+toplevel/js_of_ocaml_tyxml.cma.js ../../../compiler/lib-dynlink/.js_of_ocaml_compiler_dynlink.objs/jsoo/!effects+toplevel/js_of_ocaml_compiler_dynlink.cma.js ../../../.js/!effects+toplevel/compiler-libs.toplevel/ocamltoplevel.cma.js ../../lib/.js_of_ocaml_toplevel.objs/jsoo/!effects+toplevel/js_of_ocaml_toplevel.cma.js ../../../.js/!effects+toplevel/lwt/lwt.cma.js ../../../lib/lwt/.js_of_ocaml_lwt.objs/jsoo/!effects+toplevel/js_of_ocaml_lwt.cma.js ../../../.js/!effects+toplevel/graphics/graphics.cma.js ../../../lib/deriving_json/.js_of_ocaml_deriving.objs/jsoo/!effects+toplevel/js_of_ocaml_deriving.cma.js ../../../.js/!effects+toplevel/str/str.cma.js ../../../.js/!effects+toplevel/dynlink/dynlink.cma.js ../../../lib/lwt/graphics/.graphics_js.objs/jsoo/!effects+toplevel/graphics_js.cma.js ../../../.js/!effects+toplevel/ocaml-compiler-libs.common/ocaml_common.cma.js ../../../.js/!effects+toplevel/ppxlib.astlib/astlib.cma.js ../../../.js/!effects+toplevel/stdlib-shims/stdlib_shims.cma.js ../../../.js/!effects+toplevel/ppxlib.ast/ppxlib_ast.cma.js ../../../.js/!effects+toplevel/ocaml-compiler-libs.shadow/ocaml_shadow.cma.js ../../../.js/!effects+toplevel/ppxlib.print_diff/ppxlib_print_diff.cma.js ../../../.js/!effects+toplevel/ppx_derivers/ppx_derivers.cma.js ../../../.js/!effects+toplevel/ppxlib.traverse_builtins/ppxlib_traverse_builtins.cma.js ../../../.js/!effects+toplevel/sexplib0/sexplib0.cma.js ../../../.js/!effects+toplevel/ppxlib.stdppx/stdppx.cma.js ../../../.js/!effects+toplevel/ppxlib/ppxlib.cma.js ../../../ppx/ppx_js/as-lib/.ppx_js.objs/jsoo/!effects+toplevel/ppx_js.cma.js ../../../ppx/ppx_js/lib/.ppx_js_rewriter.objs/jsoo/!effects+toplevel/ppx_js_rewriter.cma.js .toplevel.eobjs/jsoo/dune__exe.cmo.js .toplevel.eobjs/jsoo/dune__exe__B64.cmo.js .toplevel.eobjs/jsoo/dune__exe__Colorize.cmo.js .toplevel.eobjs/jsoo/dune__exe__Graphics_support.cmo.js .toplevel.eobjs/jsoo/dune__exe__Ocp_indent.cmo.js .toplevel.eobjs/jsoo/dune__exe__Indent.cmo.js .toplevel.eobjs/jsoo/dune__exe__Ppx_support.cmo.js .toplevel.eobjs/jsoo/dune__exe__Toplevel.cmo.js ../../../.js/!effects+toplevel/stdlib/std_exit.cmo.js --linkall)
../../../compiler/bin-js_of_ocaml/js_of_ocaml.exe: You found a bug. Please report it at https://github.com/ocsigen/js_of_ocaml/issues :
Error: File "compiler/lib/source_map_io.yojson.ml", line 109, characters 12-18: Assertion failed

This test builds fine on my machine.

Edit: never mind, it was an off-by-one error in an assertion.

@OlivierNicole
Copy link
Contributor Author

One test is failing because reading “index maps” (composite source maps) is not supported for now—something that is not required for js_of_ocaml to function. I will fix this test soon.

@rickyvetter
Copy link
Contributor

I just tested this change on an incremental build for one of our largest executables and am getting this timing for js_of_ocaml link.

Pre: 2.469s
Post: 1.301s

Very cool!

@OlivierNicole
Copy link
Contributor Author

This is ready for review. I don’t quite understand the build problems on the 5.1 CI, but they don’t seem related to this PR.

@OlivierNicole OlivierNicole force-pushed the optim_sourcemap_link branch 5 times, most recently from 01f15f7 to fb76558 Compare June 11, 2024 10:16
@OlivierNicole
Copy link
Contributor Author

My bad, it actually was due to a missing :standard in a flags field of a Dune stanza. The CI passes now.

compiler/lib/source_map.ml Outdated Show resolved Hide resolved
compiler/lib/source_map.ml Outdated Show resolved Hide resolved
compiler/lib/source_map.ml Outdated Show resolved Hide resolved
compiler/lib/source_map.ml Outdated Show resolved Hide resolved
compiler/lib/js_output.ml Outdated Show resolved Hide resolved
compiler/lib/link_js.ml Outdated Show resolved Hide resolved
compiler/lib/source_map.ml Outdated Show resolved Hide resolved
compiler/lib/source_map.ml Outdated Show resolved Hide resolved
@OlivierNicole
Copy link
Contributor Author

Is anything blocking this PR still?

Copy link
Member

@hhugo hhugo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed some minor cleanup on the branch directly.

I think this goes in the right direction.

Changes would be easier to digest if they were in separate commits.

My understanding is that the Line_edits change is what gives use the most part of the improvement. Could we achieve similar perf using sections (Index sourcemap) ?

compiler/tests-compiler/sourcemap.ml Outdated Show resolved Hide resolved
compiler/tests-sourcemap/dump_sourcemap.ml Outdated Show resolved Hide resolved
compiler/lib/source_map.ml Outdated Show resolved Hide resolved
compiler/lib/source_map.ml Outdated Show resolved Hide resolved
compiler/lib/source_map_io.yojson.ml Outdated Show resolved Hide resolved
compiler/lib/source_map_io.yojson.ml Outdated Show resolved Hide resolved
compiler/lib/source_map_io.yojson.ml Outdated Show resolved Hide resolved
compiler/lib/source_map_io.yojson.ml Outdated Show resolved Hide resolved
compiler/lib/source_map.mli Outdated Show resolved Hide resolved
@OlivierNicole
Copy link
Contributor Author

OlivierNicole commented Jul 30, 2024

I’m trying to answer some of your comments but my answers are labelled as “pending”, I’m not sure why.

@OlivierNicole
Copy link
Contributor Author

Never mind, apparently I had started a review at some point in the past and never completed it, and apparently Github considers all your comments henceforth as pending

@OlivierNicole
Copy link
Contributor Author

Just to make sure we are on the same page, my intention is to wait for #1640 to be merged, rebase on it, and then address the rest of your new review (thanks!).

@hhugo
Copy link
Member

hhugo commented Aug 2, 2024

Just to make sure we are on the same page, my intention is to wait for #1640 to be merged, rebase on it, and then address the rest of your new review (thanks!).

What do you think about the following, on top of #1640.

@OlivierNicole
Copy link
Contributor Author

drop the indexed sourcemap for now (it seems it's not used)

It is used, to represent the sourcemap of the linker output (in Link_js).

@hhugo
Copy link
Member

hhugo commented Sep 4, 2024

#1640 has been merged, I'll let you rebase this one

@OlivierNicole
Copy link
Contributor Author

Done

@hhugo
Copy link
Member

hhugo commented Sep 5, 2024

I'm not very happy with the callback based approach used on Link_js. Without thinking much about it, I would assume that one could reconstruct the edit instruction inside let copy ic oc = ... only.

Maybe we could move the handling/parsing of Sourcemap index into a separate PR to remove some noise in here.

@hhugo
Copy link
Member

hhugo commented Sep 5, 2024

Regarding the representations of edits. We currently have per line instructions. Have you considered using a different approach (closer to what's done on master with reloc) using "copy n lines starting at pos n into position m" ?

@OlivierNicole
Copy link
Contributor Author

I considered it, but at the time discarded it as early optimisation. Now that the main source of slowdown as been eliminated, it might be worth trying.

@OlivierNicole
Copy link
Contributor Author

Regarding the representations of edits. We currently have per line instructions. Have you considered using a different approach (closer to what's done on master with reloc) using "copy n lines starting at pos n into position m" ?

Actually, what I had considered was to have the edits represented as a list of “copy / delete n lines” rather than a long list of “copy / delete 1 line”.

What you suggest would make Source_map.Mappings.edit no longer linear. In your suggestion, some sorting is involved, whereas in what is currently done, the destination line is implicitly represented by the position of the edit in the list of edits.

@hhugo
Copy link
Member

hhugo commented Oct 15, 2024

My understanding is that the Line_edits change is what gives use the most part of the improvement. Could we achieve similar perf using sections (Index sourcemap) ?

I hoped so initially, but it turns out that Link_js may remove or add metadata comments in the middle of a compilation unit, which pretty much killed my hope of simply using index maps.

Can we maybe revisit this approach. Maybe we could teach link_js to copy blocks of lines that do not mess up with sourcemap ? If we really want to drop comments, we could emit an empty line instead. We would probably need to emit Index map when compiling cma files so that we have one sourcemap per unit and we can easily drop dead units when linking.

@hhugo
Copy link
Member

hhugo commented Oct 17, 2024

I've opened #1715 instead. It removes the need to change encoded mappings.

@hhugo
Copy link
Member

hhugo commented Oct 22, 2024

Closing in favor of #1715

@hhugo hhugo closed this Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE REQUEST] Use source-map sections in js_of_ocaml link command.
3 participants