Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler: modernize js parser #1391

Merged
merged 7 commits into from
Feb 1, 2023
Merged

Compiler: modernize js parser #1391

merged 7 commits into from
Feb 1, 2023

Conversation

hhugo
Copy link
Member

@hhugo hhugo commented Jan 22, 2023

The PR modernize to JavaScript support inside js_of_ocaml.

The PR started with a synchronization of the jsoo parser with what's found in semgrep
(see https://github.com/returntocorp/semgrep/blob/develop/languages/javascript/menhir/parser_js.mly)

The JS ast was modified to accommodate the new syntaxes.

The PR doesn't change the syntax used during code generation.
Here are a few ideas for the future:

  • use arrow function when possible the reduce generated code size
  • use let for mutable var, instead of using IIFE

New syntax added in the PR

  • destructing var [x,y] = [1,2]
  • let and const and their local scoping
  • spread ..., in expression and patterns
  • arrow (args) => exp
  • aync/await
  • yield and generator
  • Bigint
  • Nullish coalescing ??/.?
  • ForOf
  • get/set/computed property for object literal
  • string templates
  • class
  • new.target
  • super

Syntax not included in this PR

  • import/export
  • with (probably won't fix given it's deprecated)

X-links

fix #508
fix #972
fix #989
fix #1017
fix #1031

unlock #1161
Replace #1231

Checklist

  • Update ast
  • fix scope of let and const
  • Update parser
  • Update parser error recovery to handle ( ... ) => ..
  • Check/fix precedence in js_output
  • we need better tests for parser/printer (https://github.com/tc39/test262-parser-tests)
  • fix variable rewriting for function in object_literal

@hhugo hhugo changed the title Compiler: modernize js ast Compiler: modernize js parser Jan 22, 2023
@hhugo hhugo force-pushed the modern-js branch 5 times, most recently from 77f06f9 to b3c1caf Compare January 24, 2023 16:26
@hhugo
Copy link
Member Author

hhugo commented Jan 25, 2023

@vouillon, the PR is not ready yet but is in an "ok" state if you omit commit history.
Do you think you would be able to help with reviewing and fixing precedence in js_output.ml ?

compiler/lib/js_output.ml Outdated Show resolved Hide resolved
@vouillon
Copy link
Member

@vouillon, the PR is not ready yet but is in an "ok" state if you omit commit history. Do you think you would be able to help with reviewing and fixing precedence in js_output.ml ?

I have pushed for fixes

@hhugo
Copy link
Member Author

hhugo commented Jan 27, 2023

@vouillon, the PR is not ready yet but is in an "ok" state if you omit commit history. Do you think you would be able to help with reviewing and fixing precedence in js_output.ml ?

I have pushed for fixes

Thnaks, I've pushed another update to the parser.

@hhugo
Copy link
Member Author

hhugo commented Jan 27, 2023

Parsing https://github.com/tc39/test262-parser-tests/tree/master/pass

skip : 254
fail : 62
pass : 1667

Most of the remain errors are due to keyword that are meant to be accepted as ident.

failure details ``` failed to parse ../test262-parser-tests/pass/ffaf5b9d3140465b.js:1:3 let❌()

failed to parse ../test262-parser-tests/pass/fe2d3b945530c806.js:1:2
a
❌b

failed to parse ../test262-parser-tests/pass/fbcd793ec7c82779.js:1:0
❌<!-- foo

failed to parse ../test262-parser-tests/pass/fa59ac4c41d26c14.js:1:5
({let❌})

failed to parse ../test262-parser-tests/pass/f4a61fcdefebb9d4.js:1:32
var private, protected, public, ❌static

failed to parse ../test262-parser-tests/pass/f2e41488e95243a8.js:1:5
var [❌let] = a;

failed to parse ../test262-parser-tests/pass/f108a85d36ec9afc.js:1:16
({ *a() { yield ❌super.a(); } })

failed to parse ../test262-parser-tests/pass/f0fbbdabdaca2146.js:1:4
var ❌let;

failed to parse ../test262-parser-tests/pass/f0d9a7a2f5d42210.js:1:4
let ❌+ 1

failed to parse ../test262-parser-tests/pass/ee4e8fa6257d810a.js:1:3
let❌++

failed to parse ../test262-parser-tests/pass/e6643a557fe93de0.js:1:7
({yield❌} = 1);

failed to parse ../test262-parser-tests/pass/e5393f15b0e8585d.js:1:5
({let❌} = 1);

failed to parse ../test262-parser-tests/pass/e03ae54743348d7d.js:1:2
--❌> comment

failed to parse ../test262-parser-tests/pass/df696c501125c86f.js:1:7
for(let❌.let in 1);

failed to parse ../test262-parser-tests/pass/d3ac25ddc7ba9779.js:1:0
❌<!--

failed to parse ../test262-parser-tests/pass/d22f8660531e1c1a.js:1:4
var ❌static;

failed to parse ../test262-parser-tests/pass/c8565124aee75c69.js:1:10
for (a of ❌let) {}

failed to parse ../test262-parser-tests/pass/c532e126a986c1d4.js:2:2
;
--❌> HTML comment

failed to parse ../test262-parser-tests/pass/c442dc81201e2b55.js:1:7
for(let❌;;);

failed to parse ../test262-parser-tests/pass/c06df922631aeabc.js:1:13
if (1); else ❌function a(){}

failed to parse ../test262-parser-tests/pass/ba21e63736d8fd46.js:1:3
a: ❌function a(){}

failed to parse ../test262-parser-tests/pass/ba00173ff473e7da.js:1:26
/* block comment */ --❌> comment

failed to parse ../test262-parser-tests/pass/b8c98b5cd38f2bd9.js:1:4
({a:❌let} = 1);

failed to parse ../test262-parser-tests/pass/b563460c1031daf2.js:1:9
({ a() { ❌super.b(); } });

failed to parse ../test262-parser-tests/pass/b15ab152f8531a9f.js:1:0
❌<!-- comment

failed to parse ../test262-parser-tests/pass/aa7e721756949024.js:1:14
({ set a(b) { ❌super.c[1] = 2; } });

failed to parse ../test262-parser-tests/pass/a830df7cf2e74c9f.js:1:13
({ get a() { ❌super[1] = 2; } });

failed to parse ../test262-parser-tests/pass/a5aaa3992025795a.js:1:8
({yield ❌= 1} = 2);

failed to parse ../test262-parser-tests/pass/a4d62a651f69d815.js:1:7
if (a) ❌function b(){}

failed to parse ../test262-parser-tests/pass/a1594a4d0c0ee99a.js:1:13
for (let a = ❌let;;) {}

failed to parse ../test262-parser-tests/pass/9fe1d41db318afba.js:1:8
for(let ❌in 1);

failed to parse ../test262-parser-tests/pass/9f0d8eb6f7ab8180.js:2:4
/*
*/--❌>

failed to parse ../test262-parser-tests/pass/9aa93e1e417ce8e3.js:1:9
for (let ❌in a) {}

failed to parse ../test262-parser-tests/pass/946bee37652a31fa.js:1:21
/* block comment */--❌> comment

failed to parse ../test262-parser-tests/pass/8ec6a55806087669.js:2:3
a = b-->1;
--❌> nothing

failed to parse ../test262-parser-tests/pass/884e5c2703ce95f3.js:1:10
({ *a() { ❌super.b = 1; } });

failed to parse ../test262-parser-tests/pass/8462f068b299bca2.js:1:8
var {let❌, yield} = 1;

failed to parse ../test262-parser-tests/pass/818ea8eaeef8b3da.js:1:1
(❌let[let])

failed to parse ../test262-parser-tests/pass/7c6d13458e08e1f4.js:1:3
01.❌a

failed to parse ../test262-parser-tests/pass/6b36b5ad4f3ad84d.js:1:3
let❌.let = a

failed to parse ../test262-parser-tests/pass/6815ab22de966de8.js:1:7
for(let❌();;);

failed to parse ../test262-parser-tests/pass/660f5a175a2d46ac.js:1:1
(❌let[a])

failed to parse ../test262-parser-tests/pass/65401ed8dc152370.js:1:4
var ❌let = 1

failed to parse ../test262-parser-tests/pass/6498dcc494193cb4.js:2:2
{1
❌3
4
5}

failed to parse ../test262-parser-tests/pass/63c92209eb77315a.js:1:4
var ❌let

failed to parse ../test262-parser-tests/pass/6196b3f969486455.js:1:0
❌yield => 1

failed to parse ../test262-parser-tests/pass/5ecbbdc097bee212.js:1:10
for (a in ❌let) {}

failed to parse ../test262-parser-tests/pass/5d5b9de6d9b95f3e.js:1:2
--❌>

failed to parse ../test262-parser-tests/pass/5a2a8e992fa4fe37.js:1:5
--❌> comment

failed to parse ../test262-parser-tests/pass/59ae0289778b80cd.js:1:7
if (1) ❌function a(){} else;

failed to parse ../test262-parser-tests/pass/56e2ba90e05f5659.js:1:3
let❌;

failed to parse ../test262-parser-tests/pass/5654d4106d7025c2.js:1:3
let

failed to parse ../test262-parser-tests/pass/52aeec7b8da212a2.js:1:7
if (1) ❌function a(){}

failed to parse ../test262-parser-tests/pass/4f731d62a74ab666.js:1:1
(❌let[a]=b)

failed to parse ../test262-parser-tests/pass/4f5419fe648c691b.js:1:4
--❌>

failed to parse ../test262-parser-tests/pass/4ae32442eef8a4e0.js:1:0
❌<!--
;

failed to parse ../test262-parser-tests/pass/3dabeca76119d501.js:1:25
try {} catch (a) { if(1) ❌function a(){} }

failed to parse ../test262-parser-tests/pass/2ef5ba0343d739dc.js:1:3
let❌.let

failed to parse ../test262-parser-tests/pass/1c1e2a43fe5515b6.js:1:7
if (1) ❌function a(){} else function b(){}

failed to parse ../test262-parser-tests/pass/14199f22a45c7e30.js:1:4
let ❌= 1;

failed to parse ../test262-parser-tests/pass/1270d541e0fd6af8.js:1:0
❌<!-- HTML comment

failed to parse ../test262-parser-tests/pass/0cf1df0ef867a7f4.js:1:7
({yield❌})

</details>

@hhugo
Copy link
Member Author

hhugo commented Jan 30, 2023

@pmwhite, yet another PR that would happily receive some testing on your side

@pmwhite
Copy link
Contributor

pmwhite commented Jan 30, 2023

Just tested. Some valid javascript like the following got rejected:

var y = { async: 35}
              ^

at the c character, presumably because the parser doesn't think async is a valid identifier, since it is a keyword.

@hhugo
Copy link
Member Author

hhugo commented Jan 30, 2023

var y = { async: 35}
^

I've just pushed a fix

@hhugo hhugo marked this pull request as ready for review January 31, 2023 09:30
@pmwhite
Copy link
Contributor

pmwhite commented Jan 31, 2023

Okay, the file that was failing to parse before now succeeds, but the code generated for that file in the output is incorrect. I'm not sure how much to include, but I think the relevant part of the input and output files is this translation:

Input:

{key:"from",
 value:
   function from(field,get)
     {if(!get)get=function get(x){return x;};
      return this.compute([field],function(state){return get(state.field(field));});}}

Output:

{key:"from",
 value:
   function get(field,get)
     {if(! get)get = function(x){return x};
      return this.compute([field],function(state){return get(state.field(field))})}}

The bug manifests itself at runtime with an error like "Stuff.get is not a function"; in the original input file, however, the code calls Stuff.from instead of get, so I guess there is a translation happening at both the caller and callee; I don't know which one is the incorrect one.

@hhugo
Copy link
Member Author

hhugo commented Jan 31, 2023

@pmwhite, bug fixed by a153a3a

@hhugo
Copy link
Member Author

hhugo commented Jan 31, 2023

I've added some support for class, only leaving import and export unsupported.

@pmwhite
Copy link
Contributor

pmwhite commented Jan 31, 2023

Oh, that's awesome. Just tested again on the new commits and found no issues.

@hhugo hhugo merged commit f25cbd8 into master Feb 1, 2023
@hhugo hhugo deleted the modern-js branch February 1, 2023 00:47
@hhugo hhugo restored the modern-js branch February 1, 2023 10:16
@hhugo hhugo deleted the modern-js branch February 1, 2023 10:20
@TyOverby
Copy link
Collaborator

Parsing https://github.com/tc39/test262-parser-tests/tree/master/pass

@hhugo could we pull this into the repo and have them build automatically as part of the test suite?

@hhugo
Copy link
Member Author

hhugo commented Feb 15, 2023

I think so, but work is needed to have good integration.

@tpetr
Copy link

tpetr commented Feb 24, 2023

@hhugo thanks so much for this PR! Could you publish a new release to opam so that I can unblock myself in semgrep/semgrep#7220?

@hhugo
Copy link
Member Author

hhugo commented Feb 25, 2023

@tpetr, there is at least one more PR that I'd like include in the next release.
See https://github.com/ocsigen/js_of_ocaml/milestone/10.

hhugo pushed a commit to hhugo/opam-repository that referenced this pull request Mar 7, 2023
…s_of_ocaml-ppx_deriving_json, js_of_ocaml-ppx, js_of_ocaml-lwt and js_of_ocaml-compiler (5.1.0)

CHANGES:

## Features/Changes
* Lib: Added support for KeyboardEvent.getModifierState
* Misc: bump min ocaml version to 4.08
* Misc: remove some old runtime files to support some external libs
* Misc: switch to dune 3.7
* Effects: partial CPS transformation, resulting in much better performances, lower compilation time and smaller generated code
* Compiler: separate compilation can now drops unused units when linking (similar to ocamlc). (ocsigen/js_of_ocaml#1378)
* Compiler: specialize string to js-string conversion for all valid utf8 strings (previously just ascii)
* Compiler: JavaScript files generated by `js_of_ocaml` are now UTF-8 encoded.
* Compiler: use identifier for object literals when possible
* Compiler: Cache function arity (the length prop of a function is slow with v8)
* Compiler: The js lexer is now utf8 aware, recognize and emit utf8 ident
* Compiler: Update the js lexer with new number literal syntax
* Compiler: update js parser to support most es6 feature (ocsigen/js_of_ocaml#1391)
* Compiler: stop parsing the builtin js runtime if not necessary
* Compiler: improve js pretty printer (ocsigen/js_of_ocaml#1405)
* Compiler: improve debug location and speedup compilation (ocsigen/js_of_ocaml#1407)
* Toplevel: Enable separate compilation of toplevels
* Runtime: js backtrace recording controled by OCAMLRUNPARAM
* Runtime: support for zstd decompression of marshalled data (ocaml.5.1) (ocsigen/js_of_ocaml#12006)
* Runtime: stub out custom runtime events symbols for OCaml 5.1 (ocsigen/js_of_ocaml#1414)

## Bug fixes
* Effects: fix Js.export and Js.export_all to work with functions (ocsigen/js_of_ocaml#1417,ocsigen/js_of_ocaml#1377)
* Sourcemap: fix incorrect sourcemap with separate compilation
* Compiler: fix control flow analysis; some annotions were wrong in the runtime
* Compiler: js backtrace recording respected in the js runtime and when using effects
* Compiler: no longer fail on invalid source file (when the file is a directory)
* Runtime: fix the compilation of some mutually recursive functions
hhugo pushed a commit to hhugo/opam-repository that referenced this pull request Mar 7, 2023
…s_of_ocaml-ppx_deriving_json, js_of_ocaml-ppx, js_of_ocaml-lwt and js_of_ocaml-compiler (5.1.0)

CHANGES:

## Features/Changes
* Lib: Added support for KeyboardEvent.getModifierState
* Misc: bump min ocaml version to 4.08
* Misc: remove some old runtime files to support some external libs
* Misc: switch to dune 3.7
* Effects: partial CPS transformation, resulting in much better performances, lower compilation time and smaller generated code
* Compiler: separate compilation can now drops unused units when linking (similar to ocamlc). (ocsigen/js_of_ocaml#1378)
* Compiler: specialize string to js-string conversion for all valid utf8 strings (previously just ascii)
* Compiler: JavaScript files generated by `js_of_ocaml` are now UTF-8 encoded.
* Compiler: use identifier for object literals when possible
* Compiler: Cache function arity (the length prop of a function is slow with v8)
* Compiler: The js lexer is now utf8 aware, recognize and emit utf8 ident
* Compiler: Update the js lexer with new number literal syntax
* Compiler: update js parser to support most es6 feature (ocsigen/js_of_ocaml#1391)
* Compiler: stop parsing the builtin js runtime if not necessary
* Compiler: improve js pretty printer (ocsigen/js_of_ocaml#1405)
* Compiler: improve debug location and speedup compilation (ocsigen/js_of_ocaml#1407)
* Toplevel: Enable separate compilation of toplevels
* Runtime: js backtrace recording controled by OCAMLRUNPARAM
* Runtime: support for zstd decompression of marshalled data (ocaml.5.1) (ocsigen/js_of_ocaml#12006)
* Runtime: stub out custom runtime events symbols for OCaml 5.1 (ocsigen/js_of_ocaml#1414)

## Bug fixes
* Effects: fix Js.export and Js.export_all to work with functions (ocsigen/js_of_ocaml#1417,ocsigen/js_of_ocaml#1377)
* Sourcemap: fix incorrect sourcemap with separate compilation
* Compiler: fix control flow analysis; some annotions were wrong in the runtime
* Compiler: js backtrace recording respected in the js runtime and when using effects
* Compiler: no longer fail on invalid source file (when the file is a directory)
* Runtime: fix the compilation of some mutually recursive functions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants