Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntactic completion #1257

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Syntactic completion #1257

wants to merge 10 commits into from

Conversation

let-def
Copy link
Contributor

@let-def let-def commented Feb 4, 2021

This PR implements completion based on language grammar (this complements the existing completion based on semantic information).

Here is a rough explanation of the algorithm, taken from the Syntactic_completion module:

Generate syntactic completion by analysing the parser stack.
The task is split in a few steps:

  1. First enumerate all reachable states by simulating all possible
    reductions.
    This is done by the Lookahead, Level and Stack modules. Lookahead keep
    tracks of a set of lookahead terminals: rather than simulating
    separately for each possible lookahead token, we regroup lookahead
    tokens that trigger the same reduction, such that a given reduction is
    simulated only once.
    Level keep track of all goto transitions to simulate on a given stack
    frame (hence the reductions are grouped per stack "level").
    Stack modules simulate the stack reduction.

    Auxiliary information are provided by:

    • Parser_complete.state_to_reduction_table: the list of reductions to
      simulate when in a given state, already structured per level
    • Parser_complete.state_goto_table: a naive but sufficient
      representation of the LR goto table.
      TODO: replace it by Menhir builtin goto table when possible (need to
      patch Menhir).
  2. After that, we have the list of all states that were reached, and for
    each state, the set of lookahead tokens that led to it.
    The list is ordered: the state that is the deepest in the stack comes
    first. This is done to favor completions that will close the most
    syntactic constructions over completions that might open new nested
    constructions.
    In practice, it means that in this example :
    module M = struct
    let v = if true then x
    Completing after the x, the suggestions will be:

    • first end, to close the structure
    • then in, to transforme the module let into an expression let
    • finally else, to turn the if then into an if then else.
      This order will be preserved by subsequent transformations.

    Then we turn each reached state into an "item set":

    • Parser_complete.state_closure_table associates to each state the
      states that can be reached by following "null" reductions.
      (e.g. if we are in let . rec? we can each let rec? . by assuming
      the rec flag is missing: rec? is a nullable reduction)
      TODO: maybe we can remove this step by simulating the closure from
      production definition, the runtime cost should be negligible.
    • Parser_complete.items_table associates a state to its itemset, in the
      form of a list of pair of (production, dot position)
  3. The item sets are transformed into sequence of symbols by looking
    them up in Parser_complete.productions table, that contains the
    definition of each production.
    Extra step: we need to simulate the "closure" of the itemset.
    For instance if we have an item that looks like . expression, we don't
    want to stop there and just suggest "expression", rather we want to
    expand the expression to its definition.
    This is done using Parser_complete.nonterminal_to_productions that lists
    all the productions that can produce a non-terminal.

  4. Now we have a list of symbols that constitutes valid continuations of
    the current parsing. We need to turn them into readable definitions that
    can be presented to the user.
    First, we keep only the ones that starts with tokens we consider
    "interesting" (mostly keywords) using "is_interesting_terminal".
    After that, for each starting terminal, we only keep the shortest
    sentence that can complete it. For instance,
    { if ... then ... , if ... then ... else ... ,
    let ... = ... , let ... = ... in ... }
    is simplified too
    { if ... then ... , let ... = ... }
    Then we turn terminals into text using Parser_printer and replace
    non-terminals by "..."

[completion_for_parser] runs to whole pipeline.

TODO:

  • The pipeline is not really efficient, some intermediate structures could be prune early to avoid a lot of redundant computations
  • Add tests to the testsuite
  • Tested in VIM. What about other editors?

@let-def
Copy link
Contributor Author

let-def commented Feb 4, 2021

Here is an example showing syntactic completion in an expression:
syntactic-compl

@let-def
Copy link
Contributor Author

let-def commented Feb 4, 2021

TODO:
Syntactic completion engine does not handle well completing when the cursor is on an existing word.
Fix handling when the cursor is

  • in the middle of a word
  • at the end of a word

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant