Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambiguity in grammar for parsing alignment attributes, string attributes #40

Open
mewmew opened this issue Nov 22, 2018 · 3 comments
Open

Comments

@mewmew
Copy link
Member

mewmew commented Nov 22, 2018

The grammar contains an ambiguity when parsing global variable alignment attributes. More specifically, an alignment attribute of a global variable may be interpreted either as a GlobalAttr or a FuncAttr, and since the list of both global attributes and function attributes may be optionally empty, this leads to a shift/reduce ambiguity in the parser.

From the ll.tm EBNF grammar:

GlobalDecl -> GlobalDecl
	: Name=GlobalIdent '=' ExternLinkage Preemptionopt Visibilityopt DLLStorageClassopt ThreadLocalopt UnnamedAddropt AddrSpaceopt ExternallyInitializedopt Immutable ContentType=Type (',' Section)? (',' Comdat)? (',' Align)? Metadata=(',' MetadataAttachment)+? FuncAttrs=(',' FuncAttribute)+?
;

FuncAttribute -> FuncAttribute
	: AttrString
	| AttrPair
	# not used in attribute groups.
	| AttrGroupID
	# used in functions.
	#| Align # NOTE: removed to resolve reduce/reduce conflict, see above.
	# used in attribute groups.
	| AlignPair
	| AlignStack
	| AlignStackPair
	| AllocSize
	| FuncAttr
;

Specifically, the end of the line is of interest (',' Align)? Metadata=(',' MetadataAttachment)+? FuncAttrs=(',' FuncAttribute)+?

Given that there are no metadata attachments, the alignment attribute (align 8) of the following LLVM IR:

@a = global i32 42, align 8

may be either reduced to a global attribute (i.e. Align before MetadataAttachment), or as a function attribute (i.e. FuncAttribute after MetadataAttachment).

The solution employed by the C++ parser is the opposite of maximum much, as it will try to reduce rather than shift when possible.

@mewmew mewmew added this to the v0.3 milestone Nov 25, 2018
mewmew added a commit to llir/testdata that referenced this issue Nov 26, 2018
Three function declarations had to be removed from the
original test case as #40 has yet to be resolved.

	; Functions -- align
	; TODO: re-enable when llir/llvm#40 is resolved.
	;declare void @f.align2() align 2
	; CHECK: declare void @f.align2() align 2
	; TODO: re-enable when llir/llvm#40 is resolved.
	;declare void @f.align4() align 4
	; CHECK: declare void @f.align4() align 4
	; TODO: re-enable when llir/llvm#40 is resolved.
	;declare void @f.align8() align 8
	; CHECK: declare void @f.align8() align 8
@mewmew mewmew added the bug label Jun 30, 2019
mewmew added a commit that referenced this issue Dec 4, 2019
Note: this is not a complete fix, but a pragmatic one, as align
is more commonly used for function definitions than return
attributes.

Anyone well versed with LR-1 grammars, feel free to give hints
on how we may resolve this in a proper way.

ref: llir/grammar@71126f6.

Updates #40.
Updates #111.
@mewmew
Copy link
Member Author

mewmew commented Dec 16, 2019

From #111 (comment)

Grammar related to Function String Attribute

test cases failing likely related to Function String Attribute grammar
  • llvm/test/Bitcode/attributes.ll
  • llvm/test/Transforms/Inline/inline-varargs.ll

align attribute

align used in call instruction
  • llvm/test/Analysis/ValueTracking/memory-dereferenceable.ll
    • syntax error at line 153
align used in return attribute
  • llvm/test/Transforms/InstCombine/assume-redundant.ll
    • syntax error at line 50
  • llvm/test/Transforms/LoopSimplify/unreachable-loop-pred.ll
    • syntax error at line 25

I don't know how to update the grammar to handle ambiguities related to align used in return attributes, function attributes, etc. The same goes for string attributes, and key-value attributes. The approach taken now is to simply allow the most common cases of these, and then (unfortunately) fail when we can't resolve the ambiguous grammar. I wish the grammar of LLVM IR was LR-1, but that does not seem to be the case.

If anyone knows of a clean approach to handle this. You are warmly invited to share your thoughts. We'd very much appreciate it, seeing as this annoying issue is yet to find a clean solution.

Cheers,
Robin

@mewmew mewmew removed the bug label Dec 16, 2019
@mewmew mewmew modified the milestones: v0.3, future Dec 16, 2019
@mewmew mewmew reopened this Dec 16, 2019
@mewmew mewmew changed the title Ambiguity in grammar for parsing global variable alignment attributes Ambiguity in grammar for parsing alignment attributes, string attributes Dec 16, 2019
@mewmew mewmew added the grammar label Dec 16, 2019
@mewmew mewmew mentioned this issue Sep 2, 2021
@mewmew mewmew mentioned this issue Dec 22, 2021
@dannypsnl
Copy link
Member

Do we consider ANTLR? I was thinking so many issues you opened just cannot get fixed, maybe we should swap to a more stable parser generator? Anyway, it can't be more painful.

@mewmew
Copy link
Member Author

mewmew commented Oct 20, 2022

Do we consider ANTLR? I was thinking so many issues you opened just cannot get fixed, maybe we should swap to a more stable parser generator? Anyway, it can't be more painful.

@dannypsnl, haha, yeah I know, there are some pains with using Textmapper. I do think however this is true for every parser generator.

That being said, feel free to do a Proof of concept : )

I think every parser generator has pros and cons. So if ANTLR turns out to solve more issues than it creates, it may be worth it. However, it should be noted that this is a ton of work. So, except to put in at least 2 full time weeks before reaching feature parity. If you still feel like working on it, then definitely, go for it!

Cheers,
Robin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants