Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to create a lexer that owns the string it's provided? #450

Open
glyh opened this issue Dec 7, 2024 · 11 comments
Open

Is it possible to create a lexer that owns the string it's provided? #450

glyh opened this issue Dec 7, 2024 · 11 comments
Labels
help wanted Extra attention is needed nice to have question Further information is requested

Comments

@glyh
Copy link

glyh commented Dec 7, 2024

As title. I would have a lexer that is stored persistently, but the String used to construct the lexer doesn't live long enough as the lexer itself.

@glyh
Copy link
Author

glyh commented Dec 7, 2024

I kinda need this because I'm trying to emulate streaming parsers with lalrpop:lalrpop/lalrpop#1008

@jeertmans
Copy link
Collaborator

Hi @glyh, I think what you are asking is a self referential structure, which is not really possible with safe Rust. There might be a way to implement this, for example, using a wrapper struct, but I want no experience with this…

@jeertmans
Copy link
Collaborator

But all of this might be possible if you implement Source for String: https://docs.rs/logos/latest/logos/source/trait.Source.html

@glyh
Copy link
Author

glyh commented Dec 7, 2024

Hello, I just dig through the examples, but I found no examples of implementing the Source Trait and have it work with Logos, do you mind provide a simple example?

Thanks a lot!

@jeertmans
Copy link
Collaborator

If you click on link I provided, you can see source code that is used to implement Source for &str, for example. See https://docs.rs/logos/latest/src/logos/source.rs.html#109-177

@jeertmans
Copy link
Collaborator

Note that, after reading methods you need to implement, I am not sure this is actually possible to implement this trait on String, but for sure give it a try :)

@glyh
Copy link
Author

glyh commented Dec 15, 2024

It turns out I can't implement this trait on String as it's derived by logos already.

Here's what I attempt to add:

impl Source for String {
    type Slice<'a> = &'a str;

    #[inline]
    fn len(&self) -> usize {
        self.len()
    }

    #[inline]
    fn read<'a, Chunk>(&'a self, offset: usize) -> Option<Chunk>
    where
        Chunk: self::Chunk<'a>,
    {
        #[cfg(not(feature = "forbid_unsafe"))]
        if offset + (Chunk::SIZE - 1) < self.len() {
            Some(unsafe { Chunk::from_ptr(self.as_ptr().add(offset)) })
        } else {
            None
        }
    }

    #[inline]
    #[cfg(not(feature = "forbid_unsafe"))]
    unsafe fn read_byte_unchecked(&self, offset: usize) -> u8 {
        Chunk::from_ptr(self.as_ptr().add(offset))
    }

    #[inline]
    fn slice(&self, range: std::ops::Range<usize>) -> Option<Self::Slice<'_>> {
        self.slice(range)
    }

    #[inline]
    #[cfg(not(feature = "forbid_unsafe"))]
    unsafe fn slice_unchecked(&self, range: std::ops::Range<usize>) -> Self::Slice<'_> {
        self.slice_unchecked(range)
    }

    #[inline]
    fn is_boundary(&self, index: usize) -> bool {
        self.is_char_boundary(index)
    }

    #[inline]
    fn find_boundary(&self, index: usize) -> usize {
        self.find_boundary(index)
    }
}

Here's what rust complain:

error[E0119]: conflicting implementations of trait `source::Source` for type `String`
   --> src/source.rs:288:1
    |
179 |   impl Source for String {
    |   ---------------------- first implementation here
...
288 | / impl<T> Source for T
289 | | where
290 | |     T: Deref,
291 | |     <T as Deref>::Target: Source,
    | |_________________________________^ conflicting implementation for `String`

@jeertmans
Copy link
Collaborator

Hi @glyh, you are correct, Source is implemented for everything that deref into a type that implements Source, which makes it impossible to provide a specialized implementation for String.

Specialized implementations are not yet available in Rust, see the corresponding RFC.

I am not sure how to work this around, unfortunately, in a way that would not break existing code.

@jeertmans jeertmans added help wanted Extra attention is needed question Further information is requested nice to have labels Dec 18, 2024
@glyh
Copy link
Author

glyh commented Dec 18, 2024

Well, if the implementation is already derived, why didn't this just work?

@jeertmans
Copy link
Collaborator

What do you mean, that passing an owned String didn't work, or that

error[E0119]: conflicting implementations of trait `source::Source` for type `String`
   --> src/source.rs:288:1
    |
179 |   impl Source for String {
    |   ---------------------- first implementation here
...
288 | / impl<T> Source for T
289 | | where
290 | |     T: Deref,
291 | |     <T as Deref>::Target: Source,
    | |_________________________________^ conflicting implementation for `String`

failed? About the latter, this is because you cannot have two implementations of the same trait.

For the former, I guess this is because Lexer::new takes a reference to T: Source, so it will not take ownership of the source string.

@glyh
Copy link
Author

glyh commented Dec 19, 2024

I was thinking about the former question, but yeah, thank you :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed nice to have question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants