Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parsing only final frame #22

Open
NickCondron opened this issue Aug 23, 2022 · 5 comments
Open

Support parsing only final frame #22

NickCondron opened this issue Aug 23, 2022 · 5 comments

Comments

@NickCondron
Copy link
Contributor

NickCondron commented Aug 23, 2022

Currently we support skipping the frame data entirely. There are some use cases where we only need the final frame (eg. determining winner after lrastart). It would be nice to support this use case without parsing every frame and then just picking the last one.

We might also want to support parsing only the first frame to handle the sheik fix. Replays before 1.6.0 game start didn't correctly differentiate zelda/sheik, so you have to check the first frame to tell which one actually started the game.

@hohav
Copy link
Owner

hohav commented Dec 13, 2022

Does the new placements info mostly cover this, as you see it? If so then I'm inclined to pass on implementing this, to avoid feature creep.

@NickCondron
Copy link
Contributor Author

Does the new placements info mostly cover this, as you see it? If so then I'm inclined to pass on implementing this, to avoid feature creep.

No because the placements field is defined by how melee determines the winner at the results screen. So it doesn't always match the definition of a win used in the competitive community.

For example a timeout with equal stocks for both players will have a tie for first in the placements field even if the damage was different.

Also, a lot of players lra-start at the end of games (especially to skip long KOs off the top) and being able to inspect the final frame(s) you could check for such cases for a more accurate win/loss determination.

@NickCondron
Copy link
Contributor Author

I think changing peppi to be somewhat lazy would make this issue irrelevant and simplify parts of peppi. Currently peppi immediately parses each event and passes the event struct to the relevant Handlers trait function. An alternative lazy design would have peppi encounter an event code and pass a 'thunk' (https://wiki.haskell.org/Thunk) to the relevant Handlers function for that event. The struct that implements Handlers can then choose to evaluate the thunk (and receive the relevant event struct) or not in which case the parser will simply skip over that event.

This would allow users to decide at runtime which events to parse avoiding unnecessary work. This would also remove the need for options like skip_frames, and enable other use cases like skipping only item events or skipping all events after meeting a certain condition (eg. after the first stock is taken).

I haven't fully fleshed out this idea, but I'm curious what you think?

@hohav
Copy link
Owner

hohav commented Feb 25, 2023

My instinct is that this would hurt the performance of object-based parsing enough to be a problem, but I'd be interested to see some performance numbers. py-slippi does something like this, but it's because python is so slow at low-level bit fiddling that the overhead was worth it.

@NickCondron
Copy link
Contributor Author

I'm working on a prototype lazy parsing system. We will see if it's faster or not haha. It's still a WIP, but basic idea is to scan the replay once to build an 'outline' reading the event codes and the frame event indexes only.

pub struct GameInfo {
    pub start: game::Start,
    pub end: game::End,
    pub metadata: metadata::Metadata,
    pub metadata_raw: serde_json::Map<String, serde_json::Value>,
}

pub struct FrameOutline<'a> {
    pub index: i32,
    pub start: Option<&'a [u8]>,
    pub end: Option<&'a [u8]>,
    pub pre_leaders: [Option<&'a [u8]>; NUM_PORTS],
    pub pre_follower: [Option<&'a [u8]>; NUM_PORTS],
    pub post_leaders: [Option<&'a [u8]>; NUM_PORTS],
    pub post_follower: [Option<&'a [u8]>; NUM_PORTS],
    pub items: [Option<&'a [u8]>; 15],
}

pub struct GameOutline<'a> {
    pub info: GameInfo,
    pub gecko_codes: &'a [u8],
    pub frames: Vec<FrameOutline<'a>>,
}

This has a few advantages:

  1. Transformations to the game structure are super fast. For example you can filter out rollback frames or ignore items without ever having to parse those events.
  2. Quickly detect structural errors later in replay file before spending time parsing the whole thing. In practice, if the replay file structure is sound the replay is generally valid.
  3. You know the size of the replay so you can efficiently allocate memory up front.
  4. For event based parsing you only have to parse what you need, but making this ergonomic is a bit of a challenge.

Disadvantages:

  1. You can't validate anything you avoid parsing so you could potentially parse an unsound replay without detecting that error depending on your use case.
  2. It's slower to parse everything in two passes, but maybe this will be offset by (3) above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants