Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing NBT #40

Open
gilesknap opened this issue Aug 14, 2022 · 2 comments
Open

Parsing NBT #40

gilesknap opened this issue Aug 14, 2022 · 2 comments

Comments

@gilesknap
Copy link
Contributor

gilesknap commented Aug 14, 2022

I'm opening this issue to discuss NBT parsing.

(I note that NBTs are a Java edition feature and I'm not familiar with how similar information is handled in other editions)

I have implemented a simple deserializer for Stringified Named Binary Tag data which is the format returned by commands like data get

# extract preamble from string responses to commands (benign for raw SNBT)
preamble_re = re.compile(r"[^\[{]*(.*)")
# extract list type identifiers
list_types_re = re.compile(r"[LBI];")
# regex to extract all unquoted items
unquoted_re = re.compile(r'([-.A-Za-z0-9]+)(?=([^"]*"[^"]*")*[^"]*$)')
# regex to extract numeric values
integers_re = re.compile(r'"(\d+)[bsl]?"')
no_decimal_floats_re = re.compile(r'"([0-9]+)[fd]"')
floats_re = re.compile(r'"(\d+.\d+)[fd]"')


def parse_nbt(snbt_text: str) -> object:
    """
    Naive deserialization of an SNBT string into a object graph of Python types.

    Note that this is one way only since the following details are lost:
    - distinction between byte, short, int long, types (suffixes of b,s,none,l)
    - distinction between float, double types (suffixes of f,d)
    - distinction between SNBT and raw JSON (enclosed in single quotes)

    See https://minecraft.fandom.com/wiki/NBT_format
    """
    text = preamble_re.sub(r"\1", snbt_text)
    text = list_types_re.sub(r"", text)
    text = unquoted_re.sub(r'"\1"', text).replace("'", "")
    text = no_decimal_floats_re.sub(r"\1.0", text)
    text = floats_re.sub(r"\1", text)
    text = integers_re.sub(r"\1", text)
    text = text.replace('"true"', '"True"').replace('"false"', '"False"')

    return json.loads(text)

I'm not sure the above approach is worthy of the nicely typed mcipc library.

There is a lot more work to do to make a serializable NBT class in python. A useful NBT class would need to:

  • represent all of the numeric types that are not native to python
  • support arithmetic with python floats/int
  • represent pure JSON attributes (so they can be enclosed in single quotes on serialise)
  • support dot notation for accessing child nodes

This would mean you could do something like this:

# increase the number of items in slot 0 of the chest at 626, 73, -1654
nbt = client.data.get(block=Vec3(626, 73, -1654)) 
nbt.Items[0].Count += 10
client.data.merge(block=Vec3(626, 73, -1654), nbt)

So is this worth implementing? The nbt serialize would be limited to the following commands that I can think of:

  • data merge
  • summon
  • These commands when used with a block entity
    • setblock
    • give
    • fill

Wheras the dumb deserialize specified above is useful for querying information about Players, Mobs, Entities, Block Entities etc.

@gilesknap
Copy link
Contributor Author

gilesknap commented Aug 14, 2022

Out of interest, here is a function in MCIWB using the parser

https://github.com/gilesknap/mciwb/blob/dev/src/demo/arrows.py

Note that it quite easily enables extraction of of a position from the NBT. Also note that this code creates an NBT to send in the data get but its trivial enough that having a serializer would not have added a great deal.

@gilesknap
Copy link
Contributor Author

UPDATE:

Some good news on this. When merging NBT data the command is happy to take any number for numeric types and cast them appropriately.

So this means at present the following code adds 10 eatra items to slot 0 of a chest

In [58]: nbt = parse_nbt(c.data.get(block=Vec3(625, 73, -1646)))

In [59]: nbt['Items'][0]["Count"] += 10

In [60]: c.data.merge(block=Vec3(625, 73, -1646),nbt=str(nbt))
Out[60]: 'Modified block data of 625, 73, -1646'

This appears to mean that the only special handling for serialization is the quoted JSON snippets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant