-
-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: flatten the query dataframe for contracts #2196
feat: flatten the query dataframe for contracts #2196
Conversation
@@ -676,6 +676,8 @@ def query( | |||
columns_ls = validate_and_expand_columns(columns, ContractLog) | |||
data = map(partial(extract_fields, columns=columns_ls), contract_events) | |||
df = pd.DataFrame(columns=columns_ls, data=data) | |||
# Note: The below check is to check for `event_arguments` in the request. | |||
# If `event_arguments` exists in the request, we flatten the field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So originally my idea was that "*"
or specific event arg names mentioned in the columns would go grab those fields from the logs specifically, and really only if event_arguments
was present in the columns would you keep it, otherwise it should be flattened in the resulting dataframe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I went through all of Ape, trying to find a good place to make this manipulation, and it looked pretty painful. I tried to do make this happen earlier, but there is no information available for what those dict keys are going to be in the event_arguments
field. I can't know until we make the query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment not correct then? I would think unless event_arguments
is in the request, we should flatten that field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was flattening it either way. If you want:
columns="name,event_arguments"
I planned to flatten event_arguments
. We don't have to though, wasn't sure if we wanted to always flatten it or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to think through the scenario where one of the event arguments might be called name
or something else that ContractLog has
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to think through the scenario where one of the event arguments might be called name or
contract_address
is the one that usually bites me... I always mix up if it the one on ContractLog or a custom event input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to think through the scenario where one of the event arguments might be called
name
or something else that ContractLog has
This is something I was contemplating here. If that were to happen, pandas will automatically rename both columns to name_x, and name_y.
Do we want to keep event_arguments
in the columns names? In the case of the uniswap v2 contract, we'd get event_arguments.token0, event_arguements.token1, event_arguments.pair, event_arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to keep
event_arguments
in the columns names? In the case of the uniswap v2 contract, we'd getevent_arguments.token0, event_arguements.token1, event_arguments.pair, event_arguments.
I don't think so, it seems much more natural to me to work with it like this:
df = factory.PoolCreated.query("pair", ...)
pools = pool(p for p in df["pair"])
df = df["pair"] | pd.Dataframe(dict(token0=p.token0_balance, token1=p.token1_balance) for p in df["pair"])
df
| pair | token0 | token1 |
| ----- | ------ | ------ |
| 0x... | 123... | 456... |
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to keep
event_arguments
in the columns names? In the case of the uniswap v2 contract, we'd getevent_arguments.token0, event_arguements.token1, event_arguments.pair, event_arguments.
I don't think so, it seems much more natural to me to work with it like this:
df = factory.PoolCreated.query("pair", ...) pools = pool(p for p in df["pair"]) df = df["pair"] | pd.Dataframe(dict(token0=p.token0_balance, token1=p.token1_balance) for p in df["pair"]) df | pair | token0 | token1 | | ----- | ------ | ------ | | 0x... | 123... | 456... | ...
I don't disagree, for this contract (UniswapV2), we do have an empty field. I think it's an ID? Maybe we just drop the empty field? Idk if we want that or not.
If we want the ability to query for sub-fields, it's going to take quite a bit of work. In BaseContractLog
we have the event_arguments
field as a dict. Since that field can be anything depending on the contract, I believe we'd need a model_validator in that model where we can pull apart event_arguments
and set the keys of that value to keys of the model itself.
@model_validator(mode="before")
@classmethod
def validate_event_args(cls, v, values):
if (event_args := values.get("event_arguments")):
for k, v in event_args.items():
# Note: we can do a check on the values.keys() to make sure we aren't duplicating values
values[k] = v
return values
I'm not 100% sure this will work, I've never created fields in a model on the fly before. But if it does, we'd have to ensure the downstream functionality remains intact. As long as we don't remove event_arguments
we should be okay. Maybe we add something to the field names we create so we know those are generated fields? Not sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed test_query
isn't testing much from ContractLog
at all!
@@ -676,6 +676,8 @@ def query( | |||
columns_ls = validate_and_expand_columns(columns, ContractLog) | |||
data = map(partial(extract_fields, columns=columns_ls), contract_events) | |||
df = pd.DataFrame(columns=columns_ls, data=data) | |||
# Note: The below check is to check for `event_arguments` in the request. | |||
# If `event_arguments` exists in the request, we flatten the field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to think through the scenario where one of the event arguments might be called name or
contract_address
is the one that usually bites me... I always mix up if it the one on ContractLog or a custom event input
As part of the acceptance criteria of this PR, I can add some testing for |
4de65a6
to
15199a6
Compare
15199a6
to
01ad403
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this cause issues with clashing of fields?
For example contract_address
is a common Event input argument, and contract_address
is also a regular field on the logs.
event_arguments
keeps those separate and avoids name-clashing.
01ad403
to
da59170
Compare
probably, I have to do this in a smarter way I don't like the way I implemented this |
This pull request is considered stale because it has been open 30 days with no activity. Remove stale label, add a comment, or make a new commit, otherwise this PR will be closed in 5 days. |
@fubuloubu this has gone stale, do we want to close this for now and come back to it? This work was to help with the Uniswap work |
What I did
Adding the ability to flatten
event_arguments
in queries for contracts.fixes: #
How I did it
Added a check for the
event_arguments
fieldHow to verify it
use the
uniswap-sdk feat/add-v2
branchChecklist