Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do not merge] Add support for subaddresses #23

Closed
wants to merge 4 commits into from

Conversation

j-berman
Copy link
Contributor

@j-berman j-berman commented Dec 30, 2021

PLEASE READ

Submitted a PR to the light wallet spec highlighting a different way of supporting subaddresses: monero-project/meta#647

I anticipate closing this PR, and opening a new PR implementing the proposal accepted there.


Update [1/3/21]: scanner picks up on outputs sent to subaddresses when there are 3+ outputs in a transaction involving subaddresses

Overview

This PR adds support for subaddresses in monero-lws. All monero-lws server endpoints that take an address can instead take a subaddress with this PR. For example, to login with a testnet subaddress:

curl  -w "\n" -X POST http://127.0.0.1:8443/login -d '{"address": "BffjVfkhQiZZfc4mU9xHGZQHNfgvz1PG367Pwhed2yob1qXf96eMydYT8tTkev8RBsMQMyJChTdqae5JN1sFjknFRWBEqQh", "view_key": "f747f4a4838027c9af80e6364a941b60c538e67e9ea198b6ec452b74c276de06", "create_account": true, "generated_locally": false}'

I implemented a migration to add a falsey is_subaddress boolean to every address stored in the db, since all addresses already stored must be primary addresses. The migration runs the first time you run monero-lws, then doesn't run again. I took inspiration from monero/src/blockchain_db/lmdb/db_lmdb.cpp in how to structure the migration.

Also, drawing attention to 2 things in particular in this PR:

1. Potential subaddress leak

If you give someone your subaddress, they can use your subaddress to query the server to see if your subaddress is stored on the server, but cannot gain information beyond the fact that it is or is not stored. They would still need your view key to get more information beyond the fact that the subaddress is stored. The fact that it's stored is revealed to someone calling any of the endpoints and providing a subaddress with a different view key from the one already stored, which would return with a "bad view key" error iff already stored. This leak does not affect primary addresses.

I figure the added complexity to avoid this leak may not be worth the gain for monero-lws, since if you give someone your subaddress, and also give them the endpoint of your monero-lws server, you may already be comfortable with them knowing you're monitoring that subaddress on your own server already. I figure this would be a worse implication for MyMonero (and not monero-lws) since you probably don't want to tell someone who you give your subaddress to that you're using MyMonero by default (they would be able to find out if you're using MyMonero with just your subaddress), and therefore it's worth preventing in MyMonero's case if they start supporting subaddresses.

Without modifying the light wallet API, the simplest safe solution to prevent this I can think of (with no timing attack) would be to allow storage of all subaddress/view key pairs (i.e. allow duplicate subaddresses across different view keys). monero-lws admins could approve create/import requests by referencing the view key, rather than address (so an admin accepting a single view key could approve multiple create/import requests at once). There would be no uniqueness constraint on addresses this way, but rather on address/view key pairs. The plumbing to get this working would be a bit more work, but happy to do it if it's the desired solution.

2. Equality check on secret keys

The equality check on crypto::secret_key uses libsodium's constant time crypto_verify_32. This is critical. If it didn't compare in constant time, and compared the lws::db::view key types with a naive byte comparison, it could leak the view key via a timing attack.

caveat: a user can see if another user's subaddress is
stored on the server by querying with a bad view key
All addresses stored prior *must* have been primary addresses,
therefore explicitly mark them all as primary addresses
@trasherdk
Copy link
Contributor

I'm curious to why you choose to return bad view key and not just a 401 response?

@j-berman
Copy link
Contributor Author

j-berman commented Dec 30, 2021

I'm curious to why you choose to return bad view key and not just a 401 response?

TL;DR because I believe a timing attack could theoretically still leak whether or not it's stored, and figured since it's not a perfect defense, may as well return the clearest possible error to users who fat finger the view key (or copy the wrong one).

There are 2 cases that need to be handled:
(1): the signup case (create_account set to true)
(2): all other cases

On (1) the signup case, if a user provides a duplicate subaddress but different view key than one already stored, it would need to return successfully without writing the create request to the db, otherwise a failure response indicates the subaddress is already stored (in the happy case, when you provide a fresh subaddress/view key, it writes the create request to the db). Not writing the create request to the db takes less time than writing it to the db. So someone should be able to theoretically infer if the account is already stored or not by timing how long the response takes, regardless of what view key they provide.

On (2) it's a bit more subtle. If you hit a route and the server sees the subaddress is stored, it will compare the view key provided in the REST request with the one stored. This would take more time to do than if the subaddress simply were not present in the db in the first place. Therefore someone recording response times could potentially still learn that the view key they're providing is incorrect, and subaddress is stored or not. So I figured as long as it's not a perfect defense, might as well return bad view key to make it clearer what the issue is to an honest user fat fingering the view key.

The reason I suggested actually storing all address/view key pairs as a solution is to guarantee the round trip time to server is expected not to reveal any additional information to someone who doesn't have the correct view key.

EDIT: clarification

@trasherdk
Copy link
Contributor

Well, okay.. That kinda makes sense 😃

The timing attack part. Wouldn't that require the attacker to be on the same server to make sense?
At any time, I'm running at least 8 VPS on my server, each running multiple services, all sharing one physical network interface, all battling for CPU slots, not to mention RAM. Response times would depend on what else is going on, I would think.

@j-berman
Copy link
Contributor Author

I think you're right there's a solid chance the timing attack is impractical to pull off in a lot of (maybe even all?) cases. And perhaps that's a good enough reason to handle the return cases as described (signup always succeeds, others fail with non-descript error), rather than just assume all servers are automatically compromise-able by this theoretical timing attack.

But I would also guess for certain honest users and certain circumstances, they may be more likely to be vulnerable. E.g. maybe a user who runs the server on machine with a slow HDD or running low on space or other poor specs, and is only running monero-lws so response times are more consistent. Even then I'm not sure how feasible it is.

Maybe there is also some way for an attacker to hit the server in a way that would make it more likely to pull off with accuracy. I'm not sure.

It feels like leaving a potential blind spot that I felt more comfortable either just assuming is compromised, or removing the blind spot entirely. But perhaps you are right that maybe it's worth assuming it may be difficult if not impossible to pull off. Still, I wouldn't feel comfortable assuming I'm safe from the attack. I originally wrote the code this way we're discussing, and included a comment that there may be a timing attack vulnerability and it just felt wrong to leave in and justify the more complex behavior to always return success on signup, even when not technically succeeding and not writing the request to the db

@trasherdk
Copy link
Contributor

trasherdk commented Dec 31, 2021

So, assuming your environment has not been compromised, timing attacks are off the table.
Now, correct me if I got something wrong. Sub-addresses are derived from your primary. Why would you need/want to sign up again, using a sub-address/view-key, when you already got a primary account? Shouldn't you be able to fork out a sub-address, and expect your primary to register a "payment" ?

Edit: No matter what, one should always assume a hostile environment, and all users are not friendly. Just in case 😃

BTW: Happy new year 🍻

@j-berman
Copy link
Contributor Author

j-berman commented Jan 1, 2022

Upon re-reading prior discussions, I forgot I didn't handle the case when there are 3+ outputs in a tx involving subaddresses. Will implement it as part of this PR too.

Shouldn't you be able to fork out a sub-address, and expect your primary to register a "payment" ?

The server could implement a lookahead when scanning for outputs like wallet2 does. So for all primary addresses stored, it could also derive X subaddresses (50*200 is the default in wallet2), and then compare against all of these subaddresses when scanning for outputs. This would be harder to scale for a server (10k subaddress comparisons per output, for all primary addresses stored on the server).

Generally this would also be a bigger change to the light wallet REST API in that the tx's and outputs and info returned for a primary address are also from all subaddresses. I think a larger change to the API is ultimately warranted to support subaddresses in the most optimal way (also including support for the account/subaddress abstraction). The idea with this PR was to support them in a way that meshes with the API today without too much hassle, and later on think through a larger API change and collaborate with other light wallet ecosystem folks.

I discussed this^ a bit with @vtnerd over DM, and you can see some related discussion on it here.

Why would you need/want to sign up again, using a sub-address/view-key, when you already got a primary account?

Alternatively, without too much hassle we could maintain the format of the API, but on the backend require the primary address to be stored first before allowing new subaddresses to be added to the watch-list via /login. This is another plausible way to prevent the timing attack. When a user tries to login with a subaddress, the backend could first see if the view key is stored alongside the primary address, then derive a large number of subaddresses and see if the one the user provides is among them. It could start out with the default lookahead of 50*200, and need to think further on how to handle the case with more subaddresses.

So, assuming your environment has not been compromised, timing attacks are off the table.

I wouldn't say this with 100% confidence. I read this as "the attacker needs to have access to the environment to pull off the timing attack", but I think in theory the attacker may be able to pull it off without special privileges over the environment where the server is hosted. For example, if the user hosting the server has a disk that happens to write extremely slowly, then I think they'd be vulnerable to the timing attack in the signup case (i.e. in this case, it's just dependent on the user's existing environment, and isn't something the attacker necessarily needs to have control over).

Also: happy New Year to you as well!! 😃 an enjoyable discussion to cap the year :)

@j-berman j-berman changed the title Add support for subaddresses [WIP] Add support for subaddresses Jan 1, 2022
@trasherdk
Copy link
Contributor

Look-ahead on 10K sub-addresses could become expensive on a server hosting wallets for multiple users, and is probably not a good idea for a program with light in the name 😄 so some method to add sub-addresses is needed.

If sub-addresses are treated as new accounts, that's pretty much it. But if they are linked to a primary address, I think some kind of short lived session should exist, created by a login primary-address + view-key, or the adding of sub-addresses should be an extension to the login endpoint request. How is that?

@selsta
Copy link

selsta commented Jan 1, 2022

idea for a program with light in the name

The wallet is light, not the server :)

But did you measure how much resources it takes for 10k subaddresses lookahead? If I had to guess it should be next to unnoticeable with a slight RAM increase.

Subaddresses use a lookup table so there is no impact on scanning time.

@trasherdk
Copy link
Contributor

Syncing 10K blocks on a i7-5500/16GB gobbles up 80% CPU for 3-4 mins. one primary and no account/sub-addresses used, both client and server running off SSD. Not exactly light load.

But, if this thing can handle, say 100 primary, with 10K look-ahead, without noticeable impact on performance, that would be nice 😃

@selsta
Copy link

selsta commented Jan 1, 2022

Syncing 10K blocks on a i7-5500/16GB gobbles up 80% CPU for 3-4 mins. one primary and no account/sub-addresses used, both client and server running off SSD. Not exactly light load.

3-4 mins sounds a bit long for scanning 10k wallet blocks, but I don't have monero-lws compiled to test it myself. On my system monero-wallet-cli takes less than 20 seconds to scan 10k blocks.

But, if this thing can handle, say 100 primary, with 10K look-ahead, without noticeable impact on performance, that would be nice 😃

If implemented correctly subaddresses shouldn't have any impact on scanning time, only RAM, but this has to be tested :)

@trasherdk
Copy link
Contributor

The 10K blocks synced was done by monero-wallet-gui on Windows, against a "local" monerod running on Slackware 14.2.

Connecting to monerod and syncing a wallet, seems to have very limited impact on the server +2-3% CPU

@j-berman
Copy link
Contributor Author

j-berman commented Jan 1, 2022

The 10,000 subaddress comparisons per output could just be a hash table lookup (constant time avg lookup), but I would think the 10,000 subaddress public spend keys would be read into memory for each primary address stored on the server. Napkin math: 10,000 * 32 bytes * number of primary addresses stored = 0.32 mb per primary address stored = 1 GB for ~3,000 primary addresses = 10 GB for ~30,000 addresses = 100 GB for ~300,000 addresses etc. So it would become a significant consumer of RAM when we're talking 10's-100's of thousands of primary addresses stored on the server.

An additional caveat, as mentioned in that other subaddress PR, because of the 3+ output case, when scanning tx's involving subaddresses searching for outputs belonging to subaddresses, I'll need to use the additional tx public keys stored for each output to derive shared secrets for each additional tx public key. This will add to scanning time depending on how many 3+ output tx's involving subaddresses there are.

Naturally it makes sense to discuss the resource cost of a lookahead approach when scanning here, but just to be clear I think deciding on this optimal lookahead approach is out of scope for this PR, mostly since it would need a larger change to the light wallet API which is upstream

The goal with this PR is to get subaddresses working without a change to the API. Which makes sense to do for the long term anyway imo in case a user wants to more granularly work with subaddresses.

If sub-addresses are treated as new accounts, that's pretty much it

That's this PR's approach to get initial subaddress support working :)

But if they are linked to a primary address, I think some kind of short lived session should exist, created by a login primary-address + view-key, or the adding of sub-addresses should be an extension to the login endpoint request. How is that?

This is doable without adding to the endpoint's expected params instead of the approach taken in this PR, like how I mentioned starting at "Alternatively" in the above comment. Just require the user hit the server with the primary address first. This is a tradeoff though: if you give the server your primary address and view key, the server can derive all subaddresses and see all outputs belonging to your account. Maybe users might not want to give their primary address to the server for this reason, to allow scanning of some subaddresses on a case-by-case basis when the user wants. Figure it would make sense to leave as an option for users

@trasherdk
Copy link
Contributor

I don't think the memory requirement are scary at all. If you need to host 10K primary accounts with sub-addresses, you probably can afford dedicated hardware to run it on.

Also, I don't see the use-case, where you would need a hosted wallet, for single dedicated sub-addresses, but maybe that's just me. Wasn't sub-addressed supposed to be single use anyway?

Yeah, we got a bit side tracked withe the look-ahead discussion. I still think a 401 response is better than a more descriptive error, thus not leaking information.

@j-berman
Copy link
Contributor Author

j-berman commented Jan 2, 2022

If you need to host 10K primary accounts with sub-addresses, you probably can afford dedicated hardware to run it on.

Fair point, though at some point it would be infeasible (3 million addresses = 1 TB of RAM, 30 million = 10 TB etc.) [Edit: though at that point, some type of sharding solution probably makes sense to scan on different shards in parallel anyway]

Also, I don't see the use-case, where you would need a hosted wallet, for single dedicated sub-addresses, but maybe that's just me. Wasn't sub-addressed supposed to be single use anyway?

Fair enough, I agree not a strong use case. Probably would be very few users if any who wouldn't send their primary address to a light waller server if they're using a light wallet server and were given the option not to. Not much reason to do that.

So perhaps it would be a reasonable solution in this PR to require the primary address be stored before storing any subaddresses for an existing view key. The only challenge with this approach then is dealing with the lookahead in a solid way.

I'm ok with the non-descriptive error approach as well (even though I would still feel uneasy about the timing attack vector), but curious to hear your thoughts on the solution to allow duplicate subaddresses to be stored for different view keys, and for admins to manually accept create/import requests by referencing a view key rather than an address? This should definitively close the timing attack vector.

In summary, there are 4 options on the table for this PR:
(1) as is with descriptive error that says "bad view key"
(2) nondescriptive error + signup always returns success (timing attack may still be an issue)
(3) require primary address be stored first before storing subaddresses (need to handle lookahead correctly)
(4) allow duplicate subaddresses to be stored

I personally still like option 4 if we don't want to leak that a subaddress is stored. It is simplest to reason through I think

@trasherdk
Copy link
Contributor

I guess I don't understand your option 4. Is the view-key derived per sub-address or the primary view-key?

I like the login with primary address/key thingy.
That would fail probes early, while costing a DB search (time), returning valid or invalid. 200 Ok token or 401 Invalid credential
That would give option to validate sub-address/view-key doing some math.

I also think, you are much deeper into this Monero thingy, than me, so I'm probably better served by your instinct 😃

3+ output transactions involving subaddresses place derivations
in tx extra additional tx pub keys. Be sure to check
additional derivations when scanning outputs sent to subaddress
@j-berman j-berman changed the title [WIP] Add support for subaddresses Add support for subaddresses Jan 4, 2022
@j-berman
Copy link
Contributor Author

j-berman commented Jan 4, 2022

Update: scanner now picks up on 3+ output tx's involving subaddresses


I guess I don't understand your option 4. Is the view-key derived per sub-address or the primary view-key?

In option 4, there would be 0 server-side validation that the subaddress is derived from the secret view key. The server would simply assume any subaddress / secret view key pairs it receives are valid and store them. Even if the provided subaddress is not derived from the secret view key, it will store in the db and the server would attempt to scan outputs sent to this invalid subaddress / secret view key pair. The scanner simply would not pick up on outputs sent to this invalid pair, and the server would never return tx's associated with this invalid key pair (get_address_txs e.g. would succeed with an empty array every time). If the honest user comes along later and tries to login with the valid subaddress / secret view key pair, then the server would allow them to, would store this new key pair, would be able to scan and return tx's associated with this correct key pair etc.

Option 3 has grown on me too for the same reason: it fails early on and the DB is kept clean with valid subaddresses. I'll think further/look to wallet2 some more on handling the lookahead in that option.

I also think, you are much deeper into this Monero thingy, than me, so I'm probably better served by your instinct smiley

You've been around longer than me so your instincts/insights have been super helpful :) EDIT: like on the subaddress use case thing discussed above where I thought maybe users might not want to send primary addresses to the server if they can avoid it. Helpful to realize that's probably not something people would want to do, and so option 3 actually seems pretty solid

@j-berman
Copy link
Contributor Author

j-berman commented Jan 6, 2022

Heads up: going to start using the term "standard" address instead of "primary" since it seems more widely referred to as "standard" address.

More importantly, I think this is what both you guys (@trasherdk and @selsta) were initially thinking too, but I'm starting to think the simplest (and most future-proof) thing would be for the server to just not accept subaddresses over the API at all, rather:

  • the server would just implement whatever default lookahead the server host wants (monero-lws could default to 50*200)
  • /get_address_info, /get_address_txs, and /get_unspent_outs would return data for an entire wallet, rather than just the standard address
  • the client would manage subaddress creation on its end without touching the server (client adds "accounts" and "addresses within accounts" locally, gives them out to counter-parties when requesting payment, server will pick up on them via lookahead when scanning for received outs)

With this approach, the API wouldn't really change much, just that the aforementioned endpoints would return wallet-level info by default, rather than standard address-level info.

If there is demand from lots of individual light wallet users to pre-load tons of subaddresses before accepting payments to them, the server can also support a user-specific desired lookahead over the /login endpoint. This seems like more of a nice-to-have than a necessity imo. I think we can get away without it until the transition to the next gen protocol.

I think this API approach is also most in line with what's brewing with JAMTIS. In JAMTIS, the server wouldn't even be able to separate data by subaddresses, and would just scan wallet-level potential received outs. The client in JAMTIS would manage the lookahead and subaddress state.

Ultimately, it just does not seem desirable to treat subaddresses as distinct scanned wallets at the API layer and on the backend as this PR does. As discussed above, users aren't going to use the feature like that. Users will want wallet-level data, and that seems to be the long term direction we are headed anyway with JAMTIS. It doesn't really seem useful to allow subaddresses in place of the address param in the API, and it brings annoying complications with it (like how to safely deal with the leak as discussed above).

Planning to submit a PR to the light wallet REST API spec fleshing this out. And I'll probably end up closing this PR and opening a new one implementing that.

@j-berman j-berman changed the title Add support for subaddresses [Do not merge] Add support for subaddresses Jan 7, 2022
@vtnerd
Copy link
Owner

vtnerd commented Apr 6, 2024

This has already been implemented, so the PR is no longer relevant.

@vtnerd vtnerd closed this Apr 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants