Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add indexed addresses to pool-v2.json #43

Closed
wants to merge 1 commit into from

Conversation

natsoni
Copy link

@natsoni natsoni commented Dec 4, 2023

This PR adds many of the indexed coinbase addresses in the block DB to pools-v2.json.
I used this script to generate the file (I can add it to the PR if needed)

To avoid bloating the file with too many addresses, I did not add those of pools who pays miners directly in the coinbase transactions: this results in thousands of addresses related to the mining pool (e.g. Luke-Jr's OG pool Eligius). I can include them if needed, but it felt like it was too much.

@sha2fiddy
Copy link

Hi @ncois, could you please describe how you are differentiating between pools that pay directly from the coinbase tx to miners, and other coinbase tx's which have multiple outputs?

Without knowing your exact filtering logic, I do see value in including all outputs of coinbase tx's. I myself could filter down the complete list by >1 tx paying to each address, and/or by total amount sent to each address... but I understand if most users would prefer a more concise list.

@natsoni
Copy link
Author

natsoni commented Dec 5, 2023

Hi @sha2fiddy, thanks for the feedback!

The data table I used only contains one coinbase address per block (the address of the first output of the coinbase).
For example if there are more than one spendable output in the coinbase transaction, like this one, the result of the command SELECT coinbase_address FROM blocks WHERE height=414563 is only 1JBJTJy5dNr5dKrXwWyb5XVy8sSofCzDh2.

could you please describe how you are differentiating between pools that pay directly from the coinbase tx to miners, and other coinbase tx's which have multiple outputs?

Looking at each coinbase transaction separately, I can't differentiate pools that pay directly from the coinbase tx to miners and other multiple outputs coinbase txs. But worse than that, the script only has access to the first output of each coinbase so it can't "see" all the addresses of these special transactions.

However I was able to inaccurately differentiate these 2 kinds of pools by looking at the size of the set of all addresses used by each pool. I figured that pools paying directly to miners have their first coinbase output often differ from block to block (i.e. the coinbase address "seen" by the script is often a new one when looping through the blocks) resulting in thousands of addresses associated with said pool. Meanwhile, regular pools which coinbase txs have multiple outputs generally don't change that often the first output of the coinbase, resulting in a much lower number of associated addresses. Of course these criteria also depend on the size of the pool and how long they were active, so the classification is far from perfect.

Hope this explanation is not too confusing. This is quite empirical but I thought it was a reasonable guess.

Without knowing your exact filtering logic, I do see value in including all outputs of coinbase tx's. I myself could filter down the complete list by >1 tx paying to each address, and/or by total amount sent to each address... but I understand if most users would prefer a more concise list.

Absolutely, it would be better to include all coinbase outputs and also their values, but the table I worked with didn't contain this information and indexing it myself might take a while. I might try tho.

@sha2fiddy
Copy link

@ncois, understood and thanks for the detailed response. While most pools do spend the entire coinbase to a single address, this is not always the case and certainly is not enforced by any protocol rules. I often wondered why a lot of block explorers / data providers only display the first indexed output 🤔 as a single "coinbase address".

In any case, great work and thanks for this contribution - more block-to-pool association data here is good.

@natsoni
Copy link
Author

natsoni commented Dec 6, 2023

Hi @sha2fiddy, I just uploaded a table indexing all the addresses that ever appeared in a coinbase transaction, so I thought you would be interested. For each address is given the pool(s) they are associated with, and the number of sats they mined with the associated pool(s).
You can find the data here: https://raw.githubusercontent.com/ncois/mining-pools/index-addresses/mining_addresses.csv

A row looks like this:

address pool_ids pool_values
3KfRwBSLzeGcZ76SHog2m9Kkhnb6N8dzSc [6, 98] [147319130014, 6348135066]

This means that the address 3KfRwBSLzeGcZ76SHog2m9Kkhnb6N8dzSc is associated with:

  • BTC.com (ID 6) because it mined 1473.19130014 BTC with this pool
  • okpool.top (ID 98) because it mined 63.48135066 BTC with this pool

I will try to integrate this data directly on the address page of the explorer.

@nymkappa
Copy link
Member

nymkappa commented Dec 7, 2023

@0xB10C you may be interested in this PR. Would be interesting to get your point of view on this. Maybe @ncois would apply this to https://github.com/bitcoin-data/mining-pools as well.

@natsoni
Copy link
Author

natsoni commented Dec 7, 2023

I am closing this PR to open a new one where I will add a file pools_addresses.csv that lists all addresses related to mining pools referenced in pools-v2.json with the scripts used to generate the data. The table will be used to implement the issue mempool/mempool#513.

@natsoni natsoni closed this Dec 7, 2023
@natsoni natsoni deleted the add-indexed-addresses branch December 8, 2023 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants