Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: IP Address Formatting in DB #153

Open
2 tasks
nelsonic opened this issue Mar 11, 2022 · 10 comments
Open
2 tasks

BUG: IP Address Formatting in DB #153

nelsonic opened this issue Mar 11, 2022 · 10 comments

Comments

@nelsonic
Copy link
Member

On our catch up call, @SimonLab noted that the IP addresses are not being stored correctly.

flyctl postgres connect -a hits-db
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
--More--

Ref: https://en.wikipedia.org/wiki/0.0.0.0

image

I suspect this might be a Fly.io thing ...

Todo

  • Investigate why the IP addresses aren't being stored correctly by briefly logging them to sdout (IO.inspect) so that we can simply tail the logs and see if the issue is at the database level.
    • Report back your findings!
@nelsonic
Copy link
Member Author

The reason we want to fix this ASAP is so that we can start doing Geo IP lookups for #27 🗺️

@SimonLab
Copy link
Member

from: https://elixirforum.com/t/how-to-show-clients-ipv4-on-page/41110

By default, Plug puts the IP of the peer in conn.remote_ip. When running behind a load balancer, that will be the internal IP of the load balancer.

Could be due to https://fly.io/docs/about/pricing/#anycast-ip-addresses maybe

@nelsonic
Copy link
Member Author

Good detective work! 🔍 👍
So this is the IP address of the load balancer forwarding the requests ... 🤦‍♂️
Please do some IO.inspect logging to find out if we can get the client IP. 🤞

@SimonLab
Copy link
Member

Looking at https://fly.io/docs/reference/runtime-environment/#request-headers
specifically:
image

image

and

image

@SimonLab
Copy link
Member

SimonLab commented Mar 15, 2022

I've added some logs to see the requests headers and deploy the application (flyctl deploy):
image

This hit is me trying to see my ip address, however the fly-client-ip and the x-forwarded-for don't contain the right ip and might still refer to fly.io server.

We can see the useragent is github-camo and this might be why my ip address is not displayed when visiting the badge via the github page:
image

see https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/about-anonymized-urls

We can see the image src is different:
image

Visiting the badge url directly without using the Github (eg https://hits.dwyl.com/dwyl/hits.svg), in this case I can see my ip address correctly in the x-forwarded-for header and the useragent matches my browser

@nelsonic
Copy link
Member Author

@SimonLab good detective work. Makes sense. So GitHub is proxying requests for all externally hosted images. 👍
I think in that case we should just add a check for the github-camo useragent and register it in the DB. 💾
We aren't going to be able to do any Geo lookups which is both sad 😞 and great because it's one fewer feature. 💭

@SimonLab
Copy link
Member

Yes unfortunately I don't think it's possible to get the locations as I guess most of the badge are viewed via Github.

I think in that case we should just add a check for the github-camo useragent and register it in the DB

We already save the useragent in the database with

useragent = Useragent.insert(%{"name" => useragent, "ip" => ip})

or do you mean something else to add?

@nelsonic
Copy link
Member Author

Yeah, that's probably enough. I was just thinking that they might have a block of IP addresses that github-camo makes outbound requests from and we could just ignore them completely if the ua is github-camo. But what we already have is fine. 👍
Let's not spend any more time on this from the GitHub Hits perspective and simply move onto other higher priority tasks.
It would be good to store the full useragent + IP data for other pages e.g: https://tachyons-bootstrap.dwyl.com/ or at least check if that page, given that it's hosted by GitHub pages, is proxying the hits badge requests ... 💭

@SimonLab
Copy link
Member

SimonLab commented Mar 15, 2022

I just checked which request header values we get visiting https://tachyons-bootstrap.dwyl.com/ and the ip address matches correctly. I'll create a PR to extract the ip address from x-forwarded-for header.

see: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For
image

@nelsonic
Copy link
Member Author

@SimonLab nice one. Thanks. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants