Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove scope from access log when using json logging formatter #859

Merged
merged 1 commit into from
Dec 9, 2020

Conversation

euri10
Copy link
Member

@euri10 euri10 commented Nov 20, 2020

Fixes #858 as a 1st step that ensures no sensitive headers are leaked through scope in access logs in case people are using a json formatter

Will work on an implementation that let you access those headers / scope later when there is consensus / discussion about how it should be done.

Gunicorn let you print it with a custom format

Copy link
Member

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Is there any chance this could be breaking to some users? Any case, this 1st step seems like a clear yes; we can reassess a proper implementation afterwards.

@euri10 euri10 merged commit 0660e74 into encode:master Dec 9, 2020
@euri10
Copy link
Member Author

euri10 commented Dec 9, 2020

Sounds good. Is there any chance this could be breaking to some users? Any case, this 1st step seems like a clear yes; we can reassess a proper implementation afterwards.

I think it can. It requires quite a specific setup though so chances are slim, but you never know, people get fancy when they need something, at least we'll know :)

Screenshot_20201112-211749

florimondmanca added a commit that referenced this pull request Dec 9, 2020
@florimondmanca florimondmanca mentioned this pull request Dec 9, 2020
@euri10 euri10 deleted the 858_leaked_scope branch December 12, 2020 10:31
euri10 added a commit to euri10/uvicorn that referenced this pull request Dec 12, 2020
euri10 added a commit to euri10/uvicorn that referenced this pull request Dec 12, 2020
@euri10 euri10 mentioned this pull request Dec 12, 2020
florimondmanca added a commit that referenced this pull request Dec 12, 2020
* Version 0.13.1

* Add #859

* Apply suggestions from code review

Co-authored-by: euri10 <[email protected]>

Co-authored-by: euri10 <[email protected]>
@alippai
Copy link

alippai commented Dec 17, 2020

I used to put the username from request / response headers into the access log using a custom formatter. How should I do that in the new version?

@euri10
Copy link
Member Author

euri10 commented Dec 18, 2020

I used to put the username from request / response headers into the access log using a custom formatter. How should I do that in the new version?

hi @alippai sorry to hear I broke your use case !

If you could open a feature request with what you previously we can discuss and see if we can provide a solution in uvicorn directly

in the meantime, I'd take a look at opentelemetry, I personally uses it to trace my apps and adding the username is rather simple. There are probably other solutions, but that's the 1st that comes to mind.

@alippai
Copy link

alippai commented Dec 18, 2020

@euri10 I'm always happy to move forward, opentelemetry sounds promising, there is no need to maintain legacy on a free, open-source project. That kind of support just burns out the developers and kills the project eventually.
Perhaps you have a tip for a few questions:

@euri10
Copy link
Member Author

euri10 commented Dec 18, 2020

@euri10 I'm always happy to move forward, opentelemetry sounds promising, there is no need to maintain legacy on a free, open-source project. That kind of support just burns out the developers and kills the project eventually.
Perhaps you have a tip for a few questions:

Yes ! Use an instrumentation : it adds a middleware that produces "automatic" traces, the asgi one is probably what you want, I know there is one for Starlette and FastAPI too, I'm also using the asyncpg one for the db...

then should you need more granular / precise information you can always add things yourself to the traces.

For instance I'm doing on some endpoints: trace.get_current_span().set_attribute("enduser.id", current_user.email)

as for the collector I'm using Cloud Trace which works fine so far, but I must say I have a very superficial view of this part so I won't be your best advisor :)

It's a brand new world so pretty scary at first but the guys at opentelemetry-python on gitter are super reactive and helpful so I'd try and see if I were you :)

@gregbeech
Copy link

This change has broken our custom loggers which were using the scope to send the relevant information to Datadog using its standard logging attributes. I can't see any way to access half the information now such as user agent, etc. in logs. We get some of the information in traces but those are sampled so it's not the full record of things happening. Is there going to be some way to access the request data like headers (for User-Agent) and so on brought back?

@euri10
Copy link
Member Author

euri10 commented Feb 2, 2021

This change has broken our custom loggers which were using the scope to send the relevant information to Datadog using its standard logging attributes. I can't see any way to access half the information now such as user agent, etc. in logs. We get some of the information in traces but those are sampled so it's not the full record of things happening. Is there going to be some way to access the request data like headers (for User-Agent) and so on brought back?

sorry for that @gregbeech
currently the only way I see to access the headers would be in a middleware at the app level, so that doesn't fit your reqs that it's found in the logs.
maybe @florimondmanca has some ideas !
there's currently no work on adding that back, but indeed that's something worth considering.

@florimondmanca
Copy link
Member

florimondmanca commented Feb 2, 2021

@gregbeech Hi,

Yeah, same as @euri10 I would say a custom middleware to dump any info you'd like to Datadog sounds more sensible. Relying on extras passed by the logger sounds... Well, relatively okay tbh, just too constrained when à middleware you have full control over should do just fine. Hope that helps.

Edit: hmm, looks like I misread your comment. You were sending logs from Uvicorn to DD and attaching attributes from extras, I see. Can you do things differently, perhaps using a parser if that info is present in the log lines? You might also be able to wrap the app to log things manually with whichever info from the ASGI scope you're interested in. I don't think we'll be adding this back in core given the security discussions above.

@euri10 euri10 mentioned this pull request Feb 24, 2021
@tonkolviktor
Copy link

Hi @florimondmanca, @euri10,

thanks for the discussion so far. I'd like to keep the discussion open.

I would reather easily agree that having the option to dump the whole scope object to the log is not ideal.
However I still see the value being able to customize log messages / the alternatives are also far from ideal.

My use-case is that, I get in a token via HTTP Header. I check/resolve it to a token name, now I want to add it to the access log (eg via logging a response header value). This way I would have a proper access log.

  1. Hacking with logging
    a)
    One option is to disable the "official" access log and implement my from 0. (I cannot have two access logs)
    In this case I have to copy/reimplement all the code/tought that went into how to do this properly (catching exception, formatting messages, etc)
    b)
    I could just add my extra information in a separate log line, however the correlation becomes difficult: matching process id, timestamp, etc.

  2. Starlette (FastAPI)
    I have a problem that a custom Starlette middleware cannot be the most outter one:
    https://github.com/encode/starlette/blob/master/starlette/applications.py#L68
    This means if I have my own custom access logging, but I change the response code in a 'Exception' handler I'll not see that change in my logging.

  3. Using a telemetry/prometheus could be an option, but seems to be a bit of an overkill.
    a) still fails on point 2.
    b) if we now talk about bigger companies than a small startup, one has some requirements to preserve system acces information. This is typically done via logfiles.

It would make me happy if you'd consider the above points and think of a mutually acceptable way of customizing the logs when required.

Viktor

@KlausGlueckert
Copy link

KlausGlueckert commented Jun 21, 2021

@euri10 : this change also broke our complete custom logging. We need to log Request-Header information in fastapi, so we wrote a custom logger to access the "scope" key to get to the header infos. Now we must revert back to 0.13.0 unfortunately and have no solution.

Here is how we did it: https://stackoverflow.com/questions/62894952/fastapi-gunicon-uvicorn-access-log-format-customization

Ideally, these infos should somehow be accessible programmatically when defining a custom logger to be used from FastAPI

@euri10
Copy link
Member Author

euri10 commented Jun 21, 2021

#947 is probably something worth looking at if I understand your issue correctly

@KlausGlueckert
Copy link

@euri10 i checked quickly, not sure if it makes it possible to access header infos, it just talks about tracking request/response times.

This is also an important needed feature. My hack for this was to write this in the header as an attribute from within the fast-api async function of the endpoint. In that way I can measure the "server-time", and not the response time which is biased by a lot of other things (location, bandwith etc). Then I can access als this info in the "scope" and print it to console. We use Splunk for parsing the logs, so I just need to print my logs to the console, done.

Therefore, I also agree with @Kludex that this customization is probably best made from within uvicorn

@Kludex
Copy link
Sponsor Member

Kludex commented Jun 21, 2021

In theory, that PR will allow writing header info in the access format as well.

@euri10
Copy link
Member Author

euri10 commented Jun 21, 2021

In theory, that PR will allow writing header info in the access format as well.

this, a solid alternative being a middleware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scope sent in access logs by default
7 participants