Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support response.raw #438

Open
xyb opened this issue Nov 19, 2024 · 7 comments
Open

Support response.raw #438

xyb opened this issue Nov 19, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@xyb
Copy link

xyb commented Nov 19, 2024

Is your feature request related to a problem? Please describe.
While developing a plugin for HTTPie, I noticed that the requests library interface lacked access to the response.raw attribute, which HTTPie requires.

Describe the solution you'd like

>>> from curl_cffi.requests import Session
>>> with Session(impersonate="chrome") as session:
...   response = session.get("https://httpbin.org/get")
...   print(response.raw)
...
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
AttributeError: 'Response' object has no attribute 'raw'

>>> from requests import Session
>>> with Session() as session:
...   response = session.get("https://httpbin.org/get")
...   print(response.raw)
...
<urllib3.response.HTTPResponse object at 0x103a09c30>

Describe alternatives you've considered
None

Additional context
None

@xyb xyb added the enhancement New feature or request label Nov 19, 2024
@xyb
Copy link
Author

xyb commented Nov 19, 2024

I’ve uploaded my plugin to demonstrate how to reproduce the issue:

❯ httpie --debug plugins install httpie-curl-cffi

❯ http --debug https://httpbin.org/get
...
http: error: AttributeError: 'Response' object has no attribute 'raw'


Traceback (most recent call last):
  File "/Users/xyb/.virtualenvs/httpie-curl-cffi/bin/http", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/xyb/.virtualenvs/httpie-curl-cffi/lib/python3.11/site-packages/httpie/__main__.py", line 9, in main
    exit_status = main()
                  ^^^^^^
  File "/Users/xyb/.virtualenvs/httpie-curl-cffi/lib/python3.11/site-packages/httpie/core.py", line 162, in main
    return raw_main(
           ^^^^^^^^^
  File "/Users/xyb/.virtualenvs/httpie-curl-cffi/lib/python3.11/site-packages/httpie/core.py", line 140, in raw_main
    handle_generic_error(e)
  File "/Users/xyb/.virtualenvs/httpie-curl-cffi/lib/python3.11/site-packages/httpie/core.py", line 100, in raw_main
    exit_status = main_program(
                  ^^^^^^^^^^^^^
  File "/Users/xyb/.virtualenvs/httpie-curl-cffi/lib/python3.11/site-packages/httpie/core.py", line 213, in program
    for message in messages:
  File "/Users/xyb/.virtualenvs/httpie-curl-cffi/lib/python3.11/site-packages/httpie/client.py", line 114, in collect_messages
    response = requests_session.send(
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xyb/.virtualenvs/httpie-curl-cffi/lib/python3.11/site-packages/requests/sessions.py", line 718, in send
    extract_cookies_to_jar(self.cookies, request, r.raw)
                                                  ^^^^^
AttributeError: 'Response' object has no attribute 'raw'

@lexiforest
Copy link
Owner

lexiforest commented Nov 19, 2024

Unfortunately, it's not possible to implement this attribute. curl/libcurl will automatically unzip the response in the streaming callback no matter what, whereas response.raw should return the streaming content of the raw i.e. compressed content.

@lexiforest lexiforest closed this as not planned Won't fix, can't repro, duplicate, stale Nov 19, 2024
@lexiforest
Copy link
Owner

It may be possible with curl_easy_recv, but that would take a significant amount of work, let's keep this open and revisit this in the future.

@lexiforest lexiforest reopened this Nov 19, 2024
@xyb
Copy link
Author

xyb commented Nov 19, 2024

Unfortunately, it's not possible to implement this attribute. curl/libcurl will automatically unzip the response in the streaming callback no matter what, whereas response.raw should return the streaming content of the raw i.e. compressed content.

As shown in the previous example, response.raw is an instance of <urllib3.response.HTTPResponse object at 0x103a09c30>. However, implementing response.raw directly may not be the best solution. It would be more effective to trace the call stack and find the most appropriate entry point for addressing the issue. Unfortunately, this approach requires a deeper dive into the requests library, which I am unable to dedicate time to at the moment.

@lexiforest
Copy link
Owner

Hi, I just took another look at your stacktrace, it seems that what is missing here is requests.Session.send(), not response.raw(), now the problem is simpler to solve.

@vevv
Copy link

vevv commented Nov 20, 2024

Here's a wrapper I use to add raw-like reading functionality. I only ever need content that's either decompressed or not compressed to begin with, so this works well for me. Though even with requests I've never ran into compressed content.

class RawReader:
    def __init__(self, response: requests.Response):
        self.response = response

    def read(self, amt: int | None = None):
        data = b""
        for chunk in self.response.iter_content():
            data += chunk
            if amt and len(data) >= amt:
                break

        return data
        
 # resp.raw = RawReader(resp)

@ligix
Copy link

ligix commented Jan 5, 2025

I came across a situation where I needed access via a file like object to the (decoded) contents of the response.

Below is the wrapper class I used with complete type hints (only read, close and the ability to use it as a context manager is implemented since I didn't need more than that).

response_to_file_like.py
from curl_cffi import requests
from typing import BinaryIO, IO, Iterable, Iterator
from collections.abc import Buffer

ReadableBuffer = Buffer | bytes


class ResponseToFileLike(BinaryIO):
    def __init__(self, response: requests.Response, chunk_size: int|None=512) -> None:
        self.response = response
        self.data_iterable: Iterator[bytes] = response.iter_content(chunk_size=chunk_size)
        self.extra_data = bytearray()

    def __enter__(self):
        return self

    def close(self):
        self.response.close()

    def fileno(self):
        raise NotImplementedError

    def flush(self):
        raise NotImplementedError

    def isatty(self):
        raise NotImplementedError

    def read(self, n: int|None=-1, /):
        # file object says that binary files are two of the categories of file objects
        # binary file says that an implementation of binary file is io.Bytes.IO
        # io.BytesIO documents this behaviour
        # https://docs.python.org/3/glossary.html#term-file-object
        # https://docs.python.org/3/glossary.html#term-binary-file
        # https://docs.python.org/3/library/io.html#io.BufferedIOBase.read
        if n is None:
            n=-1

        needed_data = n - len(self.extra_data)

        try:
            while needed_data > 0 or n<0:
                needed_data -= len(c := next(self.data_iterable))
                self.extra_data.extend(c)
        except StopIteration:
            pass

        ret, self.extra_data = self.extra_data[:n], self.extra_data[n:]
        return bytes(ret)

    def readable(self):
        raise NotImplementedError

    def readline(self, limit=-1, /):
        raise NotImplementedError

    def readlines(self, hint=-1, /):
        raise NotImplementedError

    def seek(self, offset, whence=0, /):
        raise NotImplementedError

    def seekable(self):
        raise NotImplementedError

    def tell(self):
        raise NotImplementedError

    def truncate(self, size=None, /):
        raise NotImplementedError

    def writable(self):
        raise NotImplementedError

    def write(self: IO[bytes], s: ReadableBuffer, /) -> int:
        raise NotImplementedError

    def writelines(self: IO[bytes], lines: Iterable[ReadableBuffer], /) -> None:
        raise NotImplementedError

    def __next__(self):
        raise NotImplementedError

    def __iter__(self):
        raise NotImplementedError

    def __exit__(self, type, value, traceback, /):
        self.close()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants