Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle non- utf-8 characters #134

Open
adamtheturtle opened this issue Sep 16, 2024 · 1 comment
Open

Handle non- utf-8 characters #134

adamtheturtle opened this issue Sep 16, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@adamtheturtle
Copy link
Contributor

subprocess-tee errors when the subprocess prints a non- utf-8 character.
This is a difference between subprocess and subprocess-tee.

Reproduction

# my_script.sh
echo -e "\xC0\x80"
# reproducer.py
import subprocess
import subprocess_tee

print("Subprocess:")

subprocess.run(args=["bash", "my_script.sh"])

print("Subprocess tee:")

subprocess_tee.run(args=["bash", "my_script.sh"])

subprocess will replace invalid bytes with a placeholder character, while subprocess-tee errors with:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte

Fix

I suggest using errors="replace" on line.decode().

@ssbarnea ssbarnea added the bug Something isn't working label Dec 6, 2024
@ssbarnea
Copy link
Member

ssbarnea commented Dec 6, 2024

Hmm... not sure but okey. I will accept a patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants