Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: the JSON object must be str, bytes or bytearray, not NoneType #301

Closed
hu0514 opened this issue Jul 5, 2021 · 11 comments
Closed
Labels

Comments

@hu0514
Copy link

hu0514 commented Jul 5, 2021

File "/data/leisu/leisu_env/lib/python3.7/site-packages/googletrans/client.py", line 219, in translate
parsed = json.loads(data[0][2])
File "/data/local/python3.7/lib/python3.7/json/init.py", line 341, in loads
raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not NoneType

@qinghuan1998
Copy link

我遇到了同样的报错,经过检测发现是翻译的字符数超过5000或者传入的是""空字符串就会有这样的报错。但API文档中显示单个文本最大翻译字符上限为15K,我也有点迷糊,想到issue中寻找答案来着,,,

@alperenkaplan
Copy link

I believe you encounter this issue because your text is too long.I think documentation says the character limit is15k but I encounter the same problem when I send anything longer than 5k. Try to send your text in 5k chunks and see if you still have the issue. Make sure the chunks end with complete sentences tho. Then you can concatenate the results

@benjaminvanrenterghem
Copy link

benjaminvanrenterghem commented Jul 28, 2021

Your content is too long, I recommend using something smaller than 5000 as mentioned above.
I would consider it a good suggestion for googletrans to automatically assert this and to automatically break up content into chunks.
You can manually break your content up into chunks and translate them, example below:

from time import sleep
import backoff

class Translator:
    def __init__(self):
        self.client = GoogleTranslator()
        self.sleep_in_between_translations_seconds = 10
        self.source_language = "en"
        self.max_chunk_size = 4000

    def __createChunks(self, corpus):
        chunks = [corpus[i:i + self.max_chunk_size] for i in range(0, len(corpus), self.max_chunk_size)]
        return chunks

    def __sleepBetweenQuery(self):
        print('Sleeping for {}s after translation query..'.format(self.sleep_in_between_translations_seconds))
        sleep(self.sleep_in_between_translations_seconds)

    @backoff.on_exception(backoff.expo, Exception, max_tries=150)
    def Translate(self, content, dest_language_code):
        try:
            print('Attempting to translate to lang={}'.format(dest_language_code))
            if len(content) > self.max_chunk_size:
                print('Warning: Content is longer than allowed size of {}, breaking into chunks'.format(self.max_chunk_size))
                results_list = []
                concatenated_result = ""

                original_chunks = self.__createChunks(content)
                for i in original_chunks:
                    r = self.client.translate(i, dest=dest_language_code, src=self.source_language)
                    self.__sleepBetweenQuery()
                    results_list.append(r.text)

                for i in results_list:
                    concatenated_result += i

                return concatenated_result
            else:
                res = self.client.translate(content, dest=dest_language_code, src=self.source_language)
                self.__sleepBetweenQuery()
                return res.text
        except Exception as e:
            print(e)
            raise e```

@benjaminvanrenterghem
Copy link

Apparently this error also occurs when a bunch of whitespace / no content to actually translate is present. Something to keep in mind.

@NawtJ0sh
Copy link

if yall cant get this working, you can try this instead #268

@stale
Copy link

stale bot commented Oct 14, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Oct 14, 2021
@ghost
Copy link

ghost commented Oct 19, 2021

I'm getting this error with just the README's example of bulk translating.

translations = translator.translate(['The quick brown fox', 'jumps over', 'the lazy dog'], dest='ko')

the entire error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-21-c4a9f98d835f> in <module>
----> 1 translations = translator.translate(['The quick brown fox', 'jumps over', 'the lazy dog'], dest='ko')

C:\ProgramData\Anaconda3\lib\site-packages\googletrans\client.py in translate(self, text, dest, src)
    217
    218         data = json.loads(resp)
--> 219         parsed = json.loads(data[0][2])
    220         # not sure
    221         should_spacing = parsed[1][0][0][3]

C:\ProgramData\Anaconda3\lib\json\__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    339     else:
    340         if not isinstance(s, (bytes, bytearray)):
--> 341             raise TypeError(f'the JSON object must be str, bytes or bytearray, '
    342                             f'not {s.__class__.__name__}')
    343         s = s.decode(detect_encoding(s), 'surrogatepass')

TypeError: the JSON object must be str, bytes or bytearray, not NoneType

only bulk translating doesn't work however. Translating normally works just fine.

@stale stale bot removed the wontfix label Oct 19, 2021
@stale
Copy link

stale bot commented Dec 19, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Dec 19, 2021
@stale stale bot closed this as completed Dec 26, 2021
@Secret-chest
Copy link

It also happens if the string is empty. I translate from a JSON file, add this line before translating:

if sourceTranslation == "" or sourceTranslation is None:

@ganbayard
Copy link

result=df.to_json(orient="index",default_handler=str).encode('utf-8')

@Absolutorcinus
Copy link

Absolutorcinus commented Mar 23, 2022

I am trying to get the source language from a sentence:

langs = translator.detect(['Hehehe, di daerah '])
....... for lang in langs:
..............print(lang.lang, lang.confidence)

i get the following Error:

TypeError Traceback (most recent call last)

in ()
----> 1 langs = translator.detect(['Hehehe, di daerah '])
2 for lang in langs:
3 print(lang.lang, lang.confidence)

2 frames

/usr/local/lib/python3.7/dist-packages/googletrans/client.py in detect(self, text)
367
368 def detect(self, text: str):
--> 369 translated = self.translate(text, src='auto', dest='en')
370 result = Detected(lang=translated.src, confidence=translated.extra_data.get('confidence', None), response=translated._response)
371 return result

/usr/local/lib/python3.7/dist-packages/googletrans/client.py in translate(self, text, dest, src)
217
218 data = json.loads(resp)
--> 219 parsed = json.loads(data[0][2])
220 # not sure
221 should_spacing = parsed[1][0][0][3]

/usr/lib/python3.7/json/init.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
339 else:
340 if not isinstance(s, (bytes, bytearray)):
--> 341 raise TypeError(f'the JSON object must be str, bytes or bytearray, '
342 f'not {s.class.name}')
343 s = s.decode(detect_encoding(s), 'surrogatepass')

TypeError: the JSON object must be str, bytes or bytearray, not NoneType

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants