-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] - using hastag search seems cannot fetch all media data even using loop #1175
Comments
can you send your full code |
I can also confirm that this is a problem even before 6.4 (6.3.0) which could not pass beyond 45 videos.. now with 6.4 and later we can finally achieve a bigger amount (i had achieve 340 videos) but looping through the same videos again and again , and again (as the YouTube_dlp) which im passing the url fetched is constantly referring: already downloaded here is my code:
|
Pass that it seems that it also fetched videos that it's not belong on the corresponding hashtag: For example, and its not even on hashtag 'coldplayathens' |
I want to search ukraine related video in America region, but seems can only fetch 30-50 records. But checked in Tiktok, has 7.1M records, could we download all, or is there anyway to search by time range
My code snipet
async def search_videos_hashtag(hashtag, time_from, time_to, current_video_amount=0,
count=100, times=0) -> None:
global result, api, current_os, result_tik_id_set
format_style = '%m/%d/%y' if current_os == 'Windows' else '%Y/%m/%d'
sleep(random.Random().randint(a=3, b=5))
temp = 0
temp_video_amount = current_video_amount
if api is not None:
if len(api.sessions) == 0:
await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3, headless=False) #ms_token is None
async for searchRes in api.hashtag(hashtag).videos(count=count, cursor=current_video_amount):
temp += 1
current_video_amount += 1
time_to_add_one_day = int((
datetime.fromtimestamp(format_str_timestamp(time_to, format_style)) +
timedelta(days=1)).timestamp())
if format_str_timestamp(time_from, format_style) <= searchRes.as_dict['createTime'] <= time_to_add_one_day
and searchRes.id not in result_tik_id_set:
author = construct_author_metadata(searchRes)
publish = construct_publish_metadata(searchRes)
author.append_publish(publish)
result.append(author)
result_tik_id_set.add(searchRes.id)
print('append one tik tok data, current search: ' + str(current_video_amount))
if temp_video_amount == current_video_amount:
sleep(random.Random().randint(a=3, b=5))
video_urls = list(map(lambda res: res.publish[0].link, result))
for url in video_urls:
await search_related_videos(url, time_from, time_to, required_video_amount=count,
current_video_amount=0,
count=int(count / len(video_urls)))
if temp < count and times < 100:
await search_videos_hashtag(hashtag, time_from, time_to, current_video_amount,
count, times=times + 1)
The text was updated successfully, but these errors were encountered: