百度图片爬虫

用于爬取一定类型的图片进行深度学习训练

Start

pip install -r requirement -i https://pypi.douban.com/simple

Use

python function.py

效果

使用文档

get_images_rul

class Crawler:

    @staticmethod
    def get_images_url(word: str, num: int, original: bool = True,
                       timeout: int = __CONCURRENT_TIMEOUT) -> (bool, bool, list):

参数

word:str:搜索关键词
num:int:搜索数量
original:bool, optional:是否下载原图，默认为False
timeout:int, optional:请求timeout，默认为60s

返回

net:bool:网络连接状态
num:bool:图片数量是否足够
urls:list:获取的urls，每项为一个dict，

download_images

class Crawler:

    @staticmethod
    def download_images(urls: list, rule: tuple = ('.png', '.jpg'),
                        path: str = 'download', timeout: int = __CONCURRENT_TIMEOUT,
                        concurrent: int = __CONCURRENT_NUM, command: bool = True) -> (int, int):

参数

urls: list: 需要爬的图片列表，格式与get_images_url返回的相同
rule: tuple, optional: 允许下载的格式，默认为('.png', '.jpg')
path: str, optional: 图片下载的路径，默认为'download'
timeout: int, optional: 请求 timeout, 默认为60(s)
concurrent: int, optional: 并行下载的数量，默认为100
command: bool, optional: 是否在控制台显示进度条，默认为True

返回

success: int: 下载成功的数量
failed: int: 下载失败的数量

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
img		img
README.md		README.md
function.py		function.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

百度图片爬虫

Start

Use

效果

使用文档

get_images_rul

download_images

About

Releases

Packages

Languages

Feng1909/Baidu_image

Folders and files

Latest commit

History

Repository files navigation

百度图片爬虫

Start

Use

效果

使用文档

get_images_rul

download_images

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages