Skip to content

Commit

Permalink
chore: cleaned project repository
Browse files Browse the repository at this point in the history
  • Loading branch information
jonasfroeller committed Jul 3, 2024
1 parent 6bc7e78 commit a028b1e
Show file tree
Hide file tree
Showing 29 changed files with 117 additions and 2,677 deletions.
49 changes: 49 additions & 0 deletions ABOUT_THIS_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,3 +196,52 @@ docs: ## Build the documentation.
switch-to-poetry: ## Switch to poetry package manager.
init: ## Initialize the project based on an application template.
```

## README - Python Project Template

A low dependency and really simple to start project template for Python Projects.

See also
- [Flask-Project-Template](https://github.com/rochacbruno/flask-project-template/) for a full feature Flask project including database, API, admin interface, etc.
- [FastAPI-Project-Template](https://github.com/rochacbruno/fastapi-project-template/) The base to start an openapi project featuring: SQLModel, Typer, FastAPI, JWT Token Auth, Interactive Shell, Management Commands.

### HOW TO USE THIS TEMPLATE

> **DO NOT FORK** this is meant to be used from **[Use this template](https://github.com/rochacbruno/python-project-template/generate)** feature.
1. Click on **[Use this template](https://github.com/rochacbruno/python-project-template/generate)**
3. Give a name to your project
(e.g. `my_awesome_project` recommendation is to use all lowercase and underscores separation for repo names.)
3. Wait until the first run of CI finishes
(Github Actions will process the template and commit to your new repo)
4. If you want [codecov](https://about.codecov.io/sign-up/) Reports and Automatic Release to [PyPI](https://pypi.org)
On the new repository `settings->secrets` add your `PYPI_API_TOKEN` and `CODECOV_TOKEN` (get the tokens on respective websites)
4. Read the file [CONTRIBUTING.md](CONTRIBUTING.md)
5. Then clone your new project and happy coding!

> **NOTE**: **WAIT** until first CI run on github actions before cloning your new project.
### What is included on this template?

- 🖼️ Templates for starting multiple application types:
* **Basic low dependency** Python program (default) [use this template](https://github.com/rochacbruno/python-project-template/generate)
* **Flask** with database, admin interface, restapi and authentication [use this template](https://github.com/rochacbruno/flask-project-template/generate).
**or Run `make init` after cloning to generate a new project based on a template.**
- 📦 A basic [setup.py](setup.py) file to provide installation, packaging and distribution for your project.
Template uses setuptools because it's the de-facto standard for Python packages, you can run `make switch-to-poetry` later if you want.
- 🤖 A [Makefile](Makefile) with the most useful commands to install, test, lint, format and release your project.
- 📃 Documentation structure using [mkdocs](http://www.mkdocs.org)
- 💬 Auto generation of change log using **gitchangelog** to keep a HISTORY.md file automatically based on your commit history on every release.
- 🐋 A simple [Containerfile](Containerfile) to build a container image for your project.
`Containerfile` is a more open standard for building container images than Dockerfile, you can use buildah or docker with this file.
- 🧪 Testing structure using [pytest](https://docs.pytest.org/en/latest/)
- ✅ Code linting using [flake8](https://flake8.pycqa.org/en/latest/)
- 📊 Code coverage reports using [codecov](https://about.codecov.io/sign-up/)
- 🛳️ Automatic release to [PyPI](https://pypi.org) using [twine](https://twine.readthedocs.io/en/latest/) and github actions.
- 🎯 Entry points to execute your program using `python -m <neptun_webscraper>` or `$ neptun_webscraper` with basic CLI argument parsing.
- 🔄 Continuous integration using [Github Actions](.github/workflows/) with jobs to lint, test and release your project on Linux, Mac and Windows environments.

> Curious about architectural decisions on this template? read [ABOUT_THIS_TEMPLATE.md](ABOUT_THIS_TEMPLATE.md)
> If you want to contribute to this template please open an [issue](https://github.com/rochacbruno/python-project-template/issues) or fork and send a PULL REQUEST.
[❤️ Sponsor this project](https://github.com/sponsors/rochacbruno/)
16 changes: 16 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,22 @@ This instructions are for linux base systems. (Linux, MacOS, BSD, etc.)
- Enter the directory `cd tech-stack-ai-configuration-data-scraper`
- Add upstream repo `git remote add upstream https://github.com/jonasfroeller/tech-stack-ai-configuration-data-scraper`

## Installing Dependencies besides Python

> Needed for JS support, if websites load content after page load client side.
### Install Playwright

```bash
npm i [email protected] --global
```

### Install the browser binaries

```bash
playwright install
```

## Setting up your own virtual environment

Run `make virtualenv` to create a virtual environment.
Expand Down
10 changes: 2 additions & 8 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,6 @@ Changelog
=========


0.1.2 (2021-08-14)
0.1.0 (yyyy-mm-dd)
------------------
- Fix release, README and windows CI. [Bruno Rocha]
- Release: version 0.1.0. [Bruno Rocha]


0.1.0 (2021-08-14)
------------------
- Add release command. [Bruno Rocha]
- commit message [Author]
107 changes: 40 additions & 67 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,94 +1,67 @@
# Neptun Webscraper

# Python Project Template
[![codecov](https://codecov.io/gh/jonasfroeller/tech-stack-ai-configuration-data-scraper/branch/main/graph/badge.svg?token=tech-stack-ai-configuration-data-scraper_token_here)](https://codecov.io/gh/jonasfroeller/tech-stack-ai-configuration-data-scraper)
[![CI](https://github.com/jonasfroeller/tech-stack-ai-configuration-data-scraper/actions/workflows/main.yml/badge.svg)](https://github.com/jonasfroeller/tech-stack-ai-configuration-data-scraper/actions/workflows/main.yml)

```
make virtualenv
source .venv/bin/activ
```
## Install it from PyPI

```bash
pip install neptun_webscraper
```
make fmt
```

A low dependency and really simple to start project template for Python Projects.

See also
- [Flask-Project-Template](https://github.com/rochacbruno/flask-project-template/) for a full feature Flask project including database, API, admin interface, etc.
- [FastAPI-Project-Template](https://github.com/rochacbruno/fastapi-project-template/) The base to start an openapi project featuring: SQLModel, Typer, FastAPI, JWT Token Auth, Interactive Shell, Management Commands.

### HOW TO USE THIS TEMPLATE

> **DO NOT FORK** this is meant to be used from **[Use this template](https://github.com/rochacbruno/python-project-template/generate)** feature.
1. Click on **[Use this template](https://github.com/rochacbruno/python-project-template/generate)**
3. Give a name to your project
(e.g. `my_awesome_project` recommendation is to use all lowercase and underscores separation for repo names.)
3. Wait until the first run of CI finishes
(Github Actions will process the template and commit to your new repo)
4. If you want [codecov](https://about.codecov.io/sign-up/) Reports and Automatic Release to [PyPI](https://pypi.org)
On the new repository `settings->secrets` add your `PYPI_API_TOKEN` and `CODECOV_TOKEN` (get the tokens on respective websites)
4. Read the file [CONTRIBUTING.md](CONTRIBUTING.md)
5. Then clone your new project and happy coding!
## Usage

> **NOTE**: **WAIT** until first CI run on github actions before cloning your new project.
```py
from neptun_webscraper import BaseClass
from neptun_webscraper import base_function

### What is included on this template?
BaseClass().base_method()
base_function()
```

- 🖼️ Templates for starting multiple application types:
* **Basic low dependency** Python program (default) [use this template](https://github.com/rochacbruno/python-project-template/generate)
* **Flask** with database, admin interface, restapi and authentication [use this template](https://github.com/rochacbruno/flask-project-template/generate).
**or Run `make init` after cloning to generate a new project based on a template.**
- 📦 A basic [setup.py](setup.py) file to provide installation, packaging and distribution for your project.
Template uses setuptools because it's the de-facto standard for Python packages, you can run `make switch-to-poetry` later if you want.
- 🤖 A [Makefile](Makefile) with the most useful commands to install, test, lint, format and release your project.
- 📃 Documentation structure using [mkdocs](http://www.mkdocs.org)
- 💬 Auto generation of change log using **gitchangelog** to keep a HISTORY.md file automatically based on your commit history on every release.
- 🐋 A simple [Containerfile](Containerfile) to build a container image for your project.
`Containerfile` is a more open standard for building container images than Dockerfile, you can use buildah or docker with this file.
- 🧪 Testing structure using [pytest](https://docs.pytest.org/en/latest/)
- ✅ Code linting using [flake8](https://flake8.pycqa.org/en/latest/)
- 📊 Code coverage reports using [codecov](https://about.codecov.io/sign-up/)
- 🛳️ Automatic release to [PyPI](https://pypi.org) using [twine](https://twine.readthedocs.io/en/latest/) and github actions.
- 🎯 Entry points to execute your program using `python -m <neptun_webscraper>` or `$ neptun_webscraper` with basic CLI argument parsing.
- 🔄 Continuous integration using [Github Actions](.github/workflows/) with jobs to lint, test and release your project on Linux, Mac and Windows environments.
```bash
python -m neptun_webscraper
```

> Curious about architectural decisions on this template? read [ABOUT_THIS_TEMPLATE.md](ABOUT_THIS_TEMPLATE.md)
> If you want to contribute to this template please open an [issue](https://github.com/rochacbruno/python-project-template/issues) or fork and send a PULL REQUEST.
```bash
neptun_webscraper
```

[❤️ Sponsor this project](https://github.com/sponsors/rochacbruno/)
## Docker Hub Scraper

<!-- DELETE THE LINES ABOVE THIS AND WRITE YOUR PROJECT README BELOW -->
```bash
python -m neptun_webscraper dockerhub --query=python
```

---
# neptun_webscraper
## Quay IO Scraper

[![codecov](https://codecov.io/gh/jonasfroeller/tech-stack-ai-configuration-data-scraper/branch/main/graph/badge.svg?token=tech-stack-ai-configuration-data-scraper_token_here)](https://codecov.io/gh/jonasfroeller/tech-stack-ai-configuration-data-scraper)
[![CI](https://github.com/jonasfroeller/tech-stack-ai-configuration-data-scraper/actions/workflows/main.yml/badge.svg)](https://github.com/jonasfroeller/tech-stack-ai-configuration-data-scraper/actions/workflows/main.yml)
```bash
python -m neptun_webscraper quay --query=python
```

Awesome neptun_webscraper created by jonasfroeller
## Development

## Install it from PyPI
## Create a virtualenv

```bash
pip install neptun_webscraper
make virtualenv
source .venv/bin/activ
```

## Usage

```py
from neptun_webscraper import BaseClass
from neptun_webscraper import base_function
## Format the code

BaseClass().base_method()
base_function()
```bash
make fmt
```

## Lint the code

```bash
$ python -m neptun_webscraper
#or
$ neptun_webscraper
make lint
```

## Development
## Contributing

More commands are in the [Makefile](Makefile).

Read the [CONTRIBUTING.md](CONTRIBUTING.md) file.
11 changes: 1 addition & 10 deletions neptun_webscraper/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,7 @@
neptun_webscraper base module.
This is the principal module of the neptun_webscraper project.
here you put your main classes and objects.
Be creative! do whatever you want!
If you want to replace this with a Flask application run:
$ make init
and then choose `flask` as template.
Main classes and objects live here.
"""

# example constant variable
NAME = "neptun_webscraper"
39 changes: 4 additions & 35 deletions neptun_webscraper/cli.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,3 @@
"""CLI interface for neptun_webscraper project.
Be creative! do whatever you want!
- Install click or typer and create a CLI app
- Use builtin argparse
- Start a web application
- Import things from your .base module
"""

import argparse

from scrapy.crawler import CrawlerProcess
Expand All @@ -18,38 +8,17 @@

def main(): # pragma: no cover
"""
The main function executes on commands:
CLI interface for neptun_webscraper project.
This is the program's entry point. The main function executes on commands:
`python -m neptun_webscraper` and `$ neptun_webscraper `.
This is your program's entry point.
You can change this function to do whatever you want.
Examples:
* Run a test suite
* Run a server
* Do some other stuff
* Run a command line application (Click, Typer, ArgParse)
* List all available tasks
* Run an application (Flask, FastAPI, Django, etc.)
---
Choose between different spiders.
Examples:
```
python -m neptun_webscraper dockerhub --query=python
```
```
python -m neptun_webscraper quay --query=python
```
"""

parser = argparse.ArgumentParser(description="Neptune WebScraper CLI")
parser.add_argument(
"spider",
choices=["dockerhub", "quay"],
help="Choose the spider to run",
default="dockerhub",
help="Choose a spider to run",
)
parser.add_argument(
"--query", default="", help="Search query for the registry"
Expand Down
6 changes: 1 addition & 5 deletions neptun_webscraper/spiders/dockerhub.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,14 @@
from scrapy.selector import Selector
from scrapy_playwright.page import PageMethod

# playwright Settings
# needed for JS support, if websites load content dynamically lazy
# npm i [email protected] --global, playwright install

SCRAPY_SETTINGS = {
"DOWNLOAD_HANDLERS": {
"http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
"https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
},
"REQUEST_FINGERPRINTER_IMPLEMENTATION": "2.7",
"PLAYWRIGHT_BROWSER_TYPE": "chromium",
"USER_AGENT": None, # using browser user agent instead
"USER_AGENT": None, # None => using browsers user agent instead
}


Expand Down
11 changes: 0 additions & 11 deletions neptun_webscraper/spiders/logs/20240701200912.json

This file was deleted.

Loading

0 comments on commit a028b1e

Please sign in to comment.