Skip to content

Commit

Permalink
docs: add mkdocs (#4)
Browse files Browse the repository at this point in the history
* docs: add mkdocs

* docs: add mike + automatic push from main
  • Loading branch information
bdura committed Apr 19, 2024
1 parent 77650fe commit 59d39ca
Show file tree
Hide file tree
Showing 11 changed files with 1,077 additions and 9 deletions.
34 changes: 34 additions & 0 deletions .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Documentation

on:
workflow_dispatch:
push:
branches: [main]

permissions:
contents: write

jobs:
Documentation:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install poetry
run: |
pip install poetry
poetry install --only docs
- name: Set up Git
run: |
git config user.name ${{ github.actor }}
git config user.email ${{ github.actor }}@users.noreply.github.com
- name: Build documentation
run: |
git fetch origin gh-pages
poetry run mike delete main
poetry run mike deploy --push main
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,6 @@ coverage.xml
# Data
data/
*.db

# MkDocs
site
1 change: 1 addition & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Changelog
5 changes: 5 additions & 0 deletions docs/api-reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# API Reference

::: persil
options:
show_source: true
146 changes: 146 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# Overview

Persil is a pure-python parsing library that draws much (most, let's be honest)
of its inspiration from the excellent [Parsy](https://github.com/python-parsy/parsy) library.

Hence the name, "Persil" ([pɛʁ.sil] or [pɛʁ.si]), the French word for parsley
-a most questionable pun on `Parsy -> Parsley -> Persil`,
in case anyone missed it.

Like Parsy, Persil is a _"monadic parser combinator library for LL(infinity) grammars"_.
As a rough approximation, you can think of Persil as a typed "fork" of Parsy.
However, although the APIs are very similar, there are [notable differences](#compatibility-with-parsy)
that you might want to review if you're coming from Parsy.

If you're merely looking for a _somewhat_ type-aware version of Parsy, you may be looking for
`parsy-stubs`. Mypy can use it to infer most of the types, but you'll find that
shortcuts had to be taken in many cases.

## Getting started

Persil is a pure-Python library. You can install it with pip:

```shell
pip install persil
```

Then, you can play with persil much the same way you would with Parsy,
and enjoy the great developer experience that type-hinting brings to Persil.

### A basic example

```python
from persil import regex

year = regex(r"\d{4}").map(int)
```

This example should drive home the point that Persil is heavily inspired by Parsy.
The only difference in this particular case is type-safety:
the persil version knows that `year` is a parser that expects
a `str`, and outputs an `int`.

### More complex parsers

Parsy uses generator functions as a most elegant solution to define complex parser.

While you can still use this approach with Persil, you're encouraged to favour
the `from_streamer` decorator:

```python
@from_streamer
def parser(
stream: Stream[str],
) -> CustomType:
a = stream(parser_a)
b = stream(parser_b)
c = stream(parser_c)

return CustomType(a, b, c)
```

The equivalent code, using `generate` instead (deprecated in Persil):

```python
@generate
def parser() -> Generator[Parser, Any, CustomType]:
a = yield parser_a
b = yield parser_b
c = yield parser_c

return CustomType(a, b, c)
```

The main issue with `generate` is that intermediate parsers cannot be typed,
whereas `Stream.__call__` plays nice with modern Python tooling like mypy.

## Relation with Parsy

First of all, I am not affiliated in any way with the Parsy project.

### Rationale

Parsy's last commit is from a year ago at the time of writing. Moreover, although the authors
have started some development to propose a typed version of their library, efforts
in that area have stalled for two years.

### Compatibility with Parsy

Although Persil draws most of its inspiration from Parsy, maintaining a one-for-one
equivalence with the latter's API **is NOT among Persil's goal**.

For those coming from Parsy, here are some notable differences:

- the `Result` type is now a union between `Ok` and `Err`, which allow for a more type-safe API.
- `Err` is its own error: it inherits from `Exception` and can be raised.
- Persil introduces the `Stream` class, a wrapper around the input that can apply parsers sequentially,
keeping track of the book-keeping.

## Performance tips

Since Persil takes a functional approach, every transformation on a parser produces a new parser.
With that in mind, the way you define/use/combine parsers may substantially affect performance.

Consider the following example:

```python
from datetime import datetime

from persil import Stream, from_stream, regex, string


@from_stream
def datetime_parser(stream: Stream[str]) -> datetime:
year = stream.apply(regex(r"\d{4}").map(int))
stream.apply(string("/"))
month = stream.apply(regex(r"\d{2}").map(int))
stream.apply(string("/"))
day = stream.apply(regex(r"\d{2}").map(int))
return datetime(year, month, day)
```

The resulting `datetime_parser` will re-create three new regex parsers **every time** it is run.

A much better alternative:

```python
from datetime import datetime

from persil import Stream, from_stream, regex, string


year_parser = regex(r"\d{4}").map(int)
day_month_parser = regex(r"\d{2}").map(int)
slash_parser = string("/")

@from_stream
def datetime_parser(stream: Stream[str]) -> datetime:
year = stream.apply(year_parser)
stream.apply(slash_parser)
month = stream.apply(day_month_parser)
stream.apply(slash_parser)
day = stream.apply(day_month_parser)
return datetime(year, month, day)
```

That way, the lower-level parsers are only defined once.
38 changes: 38 additions & 0 deletions docs/scripts/plugin.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
from pathlib import Path

from mkdocs.config import Config
from mkdocs.structure.files import File, Files
from mkdocs.structure.pages import Page

ADD_TO_DOCS = [
"changelog.md",
]
VIRTUAL_FILES = dict[str, str]()


def on_files(files: Files, config: Config):
"""
Add virtual files.
"""

all_files = [file for file in files]

for path in ADD_TO_DOCS:
content = Path(path).read_text()
VIRTUAL_FILES[path] = content

file = File(
path,
config["docs_dir"],
config["site_dir"],
config["use_directory_urls"],
)
all_files.append(file)

return Files(all_files)


def on_page_read_source(page: Page, config: Config) -> str | None:
if page.file.src_path in VIRTUAL_FILES:
return VIRTUAL_FILES[page.file.src_path]
return None
52 changes: 52 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
site_name: Persil

repo_url: https://github.com/bdura/persil

theme:
name: material
palette:
- scheme: default
toggle:
icon: material/brightness-4
name: Switch to dark mode
- scheme: slate
toggle:
icon: material/brightness-7
name: Switch to light mode

markdown_extensions:
- admonition
- pymdownx.superfences
- pymdownx.highlight
- pymdownx.inlinehilite
- pymdownx.snippets
- pymdownx.tabbed:
alternate_style: true
- footnotes

nav:
- index.md
- api-reference.md
- changelog.md

watch:
- persil/

plugins:
- search
- mkdocstrings:
handlers:
python:
options:
allow_inspection: true
docstring_style: numpy
docstring_section_style: spacy
heading_level: 2
members_order: source
show_bases: false
show_signature: false
merge_init_into_class: true
- mike

hooks:
- docs/scripts/plugin.py
38 changes: 30 additions & 8 deletions persil/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ class Parser(Generic[Input, Output]):
"""
A Parser is an object that wraps a function whose arguments are
a string to be parsed and the index on which to begin parsing.
The function should return either Result.success(next_index, value),
The function should return either `Result.success(next_index, value)`,
where the next index is where to continue the parse and the value is
the yielded value, or Result.failure(index, expected), where expected
the yielded value, or `Result.failure(index, expected)`, where expected
is a string indicating what was expected, and the index is the index
of the failure.
"""
Expand Down Expand Up @@ -105,6 +105,14 @@ def combine(
self,
other: "Parser[Input, T]",
) -> "Parser[Input, tuple[Output, T]]":
"""
Returns a parser which, if the initial parser succeeds, will
continue parsing with `other`. It will produce a tuple
containing the results from both parsers, in order.
The resulting parser fails if `other` fails.
"""

@Parser
def combined_parser(stream: Input, index: int) -> Result[tuple[Output, T]]:
res1 = self(stream, index)
Expand Down Expand Up @@ -142,7 +150,7 @@ def parse_partial(
"""
Parses the longest possible prefix of a given string.
Returns a tuple of the result and the unparsed remainder,
or raises ParseError
or raises `ParseError`.
"""

result = self(stream, 0)
Expand Down Expand Up @@ -176,7 +184,8 @@ def map(
map_function: Callable[[Output], T],
) -> "Parser[Input, T]":
"""
Returns a parser that transforms the produced value of the initial parser with map_function.
Returns a parser that transforms the produced value of the initial parser
with `map_function`.
"""

@Parser
Expand All @@ -189,7 +198,7 @@ def mapped_parser(stream: Input, index: int) -> Result[T]:
def result(self, value: T) -> "Parser[Input, T]":
"""
Returns a parser that, if the initial parser succeeds, always produces
the passed in ``value``.
the passed in `value`.
"""

@Parser
Expand All @@ -203,11 +212,24 @@ def result_parser(stream: Input, index: int) -> Result[T]:

return result_parser

def times(self, min: int, max: int | None = None) -> "Parser[Input, list[Output]]":
def times(
self,
min: int,
max: int | None = None,
check_next: bool = False,
) -> "Parser[Input, list[Output]]":
"""
Returns a parser that expects the initial parser at least ``min`` times,
and at most ``max`` times, and produces a list of the results. If only one
Returns a parser that expects the initial parser at least `min` times,
and at most `max` times, and produces a list of the results. If only one
argument is given, the parser is expected exactly that number of times.
Parameters
----------
min
Minimal number of times the parser should match.
max
Maximal number of times the parser should match.
Equals to `min` by default
"""
if max is None:
max = min
Expand Down
6 changes: 6 additions & 0 deletions persil/stream.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@ class SoftError(Exception):


class Stream(Generic[In]):
"""
The `Stream` API lets you apply parsers iteratively, and handles
the index bookeeping for you. Its design goal is to be used with
the `from_stream` decorator.
"""

def __init__(self, inner: In, index: int = 0):
self.inner = inner
self.index = index
Expand Down
Loading

0 comments on commit 59d39ca

Please sign in to comment.