Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
ikonst committed Nov 24, 2021
0 parents commit 87107bb
Show file tree
Hide file tree
Showing 17 changed files with 555 additions and 0 deletions.
25 changes: 25 additions & 0 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Release

on:
release:
types: [published]

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.8'
- name: Install dependencies
run: pip install -r requirements-dev.txt
- name: Build packages
run: python setup.py sdist
- name: Publish to PyPI
if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags')
uses: pypa/gh-action-pypi-publish@release/v1
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
30 changes: 30 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Tests

on:
push:
branches: [main]
pull_request:

jobs:
test:

runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- '3.8'

steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
pip install -r requirements-dev.txt
- name: Run tests
run: |
pytest
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
.idea
*.egg-info/
__pycache__
build/
dist/
.coverage
25 changes: 25 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.0.1
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- repo: https://github.com/asottile/reorder_python_imports
rev: v2.6.0
hooks:
- id: reorder-python-imports
args: ['--py38-plus']
- repo: https://github.com/psf/black
rev: 21.11b1
hooks:
- id: black
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v0.910-1
hooks:
- id: mypy
additional_dependencies:
- --no-compile
- ruyaml==0.20.0
- jschon==0.7.3
46 changes: 46 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
`jschon-sort` sorts a JSON or YAML document according to its JSON Schema:
object properties are ordered to match the order in which JSON Schema properties (that match them) are declared.

The "jschon" name relates to it being based on the [jschon](https://github.com/marksparkza/jschon) library
for JSON Schema handling.

## Motivation

Per the JSON RFC, an object is an unordered collection. In practice, within serialized JSON or YAML files,
a particular order of properties can benefit readability: for example,
`{"start": 10, "end": 20}` read more naturally than naive lexicographic order of `{"end": 20, "start": 10}`
(that would result from `json.dumps(..., sort_keys=True)`).
While there are [several](https://github.com/json-schema/json-schema/issues/119)
[attempts](https://github.com/json-schema-org/json-schema-spec/issues/571)
to introduce property ordering into JSON Schema, here we're taking a different approach.
By leveraging the fact that the JSON Schema itself is written with human maintainers in mind,
we can extrapolate the intuitive order from the JSON Schema definitions' ordering and apply it on the document itself.

## Example

Given **schema**:

```json
{
"type": "object",
"properties": {
"range": {
"type": "object",
"properties": {
"start": {"type": "number"},
"end": {"type": "number"}
}
}
}
}
```

the following **document**:

```json
{"range": {"end": 20, "start": 10}}
```
would be reordered as:
```json
{"range": {"start": 20, "end": 10}}
```
5 changes: 5 additions & 0 deletions jschon_sort/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from ._main import sort_doc_by_schema

__all__ = [
'sort_doc_by_schema',
]
96 changes: 96 additions & 0 deletions jschon_sort/_main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
import copy
import math
from typing import Dict
from typing import List
from typing import Tuple

import jschon.jsonschema
from jschon.json import AnyJSONCompatible


def _get_sort_keys_for_json_nodes(node: jschon.JSON) -> Dict[jschon.JSONPointer, Tuple[int, ...]]:
"""
Gets a mapping from JSON nodes (as JSON pointers) to sort keys (as tuples of integers) that match their position
within the JSON.
"""
mapping = {}

def _recurse(node: jschon.JSON, node_sort_key: Tuple[int, ...]) -> None:
if node.type == "object":
for idx, v in enumerate(node.data.values()):
new_loc = (*node_sort_key, idx)
mapping[v.path] = new_loc
_recurse(v, new_loc)
elif node.type == "array":
for idx, v in enumerate(node.data):
new_loc = (*node_sort_key, idx)
_recurse(v, new_loc)

_recurse(node, ())

return mapping


def sort_doc_by_schema(doc_data: AnyJSONCompatible, schema_data: AnyJSONCompatible) -> AnyJSONCompatible:
schema_json = jschon.JSON(schema_data)
schema_sort_keys = _get_sort_keys_for_json_nodes(schema_json)

try:
schema = jschon.JSONSchema(schema_data)
except jschon.CatalogError:
# jschon only supports newer jsonschema drafts
schema_data = copy.copy(schema_data)
schema_data['$schema'] = "https://json-schema.org/draft/2020-12/schema"
schema = jschon.JSONSchema(schema_data)

doc_json = jschon.JSON(doc_data)
res = schema.evaluate(doc_json)
if not res.valid:
raise ValueError('Document failed schema validation')

doc_sort_keys: Dict[jschon.JSONPointer, Tuple[int, ...]] = {}

def _traverse_scope(scope: jschon.jsonschema.Scope) -> None:
for child in scope.iter_children():
doc_sort_keys[child.instpath] = schema_sort_keys[child.path]
_traverse_scope(child)

_traverse_scope(res)

end_sort_key = (math.inf,)

def _sort_json_node(node: AnyJSONCompatible, json_node: jschon.JSON) -> AnyJSONCompatible:
"""Traverses the nodes while also keeping at pointer at a high-level JSON object (to get the JSON pointers)."""
if json_node.type == "object":
key_sort_keys: Dict[str, Tuple[Tuple[float, ...], str]] = {}

properties: List[Tuple[str, AnyJSONCompatible]] = []

k: str
v: AnyJSONCompatible
v_json: jschon.JSON
for (k, v), v_json in zip(node.items(), json_node.data.values()):
properties.append((k, _sort_json_node(v, v_json)))
# Keys which don't map to the schema (e.g. undefined properties when additionalProperties is missing,
# defaulting to true) are assumed to come last (end_sort_key).
# As a tie breaker for multiple such undefined properties, we use the key's name.
# TODO: update jschon to add additional properties to res.children when appropriate
key_sort_keys[k] = doc_sort_keys.get(v_json.path, end_sort_key), k

properties.sort(key=lambda pair: key_sort_keys[pair[0]])

# to maintain YAML round-trip data, copy node and re-populate
node_copy = node.copy()
node_copy.clear()
node_copy.update(properties)

return node_copy

elif json_node.type == "array":
return [_sort_json_node(node[idx], v_json) for idx, v_json in enumerate(json_node.data)]

return node

# we recurse down both the "JSON" and the actual document, and mutate only the actual document
# which is the primitive type that we can serialize back to JSON/YAML easily
return _sort_json_node(doc_data, doc_json)
22 changes: 22 additions & 0 deletions jschon_sort/_yaml.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
from typing import Any
from typing import NamedTuple

import ruyaml.representer


class YamlIndent(NamedTuple):
mapping: int
sequence: int
offset: int


def create_yaml(*, indent: YamlIndent) -> ruyaml.main.YAML:
def _null_representer(self: ruyaml.representer.BaseRepresenter, data: None) -> Any:
return self.represent_scalar('tag:yaml.org,2002:null', 'null')

yaml = ruyaml.main.YAML()
yaml.indent(**indent._asdict())
yaml.preserve_quotes = True # type: ignore[assignment]
yaml.width = 4096 # type: ignore[assignment]
yaml.Representer.add_representer(type(None), _null_representer)
return yaml
62 changes: 62 additions & 0 deletions jschon_sort/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
import argparse
import json
import sys

import jschon

from ._main import sort_doc_by_schema
from ._yaml import create_yaml
from ._yaml import YamlIndent


def main():
jschon.create_catalog('2020-12')

parser = argparse.ArgumentParser(
prog='jschon-sort',
description="Sorts a JSON or YAML document to match a JSON Schema's order of properties",
)
parser.add_argument('path', help='path to the JSON / YAML document')
parser.add_argument('schema_path', help='path to the JSON Schema document')
parser.add_argument(
'--dry-run',
'-n',
dest='dry_run',
help='if set, result is not persisted back to the original file',
action='store_true',
)
parser.add_argument('--indent', type=int, dest='indent', default=4, help='indent size')
parser.add_argument(
'--yaml-indent',
type=lambda s: YamlIndent(*map(int, s.split(','))),
dest='yaml_indent',
metavar='MAPPING,SEQUENCE,OFFSET',
default=YamlIndent(2, 4, 2),
help='YAML indent size',
)
args = parser.parse_args()

is_yaml = args.path.endswith('.yaml') or args.path.endswith('.yml')
yaml = create_yaml(indent=args.yaml_indent)
with open(args.path) as f:
if is_yaml:
doc_data = yaml.load(f)
else:
doc_data = json.load(f)

with open(args.schema_path) as f:
schema_data = json.load(f)

sorted_doc_data = sort_doc_by_schema(doc_data, schema_data)

if not args.dry_run:
if is_yaml:
with open(args.path, 'w') as f:
yaml.dump(sorted_doc_data, f)
else:
with open(args.path, 'w') as f:
json.dump(sorted_doc_data, f, indent=args.indent)


if __name__ == '__main__':
main() # pragma: no cover
Empty file added jschon_sort/py.typed
Empty file.
16 changes: 16 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
[tool.black]
line-length = 120
skip-string-normalization = true

[tool.mypy]
warn_return_any = true
warn_unused_configs = true
strict_optional = true
strict_equality = true
warn_no_return = true
check_untyped_defs = true
warn_redundant_casts = true
show_error_codes = true
implicit_reexport = false
warn_unreachable = true
disallow_incomplete_defs = true
5 changes: 5 additions & 0 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pytest
pytest-cov
ruyaml==0.20.0
jschon==0.7.3
-e .
41 changes: 41 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
[metadata]
name = jschon-sort
version = 0.0.1
description = Sorts a JSON or YAML document to match a JSON Schema's order of properties
long_description = file: README.md
long_description_content_type = text/markdown
url = https://www.github.com/ikonst/jschon-sort
maintainer = Ilya Konstantinov
maintainer_email = [email protected]
classifiers =
Programming Language :: Python :: 3
[options]
packages = find:
install_requires =
jschon
ruyaml
python_requires = >=3.8
[options.package_data]
jschon_sort =
py.typed
[options.packages.find]
exclude = tests*
[options.entry_points]
console_scripts =
jschon-sort = jschon_sort.cli:main
[tool:pytest]
addopts = --cov
[coverage:run]
branch = true
[coverage:report]
fail_under = 100
show_missing = true
omit =
setup.py
3 changes: 3 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from setuptools import setup

setup()
Loading

0 comments on commit 87107bb

Please sign in to comment.