Skip to content

Commit

Permalink
proper readme, prepare for release
Browse files Browse the repository at this point in the history
  • Loading branch information
Fogapod committed Nov 7, 2023
1 parent dab6f4c commit 6d84014
Show file tree
Hide file tree
Showing 10 changed files with 111 additions and 9 deletions.
12 changes: 10 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,17 @@ name = "pink_accents"
version = "0.0.1"
authors = ["fogapod"]
edition = "2021"
description = "Accent system based on string pattern matching"
description = "Replacement of patterns in string to simulate speech accents"
repository = "https://github.com/Fogapod/pink_accents"
homepage = "https://github.com/Fogapod/pink_accents"
license = "AGPL-3.0"
exclude = ["/.github", "/examples/accents", "/tests/sample_text.txt"]
keywords = ["text"]
categories = ["text-processing"]
exclude = [
"/.github",
"/examples",
"/tests/sample_text.txt",
]

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

Expand Down
104 changes: 99 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,103 @@
# pink accents

Accent system based on string pattern matching.
Allows defining a set of patterns to be replaced in string. This is a glorified regex replace, a sequence of them. Primary use case is for simulating silly speech accents.

Originally based on python [pink-accents](https://git.based.computer/fogapod/pink-accents) and
primarily developed for [ssnt](https://github.com/SS-NT/ssnt/tree/main)
game.
Originally based on python [pink-accents](https://git.based.computer/fogapod/pink-accents) and primarily developed for [ssnt](https://github.com/SS-NT/ssnt/tree/main) game.

See examples in [examples](examples/accents) folder.
Currently unusable on it's own because you cannot construct `Accent` using internal structures but there is a plan to support programmatic definitions.

## Types of replacements

Accent is a sequence of rules which are applied in order.
Each rule consists of regex pattern and a replacement. When regex match occurs the replacement is called. It then decides what to put instead (if anything).

Possible replacements are:

- `Noop`: Do not replace
- `Simple`: Puts string as is
- `Any` (recursive): Selects random replacement with equal weights
- `Weights` (recursive): Selects replacement based on relative weights

## Serialized format

`deserialize` feature provides an opinionated way of defining rules, specifically designed for speech accents.
Deserialization is primarily developed to support [ron](https://github.com/ron-rs/ron) format which has it's quirks but should work in json and maybe others.

Full reference:

```ron
(
// on by default, tries to match input case with output after each rule
// for example, if you replaced "HELLO" with "bye", it would use "BYE" instead
normalize_case: true,
// pairs of (regex, replacement)
// this is same as `patterns` except that each regex is surrounded with \b to avoid copypasting.
// `words` are applied before `patterns`
words: [
// this is the simplest rule to replace all "windows" words (separated by regex \b)
// occurences with "linux", case sensitive
("windows", Simple("linux")),
// this replaces word "OS" with one of replacements, with equal probability
("os", Any([
Simple("Ubuntu"),
Simple("Arch"),
Simple("Gentoo"),
])),
],
// pairs of (regex, replacement)
// this is same as `words` except these are used as is, without \b
patterns: [
// inserts one of the honks. first value of `Weights` is relative weight. higher is better
("$", Weights([
(32, Simple(" HONK!")),
(16, Simple(" HONK HONK!")),
(08, Simple(" HONK HONK HONK!")),
// ultra rare sigma honk - 1 / 56
(01, Simple(" HONK HONK HONK HONK!!!!!!!!!!!!!!!")),
])),
],
// accent can be used with severity (non negative value). higher severities can either extend
// lower level or completely replace it.
// default severity is 0. higher ones are defined here
severities: {
// extends previous severity (level 0, base one in this case), adding additional rules
// below existingones. words and patterns keep their relative order though - words are
// processed first
1: Extend(
(
words: [
// even though we are extending, defining same rule will overwrite result.
// relative order of rules remain the same: "windows" will remain first
("windows", Simple("windoos")),
],
// extend patterns, adding 1 more rule
patterns: [
// replacements can be nested arbitrarily
("[A-Z]", Weights([
// 50% to replace capital letter with one of the Es
(1, Any([
Simple("E"),
Simple("Ē"),
Simple("Ê"),
Simple("Ë"),
Simple("È"),
Simple("É"),
])),
// 50% to do nothing, no replacement
(1, Noop),
])),
],
),
),
// replace severity 1 entirely. in this case with nothing. remove all rules on severity 2+
2: Replace(()),
},
)
```

See more examples in [examples](examples) folder.
File renamed without changes.
2 changes: 1 addition & 1 deletion examples/accents/example.ron → examples/example.ron
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
// this is same as `patterns` except that each regex is surrounded with \b to avoid copypasting.
// `words` are applied before `patterns`
words: [
// this is the simplest rule to replace all "windows" words (separated by regex \b)
// this is the simplest rule to replace all "windows" words (separated by regex \b)
// occurences with "linux", case sensitive
("windows", Simple("linux")),
// this replaces word "OS" with one of replacements, with equal probability
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion src/accent.rs
Original file line number Diff line number Diff line change
Expand Up @@ -647,7 +647,7 @@ mod tests {
fn example_accents() {
let sample_text = fs::read_to_string("tests/sample_text.txt").expect("reading sample text");

for file in fs::read_dir("examples/accents").expect("read symlinked accents folder") {
for file in fs::read_dir("examples").expect("read symlinked accents folder") {
let filename = file.expect("getting file info").path();
println!("parsing {}", filename.display());

Expand Down

0 comments on commit 6d84014

Please sign in to comment.