Skip to content

SixArm/usv-lib-rust-crate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Unicode Separated Values (USV)

Unicode separated values (USV) is a data format that uses Unicode symbol characters between data parts.

The USV repo is https://github.com/sixarm/usv.

USV characters

Separators:

  • File Separator (FS) is U+001C or U+241C ␜

  • Group Separator (GS) is U+001D or U+241D ␝

  • Record Separator (RS) is U+001E or U+241E ␞

  • Unit Separator (US) is U+001F or U+241F ␟

Modifiers:

  • Escape (ESC) is U+001B or U+241B ␛

  • End of Transmission (EOT) is U+0004 or U+2404 ␄

Liners:

  • Carriage Return (CR) is U+000D

  • Line Feed (LF) is U+000A

Units

use usv::*;
let input = "a␟b␟";
let units: Units = input.units().collect();
assert_eq!(units, ["a", "b"]);

Records

use usv::*;
let input = "a␟b␟␞c␟d␟␞";
let records: Records = input.records().collect();
assert_eq!(records, [["a", "b"],["c", "d"]]);

Groups

use usv::*;
let input = "a␟b␟␞c␟d␟␞␝e␟f␟␞g␟h␟␞␝";
let groups: Groups = input.groups().collect();
assert_eq!(groups, [[["a", "b"],["c", "d"]],[["e", "f"],["g", "h"]]]);

Files

use usv::*;
let input = "a␟b␟␞c␟d␟␞␝e␟f␟␞g␟h␟␞␝␜i␟j␟␞k␟l␟␞␝m␟n␟␞o␟p␟␞␝␜";
let files: Files = input.files().collect();
assert_eq!(files, [[[["a", "b"],["c", "d"]],[["e", "f"],["g", "h"]]],[[["i", "j"],["k", "l"]],[["m", "n"],["o", "p"]]]]);

Token

A token is the underlying USV enumeration for parsing a string to output:

pub enum Token {
    Unit(String),
    UnitSeparator,
    RecordSeparator,
    GroupSeparator,
    FileSeparator,
    EndOfTransmission,
}

Type aliases

  • Token = described above

  • Tokens = Vec

  • Unit = String

  • Units = Vec

  • Record = Units

  • Records = Vec

  • Group = Records

  • Groups = Vec

  • File = Groups

  • Files = Vec

About

Unicode Separated Values (USV) library Rust crate

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages