Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added unicode generators #119

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

P-Andersson
Copy link

No documentation yet, though the implementation is tested and supports utf8 I'm looking for some feedback regarding a few issues.

  • Currently, the fundamental part here is the Unicode Codepoint, an integer which is identical to utf32 encoding. I'm not sure if it is a good idea to keep these concepts separate from each other or not.
  • The algorithm used to generate the codepoint heavily favors the low end of the possible range of values (the ASCII range) though higher values still come up a significant amount of the time. The reasoning is that characters that are likely to be specially treated (newlines, spaces, tabs...) occur there. This also gives a better distribution of byte sizes for utf8 than a pure uniform random distribution of codepoint values would.
  • Currently, it only generates valid unicode values (though still values that has been assigned no symbols yet). I'm not sure if invalid ones are interesting to create.

@emil-e
Copy link
Owner

emil-e commented Mar 17, 2016

Nice! I will review this as soon as I have the time, have been a little busy lately :)

/// within the the basic multilingual plane, though
/// any valid Unicode codepoint may be generated.
template<typename T, typename RandomType>
T generateCodePoint(rc::detail::BitStream<RandomType>& stream);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary qualification, this is in rc::detail already.

@emil-e
Copy link
Owner

emil-e commented Mar 19, 2016

Will continue review later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants