Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize function #12

Open
libre-man opened this issue Mar 5, 2017 · 1 comment
Open

Normalize function #12

libre-man opened this issue Mar 5, 2017 · 1 comment

Comments

@libre-man
Copy link

libre-man commented Mar 5, 2017

For the data processing I do time to time it is quite often really useful to have a 'normalize' function. Such a function converts unicode character to their ASCII 'equivalent' character or string. This is useful as a lot of older systems do not support unicode and to match strings I have to convert the unicode ones to the older ASCII equivalent. The function I use to do this is a simple lookup in a large lookup table, like this:

(defun normalize (string)
           (check-type string string)
           (format nil "~{~A~}"
                   (loop :for el :across string
                         :collect (aref +unicode-lookup-table+
                                        (char-code el)))))

I think such a function could be really useful in a unicode library. Is this something that fits this library and how would you feel about a pull request to add this functionality?

@hanshuebner
Copy link
Member

Support for transliteration in cl-unicode would be useful, but it should be extensible to support new or user-defined transliteration schemes. See http://cldr.unicode.org/index/cldr-spec/transliteration-guidelines and http://www.unicode.org/cldr/charts/latest/transforms/index.html for some schemes known by the Unicode consortium.

At the very least, the function provided by cl-unicode should accept an argument to indicate what scheme to use. normalize is not a good function name, transliterate seems to be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants