Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where does the definition of natural language draw the line between natural languages and other human languages? #16

Open
cstrobbe opened this issue Sep 30, 2022 · 1 comment
Assignees

Comments

@cstrobbe
Copy link

The definition of natural language currently reads,

Natural Language (sometimes just language) refers to the spoken, written, or signed communications used by human beings. (...)

What criteria are silently assumed to differentiate between natural languages and other human languages?

  1. Languages that evolved versus languages that were constructed for communication between people? This would exclude languages such as Esperanto, Ido and Klingon.
  2. Languages that people learn as a native language versus ones that aren't? ("Natural language" is sometimes defined as a language learnt as a native language.) This would seem to include Esperanto (which is said to have native speakers) but would exclude Latin, Homeric Greek and a number of languages that have recently become extinct.

The ISO language tags don't exclude ancient languages, extinct languages or constructed languages.

In 2008, WCAG 2.0 intentionally used the term human language instead of "natural language" in order to avoid using a term that might be interpreted as excluding constructed languages such as Esperanto and extinct or historical languages such as Latin. Content in these languages exists online and can be identified using ISO 639 language tags.

@aphillips
Copy link
Collaborator

What criteria are silently assumed to differentiate between natural languages and other human languages?

There are no such criteria because there is no "other" here. While some dictionary definitions prefer a narrow definition of "natural" (as a contrast with "artificial" languages such as Klingon), in computing the term primarily means "human language" through its association with "natural language processing". The contrast here is with machine languages.

We could add "human language" as an additional term here. Note that the definition here is drawn from and is meant to reflect those found in BCP47 (which is our preferred reference for language identification) such as:

Language tags are used to help identify languages, whether spoken, written, signed, or otherwise signaled, for the purpose of communication. This includes constructed and artificial languages but excludes languages not intended primarily for human communication, such as programming languages.

@aphillips aphillips self-assigned this Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants