-
-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Formal criteria for being awful #17
Comments
What about research which has a higher chance of being used maliciously? Deep fakes was an application of generative models. I think one usecase of ai that can't be overlooked is the use of deep learning in situations where bias comes from the datasets created by humans. For example, there was this article which told about an ai being used to decide court rulings, but the issue here was that there has been a history of higher arrests against the African-American community. So how do you make that an ai is fair when the practices aren't ? Does this make sense? |
Since "awful" has many meanings, I'd use three categories: harmful, stupid, and subverted. Harmful would list only those that were created purposefully as tools for harm and exploitation, in particular those that are attacking privacy. Also, it should be required that there are clear existing examples of prospective list entry being used for evil, not just speculation about potential use, because we can speculate about pretty much anything being used that way. Any tool can be used for good or evil, but that doesn't mean it's inherently evil. A neural network deciding that race plays an important role in predicting recidivism or lifetime chance to be arrested isn't racist or unfair, it reveals an uncomfortable tendention that's nonetheless objectively true. Another case is when you have a bad dataset, but that's not the classifier's fault that the results are also bad, it's just Garbage In, Garbage out. Stupid include examples of poor security, glaring errors, hilariously backfiring results, and other similar incidents. Subverted is a category that includes examples of AI that was initially created for a useful purpose, but has been since used for evil, such as fitness trackers that are used by health insurance companies or autonomous driving technologies being used to decrease car ownership and destroy jobs. |
If we look for a formal system to describe the safety of an artificial intelligence, we should start from the Asimov 3 basic laws:
Any system that violates these laws can be easily defined as awful. While stated in term or robotics, these law express clearly some core principles:
As outlined in #14 we can match to these principle two aspect of an AI system:
The "why", the human intentions behind their creation, is too complex to know and evaluate, it cannot be stated or observed scientifically, and ultimately not relevant to their awfulness. For example any autonomous robot that kill a human, is awful beyond doubt. As for the how we have a few preconditions to evaluate the awfulness of an AI
Any problem with this preconditions makes the awfulness of the system unknown, and this is awful by itself. Once all these condition are met we can simply analyze the quality and the effectiveness of the measures in place to ensure the system does not violate these principle. For example:
|
Cool idea to base rules on Asimov! What do you think about these Guidelines proposed by the Fairness in ML community? Would love to draft some principles for Awful AI that are not too far apart from existing literature |
To be honest they look pretty vague, fuzzy, to the point of being useless in practice.
This is not how a research field can progress. A little step forward compared to this document and to the vague principles proposed by Google was the recent proposal of Universal Guidelines to inform and improve the design and use of AI. I signed that proposal myself, but I still think it doesn't address well enough two fundamental issues:
We cannot let anybody to get away with crimes that would be punished if done by humans, by simply delegating an autonomous proxy. Otherwise the rich will be instantly be above the Law. On the other hand, a system that looks right 98% of times might extort a huge trust from a human operator. Beyond the issue of preserving the ability to do the automated work manually, in case of problems or just to verify that it's working properly, there is the issue of preserving the operator's critical freedom. In a badly designed system, the operator will soon start to trust the automation, he will become a useless scapegoat for all issues that would otherwise be accounted to the manufacturer of the system. In terms of the Universal Guidelines proposed at Brussels, the paradox of automation lets bad players to build systems that violate these principles through a boring UI. And this is very awful, if you think about it. Anyway, as you can see, the debate is still ongoing and we cannot give a useful contribution by carefully crafting something that everybody would like. Too many aspects of the matter still need a deeper reflection. |
Can we define a formal set of community-driven criteria for what is considered to be awful to make it to the list? As discussed in #8 use cases in this list can be re-interpreted as missing domain knowledge or unintentional.
Right now this is my rough guiding thought process in curating the list:
Discrimination / Social Credit Systems
Surveillance / Privacy
Influencing, disinformation, and fakes
We should use this issue to discuss and define better and more structured criteria (if possible)
The text was updated successfully, but these errors were encountered: