Observe the impact of bias in machine learning models and explore ways to eliminate such bias.
Bias refers to an unfair bias against a person, group, or thing. As machine learning becomes an integral part of our daily lives, the question arises, is there a bias in machine learning as well? In this article, I'll dive into the problem and its implications, as well as the many ways to eliminate bias in machine learning models.
The potential of machine learning is impressive, to power self-driving cars, to accurately identify cancer in X-rays**, and to our interests based on past behavior. However, while machine learning brings many advantages, it also presents many challenges. One of the challenges is that there is a bias in the classification and ** of machine learning. These deviations are not benevolent. Depending on the decisions generated by the machine learning model, these biases can lead to a wide variety of consequences. Therefore, it is important to understand how bias is introduced in machine learning models, how to test for bias, and how to eliminate it.
Tools for assessing the sentencing and parole of convicted offenders (compas) are an example of bias in machine learning. With many prisons overcrowded, it is hoped that an assessment will be used to identify prisoners who are less likely to reoffend. The prisoners are then scrutinized to see if they can be released first to make room for new incarcers. A risk score is defined by a large number of questions about prisoners, including questions such as whether a prisoner's parents have been in prison or whether their friends or acquaintances have been in prison (but not race).
It has been found that this tool is successful in the likelihood of convicted offenders becoming recidivists, but when race is introduced into the judgment formula, there is an error. It's worth pointing out that the jury still hasn't reached a verdict because the compas developer company subsequently provided data to support the results of its algorithm, but it also points to the question of whether bias exists. These methods will be challenged and data will need to be provided subsequently to prove their fairness.
Machine learning is seen as a key tool in HR for a variety of use cases, from providing training recommendations to recruiting and other tactical activities. In 2014, Amazon began developing a system to screen candidates to automate the process of identifying key candidates to look for based on the text on their resumes. But Amazon later found that the algorithm seemed to favor men over women when it came to selecting talent for engineering roles. After discovering the algorithm's lack of fairness and making several attempts to inject fairness into it, Amazon eventually abandoned the system.
The Google Photos app can classify objects by identifying them. But when people use the program, they find that there is some kind of racial bias here. Amazon's Rekognition, a commercial facial analysis system, has also been found to be gender- and race-biased.
The final example is Microsoft's Tay Twitter bot. Tay is a conversational AI (chatbot) that learns by interacting with people on Twitter. The algorithm mines public data to build conversational models, while also continuously learning from interactions on Twitter. Unfortunately, not all of Tay's interactions have been positive, and Tay has learned the biases of modern society, even in the machine model, which is called "sow and get melons, sow beans and get beans".
Regardless of the bias, the recommendations of machine learning algorithms have a real impact on individuals and groups. Machine learning models that contain bias can help perpetuate bias in a self-fulfilling way. Therefore, it is imperative to detect biases in these models and eliminate them as much as possible.
Regarding the appearance of bias, it can be simply attributed to the result of the data generated, but its source is elusive, usually related to the data**, the content of the data (does it contain elements that should be ignored by the model?). ) and the training of the model itself (for example, how to define good and bad in the classification context of the model).
If a machine learning algorithm is trained solely on daytime driving**, allowing the model to drive at night can lead to tragic results. This is different from human bias, but it also proves that there is a lack of representative datasets for the problem at hand.
Deviations can also occur that we are not expecting. Amazon's recruiting tool, for example, penalizes the words used by some candidates and bonus points for the words used by others. In this example, the penalized terms are gendered words that are commonly used by women, and women are similarly not sufficiently representative of the data in this dataset. Amazon's tools are primarily trained on male resumes over a 10-year period, which leads to a bias in favor of male resumes based on the language spoken among men.
Even humans can inadvertently amplify bias in machine learning models. Human bias can be unconscious (also known as implicit bias), which means that humans may even introduce bias without their own knowledge.
Let's look at how to detect bias in machine learning models and how to eliminate them.
Among the tools provided by many leaders in machine learning development, bias in machine learning datasets and models is also common.
Detecting bias starts with the data set. Datasets may not represent the problem space (e.g., training an autonomous vehicle using only daytime driving data). Datasets may also contain data that may not be taken into account (e.g., an individual's race or gender). These are known as sample bias and bias bias, respectively.
Because data is often cleaned before being used to train or test machine learning models, there is also exclusion bias. This is the case when we remove features that an individual considers irrelevant. Measurement bias occurs when the training data collected is different from the data collected during production. This occurs when a data set is collected with a particular type of camera, but the production data comes from a camera with different characteristics.
Finally, there is algorithmic bias, which does not come from the data used to train the model, but from the machine learning model itself. This includes how models are developed or trained that lead to unfair results.
Now that I've provided you with examples of bias, let's look at how to detect and prevent bias in machine learning models. We will ** some solutions from Google, Microsoft, IBM, and other open source solutions.
Google's What-If Tool (WIT) is an interactive tool that allows users to investigate machine learning models in a visual way. WIT is now part of the open-source TensorBoard web application, which provides the ability to analyze datasets and trained TensorFlow models. WIT has the ability to manually edit the samples from the dataset and see the effect of the changes by associating the model. It also generates a partial dependency diagram that shows how the results change when a feature is changed. WIT can apply a variety of fairness conditions to analyze model performance (optimized for group unconscious or fair chance). WIT is easy to use and includes a variety of demos to help users get started quickly.
IBM's AI Fairness 360 is one of the most comprehensive toolsets for detecting and eliminating bias in machine learning models. AI Fairness 360 is an open-source toolset with more than 70 fairness metrics and more than 10 bias correction algorithms to help you detect and eliminate bias. The deviation correction algorithm includes optimized preprocessing, weight adjustment, deviation removal, regularization terms, and so on. Metrics include Euclidean distance and Manhattan distance, statistical odd-even differences, and more. AI Fairness 360 is packed with tutorials and documentation. You can also use an interactive demo covering three datasets, including the Compas recidivism dataset, to explore the bias metrics, and then apply the bias correction algorithm to see how the results compare to the original model. Designed for open source, this toolset allows researchers to add their own fairness metrics and bias correction algorithms.
IBM researchers are also working towards a "composable bias rating of AI services."A bias scoring system for machine learning models is proposed. The third-party scoring system envisioned here is used to validate the bias of the machine learning model. Similar to Microsoft's experience learning in nature, datasets can also contain biases. The University of Maryland and Microsoft Research recently published a report titled "What are the biases in my word embedding?".A method for using crowdsourcing to identify bias in text encoding (natural language) was established. Word embeddings can represent text in high-dimensional space through feature vectors. Subsequently, these eigenvectors support vector arithmetic operations. This allows for an analogy with a mind game such as "man is king, woman is x". The result of calculating x is the queen, which is a reasonable answer. But looking at other analogies, there may be bias in some areas. For example, "a man is a computer programmer, a woman is a housewife" reflects gender bias. And "the father is a doctor, and the mother is **" has the same problem. Microsoft has demonstrated the ability to automatically detect deviations in text encoding using associative testing. This was inspired by the Implicit Association Test (IAT), which is widely used to measure human bias. This finding was subsequently validated by using crowdsourcing to confirm bias.
Through this process, you can help the word embedding user to reduce the bias of this data set.
Bias has become one of the hottest areas of machine learning research over the past few years, and frameworks for detecting and correcting model bias are emerging.
Local interpretable model-agnostic explanations (lime) can be used to understand why a model provides a specific **. Lime works with any model and provides a human-understandable explanation for a given**.
FairML is a toolbox that audits models by quantifying the relative importance of model inputs. Subsequently, this relative importance can be used to assess the fairness of the model. FairML also works with any black box model.
In many cases, machine learning models are a black box. You can give it input and observe its output, but the way these models map inputs to outputs is hidden in the trained model. Explainable models can help expose the way machine learning models arrive at their conclusions, but until these models are universally applied, another alternative is human-machine looping.
Human-in-the-loop is a hybrid model that connects traditional machine learning with humans monitoring the results of a machine learning model. This allows one to observe when an algorithm is adopted or when there is a dataset bias. Recall how Microsoft used crowdsourcing to validate its word embedding bias findings, suggesting that human-in-the-loop is a practical hybrid model that can be adopted. Fairness is a double-edged sword, and there is still no consensus on the mathematical definition of fairness. We can adopt the four-fifths rule or other equality measures, but the shortcomings will soon follow.
Until we are able to build fully transparent and explainable models, we will still need to rely on various toolsets to measure and correct bias in machine learning models. Thankfully, these toolsets are feature-rich, with a large number of fairness metrics and bias correction algorithms.