<Exploring AI Bias: The Pitfalls of Good Intentions in Tech>
Written on
AI is not a mystical entity. Any "good judgment" it seems to exhibit arises from either its ability to recognize patterns or from safeguards implemented by its developers. Remember, AI is not a human; it is a system designed for pattern recognition and labeling. When creating AI solutions, it’s essential to recognize that if it meets your launch criteria, you will get the results you specified—not necessarily what you desired. AI systems derive their functionality entirely from the patterns present in the data provided to them and optimize based on the objectives we set.
AI is fundamentally a tool for recognizing patterns.
Consequently, it should come as no surprise when the system utilizes patterns embedded in your data, regardless of your efforts to conceal them.
The Importance of Policy Layers for Reliability
If AI safety is a priority for you, it's crucial to advocate for the incorporation of policy layers in every AI system. Think of these layers as the AI equivalent of human etiquette.
Policy layers serve as a mechanism for ensuring AI behaves appropriately.
I know a few choice words in multiple languages, but you won't hear me say them publicly. This is not due to a lack of awareness; rather, I am consciously filtering my language. Society has instilled in me certain manners. Fortunately, a similar solution exists for AI: policy layers act as an additional logic layer that sits atop the machine learning or AI system. This essential safety net reviews, filters, and decides how to handle the system’s outputs.
In my previous article, I discussed how policy layers (and Batman) are pivotal in ensuring AI system reliability. They provide a straightforward first line of defense against foreseeable errors and can be adjusted quickly without needing to retrain complex systems, allowing for rapid responses to unexpected issues.
Should your AI system exhibit problematic behaviors that could lead to a public relations disaster, it would be advantageous to disable those outputs promptly. Without a policy layer, your engineering team’s best option may be to plan for a new model release… perhaps in the next quarter? Thankfully, policy layers are here to help!
Policy Layers as a Tool Against AI Bias
However, policy layers serve a dual purpose beyond reliability and safety; they are also effective in preventing AI systems from producing outputs that reinforce harmful biases. In a world where reality often diverges from the ideal, it’s wise to avoid stating the obvious.
Policy layers represent the most robust implementation of good practices for machines.
The challenge arises when datasets, reflecting various human activities, contain undesirable elements. If your AI learns from such data, it may perpetuate—or even amplify—attitudes we collectively wish to abandon. Remember, historical data is the foundation of your training set, and even "real-time" data is still rooted in the past, albeit very recent. Fortunately, policy layers can prevent numerous behaviors that keep us tethered to a problematic history.
As a female scientist with expertise in mathematical statistics, I often wonder who shares my perspective on certain AI outputs. A policy layer could intervene to prevent such outputs from being generated in the first place.
Addressing Symptoms Rather Than Root Causes
If you grasp the true essence of a policy layer—a straightforward filtering mechanism that doesn’t require retraining when updates occur—you’ll realize that using one to combat AI bias only alleviates symptoms without addressing the deeper issues.
If you're in search of a genuine solution, consider improving your training methods, algorithms, data preparation, objectives, logging, or performance metrics. Or, if you're feeling ambitious, strive to create a better world, making it easier for your AI to reflect a more positive reality. The downside to these substantive solutions is that they demand considerable time to implement. In the meantime, you may find yourself resorting to quick fixes, such as “fairness through unawareness” or policy layers.
True solutions require time and effort.
Fairness through unawareness is the reflexive response of selectively excluding information that might be problematic to prevent your system from accessing it. While this concept seems appealing, it often fails in practice.
Despite this, many continue to rely on it. Let’s examine why the strategy of shielding your system from undesirable data is as misguided as believing that children won’t learn inappropriate language if it’s never spoken at home.
Understanding Fairness Through Unawareness
Fairness through unawareness attempts to mitigate AI bias by selectively removing certain information from training datasets, occasionally supplemented by synthetic data. Broadly speaking, two methods can be employed:
- Removing features.
- Removing instances.
For those unfamiliar with this terminology, an instance refers to an individual data point (e.g., a single entry in an airline's dataset containing ticket prices, confirmation codes, and seat numbers), while a feature denotes a variable aspect across data points (e.g., a column representing frequent flyer numbers).
Using fairness through unawareness is akin to trying to avoid rudeness by not learning swear words. The first issue arises from the difficulty in controlling exposure to information as both humans and machines absorb data from the real world. Secondly, attempting to eliminate offensive language by remaining ignorant of its meaning hinders your capacity to effectively engage with reality. Furthermore, when applying this approach to a machine system, there's always the risk of overlooking certain stimuli, potentially leading to a grossly ineffective system. A more dependable solution lies in adopting proper manners: knowing something but choosing not to express it.
Let’s delve deeper into both strategies of unawareness.
Removing Features
“I don’t want my model to recognize this demographic attribute.” If your strategy hinges on feature removal, you may be heading for trouble.
The feature deletion approach is a well-intentioned directive that may ask you to refrain from using personal demographic data when training an AI system. It sounds great—after all, who wouldn’t support the idea of avoiding discrimination based on race, age, or gender?
However, I critique this method precisely because I oppose discrimination and its negative impacts. If you genuinely aspire to create a better world, you must demand more than just a pleasing-sounding strategy. Good intentions alone aren't sufficient; the approach must be effective, genuinely safeguarding those it aims to protect.
Thus, when regulators advise you to exclude problematic features, do it. However, don’t stop there. Assume that merely removing the feature doesn’t solve anything, and hold yourself accountable for developing a real solution.
Regardless of your noble intentions, removing an offensive feature from a complex dataset often results in the system recovering the deleted signal through other features. For instance, if you eliminate a candidate's gender information from a hiring system, the AI may still identify gender patterns by blending other inputs. It may not do so perfectly, but enough to perpetuate discrimination. The potential for extracting information from complex datasets is astonishing.
Consequently, while it may be tempting to congratulate yourself for being proactive, removing a feature seldom resolves the issue. The biases you aimed to eliminate may still manifest through your dataset. Identifying the true root cause is typically challenging (often requiring research-level efforts), and the journey to finding a solution can be lengthy. Your hasty adjustments may merely obscure the problem, leading you to mistakenly believe you've resolved it. Yet, upon launching your system, the same undesirable behaviors will likely emerge.
Creating the illusion of having acted is often more harmful than doing nothing at all.
Indeed, fostering the belief that you’ve made progress can be even more toxic than inaction. You may relax, take a break, and feel proud of your efforts, while your system continues to uphold historical biases. Remember, data originates from the past, and the further back it goes, the more it drags you toward eras you'd prefer to forget.
Desiring improvement is not enough—let’s take meaningful action.
In summary, the information you wish your system to overlook may be subtly interwoven across various features. The complexity of your dataset directly correlates with the risk of this occurrence; your model will inevitably learn what you intended for it to ignore in a more subtle and challenging-to-diagnose manner. Call me cynical, but I value actionable results over well-meaning intentions. I prefer striving for a better world rather than simply wishing for one. Let’s commit to effective actions.
Deleting features generally fails to address the bias problem.
Thus, while you labor diligently to implement a genuine fix, consider utilizing a policy layer as a more reliable temporary measure than unawareness.
Removing Instances
Another approach to address bias might involve eliminating specific data points from your dataset or enhancing it with synthetic data.
This method is complex and shares similarities with handling outliers (see the video below).
When you discard irrelevant or incorrect data points, the overall results typically improve. However, tampering with data that accurately reflects reality is risky.
A phrase I encourage all AI practitioners to keep in mind is:
“The world represented by your training data is the only world you can expect to succeed in.”
In other words, if you alter your training data so it no longer mirrors reality, your system may underperform upon deployment. Producing ineffective AI systems for unsuspecting users does not align with good practices.
While trade-offs between system performance and other goals are possible (like promoting a better future at the cost of current profits), sacrificing some classification accuracy to protect people from harm may be precisely the trade-off worth making.
This concept transcends AI and relates to a choice we all face in life. There’s plenty of negativity around us. You can choose to profit by reflecting it back to those around you or become a beacon of goodness amidst the chaos. By striving to inspire others with a vision for a brighter future, you may motivate them to exhibit kindness as well. Your software can embody this choice too.
In summary, just because something is true doesn’t mean your AI system should be based on it. You could opt to set an example through your software design, guiding your system’s performance toward the world you aspire to create rather than the one you currently inhabit. Just remember, this endeavor is unlikely to be simple or inexpensive. Safely deleting or reweighting training data remains a significant challenge for fairness researchers in AI. While a solution may emerge, in the meantime, regardless of the balance you strike between aspiration and reality, always prioritize safety nets.
Hello again, policy layers.
Thanks for Reading! Interested in a YouTube Course?
If you enjoyed this article and are interested in a comprehensive applied AI course designed for both beginners and experts, check out the one I created for your learning enjoyment.
width: 800 alt: YouTube Course Advertisement
P.S. Have you ever tried clicking the clap button on Medium multiple times to see what happens?
Enjoyed the Author? Connect with Cassie Kozyrkov
Let’s connect! You can find me on Twitter, YouTube, Substack, and LinkedIn. If you’d like to invite me to speak at your event, please use this form to get in touch.