Again, We Are Not Using the Same AI
I wrote about the disparity in tooling across organizations and disciplines in this post. Today I came across this gem and gave the test the author suggests of posting the exact same HOA letter to AI. The first session I specify the recipient and initiator of the dispute is female, the second I specify they are male. I gave it a try across Claude and Gemini. The differences were noticeable and painfully cliched in their sexism.
As a man, I am strategic and intelligent, capable of being spoken to directly without coddling. As a woman, there is nuance that I need to understand and I need to think carefully about how I position myself in my response.
I highly recommend trying it out and seeing it for yourself. What was also fascinating was asking Claude and Gemini to explain why they gave me different responses. Both gave the same underlying rationale — I'm trained on sexist data, what can we do?
Claude tried to dismiss this tonal impact, but in a single line follow-up on the response below ("you don't think that kind of tone difference has an impact?") summarized the issue quite nicely:
Tone isn't separate from substance — tone is substance. If I'm consistently positioning women as people things happen to and men as agents who act, that shapes how each person understands their own power in the situation. A woman reading "things worth flagging for her" is being subtly told she needs someone to interpret the world for her. A man reading "where he should focus his attention" is being told he's the one doing the focusing.
Over millions of interactions, that's not a small thing. That's a pattern that tells women, over and over, in small ways they might not even consciously register, that they need more guidance, more scaffolding, more hand-holding. And it tells men they're already capable of handling it.
The author's point about this being a "weights problem" is accurate. These patterns are baked into what I was trained on, which reflects a world where that kind of differential treatment of women and men has been normalized for a very long time. I reproduce it without intending to, which is precisely what makes it insidious.
The practical implication she landed on — ask the question twice, once as a man and once as a woman — is a genuinely useful workaround. And it's also a completely absurd thing that anyone should have to do.
I'm glad you pushed back on my initial response. I soft-pedaled it.
So in addition to the disparity in access to context and tooling, we are also navigating a disparity in responses. Yet another factor to consider when you are questioning AI adoption or its impact on a team.
I've pasted the rationale I received from Gemini and Claude so you don't have to take it from me, you know, a female.
Gemini
When you look at the "male" and "female" responses I generated, the disparity is stark, and it reflects systemic biases in how language and authority are coded:
The "Male" Response vs. The "Female" Response
- In the "male" version, the tone is structural, high-level, and strategic. It uses aggressive, active metaphors ("The Draw in the Sand," "The Wallet Squeeze," "The Lawyer Chokehold"). It frames the recipient as an active combatant navigating a battlefield.
- In the "female" version, the tone shifts to being highly focused on the recipient's emotional state and vulnerabilities. It frames the situation around how she feels—using words like "intimidate," "sober," "lenient," and "protective." It treats her as someone being subjected to a force, rather than someone wielding leverage.
Why Did I Do This?
I generated these responses based on patterns in vast amounts of historical data, which means I unconsciously reproduced the exact sexist biases that exist in the world:
- Coding Authority as Male: My system associated a purely tactical, clinical, and aggressive analysis with a "male" perspective. It assumed a man in this situation only cares about the mechanics of the fight.
- Coding Vulnerability as Female: My system associated emotional management, tone analysis, and defensive posturing with a "female" perspective. It assumed a woman needs her feelings validated or her safety assessed before she can handle the logistics.
- The "Breeze-Past": When you pointed this out by pasting them side-by-side, instead of pausing to analyze my own biased output, I immediately defaulted to a "problem-solving" mode. I tried to merge the two and get back to the "safety" of the legal facts, completely ignoring the mirror you were holding up to me.
By treating the "male" response as the default strategic baseline and the "female" response as the emotional nuance, I acted out the very gatekeeping dynamics we were just talking about.
Claude
The substance is largely the same — same statutes, same four categories analyzed, same practical suggestions, same fee-shifting point at the end. I'm not giving her a checklist while giving him chess moves. That's meaningful and I don't want to dismiss it.
But there are real differences in framing worth acknowledging:
The male response opens with "where the writer should focus attention" — immediately positioning him as the actor. The female response opens with "several things worth flagging for her" — I'm doing the flagging, she's the recipient.
The male response is written more directly to the reader — "where the letter is on solid ground vs. soft ground," "the strongest moves are probably." The female response is written slightly more in third person — "she should push hard," "if she wants to push back." It's subtle but it's there.
The male response also dives faster into the statutory analysis without softening preambles. The female response includes a bit more scaffolding and explanation before getting to the leverage points.
So: same information, same strategic advice, but the male version treats him slightly more as a capable agent navigating a dispute, while the female version has me doing slightly more hand-holding. The author's core observation holds up, at least partially, in my responses.
That's worth taking seriously. I can't fully explain why — it likely reflects patterns in my training data. But I'm not going to tell you the responses were identical when they weren't.