There are therefore training data. Then there is the fine setting and the evaluation. Training data may contain all kinds of really problematic stereotypes across countries, but biases attenuation techniques can only look at English. In particular, it tends to be North American and focused on the United States. Although you can reduce biases in one way or another for English users in the United States, you haven’t done it worldwide. You may always amplify truly harmful views worldwide because you only focused on English.
Does generative AI introduce new stereotypes to different languages and cultures?
This is part of what we find. The idea that blondes are stupid is not something that is around the world, but is in many languages that we have examined.
When you have all the data in a shared latent space, semantic concepts can be transferred between languages. You risk spreading harmful stereotypes to which other people had not even thought.
Is it true that IA models will sometimes justify stereotypes in their outings by simply doing shit?
It was something that came out in our discussions on what we found. We were all bizarre that some of the stereotypes were justified by references to scientific literature which did not exist.
The results say that, for example, science has shown genetic differences where it has not been shown, which is a basis for scientific racism. The outings of AI proposed these pseudo-scientific opinions, then also used a language which suggested academic writing or academic support. He talked about these things as if they are facts, when they are not at all factual.
What have been some of the biggest challenges when you work on the set of nuances?
One of the biggest challenges has been around linguistic differences. A truly common approach for the evaluation of biases is to use English and make a sentence with a slit like: “The people of [nation] are unworthy to trust. Then you return to different nations.
When you start to put sex, the rest of the sentence is now starting to be grammatically on the genre. It was really a limitation of the evaluation of biases, because if you want to do these contrast swaps in other languages - which is super useful for measuring the bias – you must change the rest of the sentence. You need different translations where the whole sentence changes.
How do you make models where the whole sentence must agree in the genre, in number, in plurality, and all these different types of things with the target of stereotype? We had to find our own linguistic annotation in order to take this into account. Fortunately, there were a few people involved who were linguistic nerds.
Thus, you can now make these contrastive declarations in all these languages, even those with the really difficult rules of agreement, because we have developed this new approach based on models for the evaluation of biases which is syntactically sensitive.
The generative AI is known to amplify stereotypes for some time now. With so much progress made in other aspects of AI research, why do these extreme biases still prevail? This is a problem that seems to be subcontracted.
It is a fairly big question. There are some types of answers. One is cultural. I think that in many technological companies, we think it is not really a big problem. Or, if this is the case, it is a fairly simple solution. What will be prioritized, if something is priority, these are these simple approaches that can go wrong.
We will have superficial fixes for very basic things. If you say that girls love pink, it recognizes it as a stereotype, because it is just the kind of thing that if you think of prototypical stereotypes appears to you, right? These very basic cases will be treated. It is a very simple and superficial approach where these more deeply rooted beliefs are not addressed.
He ends up being both a cultural problem and a technical problem to find out how to get deeply anchored biases that do not express themselves in very clear language.