A million pictures to check manually

To diminish facial recognition biases, data sets of representative pictures are necessary. Idiap decided to go a step further by participating in the creation of an ethical data set of pictures.

If you are a white man, algorithms are usually well performing to recognize your face. On the contrary, the success rate is lower if you are a women or if you have another skin colour. These biases don’t mean that the artificial intelligence is programmed to discriminate. They are generated because the computer programs are trained to recognize faces with examples not reflecting human diversity. To compensate these inequalities and improve facial recognition systems’ security, the Biometrics Security and Privacy research group is creating a new and more reliable data base. This initiative is linked to a partnership with a private society operating in the security field.

Ethics and confidentiality

“Usually, this kind of picture annotation work to produce metadata, such as for example gender or eye colour, is crowdsourced on online platforms as Amazon Mechanical Turk (MTurk), » explains Sébastien Marcel head of the research group. “Everyone can contribute and be paid a small amount for checking a small chunk of the data set. Besides the resulting uberisation of such tasks, there are confidentiality issues.” The data set created at Idiap is related to a project with an industrial partner. For security reasons, the project agreement specifies that data cannot be distributed and must stay at Idiap.

Relyability to fight biases

In an office at Idiap, they are four in front of their screens to compare sets of pictures to validate them. “The hardest challenge is to stay focused,” states Magali. “To success, you have to take frequent breakes,” adds Josselin. “Each hour,” points out Oriane. Eight hours a day, during one or two weeks, they are contributing to this long and very demanding task. “It’s sometimes harder with some type of pictures or people,” explains Léo.

The cost of this task is significant, around 20,000 Swiss francs. The young people employed to achieve it are discovering Idiap and the challenges of picture annotation while being paid 20 Swiss francs per hour. The same kind of job offered by foreign companies can be paid up to 20 times less. “The fact that we are doing this by ourselves allows us to check in a quicker way the quality of our data base, points out Sébastien Marcel. “The “cleaner” a data base, the better and more reliable are the results of the machine learning program.”

More information

-    Biometrics Security and Privacy research group
-    Swiss Center for Biometrics Research and Testing