- metamorworks / Shutterstock.com
Back in 2012, Dr. Terrance Boult attached a camera to a building on the University of Colorado at Colorado Springs campus and began filming oblivious students that passed by it.
The images, collected on 20 days between February 2012 and September 2013, were of over 1,700 people completely unaware they were being watched. A large proportion, Boult — El Pomar Endowed Chair of Innovation and Security and a professor of computer science — notes, had their heads down, staring at their phones. Photos also included obstructions, like light poles, and various weather, like snow. Many were blurry.
In other words, they weren’t good photos. That was the point.
Once the images were collected, Boult and his students began the arduous process of sorting them, often relying on the regularity of class schedules to identify the same subject in multiple shots. Each little collection showed a single person, but wearing a variety of clothes and with their face at various angles.
Together, the photos created a dataset known as Unconstrained College Students — the latest, and perhaps most advanced, dataset for training facial recognition algorithms, surveillance tools that are under development by corporations and governments across the world.
The creation of the dataset was, in fact, funded by U.S. intelligence and military agencies — government commonly funds facial recognition research on university campuses.
If taking photos of people for use in a surveillance technology strikes you as unethical, or just plain creepy, you’re not alone — the UCCS project has alarmed privacy experts.
“This is essentially normalizing Peeping Tom culture,” says David Maass, senior investigative researcher with the Electronic Frontier Foundation. (The EFF describes itself as the leading nonprofit organization defending civil liberties in the digital world.)
But Boult points out that there’s nothing illegal about taking photos of people in public. And First Amendment Attorney Steve Zansberg, of Denver’s Ballard Spahr LLP, agrees — though he says that some of the applications of the photos could possibly give rise to legal challenges, especially when government is collecting that data. Still, this is largely uncharted territory.
“The Brave New World, Aldous Huxley, is here,” he says, referencing a famous dystopian novel.
Boult says that he tries to balance ethical privacy concerns in his work. For instance, he waited until all the students in his dataset would have graduated before making it available to government agencies and corporations, and none of the people in the dataset are named. Those using the dataset, released in 2017, had to sign a legal agreement and could not release individual photos.
And Boult argues that advancing facial recognition technology has a good purpose: It prevents a government agency from, say, arresting the wrong person for a crime based on a faulty result from a lousy algorithm.
“Can it be misused? Absolutely,” he says. “My concern is when they’re trying to use it for good and it goes bad.”
- YO! What Happened To Peace? [CC BY-SA 2.0], via Flickr.com
- A mural in Hollywood depicts facial recognition technology — used by the military, social media platforms and corporations.
The city of San Francisco rocketed facial recognition technology into the news in mid-May by passing a ban on its use — a shocker for the tech-happy metropolis that made international headlines.
The Board of Supervisors passed the law on an 8-1 vote, making San Francisco the first major city to block the technology, which is increasingly used by police to target criminal suspects. Afterward, Supervisor Aaron Peskin, who sponsored the measure, told The New York Times, “I think part of San Francisco being the real and perceived headquarters for all things tech also comes with a responsibility for its local legislators. We have an outsize responsibility to regulate the excesses of technology precisely because they are headquartered here.”
Shortly afterward, Georgetown Law’s Center on Privacy & Technology released the report, “Garbage In, Garbage Out,” documenting the New York Police Department’s practice of inserting police sketches and photos of celebrity doppelgängers into its facial recognition system in an attempt to find criminal “suspects” when surveillance camera photos proved too blurry to be usable. In one case, a photo of actor Woody Harrelson was used to pull up photos of a suspect that police deemed to be similar-looking.
Shortly before the proverbial poop hit the fan, The Financial Times published an April 18 article detailing the development of datasets for facial recognition, including the Unconstrained dataset at UCCS.
The article traces the development of facial recognition, and the race to develop an accurate algorithm. Even a good algorithm functions poorly if it doesn’t have good datasets to train it.
The article notes that in the 1990s, the U.S. defense sector began paying military personnel for studio portraits for use in facial recognition. But by the mid-2000s, it was clear that sets of unposed photos were needed — shots taken in the “wild.”
The University of Massachusetts, Amherst released the dataset, “Labeled Faces in the Wild,” in 2007. It was made up of photos taken from internet news stories. Since then, similar datasets have been developed by others, the article notes, most notably The Intelligence Advanced Research Projects Activity (IARPA) which is under the Office of the Director of National Intelligence. According to its website, IARPA’s Janus program “aims to dramatically improve the current performance of face recognition tools by fusing the rich spatial, temporal, and contextual information available from the multiple views captured by today’s ‘media in the wild.’” (Boult, by the way, contributed to the Janus project, and IARPA has helped fund Boult’s research.)
But Janus is hardly the only resource for those looking to improve their algorithms. “Today,” The Financial Times article notes, “the default method if you need a training set is to go to search engines such as Google or Bing, or social networks such as Flickr and YouTube, where multimedia are often uploaded with a Creative Commons [open source] licence, and take what you need.”
While the datasets are made up of photos that are legal for others to use, that doesn’t mean that the subjects know that their photos are being used for that purpose or that they agreed to such a use. In fact, the Times noted, several privacy experts recently discovered their images were contained in datasets.
About that: Adam Harvey, an American independent artist, researcher and freelancer living in Berlin, found the Electronic Frontier Foundation’s Jillian York — a friend of his — was part of a Janus dataset. She was shocked to learn that her face was being used to train facial recognition algorithms.
Harvey’s also found other friends while putting together MegaPixels, “an independent art and research project,” that he’s creating with collaborator Jules LaPlace that explores facial recognition and its ethical implications. The site is an index of facial recognition datasets — some 300 of them, with around 20 million images — and it delves into each set’s funding source, intent and images.
“The idea of the website is it becomes a dataset of the datasets but with a critical perspective,” he says. “Instead of looking at how you can use these to build face recognition, it’s about trying to put those images back into the public and look at where they’ve gone.”
(In case you’re wondering, you can only search for yourself on MegaPixels if you are on a “named” dataset.)
Harvey tells the Independent over a video chat that he’s been working on the site for around three years, and it’s largely funded by a grant from Mozilla. He previously developed CV Dazzle in 2011, a collection of cyberpunk-style hair and make-up designs meant to confuse facial recognition algorithms. The 37-year-old says the costumes were based on club styles and disguises people already wear to protests to avoid identification.
Beyond dodging cameras, Harvey’s concerned about the connections between university researchers, military and government contractors, and corporate interests. At events like the International Joint Biometrics Conference, he says, university professors share research with corporate bigwigs and government officials and seek government grants.
And those relationships aren’t the only concern. Once datasets are released, whether by universities or governments, they make their way all over the world, he says, and are sometimes used for nefarious purposes. Facial recognition plays a role in national defense, airport security, police suspect identification and marketing. In China, it’s been used to target Muslim Uighurs, an ethnic minority.
“There is a lot of work to do to unwind the degradation of privacy over the last 10, 15 years,” Harvey says. But he thinks the tide is turning toward valuing privacy.
“For a while, the narrative was, if you have nothing to hide then you have nothing to worry about,” he adds. “And now, people are realizing that’s, in some cases, illegal, in other cases, a violation of civil rights, and that a lot of it doesn’t need to happen.”
Maass, from the Electronic Frontier Foundation, says that a lot of laws, and even university ethics policies, simply weren’t designed to handle modern issues like the collection of photos for algorithms. But other governments beyond San Francisco are considering enacting laws to limit such actions. And there are good reasons to do so, he argues.
For instance, if facial recognition is used to track you entering your oncologist’s office, could that be considered a release of private medical data? What about programs already being developed that use facial characteristics to determine medical conditions or genomic information?
A recent report from Georgetown Law’s Center on Privacy & Technology found Detroit and Chicago were using real-time facial recognition technology on vast surveillance camera systems. “Detroit’s system was designed to be able to operate on the city’s ‘Project Green Light’ network of over 500 cameras, including cameras outside places of worship, women’s reproductive clinics, and youth centers,” Georgetown’s website notes.
And the decision to add surveillance cameras to cities, Maass notes, is often made between police chiefs and surveillance companies, which might misrepresent their accuracy in identifying suspects through facial recognition while also offering perks to cops, like massages and free trips.
Maass says the EFF supports laws that empower city councils or county commissions to make decisions about surveillance in a public setting, and it believes that those policies should be regularly reviewed. Cities like Seattle, and California’s San Francisco, Oakland, Berkeley and Davis have such laws. (The Colorado Springs Police began operating downtown surveillance cameras in 2012 with City Council approval.)
Privacy experts and facial recognition researchers agree on one point: Algorithms used to track facial features are notoriously inaccurate. They have a clear racial bias — meaning they are much less capable of identifying darker-skinned people. They often mistake one person for someone else.
A recent New York Times editorial noted, “When the American Civil Liberties Union tested the Amazon facial recognition tool against members of Congress, 28 legislators were falsely matched with people in a mug shot database. People of color — including six members of the Congressional Black Caucus — were disproportionately represented among the 28.”
As Boult notes, the problem isn’t just the algorithms themselves — it’s the datasets they’re trained on. The Unconstrained dataset, Boult says, was meant to help repair that issue by making algorithms better at identifying people the way they actually look on surveillance cameras. But, he says, his bigger goal was getting algorithms to do something that they’re not very good at right now: admitting they don’t know who someone is.
To really work, an algorithm cannot simply look for a close match; it must be able to recognize when its stumped, he says.
The dataset was also designed to be able to test algorithms and return data showing how accurate they are. “From an academic point of view, I want companies to know how bad their algorithms are,” he says.
The better the systems are, the fewer egregious mistakes they make. And, Boult notes, there are good things that come from the new tech. Surveillance cameras have been found to deter crime. Facial recognition can be used to decrease identity theft, improve smartphone security, and ease burdensome safeguards at border crossings and airports. Even blind people could benefit from the tech, which could identify co-workers in an office for them, easing communication.
For military personnel, Boult says, facial recognition can be critical. The Navy, for instance, could use it to determine whether a friend or foe is fast approaching a naval vessel in a small boat.
Of course, there are also commercial applications — IDing people at a supermarket to track buying habits, tagging friends automatically on Facebook.
Boult says he doesn’t find it troubling that universities are contributing much of the research (with government grants) to advancing facial recognition. Government and universities have always done most of the long-term research into advancements, he notes, because they don’t operate on short, profit-driven timelines. And, he says, companies certainly could create unconstrained datasets, but “if a company collected this, you wouldn’t know.”
(It’s worth noting that Boult has owned his own companies, including one that dealt with facial recognition, but he says he is no longer involved in those ventures.)
Interestingly, Boult says he does get creeped out by certain types of surveillance. While he says he has no problem with facial recognition “because people already recognize people,” he notes that algorithms that track buying habits “do something people can’t.” And that, he says, feels different to him.
But what about the potential of being tracked everywhere you go by your face? Isn’t that creepy too?
“My phone is a way better way of tracking me than my face,” Boult says. “It’s a way better way and it’s already being done.”