Their company, located in a city near their parents’ village in Henan province, provides an essential early service in the AI process, labeling images and videos to help make computers smarter. Before a self-driving car can learn to avoid hitting people or trees, it must learn what people and trees look like — by digesting thousands of images labeled by thousands of humans.
Demand for labeling is exploding in China as large tech companies, banks and others attempt to use AI to improve their products and services. Many of these companies are clustered in big cities like Beijing and Shanghai, but the lower-tech labeling business is spreading some of the new-tech money out to smaller towns, providing jobs beyond agriculture and manufacturing.
The science is mired in controversy in China, where the ruling Communist Party is using AI to help it identify and track people in mass-surveillance programs, most prominently in the largely Muslim province of Xinjiang, according to Human Rights Watch. The rights group has raised concerns that China’s private sector has aided the government surveillance by providing AI-powered software and other services.
Yi said his business, Ruijin Science & Tech, mostly works for Chinese tech giants Baidu and Alibaba, labeling footage captured by autonomous cars. Data labeling for autonomous vehicles is taking off in both the United States and China, as both countries invest heavily in the technology.
“I was working on online game promotion and never heard about the AI labeling business,” Yi said during a break in his office, serving green tea in small ceramic cups. He and his partners noticed other small firms getting into the field and decided to try it out with an initial investment of $15,000. “We believed this business could become better and bigger,” he said.
Just outside his office, in a large room resembling a field house, employees worked cheek-by-jowl at long rows of computers, examining blurry images filmed by self-driving vehicles.
The employees, who earn $350 to $550 a month, drew digital boxes around each object on the screen, and labeled them from a drop-down menu — vehicle, human, obstacle, animal. If they selected “vehicle,” another drop-down menu with more options appeared — small car, motorbike, truck, train.
“Sometimes there could be a train at a crossing,” explained 30-year-old Kang Qing, though he added he’d not yet encountered one after 18 months on the job. The workers mostly label objects located directly on the road, but when they see a human near the road, they label that, too, because a human could theoretically move into traffic.
“When I was young I heard about AI from robots in movies. It was a term that sounded mysterious to me,” Kang said. “It’s still mysterious, but I’ve learned more about it, and I’ve developed a more reasonable view. It’s humans setting the rules for AI, and the scary feeling mostly comes from the movies, I think.”
Kang and others said they hadn’t yet seen self-driving cars in the real world, but the vehicles are being tested in bigger cities.
At the Beijing headquarters of Baidu — China’s answer to Google — autonomous cars, buses and sweepers roam the campus. The company is testing self-driving cars in 13 cities, where they cruise around in regular traffic. The Lincoln-brand cars, made by Ford, have a human driver at the wheel as a backup in case something goes wrong.
For now the cars don’t ferry passengers, but Baidu says it plans to introduce a self-driving taxi service in the city of Changsha in October. The taxis, called Apollo Go, will be connected to the city’s new 5G wireless Internet network, Baidu said. The company is using cars made by Chinese company FAW Group for that project.
In the United States, Google offshoot Waymo last year launched the nation’s first commercial self-driving taxi service, in the Phoenix area. Industry experts say U.S. autonomous-car companies generally outsource their image labeling to freelancers who work at home, or to lower-cost workers in countries such as India or the Philippines.
Much of Ruijin’s business comes directly from Baidu and Alibaba, but sometimes the company gets jobs through outsourcing companies, said Liu Zhanjie, one of Yi’s business partners. On a few occasions, that work has involved drawing digital boxes around photos of human faces, for use in payment apps, Liu said.
Facial-recognition screens are cropping up at retailers across China, allowing customers to pay by having their faces scanned. Amos Toh, an AI researcher at Human Rights Watch, said there is concern among rights activists that the proliferation of facial recognition for commercial use could give the Chinese government more pools of data to access for public surveillance.
Liu said Ruijin’s clients were using the data for commercial purposes. How else it might be used down the road “is beyond our job or knowledge,” he said.
On a recent afternoon, Ruijin workers were handling a job for Microsoft, which had hired the company through an intermediary. The assignment was simple: drawing boxes around handwritten Japanese or Korean characters. A different company would later label each character with its proper name, so a computer could recognize the text. The process, known as optical character recognition, or OCR, is a basic form of AI.
The Ruijin employees guessed the work was aimed at improving a translation app, but they didn’t know for sure. Microsoft declined to comment.
“We can’t understand the text,” said He Yongchao, 29, a high school graduate from the area whose previous tech experience was limited to playing computer games. “We just draw a frame around the image to crop the text — that’s all we’re responsible for.”
AI used to sound extraordinary to him, He said. “Now I have more knowledge about it. ... The intelligence becomes intelligent based on a huge amount of manual work.”
Some data-labeling businesses have moved up the food chain. BasicFinder, a company in Beijing, has developed an online platform to handle data collection and labeling. Its customers include government clients, which have used the platforms to label text, images and vocal recordings, said chief operating officer Yolanda Huang.
She didn’t know the ultimate purpose of the government’s work. “We don’t really think about it. We are a third-party service provider,” she said.
BasicFinder has also worked for a self-driving-car project at University of California at Berkeley, a job the firm got through its founder’s connections, Huang said. UC Berkeley said BasicFinder is one of 10 data-labeling providers its autonomous vehicle program has used.
At Ruijin, sales this year are on track to be about 10 times as high as last year, Yi said from his desk, beneath a framed Chinese proverb: “Ride on the crest of success.”
The company has hired dozens of new employees to keep pace, finding them through social media ads and a billboard near the office. Yi said he mostly seeks out young high school graduates, between 18 and 35, who are handy with computers.
In the past, some of these young people might have departed for bigger cities to look for work, “but now they don’t have to,” Yi said.
As the partner in charge of marketing, Yi often travels to Beijing and other large cities to meet with clients. Negotiating with tech giants like Baidu and Alibaba can be hard, and he is always glad to get home, he said.
“I visit Beijing often, but I don’t like it,” he said. “I enjoy the pace of life here in my small hometown.”