Since imperial times Chinese governments have yearned for a perfect surveillance state. Will big data now deliver it?Â On 5 July 2009, residents of Xinjiang, Chinaâs far western province, found the internet wasnât working. Itâs a regular frustration in remote areas, but it rapidly became apparent that this time it wasnât coming back. The government had hit the kill switch on the entire province when a protestÂ in the capital ÃrÃ¼mqi by young Uighur men (of the areaâs indigenous Turkic population) turned into a riot against the Han Chinese, in which at least 197 people were killed.
The shutdown was intended to prevent similar uprisings by the Uighur, long subjected to religious and cultural repression, and to halt revenge attacks by Han. In that respect, it might have worked; officially, there was no fatal retaliation, but in retrospect the move came to be seen as an error.
Speaking anonymously, a Chinese security advisor described the blackout as âa serious mistakeâ¦ now we are years behind where we could have been in tracking terroristsâ. Young Uighur learnt to see the internet as hostile territory â a lesson reinforced by the arrest of Ilham Tohti, a popular professor of economics, on trumped-up charges of extremism linked to an Uighur-language website he administered. âWe turn off our phones before we talk politicsâ, a tech-savvy Uighur acquaintance remarked.
The Uighur continued to consume digital media, but increasingly in off-line form, whether viewing discs full of Turkish TV series or jihadist propaganda passed on memory sticks. Where once Chinese media reports claimed that arrested Uighur had been visiting âseparatistâ websites, now they noted drawers full of burnt DVDs and flash drives.
A series of brutal terrorist attacks early in 2014 reinforced the lesson for the Chinese authorities; by driving Uighur off-line they had thrown away valuable data. Last summer, the Public Security University in Beijing began recruiting overseas experts in data analysis, including, Iâm told, former members of the Israeli security forces.
In functioning democratic societies, information is gathered from numerous independent sources: universities, newspapers, non-government organisations (NGOs), pollsters. But for the Chinese Communist Party, the idea of independent, uncontrolled media remains anathema. Media is told to âdirect public opinionâ, not reflect it.
Chinaâs rulers have always struggled to get good data, especially in the countryside
In the 2000s, a nascent civil society was gradually forming, much of it digital. Social media, public forums, online whistle-blowing, and investigative journalism offered ways to expose corrupt officials and to force the state to follow its own laws. But in the past three years all such efforts have been crushed with fresh ruthlessness. Lawyers, journalists and activists who were once public-opinion leaders have been jailed, exiled, banned from social media, silenced through private threats, or publicly humiliated by being forced to âconfessâ on national television. Ideological alternatives to the Party, such as house churches, have seen heightened persecution. And the life was choked from services such as Weibo, the Chinese Twitter-alike, with thousands of accounts banned and posts deleted.
Yet this has left Beijing with the same problem it has always faced. From the beginning, central government has tried simultaneously to gather information for itself and to keep it out of the hands of the public.
A vast array of locals, from seismological surveyors to secret police services, gathered data for the government, but in its progress through the hierarchies of party and state the data inevitably became distorted for political and personal ends. Now some people in government see technology as a solution: a world in which data can be gathered directly, from bottom to top, circumventing the distortions of hierarchy and the threat of oversight alike. But for others, even letting go to the minimal degree necessary to gather such data presents a threat to their own power.
Ren (a pseudonym), a native Beijinger in his late 20s, spent his college years in the West fervently defending China online. But, he now says: âI realised that I didnât know what was going on, and there are so many problems everywhere.â Back in China, and working for the government, he sees monitoring social media as the best way for the government to keep abreast of and respond to public opinion, allowing a âresponsibleâ authoritarianism. Corrupt officials can be identified, local problems brought to the attention of higher levels, and the publicâs voice heard. At the same time, data-analysis techniques can be used to identify worrying groupings in certain areas, and predict possible âmass group incidentsâ (riots and protests) before they occur.
âNow that weâve taken out the âbig Vsâ,â Ren told me, âwe shouldnât worry about ordinary people speaking.â âBig Vsâ are famous and much-followed âVerifiedâ users on Weibo and other social media services; celebrities, but also âpublic intellectualsâ who have been systematically eliminated over the past three years. With potential rallying points such as opinion leaders or alternative ideologies crushed, the government can view the seething mass of public grievances as a potential source of information, not a direct challenge.
The central government is acutely aware of how little it knows about the country it rules, as fragmented local authorities contend to bend data to their own ends. For instance, the evaluation of officialsâ performance by their superiors is, formally, deeply dependent on statistical measurements, predominantly local gross domestic product (GDP) growth. (Informally, it depends also on family connections and outright bribery.) As a result, officials go to great lengths to juke the stats. As Li Keqiang, now the Chinese premier, told a US official in 2007 when he was Party Secretary of Liaoning Province, in a conversation later released by WikiLeaks, GDP figures are âman-madeâ and âfor reference onlyâ.
Li said he relied, as many analysts do, on proxy data thatâs far harder to fake. To measure growth in his own province, for instance, he looked at electricity, volume of rail cargo and disbursed loans. But he also relied on âofficial and unofficial channelsâ to find information on the area he ran, including âfriends who are not from Liaoning to gather information [I] cannot obtain myself.â
Liâs dilemma would have been familiar to any past emperor. Chinaâs rulers have always struggled to get good data, especially in the countryside. The imperial Chinese state did its best to make its vast and diverse population legible. Household registration systems, dating back to Chinaâs ancient precursor kingdoms, tried to monitor subjects from birth to death. Government officials trudged their way to isolated hamlets across mountains, jungles and deserts. But at the same time, local leaders reported pleasant fictions to the capital to cover their backs.
The Peopleâs Republic inherited these problems, but added to them an obsession with statistics, acquired from the Soviets. Communism was âscientificâ, and so the evidence had to be manufactured to support it. Newspapers in the 1950s included paragraphs of figures about increased production and national dedication. Chinese reporters still cram unnecessary (and often fictitious) statistics into stories. (âThe new factory has an area of 2,794 square meters.â) âAccording to statisticsâ is one of the most overused phrases in mainland writing.
All of this has made China a society in which real information is guarded with unusual jealousy, even within the government. For decades, even the most innocuous data was treated like a state secret. Even the phone numbers of government departments were given out only to a privileged few.
The government in Beijing is well aware that information it receives from below is mangled or invented on the way up. The National Bureau of Statistics (NBS), which chiefly manages industrial data, frequently demands more direct reporting; it has repeatedly called for businesses to send their information directly to the NBS, and has begun naming and shaming firms that donât do so, as well as local authorities that it catches fixing numbers. In September 2013, for instance, it reported on its website that a Yunnanese county had inflated its industrial growth four-fold. But the NBS is largely âhelplessâ, a junior official at a more powerful body smugly told me, lacking the internal clout to enforce its demands.
One corrective has been sudden descents by higher authorities for âinspection toursâ. But these are usually anticipated and controlled by local officials who have long since mastered the Potemkin arts. Another longstanding solution was the petitioning system, first institutionalised in the seventh century AD. It let individuals circumvent local officials and present their plea for justice directly to higher authorities, or even directly to the capital. The system is still in place, handling millions of requests a year. But it has never worked, with petitioners more likely to be branded as troublemakers, beaten up, or imprisoned, than for their information to reach anyone of note. Partly the problem is that one of the metrics used to measure officials is the number of petitioners their district produces, the theory being that good governance produces fewer complains, and so corruption has been incentivised.
The files of perceived dissidents are thick, but the records of ordinary life are thin
The constant interference of middlemen is why some in central government are so excited by the possibility of gathering data directly. Take the contentious issue of population. Incentives to distort information cut two ways; under the oneâchild policy, rural families often try to avoid reporting births at all, but rural authorities have a strong incentive to over-report their population, since they receive size-linked benefits from the centre. Urban areas, meanwhile, have a strong incentive to under-report population figures, as theyâre supposed to be limiting the speed of urbanisation to controllable levels. Beijingâs official population is 21.5 million, but public transport figures suggest the real figure might be 30-35 million.
In theory, Chinaâs surveillance state already generates massive amounts of personal data that could provide government with valuable information. The ID card, now radio-frequency-based, is central to Chinese citizensâ lives, required from banks to hospitals. A centralised database lets ordinary people check ID numbers against names online and confirm identities. But individual transactions with the ID card often go unrecorded unless the Public Security Bureau (PSB) â essentially, the local police station â has already taken an interest in you. And so the files of perceived dissidents are thick, but the records of ordinary life are thin. Even if central agencies go looking for the information, it is distorted en route from municipal, then provincial PSBs.
Despite the vast amounts of data produced within the government, Chinese scientists and officials often find themselves turning to the same sources as Western ones. Theyâve seized on projects from abroad that demonstrate how analysis could potentially map population mobility through mobile-phone usage. The mass of consumer data produced by the online shopping services run by the $255 billion Alibaba group is another huge bonanza. Now smartphones produce more of the information that the government needs than secret policemen do.
In and of itself, Chinaâs search for data is morally neutral. As the US political scientist James Scott points out in Seeing Like a State (1998), population data can equally be used for universal vaccinations or genocidal round-ups.
If big data is used by Chinaâs central government to identify corrupt officials, pinpoint potential epidemics and ease traffic, that can only be laudable. Better data would also help NGOs seeking to aid a huge and complex population, and firms looking to invest in Chinaâs future. The flow of data could circumvent vested interests and open up the countryâs potential. For Professor Shi Yong, deputy director of the Research Center on Fictitious Economy and Data Science in Beijing , this is a moral issue, not just a question of governance. âThe data comes from the people,â he said strongly, âso it should be shared by the people.â
Most people in China donât want to protest against government. They want to know where the good schools are, how clean the air is, and what mortality rates are at local hospitals. Shi returned to China after spending two decades at universities in the US because he was excited by the possibilities of Chinaâs growing information society.
Resistance to opening up officialsâ property registration details is extremely fierce
âLetâs say I want to move to a small city here,â he told me. âI want to know school districts, rent, health: we donât have this information easily available. Instead, people use personal contacts to get it.â Shi says that thereâs huge resistance to the idea of open data, from within the government and even more from businesses. âThey might want to protect the way they run their business, they may want to hide something.â One of his current projects is working with the Peopleâs Bank of China (PBOC) on establishing a nationwide personal creditârating system.
âActually,â Shi told me, âthey have two databases: one for personal information and one for companiesâ information, and they wanted us to work on both. But I said no, we would only work on the first. This data is very beautiful! Better than the American data, because all the other banks must send the information directly to the PBOC, the central bank, every day.â The company data, in contrast, was bad enough to be unworkable. âYou know garbage in, garbage out? With data analysis, small garbage in, big garbage out.â
Shi highlighted the ways in which the internet had already opened up the provinces for the central government. âLook at the PX protests,â he said, pointing to the local outrage in August 2011 in Dalian and elsewhere against factories producing the chemical paraxylene (PX). âTwo decades ago, that would have gone nowhere. But this time, the higher authorities took notice of it.â
Small injections of information have already had a palpable effect in China. Air pollution comes in two main forms: relatively large particles called PMÂ 10, and relatively small ones called PMÂ 2.5. For years, Chinese cities published only PMÂ 10 figures, and further skewed statistics by picking selectively from less polluted areas. But after independent monitors, including the US Embassy in Beijing, began putting their own PMÂ 2.5 figures online hourly, which spread rapidly through social media, public pressure eventually forced a shift in official policy.
The crucial issue is who gets to see and to use the data. If itâs limited to officials, however pure-minded their intentions, all it will do is reinforce the reach of the state. Chinaâs strong data protection and privacy laws function primarily not to protect citizens from state intrusion, but to shield officials and businessmen from public scrutiny. Resistance to opening up officialsâ property registration details is extremely fierce.
Even if opened up, this information means nothing without tools to find it. Â In China, much of that searching is filtered through the web services and search engine Baidu, which is based in Beijing and commands three-quarters of search revenue on the mainland. Like much Chinese online innovation, Baidu profited from the governmentâs fears of foreign firms, which created a walled garden in which domestic products thrived. After Google announced it would cease censoring searches in China in 2010, the US giant was effectively blocked on the mainland, its share of searches falling from 36Â per cent in 2009 to 1.6Â per cent in 2013. But Baidu had to fight off internal competitors too, including ChinaSo, the search engine created last year by the merger of the Peopleâs Daily newspaper and the Xinhua news agency, both state-run.
Baidu recently announced that it would launch a big-data engine to allow the public to search and analyse its available data. The firm already works with the Ministry of Transport, using data drawn from the search results on its map service to predict travel trends and help manage traffic. In a project âinspired byâ Google Flu Trends, itâs also working with health authorities to predict epidemic outbreaks.
Baidu is widely criticised for coâoperating with the authorities over censorship, and for its dependence on paid advertising which puts the highest-paying companies at the top of search results. Thatâs why, as Ren explained: âIf you search for public opinion (minyi), you get two pages of car resultsâ â minyi uses the same characters as an automobile brand.
Yet the firm also puts up some quiet, informal resistance to government intrusion. It maintains less personal information on users than Google does, for instance, partly because it has fewer integrated services, but it also wipes its own records of search histories far more frequently than Western firms. Insiders say that in meetings with the authorities, Baidu plays an active role in speaking up for greater online freedoms.
That might be why Baidu isnât popular among many in government. âThe Partyâs publicity department invited Zhou Xiaoping [a young, ultra-nationalist blogger] to speak recently,â Ren told me. âMuch of the speech was a rant against Baidu, how they were ârightistsâ [pro-US, civil rights, and free markets]. Do you know, he said, that if you search âpolice brutalityâ on Baidu, you get results about China? Why are the results not about the US, he asked. He got rounds and rounds of applause.â
The network-analysis techniques the authorities use to identify terrorists are also deployed against peaceful independence activists
Whatever measures some firms take, intrusion by the state is hard to resist. A draconian new draft for a national security law, likely to be introduced this year, specifies that the state has full access to any data it demands â already the case in practice â and that any foreign firm working in China must keep all their Chinese data inside the country. It also envisages extensive camera networks and the use of facial-recognition software on a vast scale.
Shi described to me how personal banking and credit information âis being used as part of the anti-corruption campaign to identify the networks of corrupt officialsâ, who in China often hide their graft â whether itâs property or cash â by putting it in the names of friends or family. Using data analysis, Shi suggested, the Partyâs investigators could root out such previously opaque networks.
Identifying and targeting friends and family, however, is also a technique that the Chinese state has traditionally used against dissidents and whistleblowers. In earlier times, ideological deviance could cause a manâs entire family to be persecuted, or even executed. Even today, the threat of children forced out of school or spouses fired from jobs is part of the toolset deployed against âtroublemakersâ. In Xinjiang, meanwhile, the network-analysis techniques the authorities use to identify terrorists are also deployed against peaceful independence activists, academic dissidents such as Tohti (whose students were marched out to testify against him), and Islamic teachers.
When I asked Shi about the increasing discussion in the West over government surveillance, he suggested that it would come in time in China. âWeâre not at that stage yet,â he said. âRight now, weâre just setting up the basic infrastructure. In time, weâll have the kinds of legal protections that developed countries do.â
That might happen. But Iâve been hearing from well-meaning people ever since I came to China more than a decade ago that the rule of law is right around the corner. The cornerâs still there. But now it has a CCTV camera on it.
James PalmerÂ is a British writer and editor who works closely with Chinese journalists. His latest book is The Death of Mao (2012). He lives in Beijing
Originally posted via “Will China use big data as a tool of the state?”
Source: Will China use big data as a tool of the state?