Data is, in essence, a lot of numbers. They mean something only when put together and analyzed. However, the analysis can be lost on someone who is not intimately familiar with the data and/or how to read the analysis. It is here that data visualization helps.
In the social and political saga surrounding the question of net neutrality, what is often overlooked is the data war going on behind the scenes. The real fuel behind the debate is the enormous volume of data we generate with each search and click.
As a marketable commodity, large-scale audience data has completely transformed the global economic landscape in less than a decade. The emergence of GAFA (Google, Amazon, Facebook and Apple) germinated a disruptive new business model that capitalizes on what many consider to be the new oil: Data.
Based on a study published by eMarketer in September 2017, we can see how user-data companies (UDC) now hold the top five positions among the largest brands in the world.
In 2006, five of the top 10 brands were retailers. By 2017, nine of the top 10 brands in the world were UDCs.
The business of user data
The nature of the data business model can be understood by the relationship between its three core pillars: The internet user, who generates the data; the content publisher, who offers the internet user a service (often free) in exchange for personal data; and the advertiser, who buys data from content publishers in order to run more effective marketing campaigns.
The schema below attempts to illustrate the nature of this internet user data paradigm:
By having more control over an individual’s internet usage, those companies are in a position to adjust prices in ways that could significantly benefit their bottom line. For example, AT&T could decide that from now on, given the large bandwidth used by Netflix, the latter would have to pay a usage fee to maintain its regular website streaming speed.
Conversely, the internet service provider (ISP) could just as well charge internet users an extra fee to maintain their Netflix streaming at a regular or faster speed. In an extreme case of greed, the ISP could overcharge both Netflix and its user.
But there is more to it than that.
The real reason Verizon bought AOL and Yahoo!
In 2015, Fortune purported what it deemed to be the “real reason Verizon bought AOL.” In that article, journalist Kevin Fitchard observed:
“Verizon isn’t trying to create an Internet powerhouse with this investment. It’s likely just trying to gain some type of foothold in the changing online industry, as its traditional communications business slows down.”
Fitchard is alluding to the dominance of the data business model that gave rise to GAFA. As such, we can see why telecom companies like Verizon that control the internet channels through which the data is transmitted would also want to control — and take advantage of — the data itself. As Fitchard further observes in the same article:
“While AOL may be most known for its dial-up services and growing content empire — which includes The Huffington Post, Engadget and TechCrunch — it also has put together a sophisticated suite of advertising technologies for online and traditional media that no other company (aside from Google and Facebook) can match.”
The advertising technology in question, commonly referred to as programmatic advertising, uses advanced machine learning and artificial intelligence (AI) on the data generated by online user behaviour, and tracked by browser cookies or device IDs stored in mobile applications. Much of the advertising performance offered by Google, Facebook, AOL and others is largely attributed to their investments in this kind of technology, which Verizon can now leverage.
As described in an email by John Cosley, director of marketing for Microsoft search advertising, digital ads are “perhaps by far the most lucrative application of AI [and] machine learning in the industry.”
The birth of a super entity
To maximize the power of these advertising algorithms, companies need to secure big data. Since internet users are the prime generators of this precious raw material, publishers need to continually increase the number of visitors coming to their websites or mobile applications.
In a move to secure that expansion, shortly after its acquisition of AOL, Verizon bought Yahoo!, Google’s competitor in the search engine market. Yahoo! also has access to the entire Microsoft advertising network and its user data.
In order to assess the impact of this streak of acquisitions on total user reach of Verizon vs Google and Facebook, we used comScore data from May 2017, made available courtesy of Adviso Conseil. The comScore platform is essentially an audience analytics software used to track the data coming from most of the large desktop and mobile publishers in the world.
The competitive advantage of this merger — now a super-entity called Oath by Zerizon — stands out immediately when one looks at the combined reach of AOL, Huffpost and Yahoo!.
The U.S. and the rest of the world
The best way to illustrate the direct relationship between data and net neutrality is to simply ask the following question:
If a telecommunication company like Verizon were in a position to compete with Google and Facebook for data dollars, what happens if it also controls the data pipeline used by its competitors?
The answer is obvious. If U.S. telecoms can capriciously control internet access, while also controlling platforms that compete with GAFA, what stops them from impeding the pipeline of their competitors? Absolutely nothing.
I came across this by chance, and after reading some articles about it, I think it is a mind-blowing idea. Open Banking has the potential to give power back to the bank-customer. Very little, but still, at least the customer will be able to shop around for better deals.
I doubt it will come to Canada in the way it should, because the banks here will certainly go out of their way to squash it thoroughly. I mean, just look at what happened to Tangerine Bank, formerly ING Direct Canada.
While many are planning trips to their home towns to attend family reunions, millions more Chinese citizens have been blacklisted by authorities, labelled as “not qualified” to book flights or high-speed train tickets.
The word “credit” in Chinese – xinyong (信用) – is a core tenet of traditional Confucian ethics, which can be traced back to the late 4th century BC. In its original context, xinyong is a moral concept that indicates one’s honesty and trustworthiness. In the past few decades, its meaning has been extended to include financial creditworthiness.
So what does “credit” mean in the Social Credit System?
It is a question Chinese authorities have been exploring for more than 10 years. When the plan of constructing a Social Credit System was first proposed in 2007, the primary goal was to restore market order by leveraging the financial creditworthiness of businesses and individuals.
Gradually the scope of the project has infiltrated other aspects of daily life.
One shared focus of the country’s existing pilot schemes is to generate a standardised reward and punishment system based on a citizen’s credit score.
Most pilot cities have used a points system, whereby everyone starts off with a baseline of 100 points. Citizens can earn bonus points up to the value of 200 by performing “good deeds”, such as engaging in charity work or separating and recycling rubbish. In Suzhou city, for example, one can earn six points for donating blood.
Publishing the details of blacklisted citizens online is a common practice, but some cities choose to take public shaming to another level.
Several provinces have been using TV and LED screens in public spaces to expose people. In some regions authorities have remotely personalised the dial tones of blacklisted debtors so that callers will hear a message akin to: “the person you are calling is a dishonest debtor.”
It is important for a country to be able to enforce court orders, but when the judicial and legislative systems sometimes malfunction, as they do in China, it raises questions about whether the ability to expose and punish without due process can lead to abuses of power.
Liu Hu, a vocal journalist who has criticised government officials on social media, was accused of “spreading rumour and defamation”. While seeking legal redress in early 2017, he realised that he was blacklisted as “untrusworthy” and prohibited from purchasing plane tickets.
Liu’s story may be an isolated incident, but it demonstrates how the system could potentially be used to push the government’s agenda and to crack down on dissent.
The role of big data in the project has received broad media attention outside China due to concerns about how the Chinese government may use its power to further intensify surveillance.
For example, Chinese tech giants Alibaba and Tencent are testing user credit files based on behavioural data gathered through people’s use of social media and e-commerce sites. To date, few operational details have been released about the country’s plan to integrate user data from online platforms into a central system overseen by the government.
This will soon change. Since last December, the National Development and Reform Commission and Central Bank of China began to approve pilot plans to integrate big data with the Social Credit System. As one of China’s first pilot provinces, Guizhou province was selected to showcase a government-led experiment of a big data-empowered Social Credit System.
Guizhou is one of the poorest provinces in China, and is mostly known for being the home of Maotai – a high-quality liquor. This seemingly random choice of location is actually tactical. Unbeknown to most, since 2015 this rural backwater has been fast becoming the country’s hub of big data.
In 2017, tech giants Google, Microsoft, Baidu, Huawei and Alibaba established research facilities and data centres in the region. In 2018, Apple is following suit and transferring its Chinese iCloud server to a local company.
Guizhou’s position as the country’s data centre makes it an ideal social laboratory for the local government’s Social Credit System experiments.
Turning the system back on the government
While some might view China’s Social Credit System as something out of dystopian fiction, if properly implemented the system can have positive impacts – especially when used to keep government officials and business owners accountable.
Most pilot schemes target companies as stringently as individuals. Firms with a history of environmental damage or product safety concerns are now regularly exposed on online blacklists.
Government officials can also be found on online blacklists. As of December 2017, more than 1,100 government officials had been blacklisted as untrustworthy. Such a move to expose corruption is arguably more beneficial to Chinese society than public shaming of jaywalkers.
When dealing with data, a common assumption is that – data either proves or disproves something, straight up. There is no ambiguity. Or at least, that is what one usually assumes about data.
However, the truth is that where there is data, there is bound to be uncertainty. And visualizing uncertainty is an important part of visualizing data if one is to responsibly present data. This post does an excellent job of explaining the pros and cons of various ways of visualizing uncertainty in data.
As much as this blog is about data, it is worth acknowledging that data, first, has to be collected. As data has become a more and more prominent topic in the media and more and more faith is put into data, the following quote reminds me of the chink in data’s armour:
“The government are very keen on amassing statistics. They collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams. But you must never forget that every one of these figures comes in the first instance from the chowkidar (village watchman in India), who just puts down what he damn pleases.” – From Wikipedia
As a job seeker, I apply to plenty of jobs online.
Job applications are submitted in mainly two ways:
Upload (cover letter and) resume
Create account, and then upload (cover letter and) resume
The Ontario government has a simple straightforward interface of asking for some personal information and then accepting an upload of a cover letter and resume as one file.
Then, there are the websites where you have to create an account to upload your cover letter and resume. This LinkedIn post very accurately captures the frustrations associated with this system. While I do not quite agree with everything in the post, they are very valid points.
This article accurately captures the overall frustrations associated with searching for a job, including the above-mentioned job application systems. This article focuses solely on frustrations, without commenting on the job application systems. To wrap up the picture, this article talks about how the reality of job applicants is not reflected in the numbers.
But to come back to the main point. Many of the websites where I submitted my job applications, were either operated by ICIMS or Taleo, among others. ICIMS and Taleo are Applicant Tracking Systems (ATSs) – automated tools to help companies parse through thousands of job applications so they can spend less time reading letters and resumes and just hire someone to do the work. Automation gives rise to more automation – just as companies use ATSs to automate hiring, companies are popping up to automate the job application process itself, to help applicants beat the ATS.
The part that I would like to note is that an applicant may end up submitting applications to various different companies, all of which use ICIMS as their ATS. In such an instance, the usefulness of LinkedIn as a single-platform vanishes – it would instead be useful to create a profile on the ATSs like ICIMS, Taleo and others and then just apply for jobs through them!
Others, like Deloitte, have their own ATS, but it is in the US even if you are applying for a non-US based job. As per US law, any data stored anywhere in the US is freely accessible to the US government.
ATSs, as far as I know, do not store their data in Canada – as most are US-based, the data also ends up there. This, raises the issue of data sovereignty – even though I am an applicant in Canada, applying for a job in Canada, my data will end up in the US. Granted, by using Gmail I am already giving up my data to the US, but that is because I want the free email service. How does that argument apply to my job search? As a Canadian applying for a job in Canada, it is reasonable to expect that my job application and related data stays in Canada. Yet, I am forced to give up my data sovereignty just to able to apply for a job, let alone being hired! (Not that it is right for a person to have to give up their data sovereignty to be hired either).
By forcing job applicants to give up control over their own data, the job application process takes advantage of the vulnerable status of the applicants, makes them further vulnerable, and also violates their data sovereignty. The question here is not why does the job applicant continue applying, but why are non-US based companies happily giving up their own data to the US?
I came across this excellent article that asks a very pertinent question – if the Nobel Prize is awarded for work done for the betterment of mankind, shouldn’t the knowledge be freely available?
Open access to knowledge and information is an increasingly necessary issue that everyone needs to care about. As the Internet spreads, the barriers to accessing ever increasing amounts of information are coming down. Yet, much of the credible information is unavailable to the public, and individuals sometimes take extreme actions, such as Aaron Swartz.
What are your thoughts on open access to knowledge and information?