Ever Heard of a Black Swan Lake?
The pool of Big Data is an ever expanding body of pure and pristine information. It covers many surfaces and fields of study, faithfully reflecting the nature of the landscape around. Many data analysts can be seen walking around the lake’s shore trying to decipher patterns, trends, and behaviors in the data. One such lake of information was that of the American elections.
Most analysts had concluded that, albeit by a slim margin, it would be Hillary Clinton who would sweep the polls and convince the populous that ‘Hillary for America’ was the brightest path to the future. Her campaign was impressively fueled by the waters of the Data Lake. Her digital strategy was strong and data analytics had informed the polls that a lady would soon be instated in the Oval Office. However, as that Big Data Lake was being cleaved apart by Clinton, a rebellious outlier was slowly circling her. A Black Swan was about to settle into the lake and disprove all that it stood for.
Black Swan events are phenomena that break the norm and go against what is expected or predicted. Of course, there are now acres of newsprint that see Donald Trump’s rise to power as a sign that, according to one magazine, he was taken literally not seriously whereas the world should have taken him seriously and not literally. The political climate of the west is moving dangerously towards the right. In the words of CNN’s Van Jones, a ‘white-lash’ has swept the United States in the aftermath of globalization and the onslaught of migrants which together have squeezed the American Rust Belt out of jobs. But how could the polls have got it wrong? Has Big Data failed? Will the Data Lake now only be home to smug, albeit tantalizing, bevies of Black Swans?
In the weeks and months leading up to the election, Hillary Clinton’s triumph seemed inevitable. Pollsters and pundits were not arguing about who the victor would be but by how much she would win! Then Tuesday morning dawned, bright and uncertain and the world was never the same again. The American map bled blue as Clinton’s miscalculation turned to Trump’s triumph.
Hillary Clinton was analyzing data, but in retrospect her source may have been wrong. Her campaign was using an algorithm known as ‘Ada’ to help them locate their advantages and play to them. But such ardent number punching overlooked a deep furrow of frustration and dissatisfaction that ran the length of middle America. The Rust Belt voters were unable to connect with the white collar Clinton, whose aim to shatter the glass ceiling fell on disinterested ears of both genders.
Trump, on the other hand, relied on instinct. In his own unique way, drenched in pomposity and a roguish charm, he promised the country the moon on a platter, a wall across Mexico, and a chance to overturn the system they had grown to distrust, even hate. His campaign manager fed off the energy of the crowd, the mood of the electorate. So is Big Data inherently unable to handle certain real life situations?
Human beings can be unpredictable. The information one receives from them can be clouded by subjectivity, a fact that most algorithms fail to compensate for, as Hillary Clinton discovered much to her dismay. Clinton supporters may have overstated their support and the Trump demographic might have held their tongues in front of the media they loathed, leading the polls fractionally off the mark. But a fraction was all it took.
However, this is not a failure of Big Data per se but that of forecasting. It wasn’t the analysis that was to blame as much as the human error of analysts. If you are looking into the Data Lake to see the future, then you may be disappointed. Data is just a collection of facts and figures, not a seer, one cannot hedge entire elections on them. You can’t judge a nation’s hopes, disappointments, and aspirations in a row of numbers. There are certain qualitative aspects of the human psyche that no amount of Data Analytics can read.
To get a clearer picture of a prediction, one must look for a variety of data sets from a variety of sources. If one had looked at the Data Lake carefully and perhaps not allowed oneself to get carried away by media hysteria, they would have realized that there were ripples in the polls, a change in the winds, the tide was turning against Clinton in the subtlest of ways. The danger lies in trusting the Data Lake too much, the waters look like they can support life but will ultimately drown those who rely on it completely.
Data Science is a means, not an end — a tool that can lead to probabilities, inherent in that very word is the margin for error, for change. One must balance the qualitative and quantitative aspects of data, especially in a social science like politics.
Thus as Trump ruffles feathers in the nation, the future of Big Data is clear. The Data Lake shall live on and grow in both size and depth. It is those who look into its waters who must change their perspective. Trump famously said, “What separates the winners from the losers is how a person reacts to each new twist of fate.” And in a world where Data Lakes can turn into pools of volatility and uncertainty, only the truly adaptable have a shot at gold.