Big Data Ethics: Weaving ethics into the fabric of Big Data

  • Author:
  • Sahana Rajan


The coming of Big Data to the domain of information science was a technological leap which our existent world of machinery found itself unprepared for. As it happens with evolution that every new species has to inhabit a fertile and nutritive environment for survival, so with every stage we move forward in information technology. Every invention and leap has to be grown in an atmosphere that has a parallel ethical code to restrain the misuse of the former. What is the boundary of ownership on the data? What kind of data must be accessible to each of the designations? Who decides in what form and for how long such data must be stored? While some of the questions fall in the purview of law, questions aforementioned cross the fingertips of all who come in touch with Big Data and work on it every day.

Gartner defines Big Data as the “high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making and process automation”. As Kord Davis (the author of a recent and comprehensive book, “Ethics of Big Data”) points out, the ethics of working with Big Data is the challenge of creating a safe atmosphere where innovation can take place with minimal risk. Such a challenge is more pressing due to the type of inter-disciplinary force it applies. It penetrates our social, political and cultural geography by not only extracting but by becoming raw material for insights that impact our daily lives. The body of ethics and law seems to be walking slower and getting obsolete in comparison to the development of mind of our information.

Pinterest recently announced the feature where you can singularise a piece of a photo and search through their inventory for similar photos. Such intricate equations of data processing pose questions about the avenues through which access to personal or private data could open up. These questions have been visited in the past and for that matter, have been food for public talk on the New York Times in 2012. The report laid down the case of social networking app ‘Path’ which was alleged to be extracting and storing away personal data (such as names, contact info) without explicit consent of the users. Response to this news trickled from ethical judgements to legal crumbs.

Ways in which Big Data influences you every day include the presence of targeted ads on social networking sites. Entering such a site could feel like having a flip through a stalker-scrapbook who has collected your virtual footprint and now, waits for your consumers to give in. In such a scenario, organizations have to be individually responsible for the kind of values they allow to reign over their database.

In the vast discussion on ethical nature of Big Data, we often tend to distort the impact of Big Data itself. Big Data is not simply a humongous storage of information but the breeding ground for power and money. There are four major principles which can guide on ethics of Big Data:

Privacy Laws: A manual of rules and laws on how information can be accessed and used will mediate the relation between data and different parties. Any privacy law must first identify the target audience and type of data being protected. Foremost, the law-making panel must define the differences between secret and private and identify the fields within which these laws will be applicable.

Redefining Boundaries: Confidential and Accessible

A spectrum of information about users is received through services that we subscribe and rely on: WiFi locators, picture galleries and GPS. In a scenario where the lines between flow of information is so fluid, it becomes important to make distinctions subtler than the inadequate dichotomy of entirely public or entirely private.

Declouding the mirrors of communication: Transparency

The possibility of being able to create a virtual world where you can be entertained courtesy the stream of www footprints that you leave while browsing or simply being online. This makes it important to acknowledge the probability that every time I click or visit a page, bits and pieces of me are being collected which can be used for targeted mailing/notifications. Transparency in mediation of data can be executed through privacy laws and through a cooperative consent between individuals and organizations. The latter would define the rights of cyber citizens and duties of the data-diggers.

Identity Care

 Once all the “Dos” have been decided, it is integral to seal the Big Data Ethics with a range of Don’ts. What are the kind of information that no organization/enterprise can have access to? Even if such data were available, how do we ensure that it is not utilized for undesirable inferences? This would require Predictive Analytics as a discipline to explore business ethics and declare lines of conduct.

Along with the crowd of antivirus and anti-spyware, the age of Big Data shows the need for a vaccine against data theft and misuse. The four mentioned principles can stand as the seeds for growth of ethics that would regulate production and distribution of data. Determining the participants in the chain of information lifecycle and designating each level in the hierarchy with its corresponding powers will allow to systematize the task of forming ethics for Big Data.

Placing of laws and regulations will not solely be sufficient for effective implementation of the principles. What is required is a holistic revamping of how we have previously perceived the internet and World Wide Web. It is time to start looking at our screens as extensions of ourselves and which, because they have begun to be windows into us, need to be guarded through a series of measures.