Researching Big Data in
Indian Governance
A Case Study by
The Centre for Internet & Society
Bangalore, India
Research Objectives
- Explore methodologies for researching Big Data - a nascent topic in India.
- Explore the reality and potentiality of Big Data in governance in India
- Explore the potential harms and benefits of Big Data in India and provide recommendations for anticipatory regulation.
Scope
The use of Big Data in Governance in India is still emerging The stage at which 'Big Data' is relevant differs. For some, 'Big Data' is a tool that has been recognized as being important. For others, 'Big Data' anayltics and techniques are a means to operationalize a scheme. While some schemes have not explicitly cited 'Big Data' but are structured in such a way that the use or generation of 'Big Data' is a potential.
Based on either the potential use or generation of Big Data or a public statement on the use or generation of Big Data - the case study focuses on the following:
- The Aadhaar scheme: Implemented since 2010, a 12 digit identity number based on biometrics that can be used to authenticate individuals for delivery of PDS.
- Digital India: In the process of implementation, a governance campaign comprised of 9 pillars to transform India's economy and empower the citizen.
- Smart Cities: In the process of conceptualization, a five year project to develop 100 smart cities across India to drive economic growth and improve quality of life.
Research Questions
- What are the objectives/promises of a scheme?
- Are there assumptions made within a scheme?
- How does Big Data pertain to a scheme? Is it generated or used or both?
- What is the data flow within a scheme? Specifically, where and how is information collected? what is the mechanism for consent? and how is information shared or disclosed?
- What has been the public dialogue around a scheme?
- What are the applicable legislation/policy?
- Does the government department host a privacy policy for the scheme?
- Are private companies involved in implementing the scheme? If yes, are these foreign or domestic? If yes, is there a clear data policy for these organizations?
- Algorithmic decision making - Issues
- Use and reuse
Research Methodology
- Literature review
- Analysis of media reports, government notes, press releases, legislation, conference inputs, contracts, tenders, and policy
- Interviews with experts and site visits
- Review of government websites
- Right to Information Requests
Data Flow
Qualifying Big Data for the Case Study
Self Identified: Scheme policy documents describe the use of Big Data analytics and techniques.
Publicly Identified: Described in publicly available third party sources as a scheme using Big Data or as being a critical component of the scheme.
Potentially Identified: Consent mechanism, infrastructure, size of population serviced, and sharing of data or more generally schemes that will enable a quantified society.
Data Flow Research
Towards this the case study seeks to identify the following in the context of Big Data and Governance in India:
-
Access and consent
-
Generation and analysis
-
potential and present uses and reuse
-
promises and assumptions
-
policy implications
-
public perception
-
potential impact on citizens, society, and governance
-
potential regulatory interventions and solutions
The Importance of Data Flow
Mapping out the flow of data in each scheme is important in understanding:
- If big data is or potentially being generated and used.
- Where 'data gaps' could be in the research i.e what are areas that are opaque or not accessible.
- Identify good data flow or poor/inadequate data flow processes and identify potential benefits or harms.
Data sources
- Where is data being collected from?
Consent
Consent is a mechanism that can indicate indiscriminate sharing and re-purposing of the data that might happen while the lack of consent or minimal consent can also be an indicator of inadequate policy.
The form of consent taken in different schemes varies and can include:
- Explicit: Consent is taken for each collection and/or use
- Implicit: Consent is understood to be given when entering a space or engaging with a service.
- Generic: Consent is taken for initial collection, but consent for future uses is not taken.
Collection
The way the data is collected could have a bearing on the size of the data that is collected, how it can be analysed, shared and used. Whether the provision of data is mandatory, voluntary or quasi thereof also raises questions of citizens rights and agency in the use and re-use of the data.
Proactive Reactive Ongoing |
Mandatory Voluntary Quasi |
---|
Data Ownership and liability
Data ownership is important in identifying forms of redress available to the individual and the liability of those collecting and using the data.
- Big Data can complicate the issue of data ownership as data changes hands and new insights are derived.
- In governance, the involvement of public private partnerships can complicate the question of data ownership and liability.
- In a context like India, where data protection standards do not clearly extend to public bodies, questions of ownership and liability are important in understanding the rights of individuals in relation to their data.
Type of data
The type of data collected and the source of that data is important in understanding the potential implications for individuals rights including privacy as it is used and re-used
- Data or metadata
- Quantitative or Qualitative
- Primary or secondary
- Direct or indirect
Veracity of the Data
- Bullet One
- Bullet Two
- Bullet Three
Storage
Aspects of the storage of data can impact citizens privacy
- Duration
- Format
- Security
Analysis
Analysis can impact privacy, discrimination, and marginalization
- Method used to reach a conclusion
- Data used to reach a conclusion
- Use of such conclusion
Sharing
The way in which data is shared and retrieved can result in convergence
- Seeding
- Merging
- One time disclosure
Use
- Limited use
- Re-purposed
Deletion
- By the individual
- By the department
- Completely deleted
- Partially deleted
Data updation
- Frequency
- Source and veracity of data
- Bullet Two
- Bullet Three
Pivots for Governance
- Bullet One
- Bullet Two
- Bullet Three
Size of information collected
Text
Policy and Big Data
Potential Research Methods
- Bullet One
- Bullet Two
- Bullet Three
Thank you!
Researching Big Data in Indian Governance
By Elonnai Hickok
Researching Big Data in Indian Governance
- 1,469