Big Data in Indian Governance

A Case Study by the Centre for Internet and Society

Research Objectives

 

Understand the potential harms and benefits of  Big Data in India and provide recommendations for anticipatory regulation.  

Towards this the case study seeks to identify the following in the context of Big Data and Governance in India:

  • uses 

  • generation

  • promises

  • policy implications

  • public perception 

  • potential impact on citizens, society, and governance 

  • potential regulatory interventions and solutions

 

Scope

The use of Big Data in Governance in India is still emerging  The stage at which 'Big Data' is relevant differs. For some, 'Big Data' is a tool that has been recognized as being important. For others, 'Big Data' anayltics and techniques are a means to operationalize the scheme. While some schemes have not explicitly cited 'Big Data' but are structured in such a way that the use or generation of 'Big Data' is a potential.

Based on either the potential use or generation of Big Data or a public statement on the use or generation of Big Data - the case study focuses on the following: 

  • The Aadhaar scheme:  A 12 digit identity number based on biometrics that can be used to authenticate individuals. 
  • Digital India:  (Amber please add sentence)
  • Smart Cities: A five year project to develop 100 smart cities across India to drive economic growth and improve quality of life. 

Qualifying Big Data 

Self Identified: Scheme policy documents describe the use of Big Data analytics and techniques 

Publicly Identified: Described in publicly available third party sources as a scheme using Big Data or as being a critical component of the scheme. 

Potentially Identified: Consent mechanism, infrastructure, size of population serviced, and sharing of data or more generally schemes that will enable a quantified society. 

(Amber please add the schemes that you have identified as potentially using or generating big data  

Research Questions

  • What are the objectives/promises of a scheme?
  • Are there assumptions made within a scheme?  
  • How does Big Data pertain to a scheme? Is it generated or used or both? 
  • What is the data flow within a scheme? Specifically, where and how is information collected? what is the mechanism for consent? and how is information shared or disclosed? 
  • What has been the public dialogue around a scheme? 
  • What are the applicable legislation/policy? 
  • Does the government department host a privacy policy for the scheme? 
  • Are private companies involved in implementing the scheme? If yes, are these foreign or domestic? If yes, is there a clear data policy for these organizations? 
  • Algorithmic decision making - Issues
  • Use and reuse

Research Methodology

  • Literature review 
  • Analysis of media reports, government notes, press releases, legislation, conference inputs, contracts, tenders, and policy 
  • Interviews with experts and site visits 
  • Review of government websites
  • Right to Information Requests 

Literature Review 

Consent 

  • Bullet One
  • Bullet Two
  • Bullet Three

Privacy

  • Bullet One
  • Bullet Two
  • Bullet Three

Potential harms and benefits

  • Bullet One
  • Bullet Two
  • Bullet Three

Knowledge 

  • Bullet One
  • Bullet Two
  • Bullet Three

Big Data and Governance in India: Overview 

 

Big Data in Indian Governance

Big Data is still in nascent phases in India. Ways in which the Government is beginning to use Big Data include: 

  • Informing policy and decisions: analytics for poverty line and mygov.in 

  • Operationalize and Improve Schemes:  Department of electronics and information technology have published an IoT policy recognizing the importance of Big Data in the delivery of government services.

  • Grow domestic analytics capacity: Industry body NASSCOM is setting 20 analytic centres of excellence to grow domestic capacity and undertake government projects. Big Data Initiative under the Department of Science and Technology has issued a call for proposals to build Big Data capacity 

Digital India Overview

  • Vanya add summary 
  • Bullet Two
  • Bullet Three

Aadhaar Overview

 

  • Vanya add summary 
  • Bullet Three

100 Smart City Overview

  • Vanya add summary 
  • Bullet Two
  • Bullet Three

Overview of Promises and Objectives  

  • 18 Efficient service delivery
  • 16 Accessibility
  • 13 Integration and data consolidation
  • 11 Automation and Monitoring
  • 8 Transparency and Accountability
  • 7 Interoperability and common standards
  • 2 Political and social empowerment 
  • 2 Reduction of fraud 
  • 2 Data driven decision making   
  • 2 Conclusiveness
  • 1 Digital Security 
  • 1 Universal Identity  
  • 1 Financial inclusion  

Overview of Promises and Objectives  

Overview of Promises and Objectives  

Overview of Assumptions 

  • The data ecosystem in a scheme is accurate and thus the results are accurate   
  • Data driven decisions will allow for targeted and accurate implementation
  • Data driven decisions will save money 

Data Flow 

Consent

  • Explicit
    • Digital India - scheme A
  • Implicit
  • Generic

Examples of consent

  • Bullet One
  • Bullet Two
  • Bullet Three

Data Ownership

  • Aadhaar: Public private partnership makes it unclear who owns enrollment data and transaction data. 
  • Smart Cities: Not yet determined 
  • Digital India: In some cases the government owns the data while in other cases lack of documentation and a clear publicly available policy makes it unclear who owns the data 

Collection

  • Proactive
  • Reactive
  • Ongoing

Type and source of data

  • Aadhaar: Directly from the individual and from service providers adopting the identifier
  • Digital India: Directly from the individual and from digital platforms such as social media. 
  • Smart Cities: Directly from the individual and devices and systems they interact with 

Storage

  • Bullet One
  • Bullet Two
  • Bullet Three

Analysis

  • Bullet One
  • Bullet Two
  • Bullet Three

Sharing & Retrieval

  • Bullet One
  • Bullet Two
  • Bullet Three

Use

  • Elonnai, I have mapped the use of all initiatives in the excel sheet shared, but am sure how we want to represent this information. To me, the Re-use slide seems cover the information we want to bring out that there are no restrictions on use
  • Bullet Two
  • Bullet Three

Reuse

  • All initiatives under Digital India are silent on the issue of re-use of data collected
  • It is however, clear from the objectives of a few initiatives data is intended to be shared and re-used
  • In the absence of laws on data minimization and purpose limitation, it allows data to be used indiscriminately for any purpose

Deletion

  • Of the 34 initiatives, 31 initiatives are engaged in data collection
  • Of these 31 initiatives, 22 have been implemented fully or partially
  • None of these 22 initiatives have mechanisms provided for deletion of personal record by individuals

Data updation

  • Of the 34 initiatives covered, 12 initiatives allow for some updation of data collected
  • In most cases, this updation facility is provided for only in cases where it is required for monitoring and governance  

Pivots for Governance

  • Bullet One
  • Bullet Two
  • Bullet Three

Access - whose data is driving the intelligence

  • Bullet One
  • Bullet Two
  • Bullet Three

Size of information collected 

  • Digital India Schemes: The size of data varies across schemes. Most initiatives also plan to digitize and leverage the information collected by analog means in past decades. For instance, under IncomeTaxIndia, over 12 crore PAN numbers are registered and 5.7 crore passport holders are registered under Passport Seva
  • Aadhaar:  As of 30th October, 2015 UIDAI has generated more than 92.68 crore Aadhaar numbers
  • Smart Cities: Not yet at the stage of implementation

Policy and Big Data 

Digital India Policy Ecosystem 

  • The 11 Central Government initiatives under Digital India deal with subject matter covered under 17 central legislations
  • The 14 State Government initiatives deal with subject matter covered under 21 central legislations and corresponding state legislations
  • The other 9 initiatives under this study attract provisions from 4 other legislations

 

Schemes and Privacy Policies 

  • Only 20 out of 34 initiatives have clear privacy policies on their websites 
  • Most policies mention collection of the following information – Internet protocol (IP) address, domain name, browser type, operating system, the date and time of the visit and the pages visited, and if you reached this website from another website, the address of that referring website.
  • Some policies provide that information collected can be divulged to any governmental organisations or law enforcement agencies

Schemes and Privacy Policies 

  • Only 20 out of 34 initiatives have clear privacy policies on their websites 
  • Most policies mention collection of the following information – Internet protocol (IP) address, domain name, browser type, operating system, the date and time of the visit and the pages visited, and if you reached this website from another website, the address of that referring website.
  • Some policies provide that information collected can be divulged to any governmental organisations or law enforcement agencies

Schemes and Privacy Policies

Schemes and Privacy Policies

Schemes and Privacy Policies

Schemes and Privacy Policies

43A and Big Data 

Areas in which India's current data protection standards would not be adequate in a 'big data' scenario include:

  • Scope 
  • Definition of PI and SPI 
  • Consent 
  • Notice of collection 
  • Access and correction 
  • Security 
  • Data Breach 
  • Opt in and out 
  • Disclosure of Information 
  • Privacy Policy 
  • Remedy 

Big Data and Potential Legal Hurdles 

There are potential legal hurdles with the collection and use of different types of digital data. For example

  • s.69 and access to GPS data for smart city traffic management
  • Bullet Two
  • Bullet Three

Public Dialogue 

Smart Cities 

  • The timeline for the implementation the smart city initiative is too fast for what it seeks to achieve

  •  In the smart city scheme, technology is being relied upon to 'smooth over' city level problems.

  • The Smart City initiative assumes that the technology is neutral and the reality of urban data politics are not being considered

  • The Smart City initiative raises questions of socio-spatial consequences are raised by the S

  • The smart city initiative has not considered the need for interoperable standards 

  • There is a lack of inter-departmental and organizational cooperation, which is needed

  • Smart cities risk exclusion and marginalization

  • Smart Cities are an example of a western practice being imposed in the Indian context

  • Smart Cities represents top down application of technology 

  • Smart Cities bring together open data and big data 

Aadhaar 

  • Aadhaar can enable function creep and convergence 
  • Aadhaar could be used to profile or surveil individuals 

e-Gov 

  • Bullet One
  • Bullet Two
  • Bullet Three

e-Kranti

  • Bullet One
  • Bullet Two
  • Bullet Three

Initial

Observations 

  • From a policy perspective, India has yet to consider the implications of Big Data or how policy will need to adapt.  Ths will be particularly important as India's data protection standards do not apply to the public sector. 

  • Many aspects related to 'data flow' are not publicly available or are not adequate from a privacy or big data perspective (i.e consent) 

  • Efficient service delivery is an objective that cuts across all schemes and projects. 

  • The public dialogue in India has raised concerns of privacy, surveillance, convergence, marginalization, discrimination, and equality that could come out of these projects - but have not raised concerns of anti-competitive practices. 

  • Data is being equated as the truth and services are creating project specific ecosystems of 'truth': For example, the UIDAI has set up a web enabled Analytics portal which functions as a common data source and serves as a 'single source of truth for the organization'

  • Big Data in governance requires public private partnerships. This complicates issues of liability and data ownership and creates a 'black box' around data practices of both the government and private companies

  •  The use and re-use of data for governance purposes is not always being collected within a legal framework

Initial Observations 

  • New schemes that deliver government services are replacing rights based legislation. 
  • Bullet Two
  • Bullet Three

Challenges to the research

  • Bullet One

  • Bullet Two

  • Bullet Three

Questions and policy windows we are still pursuing 

  • Bullet One

  • Bullet Two

  • Bullet Three

Potential Research Methods 

  • Bullet One
  • Bullet Two
  • Bullet Three

Thank you!