Unified Beneficiary Identification Database Project

Data Management Division, ICTMS
UBID TWG

OUTLINE:

  1. About the Project
  2. Technical Working Group
  3. Scope
  4. Work Plan
  5. Integration of CRIMS v.4

About the Project

Business Intelligence, DATA VISUALIZATION, and Big Data

technologies, applications and practices for the collection, integration, analysis, and presentation of business information. The purpose of Business Intelligence is to support better business decision making

DSWD became part of the Task Force on Big Data for Official Statistics

2016

to generate high quality data with strong spatial dimension and sectoral and socioeconomic disaggregation required in monitoring national development plans as well as the Agenda 2030 on Sustainable Development
AmBisyon Natin 2040

However, in order to make the best out of these technologies, it is important to establish fundamentals for the environment of the data (i.e. data governance, architecture, standards, and management).

 

Thus, it is important to continue improving our capacity to collect, manage, and secure our data.

 

But how reliable and accurate are our data?

4 Anchors of Trust on Data and Analytics

  1. quality
  2. effectiveness
  3. integrity
  4. resilience

a single source of truth for the DSWD beneficiaries from its key programs and projects

a single source of truth for the DSWD beneficiaries from its key programs and projects

The project aims to organize and store beneficiary data collected by different programs and projects of DSWD.

Objective of the project

Specific functions of UBID

  • Increased productivity between National Project Management Offices (NPMOs), Offices, Bureaus, Services, and Units (OBSUs), and other stakeholders who use beneficiary data in the delivery of their services

  • Reduced human error when dealing with beneficiary identification and managing information about project and program beneficiaries

  • Easier identification on whether a client is, or is still, qualified to benefit from one or more DSWD programs

  • Improved communication from UBID stakeholders

  • Improved business intelligence on DSWD beneficiary data

  • Better decision-making on how to improve our services

SCOPE

  • OBSUs & Data Owners
  • Beneficiary Data from the whole country up to Barangay level
  • September 4 to December 6, 2019

Work Plan

Meeting with Listahanan and Pantawid

Data Cleansing (Pantawid, NHTO)

Technical Meeting

Sept. 24- Oct. 8

Meeting with UCT, Kalahi, and SLP

Technical Meeting

Sept. 25 - Oct. 18

Data Cleansing (UCT)

Data Cleansing (SLP)

Sept. 25 - Oct. 22

Oct. 28- Nov. 18

Data Cleansing (PMB)

Technical Meetings

Deduplication and Merging

Sharing of Clean Data to ICTMS

Spatial Integration

New Field for UBID on Individual Databases

Dashboard Creation and Continuous Update

CRFs

Format

lat, long

(format) memo, advanced copy

ICTMS

FUZZY NAME-MATCHING ALGORITHM

Variable Definition
LastNameSR Last name of the individual to be matched
FirstNameSR First name of the individual to be matched
MiddleNameSR Middle name of the individual to be matched
FMiddleNameSR First letter of the middle name of the individual to be matched
LastNameLR Last Name of the Listahanan 2* individual
FirstNameLR First name of the Listahanan 2* individual
MiddleNameLR Middle name of the Listahanan 2* individual
FMiddleNameLR First letter of the middle name of the Listahanan 2* individual

FUZZY Name-matching algorithm

  1. Compute the LastNameMatchPercentageJW = compute similarity of LastNameSR and LastNameLR using Jaro-Winkler Algorithm
  2. Compute LastNameMatchPercentageL = compute similarity of LastNameSR and LastNameLR using Levenshtein Algorithm
  3. Compute FirstNameMatchPercentageJW = compute similarity of FirstNameSR and FirstNameLR using Jaro-Winkler Algorithm
  4. Compute FirstNameMatchPercentageL = compute similarity of FirstNameSR and FirstNameLR using Levenshtein Algorithm
  5. Compute MiddleNameMartchPercentageJW = compute similarity of MiddleNameSR and MiddleNameLR using Jaro-Winkler Algorithm
  6. Compute ISFMiddleNameEqual = check if FMiddleNameSR and FMiddleNameLR is equal

FUZZY Name-matching algorithm

Compute FinalMatchPercentage =

 

45% * AVERAGE(FirstNameMatchPercentageJW, FirstNameMatchPercentageL) + 53% * AVERAGE(LastNameMatchPercentageJW, LastNameMatchPercentageL) + 1% * MiddleNameMatchPercentageJW + 1% * IsMiddleNameEqual

HYBRID FUZZY NAME-MATCHING ALGORITHM

Variable Definition
MetaLastNameSR Result of Metaphone Algorithm on the last name of the individual to be matched
MetaFirstNameSR Result of Metaphone Algorithm on the first name of the individual to be matched
MetaLastNameLR Result of Metaphone Algorithm on the last name of the Listahanan 2* individual
MetaFirstNameLR Result of Metaphone Algorithm on the first name of the Listahanan 2* individual

HYBRID FUZZY NAME-MATCHING ALGORITHM

  1. Compute LastNameHMatchPercentageJW = compute similarity of MetaLastNameSR and MetaLastNameLR using Jaro-Winkler Algorithm
  2. Compute FirstNameHMatchPercentageJW = compute similarity of MetaFirstNameSR and MetaFirstNameLR using Jaro-Winkler Algorithm

HYBRID FUZZY NAME-MATCHING ALGORITHM

Compute FinalMatchPercentage =

 

45% * AVERAGE(FirstNameMatchPercentageJW, FirstNameMatchPercentageL, FirstHNameMatchPercentageJW)  + 53% * AVERAGE(LastNameMatchPercentageJW, LastNameMatchPercentageL, LastHNameMatchPercentageJW) + 1% * MiddleNameMatchPercentageJW + 1% * IsMiddleNameEqual

HYBRID FUZZY NAME-MATCHING ALGORITHM

If FinalMatchPercentage = 100% = DIRECT MATCH

 

If FinalMatchPercentage = 90-99% = POSSIBLE MATCH

 

ELSE (<90%), NOT MATCH; UNIQUE

 

what's NEXT?

 

  • What do we do with possible matches? (90%-99% FinalMatchPercentage)
  • This still cannot respond to the issues on fraudulent identities (different names, same person), unless the beneficiaries' biometrics are also collected.
  • Discuss other technical questions about data cleansing and deduplication.
  • Revisit policies on beneficiary eligibility per program/project and how we can apply them on a possible searching system for UBID
  • CRF Management
  • How to update data once there is an existing UBID
  • Communication protocols for data updates, CRFs, data mismatch, etc.
  • TWG member/ technical person change

UBID Project

By Andi