Unified Beneficiary Identification Database Project
Data Management Division, ICTMS
UBID TWG
OUTLINE:
- About the Project
- Technical Working Group
- Scope
- Work Plan
- Integration of CRIMS v.4
About the Project
Business Intelligence, DATA VISUALIZATION, and Big Data
technologies, applications and practices for the collection, integration, analysis, and presentation of business information. The purpose of Business Intelligence is to support better business decision making
DSWD became part of the Task Force on Big Data for Official Statistics
2016
to generate high quality data with strong spatial dimension and sectoral and socioeconomic disaggregation required in monitoring national development plans as well as the Agenda 2030 on Sustainable Development
AmBisyon Natin 2040
However, in order to make the best out of these technologies, it is important to establish fundamentals for the environment of the data (i.e. data governance, architecture, standards, and management).
Thus, it is important to continue improving our capacity to collect, manage, and secure our data.
But how reliable and accurate are our data?
4 Anchors of Trust on Data and Analytics
- quality
- effectiveness
- integrity
- resilience
a single source of truth for the DSWD beneficiaries from its key programs and projects
a single source of truth for the DSWD beneficiaries from its key programs and projects
The project aims to organize and store beneficiary data collected by different programs and projects of DSWD.
Objective of the project
Specific functions of UBID
-
Increased productivity between National Project Management Offices (NPMOs), Offices, Bureaus, Services, and Units (OBSUs), and other stakeholders who use beneficiary data in the delivery of their services
-
Reduced human error when dealing with beneficiary identification and managing information about project and program beneficiaries
-
Easier identification on whether a client is, or is still, qualified to benefit from one or more DSWD programs
-
Improved communication from UBID stakeholders
-
Improved business intelligence on DSWD beneficiary data
-
Better decision-making on how to improve our services
SCOPE
- OBSUs & Data Owners
- Beneficiary Data from the whole country up to Barangay level
- September 4 to December 6, 2019
Work Plan
Meeting with Listahanan and Pantawid
Data Cleansing (Pantawid, NHTO)
Technical Meeting
Sept. 24- Oct. 8
Meeting with UCT, Kalahi, and SLP
Technical Meeting
Sept. 25 - Oct. 18
Data Cleansing (UCT)
Data Cleansing (SLP)
Sept. 25 - Oct. 22
Oct. 28- Nov. 18
Data Cleansing (PMB)
Technical Meetings
Deduplication and Merging
Sharing of Clean Data to ICTMS
Spatial Integration
New Field for UBID on Individual Databases
Dashboard Creation and Continuous Update
CRFs
Format
lat, long
(format) memo, advanced copy
ICTMS
FUZZY NAME-MATCHING ALGORITHM
Variable | Definition |
---|---|
LastNameSR | Last name of the individual to be matched |
FirstNameSR | First name of the individual to be matched |
MiddleNameSR | Middle name of the individual to be matched |
FMiddleNameSR | First letter of the middle name of the individual to be matched |
LastNameLR | Last Name of the Listahanan 2* individual |
FirstNameLR | First name of the Listahanan 2* individual |
MiddleNameLR | Middle name of the Listahanan 2* individual |
FMiddleNameLR | First letter of the middle name of the Listahanan 2* individual |
FUZZY Name-matching algorithm
- Compute the LastNameMatchPercentageJW = compute similarity of LastNameSR and LastNameLR using Jaro-Winkler Algorithm
- Compute LastNameMatchPercentageL = compute similarity of LastNameSR and LastNameLR using Levenshtein Algorithm
- Compute FirstNameMatchPercentageJW = compute similarity of FirstNameSR and FirstNameLR using Jaro-Winkler Algorithm
- Compute FirstNameMatchPercentageL = compute similarity of FirstNameSR and FirstNameLR using Levenshtein Algorithm
- Compute MiddleNameMartchPercentageJW = compute similarity of MiddleNameSR and MiddleNameLR using Jaro-Winkler Algorithm
- Compute ISFMiddleNameEqual = check if FMiddleNameSR and FMiddleNameLR is equal
FUZZY Name-matching algorithm
Compute FinalMatchPercentage =
45% * AVERAGE(FirstNameMatchPercentageJW, FirstNameMatchPercentageL) + 53% * AVERAGE(LastNameMatchPercentageJW, LastNameMatchPercentageL) + 1% * MiddleNameMatchPercentageJW + 1% * IsMiddleNameEqual
HYBRID FUZZY NAME-MATCHING ALGORITHM
Variable | Definition |
---|---|
MetaLastNameSR | Result of Metaphone Algorithm on the last name of the individual to be matched |
MetaFirstNameSR | Result of Metaphone Algorithm on the first name of the individual to be matched |
MetaLastNameLR | Result of Metaphone Algorithm on the last name of the Listahanan 2* individual |
MetaFirstNameLR | Result of Metaphone Algorithm on the first name of the Listahanan 2* individual |
HYBRID FUZZY NAME-MATCHING ALGORITHM
- Compute LastNameHMatchPercentageJW = compute similarity of MetaLastNameSR and MetaLastNameLR using Jaro-Winkler Algorithm
- Compute FirstNameHMatchPercentageJW = compute similarity of MetaFirstNameSR and MetaFirstNameLR using Jaro-Winkler Algorithm
HYBRID FUZZY NAME-MATCHING ALGORITHM
Compute FinalMatchPercentage =
45% * AVERAGE(FirstNameMatchPercentageJW, FirstNameMatchPercentageL, FirstHNameMatchPercentageJW) + 53% * AVERAGE(LastNameMatchPercentageJW, LastNameMatchPercentageL, LastHNameMatchPercentageJW) + 1% * MiddleNameMatchPercentageJW + 1% * IsMiddleNameEqual
HYBRID FUZZY NAME-MATCHING ALGORITHM
If FinalMatchPercentage = 100% = DIRECT MATCH
If FinalMatchPercentage = 90-99% = POSSIBLE MATCH
ELSE (<90%), NOT MATCH; UNIQUE
what's NEXT?
- What do we do with possible matches? (90%-99% FinalMatchPercentage)
- This still cannot respond to the issues on fraudulent identities (different names, same person), unless the beneficiaries' biometrics are also collected.
- Discuss other technical questions about data cleansing and deduplication.
- Revisit policies on beneficiary eligibility per program/project and how we can apply them on a possible searching system for UBID
- CRF Management
- How to update data once there is an existing UBID
- Communication protocols for data updates, CRFs, data mismatch, etc.
- TWG member/ technical person change
UBID Project
By Andi
UBID Project
- 373