Real-time pre-authorization in ride-hailing: switching from rules to ML
Andrey Lukyanenko
Senior DS @ Careem
About me
- ~4 years as ERP-system consultant
- DS since 2017
- Lead a medical chatbot project
- Lead an R&D CV team
- Senior DS in Careem: anti-fraud, recommendation system, LLM-based products
A slide about Careem
Content
- Pre-authorization: what is it?
- Why did we decide to switch from rules to ML?
- Challenges of building the model
- Data Preparation
- Model training
- Model deployment
- Results
Pre-authorization
Rules vs ML models
Challenges of building the model
- Determining the actual metrics: if the new model denied the transaction, we don't know if it was fraudulent
- Prediction threshold optimization
- Latency
Data Preparation
- How much historical data to use?
- Feature engineering and selection
- Ensure the lack of leakage and don't use the data not available at the moment of making predictions
- Check data for discrepancies
Model training
Model deployment
- Preparing features to be available in real-time
- Checking for discrepancies between training and production data
- Internal system for model training and deployment on AWS
- Running the model in shadow mode
SELECT user, COUNT(*)
FROM user_trips
WHERE trip_type = "special"
AND day = {date}
GROUP BY user
Results
Contacts
pre_auth
By Andrey Lukyanenko
pre_auth
- 47