Step 0: Drop variable with high missing rate (> 80%: troponin_high, procalcitonin, fibrinogen, troponin_nomal)
Step 1: Naively impute missing data points of each variable using functional PCA {fdapace}
Step 2: Drop rows with the most (originally) missing values, record the proportion of rows dropped for each patient (pdrop)
Step 3: Put NAs back in the CRP variable where it was missing.
Step 4: Train CRP on Leukocytes, Albumin and pdrop (mixed effect model, XGBoost, Amelia II) with available data
Step 5: Use the fitted model to predict the missing CRP values.
Step 6: Repeat Steps 3–5 separately for each variable that has missing data (Leukocytes and Albumin).