NMDP [002 SP_B]

Vidhi Lalchand, Ph.D.

 IMU Biosciences

March 7, 2026

[203 donors]

x \in \mathbb{R}^{d_{\text{main}}} \text{ where } d_{\text{main}} \approx 2.3k
z_o \in \mathbb{R}^{d_{\text{ord}}} \text{ where } {d_{\text{ord}}} = 5
z_n \in \mathbb{R}^{d_{\text{nom}}} \text{ where } {d_{\text{nom}}} = 3

The data blocks [cell ratios, ordinal covariates and nominal covariates]

["cytoelnnew_ord", "drigp_ord", "hctcigp_ord", "drcmvpr_ord", "age"]

The immune features, log1p transformed and scaled.

Embedding of the ordinal features

One-hot encoding of the nominal features

["amltype", "donor_group", "drsexmatch"]

The model consists of three components:


(i) a nonlinear MLP applied to the main features,
(ii) direct linear contributions from the covariates,
(iii) an ordinal–nominal interaction term.

 

\text{logit} = \underbrace{f_{\text{MLP}}(x_{\mathrm{immune}})}_{\text{non-linear function of immune features}} + \underbrace{\beta_o^\top z_o + + \beta_n^\top z_n}_{\text{linear contribution term of ordinal and nominal covariates}} + \underbrace{z_o^\top W z_n}_{\text{interaction effect}}
p(y=1 \mid x_{\text{immune}}, z_o, z_n) = \sigma\!\left( \text{logit} \right)
\sigma(t) = \frac{1}{1 + e^{-t}}

where

\text{logit} = \underbrace{f_{\text{MLP}}(x_{\mathrm{immune}})}_{\text{non-linear function of immune features}} + \underbrace{\beta_o^\top z_o + + \beta_n^\top z_n}_{\text{linear contribution term of ordinal and nominal covariates}} + \underbrace{ \alpha(z_o^\top W z_n)}_{\text{interaction effect}}

Parameters: Weight of the MLP, linear weights for covariates \(\beta_{0}, \beta_{n}\), the interaction scale \(\alpha\), the interaction matrix \(W\) of size \(d_{\text{ord}} \times d_{\text{nom}}\)

W_int 50
interaction_scale 1
ln_main.weight 2314
ln_main.bias 2314
fc1.weight 148096
fc1.bias 64
fc2.weight 2048
fc2.bias 32
fc3.weight 32
fc3.bias 1
ln_ord.weight 5
ln_ord.bias 5
ord_head.weight 5
ord_head.bias 1
nom_head.weight 10
nom_head.bias 1
interaction_head.weight 10
interaction_head.bias 1
Fold 1/5

Epoch 010 | Train Loss: 0.937 | Val Loss: 1.054 | Val Acc: 0.500 | Val Acc Positive: 0.333 | LR: 0.000100
Epoch 020 | Train Loss: 0.931 | Val Loss: 1.047 | Val Acc: 0.481 | Val Acc Positive: 0.417 | LR: 0.000070
Epoch 030 | Train Loss: 0.812 | Val Loss: 1.039 | Val Acc: 0.481 | Val Acc Positive: 0.417 | LR: 0.000070
Epoch 040 | Train Loss: 0.824 | Val Loss: 1.036 | Val Acc: 0.481 | Val Acc Positive: 0.417 | LR: 0.000070
Epoch 050 | Train Loss: 0.788 | Val Loss: 1.034 | Val Acc: 0.486 | Val Acc Positive: 0.417 | LR: 0.000070
Fold 1 AUC: 0.475
Fold 1 ACC: 0.564
Fold 1 ACC Positive: 0.500

Fold 2/5

Epoch 010 | Train Loss: 0.901 | Val Loss: 0.862 | Val Acc: 0.542 | Val Acc Positive: 0.583 | LR: 0.000100
Epoch 020 | Train Loss: 0.838 | Val Loss: 0.850 | Val Acc: 0.542 | Val Acc Positive: 0.583 | LR: 0.000100
Epoch 030 | Train Loss: 0.930 | Val Loss: 0.839 | Val Acc: 0.583 | Val Acc Positive: 0.667 | LR: 0.000100
Epoch 040 | Train Loss: 0.792 | Val Loss: 0.832 | Val Acc: 0.625 | Val Acc Positive: 0.750 | LR: 0.000100
Epoch 050 | Train Loss: 0.768 | Val Loss: 0.830 | Val Acc: 0.667 | Val Acc Positive: 0.750 | LR: 0.000100
Fold 2 AUC: 0.756
Fold 2 ACC: 0.718
Fold 2 ACC Positive: 0.750

Fold 3/5

Epoch 010 | Train Loss: 0.966 | Val Loss: 0.924 | Val Acc: 0.593 | Val Acc Positive: 0.667 | LR: 0.000100
Epoch 020 | Train Loss: 0.946 | Val Loss: 0.929 | Val Acc: 0.593 | Val Acc Positive: 0.667 | LR: 0.000100
Epoch 030 | Train Loss: 0.818 | Val Loss: 0.940 | Val Acc: 0.574 | Val Acc Positive: 0.750 | LR: 0.000100
Epoch 040 | Train Loss: 0.835 | Val Loss: 0.954 | Val Acc: 0.537 | Val Acc Positive: 0.750 | LR: 0.000100
Epoch 050 | Train Loss: 0.734 | Val Loss: 0.959 | Val Acc: 0.477 | Val Acc Positive: 0.750 | LR: 0.000100
Fold 3 AUC: 0.559
Fold 3 ACC: 0.538
Fold 3 ACC Positive: 0.750

Fold 4/5

Epoch 010 | Train Loss: 1.064 | Val Loss: 0.982 | Val Acc: 0.614 | Val Acc Positive: 1.000 | LR: 0.000100
Epoch 020 | Train Loss: 1.058 | Val Loss: 0.979 | Val Acc: 0.569 | Val Acc Positive: 1.000 | LR: 0.000100
Epoch 030 | Train Loss: 0.935 | Val Loss: 0.974 | Val Acc: 0.542 | Val Acc Positive: 1.000 | LR: 0.000100
Epoch 040 | Train Loss: 0.912 | Val Loss: 0.974 | Val Acc: 0.542 | Val Acc Positive: 0.909 | LR: 0.000100
Epoch 050 | Train Loss: 0.906 | Val Loss: 0.981 | Val Acc: 0.542 | Val Acc Positive: 0.909 | LR: 0.000100
Fold 4 AUC: 0.576
Fold 4 ACC: 0.474
Fold 4 ACC Positive: 0.818

Fold 5/5

Epoch 010 | Train Loss: 1.063 | Val Loss: 1.023 | Val Acc: 0.500 | Val Acc Positive: 0.333 | LR: 0.000100
Epoch 020 | Train Loss: 1.004 | Val Loss: 1.011 | Val Acc: 0.500 | Val Acc Positive: 0.333 | LR: 0.000100
Epoch 030 | Train Loss: 0.875 | Val Loss: 0.999 | Val Acc: 0.481 | Val Acc Positive: 0.333 | LR: 0.000070
Epoch 040 | Train Loss: 0.890 | Val Loss: 0.994 | Val Acc: 0.522 | Val Acc Positive: 0.333 | LR: 0.000049
Epoch 050 | Train Loss: 0.784 | Val Loss: 0.995 | Val Acc: 0.564 | Val Acc Positive: 0.333 | LR: 0.000049
Fold 5 AUC: 0.593
Fold 5 ACC: 0.658
Fold 5 ACC Positive: 0.333

MLP_w_Dropout (5-fold CV) - Mean of all folds
Threshold: 0.49 - AUC: 0.592 ± 0.092
Threshold: 0.49 - ACC: 0.590 ± 0.087
Threshold: 0.49 - ACC Positive: 0.630 ± 0.184
\operatorname{logit}\!\bigl(\Pr(y_i = 1 \mid \mathbf{x}_i, \mathbf{u}_i, \mathbf{v}_i)\bigr) = \sum_{b=1}^{B} f_b\!\left(\mathbf{x}_i^{(b)}\right) + \boldsymbol{\beta}_{\mathrm{nom}}^\top \mathbf{u}_i \\ + \boldsymbol{\beta}_{\mathrm{ord}}^\top \mathbf{v}_i + \sum_{b=1}^{B} \mathbf{u}_i^\top \mathbf{g}_b^{\mathrm{nom}}\!\left(\mathbf{x}_i^{(b)}\right) + \sum_{b=1}^{B} \mathbf{v}_i^\top \mathbf{g}_b^{\mathrm{ord}}\!\left(\mathbf{x}_i^{(b)}\right) + b_0

Let the immune features be partitioned into BBB parent-subtree blocks:

xi=(xi(1),…,xi(B))\mathbf{x}_i = \bigl(\mathbf{x}_i^{(1)}, \ldots, \mathbf{x}_i^{(B)}\bigr)xi=(xi(1),,xi(B))

where xi(b)\mathbf{x}_i^{(b)}xi(b) is the feature vector for subtree bbb. Let:

  • ui\mathbf{u}_iui = nominal covariate block

  • vi\mathbf{v}_ivi = ordinal covariate block

where:

  • fb(xi(b))f_b(\mathbf{x}_i^{(b)})fb(xi(b)) is the main nonlinear effect of subtree bbb

  • gbnom(xi(b))\mathbf{g}^{\mathrm{nom}}_b(\mathbf{x}_i^{(b)})gbnom(xi(b)) is a vector-valued function giving the interaction of subtree bbb with each nominal covariate

  • gbord(xi(b))\mathbf{g}^{\mathrm{ord}}_b(\mathbf{x}_i^{(b)})gbord(xi(b)) is the analogous interaction function for ordinal covariates

NMDP [002 SP_B]

By Vidhi Lalchand

NMDP [002 SP_B]

  • 22