yuan meng
nv search dsml
as opposed to flat
query-to-taxonomy classification
"chicken"
"Pet Care"
"Meat & Poultry"
"Cats"
"Dogs"
"Cat Food"
"Dry Cat Food"
"Wet Cat Food"
L1
L2
L3
L4
"Poultry"
"Chicken"
"Chicken Breast"
"Chicken Thigh"
why not? 🤔
inconsistency: e.g., "Features" >> "Sports"
huge number of classifiers
HiAGM: hierarchy-aware global model for hierarchical text classification
1. textRCNN: encodes text input
2. structure encoder: aggregates label information
inductive fusion
deductive fusion
3. fusion method: combine 1 & 2
contextual info
n-gram features
key information
def forward(self, inputs, seq_lens):
text_output, _ = self.rnn(inputs, seq_lens)
text_output = self.rnn_dropout(text_output)
text_output = text_output.transpose(1, 2)
# top k max pooling after CNN
topk_text_outputs = []
for _, conv in enumerate(self.convs):
convolution = F.relu(conv(text_output))
topk_text = torch.topk(convolution, self.top_k)[0].view(
text_output.size(0), -1
)
topk_text = topk_text.unsqueeze(1)
topk_text_outputs.append(topk_text)
return topk_text_outputssnippet from TextEncoder class (code)
before learning: assign priors
learning of a node informs its parent + children
def _soft_attention(text_f, label_f):
att = torch.matmul(text_f, label_f.transpose(0, 1))
weight_label = functional.softmax(att.transpose(1, 2), dim=-1)
label_align = torch.matmul(weight_label, text_f)
return label_alignmain results
compare different component choices
winner
depth of gcn
depth of taxonomy