Supervised Knowledge May Hurt Novel Class Discovery Performance
Ziyun Li 1, Jona Otholt 1, Ben Dai 2, Di hu 3,
Christoph Meinel 1, Haojin Yang 1
1 Hasso Plattner Institute, 2 Chinese University of Hong Kong, 3 Renmin University of China
NCD Background
Novel class discovery (NCD) is a machine learning task focused on finding new classes in the data that weren't available during the training period.
NCD Background
NCD Background
NCD Background
How can we borrow supervised knowledge and break the category constrain?
NCD Background
NCD: Existing Methods
Vaze et al (CVPR 2022) Generalized Category Discovery
NCD: Existing Methods
Fini et al (ICCV 2021) A Unified Objective for Novel Class Discovery
NCD: Existing Methods
Fini et al (ICCV 2021) A Unified Objective for Novel Class Discovery
NCD: Existing Methods
What makes the implementation of NCD possible?
NCD: Existing Methods
What makes the implementation of NCD possible?
Supervised info \( \mathbf{X} | Y \)
NCD: Existing Methods
What makes the implementation of NCD possible?
Supervised info \( \mathbf{X} | Y \)
Unsupervised info \( \mathbf{X} \)
NCD: Existing Methods
Vaze et al (CVPR 2022) Generalized Category Discovery
NCD: Existing Methods
Fini et al (ICCV 2021) A Unified Objective for Novel Class Discovery
NCD: Existing Methods
DL:
- More data is better...
- design a DL architecture
STAT:
- under this kind of assumption you should ...
NCD: Outline
DL:
- More data is better...
- design a DL architecture
STAT:
- under this kind of assumption you should ...
Motivated question:
Is supervised knowledge always helpful?
NCD: Outline
DL:
- More data is better...
- design a DL architecture
STAT:
- under this kind of assumption you should ...
Step 1
Step 2
Step 3
NCD: Metric
NCD: Metric
Suppose we learn a mapping \(\mathbf{p}\) from training samples
How to measure the effectiveness of \(\mathbf{p}\)
NCD: Metric
Recall: MMD
Muandet et al (2020) Kernel Mean Embedding of Distributions: A Review and Beyond
NCD: Metric
Recall: MMD
Muandet et al (2020) Kernel Mean Embedding of Distributions: A Review and Beyond
Fini et al (ICCV 2021) A Unified Objective for Novel Class Discovery
NCD: Metric
Yet, in practice, \(Y_u\) is unknown...
NCD: Benchmark
Step 1
Step 2
Step 3
NCD: Benchmark
NCD: Benchmark
NCD: Benchmark
Conclusion: consistency between Semantic Similarity and Accuracy. The proposed benchmark is good...
NCD: Benchmark
Conclusion: consistency among Semantic Similarity, Accuracy, and (pseudo) transfer flow. The proposed metric is good...
NCD: Supervised Info May Hurt
Step 1
Step 2
Step 3
Step 4
NCD: Supervised Info May Hurt
NCD: Supervised Info May Hurt
Suboptimal
NCD: Supervised Info May Hurt
Conclusion: Supervision information with low semantic relevance may hurt NCD performance.
NCD: Supervised Info May Hurt
Conclusion: pseudo transfer flow can be used as a practical reference to infer what sort of data we want to use in NCD.
Application: Data selection
NCD: Supervised Info May Hurt
Application: Data Combining
Contribution
- We find that using supervised knowledge from the labeled set may lead to suboptimal performance in low semantic NCD datasets. Based on this finding, we propose two practical methods and achieve ∼3% and ∼5% improvement in both CIFAR100 and ImageNet compared to SOTA.
- We introduce a theoretically reliable metric to measure the semantic similarity between labeled and unlabeled sets. A mutual validation is conducted between the proposed metric and a benchmark, which suggests that the proposed metric strongly agrees with NCD performance.
- We establish a comprehensive benchmark with varying degrees of difficulty based on ImageNet by leveraging its hierarchical semantic similarity.
Thank you!
ncd
By statmlben
ncd
- 78