termex

1. Papers read

2. Tentative workflow designed

3. Code

Review Literature:

Statistical Approaches:

Linguistic/Semantics-based approaches:

Removal of stopwords and other unnecessary characters like punctuation (DONE)
POS-tagging to get only nouns, verbs and adjectives. (DONE)
Keeping remaining words as domain-relevant labels, removing any irrelevant words from them manually. This will become the label set for evaluation. (IN PROGRESS)
Actual domain term extraction, analysis of approaches.
Evaluation, further work/corrections

Future Work

Exploring approaches - graph-based, statistics and linguistics-based, aspect-based