Димитрина
Златкова
(ИИ)
Даниел
Копев
(ИИ)
Атанас
Атанасов
(ИИ)
SemEval-2018
So this happened :) with my girls @user and erinblonshine. ️ #29 #sacredhearttattoo @ Sacred…
Data:
Label:
red_heart
two_hearts
blue_heart
purple_heart
camera_with_flash
camera
So this happened :) with my girls @user and erinblonshine. ️ #29 #sacredhearttattoo @ Sacred…
Original:
So this happened __smile__ with my girls @user and erinblonshine. ️ #29 #sacredhearttattoo @ Sacred…
Pattern replace:
['so', 'this', 'happened', '__smile__', 'with', 'my', 'girls', 'and', 'erinblonshine', 'sacred', 'heart', 'tat', 'too', 'sacred']
Tokenize:
So this happened __smile__ with my girls and erinblonshine#sacredhearttattoo Sacred
Char filter:
['happened', 'smile', 'girls', 'erinblonshine', 'sacred', 'heart', 'tat', 'sacred']
Stop words:
['happen', 'smile', 'girl', 'erinblonshine', 'sacred', 'heart', 'tat', 'sacred']
Lemmatize:
[('so','RB'),('this','DT'),('happened','VBD'),('smile','NN'), ('with','IN'),('my','PRP$'), ('girls','NNS'),('and','CC'), ('erinblonshine','NN'),('sacred','JJ'),('heart','NN'),('tat','VB'),('too', 'RB'),('sacred','VBD')]
POS Tagger:
- fun, sun
- fun, sun
- sun
- sun
- sun
- sun
- park
Positive:
Negative:
"pos_0", "pos_.15", "pos_.20", "pos_.27", "pos_.4", "pos_above"
"neg_0", "neg_.15", "neg_.25", "neg_.35", "neg_.6", "neg_above"
^010011000 | got qot gott g0t gotz qott gottt gawt ghot gotcho goht ggot |
^111010100010 | lmao lmfao lmaoo lmaooo lool rofl loool lmfaoo lmfaooo lmaoooo |
^111010100011 | haha hahaha hehe hahahaha hahah aha hehehe ahaha hah hahahah hahaa ahah |
Precision | Recall | F1 Macro | |
Multinomial Naive Bayes | 0.05 | 0.21 | 1.763 |
Logistic Regression with L-BFGS | 0.22 | 0.28 | 13.16 |
MLP, 2 hidden layers, ReLU | 0.26 | 0.26 | 17.898 |
Random Forest (50 estimators) | 0.20 | 0.26 | 16.167 |
SVM, tf-idf | 0.23 | 0.27 | 19.554 |
SVM, Twitter embeddings | 0.16 | 0.18 | 8.522 |
AdaBoost, Extra Tree base | 0.15 | 0.19 | 7.825 |
SVM+AdaBoost+Random Forest | 0.25 | 0.24 | 13.764 |
SVM+AdaBoost+MLP | 0.25 | 0.28 | 20.106 |
(10k train, 1k test)
Precision | Recall | F1 Macro | |
CNN | 0.15 | 0.14 | 12.034 |
RNN with LSTM | 0.24 | 0.17 | 13.106 |
HANN | 0.30 | 0.13 | 15.999 |
(10k train, 1k test)
Precision | Recall | F1 Macro | |
SVM, tf-idf | 0.30 | 0.33 | 23.3 |
HANN | 0.30 | 0.13 | 22.518 |
(488k train, 50k test)
valentine, loveofmylife, heart full, heart
cool kid, sunglasses, coolin, shade, cool, sunglass
ti season, christmastree, tree, christmas tree, merry christmas, merry, christmas
pretty pink, breast, pink, breast cancer
daze, beachin, sunshine state, fun sun, sunny day, sun, sunny, sunshine
veteran day, murica, veteran, america, ivoted, election, merica, vote, usa
23.3