Automatic Text Summarization
Luis Manuel Román García
ITAM
Contents
- Problem description
- Extractive/Abstractive text summarization
- Progress
Problem Description
Problem Description
Automatic text summarization is the task of producing a concise and fluent summary while preserving key information content and overall meaning
Problem Description
Automatic text summarization is very challenging, because when we as humans summarize a piece of text, we usually read it entirely to develop our understanding, and then write a summary highlighting its main points.
Problem Description
Since computers lack human knowledge and language capability, it makes automatic text summarization a very difficult and non-trivial task
Problem Description
In general, there are two different approaches for automatic summarization: extraction and abstraction
Problem Description
Extractive summarization methods work by identifying important sections of the text and generating them verbatim; thus, they depend only on extraction of sentences from the original text.
Problem Description
abstractive summarization methods aim at producing important material in a new way. In other words, they interpret and examine the text using advanced natural language techniques in order to generate a new shorter text that conveys the most critical information from the original text
Abstractive Summarization
Extractive/AbstractiveText Summarization
Summarization Methods
- Intermediate representations
- Topic representation
- Word frequency
- Latent semantic analysis
- Indicator representation
- Sentence length
- Position in text
- Presence of NER
- Topic representation
Extractive Methods
Summarization Methods
- Complex NLP models
- Bayesian networks
- Pointer networks
- Sequence to sequence
Abstractive Methods
Progress
Progress
- Multiple text, multiple source approach
MODULS
- Advanced location identifier
- NER
- Event desambiguation
- Semantic similarity
{'geometry':
{'type': 'Point',
'coordinates': [-102.29171, 21.866713]},
'type': 'Feature',
'properties': {
'box': 80,
'doc_type':
'news',
'dates': ['2017-11-01 03:56:22', '2017-11-01 00:00:00', '2017-11-03 00:00:00'],
'url': [u'http://www.noticieroelcirco.mx/policias-municipales-impidieron-que-se-suicidara-un-joven-en-aguascalientes/', u'http://www.hidrocalidodigital.com/local/articulo.php?idnota=132067', u'http://www.noticieroelcirco.mx/ya-son-126-suicidios-en-el-ano-en-aguascalientes-hombre-se-ahorco-en-su-casa/'],
'sources': [u'Noticiero el Circo', u'Hidrocalido Digital', u'Noticiero el Circo'],
'titles': [u'\xa1Polic\xedas municipales impidieron que se suicidara un joven en Aguascalientes! \u2013 Noticiero El Circo',
u'Se supera la cifra hist\ufffdrica de suicidios en Aguascalientes',
u'\xa1Ya son 126 suicidios en el a\xf1o en Aguascalientes: hombre se ahorc\xf3 en su casa! \u2013 Noticiero El Circo'],
'words': ['circo', 'hombre', 'ahorco', 'noticiero', 'suicidios', 'ano', 'supera', 'impidieron', 'histrica', 'cifra', 'policias', 'municipales', 'suicidara', 'joven']}}
State of the App
Text summarization
By Luis Roman
Text summarization
- 1,354