Aspect/sentiment-aware review summarization (Seminal)

Aspect/sentiment-aware review summarization (Seminal)

Opinion summarization has been traditionally approached with unsupervised, weakly supervised and few-shot learning techniques.

No alt text provided for this image

[Yang 18] instead studies abstractive summarization in an end-to-end manner without hand-crafted features and templates by exploring encoder-decoder framework and multi-factor attentions. Specifically, a mutual attention mechanism interactively learns representations of context/sentiment/aspect words within reviews, acting as an encoder. Learned representations are incorporated into the decoder to generate summaries via an attention fusion network. In addition, summarizer is jointly trained with the text categorization task, which helps learn a category-specific text encoder, locating salient aspect information and exploring variations of style and wording of content wrt different text categories.

No alt text provided for this image

[Elsahar 20] proposes a self-supervised setup that considers an individual document as a target summary for a set of similar documents. This setting makes training simpler than previous approaches by relying only on standard log-likelihood loss and mainstream models. Hallucination is addressed through the use of control codes, to steer the generation towards more coherent and relevant summaries. Benchmarks on two English datasets against graph-based and recent neural abstractive unsupervised models show that proposed method generates summaries with a superior quality, as well as a high sentiment and topic alignment with the input reviews. This is confirmed in human evaluation which focuses explicitly on the faithfulness of generated summaries.

No alt text provided for this image

[Amplayo 21]? allows generation of customized summaries based on aspect queries (e.g., describing location/room of a hotel). Using a review corpus, a synthetic training dataset is created of (review, summary) pairs enriched with aspect controllers which are induced by a multi-instance learning (MIL) model that predicts aspects of a document at different levels of granularity. A pretrained model is fine tuned using the dataset and aspect-specific summaries are generated by modifying aspect controllers.

No alt text provided for this image

[Brazinskas 21] collects a large dataset of summaries paired with user reviews for over 31,000 products, enabling supervised training. However, the number of reviews per product is large (320 on average), making summarization – and especially training a summarizer – impractical. Moreover, content of many reviews is not reflected in human-written summaries, and, thus, summarizer trained on random review subsets hallucinates. In order to deal with both of these challenges, the task is formulated as jointly learning to select informative subsets of reviews and summarizing opinions expressed in these subsets. Choice of review subset is treated as a latent variable, predicted by a small and simple selector and then fed into a more powerful summarizer. For joint training, amortized variational inference and policy gradient methods are used.

[Yang 18] Aspect and Sentiment Aware Abstractive Review Summarization

[Elsahar 20] Self-Supervised and Controlled Multi-Document Opinion Summarization

[Amplayo 21] Aspect-Controllable Opinion Summarization

[Brazinskas 21] Learning Opinion Summarizers by Selecting Informative Reviews

要查看或添加评论,请登录

Muthusamy Chelliah的更多文章

社区洞察

其他会员也浏览了