You talk what you read: Understanding News Comment Behavior
by Dispositional and Situational Attribution
Abstract
Many news comment mining studies are based on the assumption that comment is explicitly linked to the corresponding news. In this paper, we observed that users’ comments are also heavily influenced by their individual characteristics embodied by the interaction history. Therefore, we position to understand news comment behavior by considering both the dispositional factors from news interaction history, and the situational factors from corresponding news. A three-part encoder-decoder framework is proposed to model the generative process of news comment. The resultant dispositional and situational attribution contributes to understanding user focus and opinions, which are validated in applications of reader-aware news summarization and news aspect-opinion forecasting.
1 Introduction
Increasingly more people express their opinions on news articles through online services recently, such as news portals and microblogs. The resulting vast number of comments clearly reflect thoughts and feelings of individuals. Mining these comments thus has important applications with practical socio-political and economical benefits. Hou et al. (2017) Boltužić and Šnajder (2014)
Existing related work has explored comment data in different scenarios. One typical line of studies Pontiki et al. (2016) Peng et al. (2020) Yan et al. (2021) concentrated on Sentiment Analysis tasks, especially Aspect-based Sentiment Analysis(ABSA) which aims to identify the aspect term, its corresponding sentiment polarity, and the opinion term. Another research line was based on the interaction between news and comments. A fundamental assumption for these studies is that comments have clear correspondence with certain aspects of the corresponding news. Based on the explored news-comments correspondence, Hou et al. (2017) aligned comments to news topics, which improves readers’ news browsing experience, Yang et al. (2020) introduced a new task which leverages reading and commenting history to predict a user’s future opinions to unseen news, and researchers also developed automatic news commenting algorithms to encourage user engagement and interactions Qin et al. (2018). For example, Wang et al. (2021) incorporated reader-aware factors to generat diversified comments. Li et al. (2019) modeled the news as a topic interaction graph to capture the main point of the article, which enhances the correspondence between generated comments and news.
However, sometimes comments not explicitly link to the corresponding news. Table 1(top) illustrates an example of news and associated comments that Wuhan’s patients with COVID-19 in the hospital are fully recovered.
Title: Live screen! The last batch of COVID-19 patients in Wuhan have been discharged from hospital. |
Body: “Finally!” On April 26, a COVID-19 patient surnamed Ding was discharged from Wuhan Pulmonary Hospital, and #all hospitalized COVID-19 patients in Wuhan were cleared#. Netizen: The day of wuhan’s recovery is the Chinese New Year. |
Comment: |
User-1: Congratulations! This is a memorable day! |
User-2: The last batch of Jiangsu Medical teams to aid Hubei went home. Thank you. |
User-3: Great China! |
Partial news-comment history of User-2: |
news: Jiangsu launched a level 1 public health emergency response to prevent the spread of the virus. |
comment: Each student has been screened in my daughter’s school today. |
\hdashline[2.5pt/5pt] \hdashline[2.5pt/5pt] news: The second group of medical workers from Jiangsu province relay to Wuhan. |
comment: Salute to the most beautiful people. |
\hdashline[2.5pt/5pt] \hdashline[2.5pt/5pt] news: Four cases of pneumonia caused by the COVID-19 were confirmed in Jiangsu province, all with recent travel history to Wuhan. |
comment: Suzhou is the first city in Jiangsu province to find (confirmed cases). |
The comments about “memorable day” and “China” link entities to the corresponding news, however, the comments about “Jiangsu” from user-2 has no explicit correspondence. By retrieving user-2’s news interaction (i.e., reading and commenting) history as shown in Table 1 (bottom), we find that he/she heavily concerns about the topics related to “Jiangsu”, which gives rise to the above comment combining the news’ topics on COVID-19 and user-2’s individual focus on “Jiangsu”.
To further investigate whether the above phenomenon is common, we study how comment entities distribute in corresponding news and their interaction history on 85,179 comments from 1,275 users on NetEase News111https://news.163.com/. Table 2 shows the distribution of comments with respect to the entity linking. We observe that 34% comments have no key entities clearly linked to the corresponding news, among which nearly 2/3 (20%) appear only in users’ interaction history. The result demonstrates that user comment is related to not only the corresponding news but also user’s individual characteristics embodied by the interaction history.
Inspired by this, in this paper, we position to understand news comment behavior by modeling both user’s interaction history and the corresponding news. According to the attribution theory Heider (2013) Heider and Simmel (1944), human behavior attribution can be divided into dispositional attribution(e.g. emotions, attitudes, abilities, etc.) and situational attribution(e.g. event or external pressure etc.). Intuitively, in the news comments scenario, mining interaction history and the corresponding news contribute to dispositional and situational attribution respectively. Based on the above analysis and conclusion, we develop a three-part generative framework named DS-Attributor to understand news comment behavior by Dispositional and Situational Attribution. The first part is Dispositional Factor Encoder to model individual characteristics with both aspect and opinion user topic preferences. The second part is Situational Factor Encoder exploiting the user-derived aspect topics from dispositional factor to detect the focused aspects of specific news. Finally, the mined opinion topics of dispositional factor and the detected situational factor are integrated into the Dynamic Comment Decoder module to generate comments.
Entities of comments appear in | Percentage |
---|---|
only corresponding news | 21% |
corresponding news & history | 55% |
only history | 20% |
neither | 14% |
Contributions. We summarize the main contributions of this paper as follows:
-
•
We position the problem of understanding news comment behavior by both situational and dispositional attribution.
-
•
We propose a novel encoder-decoder framework to model the comment generation process by combining the comment history and corresponding news.
-
•
The resultant dispositional and situational attribution is validated to enable applications like news aspect-opinion forecasting and reader-aware news summarization.
2 Notations and Problem Definition
Our goal is to understand news comment behavior by dispositional and situational attribution through a generative framework. Specifically, given a user, the model needs to use his/her historical comments to mine dispositional factors and detect the situational factors from a specific piece of news, then generates comment with the mined dispositional and situational factors. Let denote a set of users. For each user , assume to include all comments posted before timestep . The comment denotes a sequence of words as , where , and is the number of words in . Let denotes a piece of news which haven’t been read by before as includes news title and sentences of news body. Based on the above notations, we formally define the problem as follows:
Problem 1(Dispositonal and Situational Comment Attribution) Given the historical comments of user and a specific piece of news , the goal of Dispositonal and Situational Comment Attribution is: (1) to mine dispositional factor which includes preferences regarding both aspect and opinion topics from , (2) to detect situational factor from news , (3) to generate comment on news based on dispositional and situational factors.

3 Methodology
We present the overall framework DS-Attributor in Figure 1, which includes three main modules. The Dispositional Factor Mining Encoder aims at modeling dispositional factors involving with both aspect and opinion topic preferences , from users’ historical comments (see Section 3.1). In Situational Factor Encoder, given aspect topic preference and news , the goal is to get representation for each news sentence, and measure the corresponding importance by a weighted aspect vetor (see Section 3.2). Finally, in Dynamic Comment Decoder, users opinion vector is obtained and incorporated with and of each sentence to generate the observed comment (see Section 3.3). We will elaborate the details of each module below.
3.1 Dispositional Factor Encoder
In this subsection, our goal is to mine dispositional factor from users’ historical comments. The comment is mainly composed of aspect and opinon terms Pontiki et al. (2016), for example, in the sentence Great China!, the aspect term is “China”, and the opinion term is “Great”. Therefore, we model the dispositional factor as the user’s preferences of aspect and opinion.
Comment Disentanglement. We first pretrain a Comment Disentanglement module based on Neural Topic Model Dieng et al. (2020), which can be used to extract aspect and opinion topic distributions from the comment. In addition, the aspect topic vectors and opinion topic vectors also can be obtained, where is the dimension of topic vetors, and are numbers of aspect and opinion topic respectively. Specifically, as shown in the left of Figure 1, for each comment, we employ Bag-of-Words(BOW) feature vector to represent it. Then we use two parallel VAE-based structures to reconstruct aspect BOW target and opinion BOW target respectively. and are defined as:
(4) | ||||
(8) |
where , , are elements of , , respectively. During inference, given a comment BOW feature vector, we can get both aspect topic distribution and opinion topic distribution .
Historical Aspect-Opinion Modeling. For each user , the historical sequences of both aspect and opinion topic distributions can be obtained by performing comment disentanglement on the historical comments . Denote , as the derived historical sequences of aspect and opinion topic distributions. We introduce two different LSTMs to process the above distribution sequences and output user preferences of aspect topic and opinion topic respectively. Specifically, the -th hidden states are given by:
(9) | ||||
(10) |
After recursive updating, we encode user’s propositional preference in user-aspect topic and user-opinion topic .
3.2 Situational Factor Encoder
In this subsection, our goal is to detect situational factor from the news. Since not all sentences contribute equally to motivate users to comment, we introduced an attention-based method using weighted aspect vector to measure importance of news sentences with respect to a specific user.
Hierarchical News Encoder. Firstly, the news is embedded as by a hierachical news encoder. Assume the news contains sentences and each sentence contains words. with represents the words in the th sentence. Given a sentence, we first use Bi-LSTM to get word embeddings from both directions for words as:
(11) |
An attention mechanism Vaswani et al. (2017) is introduced to aggregate the representation of those informative words to form a sentence vector as follows:
(12) | ||||
(13) | ||||
(14) |
To get the news embedding, we aggregate the sentence vectors by attentive pooling as:
(15) | ||||
(16) | ||||
(17) |
Importance Measurement. Since reflects the user’s aspect preference for news content, we employ it to analyse the importance of sentences in the news. Specifically, we first take and as inputs to predict aspect topic distribution , and the weighted aspect vector is then calculated as:
(18) | ||||
(19) |
where is the th aspect vector obtained from Comment Disentanglement module. Note that in the training stage, the true aspect topic distribution is available. During inference, we predict aspect topic distribution by Eqn. (19). So in order to learn during the training stage, a KL term is added into the final loss function:
(20) |
Similarly, =D_KL(^z_s——z_s)^z_sg_i ∈[0,1]
3.3 Dynamic Comment Decoder
Considering the opinion topic preference, we design a comment decoder to dynamically integrate the opinion vector and the news context vector to generate comments. The formula of the decoder is defined by:
(21) |
where denotes vector concatenation. The comment word is then sampled from output distribution based on the concatenation of decoder state as:
(22) |
During training, cross-entropy loss is employed as the optimization objective. The decoder takes the embedding of the previously decoded word , the context vector and the dynamic opinion state as input to update its state. The context vector is a weighted sum of encoder’s sentence representations calculated by:
(23) | ||||
(24) | ||||
(25) |
The dynamic opinion state is initialized by the opinion vetor which is calculated similar with (see Eqn. (19)), and decays by a certain amount at each time step. This process is described as:
(26) | ||||
(27) | ||||
(28) | ||||
(29) |
where denotes element-wise multiplication. Once the decoding process is completed, the opinion state is expressed on context vector completely, and the comment is generated.
The above three modules are jointly trained with the following overall loss function
(30) |
where , are hyperparameters to balance between different modules. Then we jointly train all components according to Eqn. (30). After dispositional and situational comment attribution, we can obtain , , and the decoder attention to support the following experiments and applications.
4 Experiments
4.1 Experiments Setup
Datasets. Since existing news datasets do not include user interaction history to satisfy situational attribution, we construct a new dataset DS-News from NetEase News, one of the most popular online news platfrom in China. We set 10 random users and crawl users who commented on the same news article using Breadth First Search. For a specific user, we crawl his/her interaction history which consists of a sequence of news-comment pairs. After removing users with too short interaction history, 1,275 examined users are collected with totally 124,918 comments. Table 1(top) visually shows a news comment instance. The statistics of DS-News are summarized as shown in Table 3.
Dataset attributes | number |
---|---|
Total number of users | 1,275 |
Total number of news | 97,937 |
Total number of comments | 124,918 |
Avg. length of user histories | 97.63 |
Avg. number of news words | 382.77 |
Avg. number of comment words | 17.10 |
Compared Methods. In order to evaluate the effectiveness of the proposed DS-Attributor, we implemented the following baselines and DS-Attributor variants for comparison in terms of the news comment generation task.
-
•
Seq2seq(Qin et al. (2018)): this model follows the framework of seq2seq model with attention. We use the title together with the content as input.
-
•
Hierarchical-Attention(Yang et al. (2016)): this model takes all the content sentences as input and applies hierarchical attention as the encoder to get the sentence vectors and document vector. A RNN decoder with attention is applied. The document vector is used as the initial state for RNN decoder.
-
•
Graph2seq(Li et al. (2019)): this model constructs the input news as a topic interaction graph, and it takes the GCN as encoder and the LSTM as decoder to generate news comment.
-
•
DS-Attributor(w/o IM): DS-Attributor without Importance Measurement.
-
•
DS-Attributor(w/o OV): DS-Attributor without integrating opinon vector .
Evaluation protocols. We use BLEU-1, BLEU-2 Papineni et al. (2002), ROUGE-L Lin (2004), CIDErVedantam et al. (2015) and METEOR Banerjee and Lavie (2005) as metrics to evaluate the performance of different models. A popular NLG evaluation tool nlg-eval 222https://github.com/Maluuba/nlg-eval is used to compute these metrics.
Implementation details. For pretraining Comment Disentanglement, we use a vocabulary with the top 20k frequent words in the entire data. The number of aspect topics and opinion topics are set to 40 and 6 respectively. The dimensions of the latent topic vectors are both set to 300. We pretrain the model using Adam Kingma and Ba (2014) with learning rate 0.001. For Historical Aspect-Opinion Modeling, we use different two-layer LSTMs with hidden size 64 to model aspect and opinion topics respectively. For sentence encoder, we use a two-layer Bi-LSTM with hidden size 128. For importance measurement, we employ attention mechanism with hidden size 256. We use a two-layer LSTM with hidden size 512 as decoder. For our method, and are both set to 0.4 333More implementation details and results by tuning the hyperparameters are available in the supplementary material.. The batch size is set to 64. Those parameters are optimized by Adam optimizer with learning rate 0.001 and trained for 200 epochs with learning rate decay.
Methods | BLEU-1 | BLEU-2 | ROUGE-L | METEOR | CIDEr |
---|---|---|---|---|---|
Seq2seq | 0.101 | 0.021 | 0.091 | 0.046 | 0.029 |
Graph2seq | 0.108 | 0.020 | 0.093 | 0.044 | 0.023 |
Hierarchical-Attention | 0.102 | 0.022 | 0.092 | 0.044 | 0.037 |
DS-Attributor(w/o IM) | 0.118 | 0.027 | 0.103 | 0.051 | 0.034 |
DS-Attributor(w/o OV) | 0.121 | 0.027 | 0.094 | 0.053 | 0.034 |
DS-Attributor | 0.125 | 0.029 | 0.108 | 0.054 | 0.039 |
4.2 Quantitative Experimental Results
Quantitative evaluation results are shown in Table 4. The proposed DS-Attributor outperforms the baselines on all 5 evaluation metrics. This demonstrates the advantage of exploring the dispositional factors in modeling news comment behaviors. In Table 6 and Table 7, we illustrate some example aspect and opinion topics discovered from news interaction history. We can see that aspect topics describe different focuses and interests of users, and opinion topics help understand user sentimental preferences. Among the baseline methods, Hierarchical-Attention generally performs better than Seq2seq and Graph2seq. A possible reason is that Hierarchical-Attention captured and aggregated the key information in the news through hierarchical attention mechanism.
On all 5 evaluation metrics, DS-Attributor achieved superior performance than the two variants, showing the contribution of important sentence measurement and opinion integration. Key observations include: (1) The performance of DS-Attributor(w/o IM) decreased significantly on BLEU and METEOR, which indicates that leveraging weighted aspect vetor is beneficial to remove irrelevant information and detect users’ focused aspects of specific news. (2) When opinion vector is removed, DS-Attributor(w/o OV) performs poorly on ROUGE-L and CIDEr, which shows that mining users’ opinion preference does provide prior information to understand the sentimental tendency and thus help accurately predict comment reaction.
Title: A college in Wuhan apologizes for the requisition of student dormitories: Improper disposal of items will be compensated |
Body: The college issued a letter of apology on February 10 in response to the requisition of students’ dormitories. The college received a notice from the city government on February 7, and then requisitioned some dormitories as COVID-19 medical isolation sites by February 9. The college apologized for the improper disposal of students’ belongings and promised to compensate students for any loss of belongings after verification and disinfect the dormitories in the next semester. Experts: medical support. In recent days, a number of university dormitories in Wuhan have been requisitioned as quarantine observation points in response to the COVID-19 outbreak. For students’ personal belongings, many schools said they would be sealed up for special storage. |
Seq2Seq: 我就想知道是什么时候的? (I just want to know when?) |
Graph2Seq: 我觉得这就是在黑,因为我觉得是个什么原因 (I think it’s slander because I think it’s a reason) |
Hierarchical-Attention: 这是要被封了吗? (Is this going to be blocked?) |
Comment-1: 全国学校都是一个样,都是血泪了 Schools all over the country are the same, sad |
Comment-2: 什么时候可以开学? When can I go to school? |
Comment-3: 大逆不道!我想见宿舍! Outrageous! I want to see the dorm! |
Topic No. | Topic words |
---|---|
Aspect 1 | fan, star, hero, entertainment |
Aspect 4 | player, football, champion, fans, team |
Aspect 17 | news, society, problem, comments |
Aspect 20 | virus, human, earth, Black, Wuhan |
Aspect 38 | teacher, school, student, university |
Aspect 39 | world, people, politics, protest, danger |
Topic No. | Topic words |
---|---|
Opinion 1 | like, not bad, nice, pretty, delicious |
Opinion 2 | hope, protect, isolate, normal, alive |
Opinion 3 | development, hope, try hard, solve |
Opinion 4 | no, no way, not enough, disbelief |
4.3 Case Study
In order to better understand how dispositional and situational attribution contribute to the comment behavior, as shown in Table 5, we visualize the generated comments for one specific news regarding the requisition of a university in Wuhan as a medical isolation place. Compared to the baselines only leveraging modeling the situational factors, DS-Attributor can generate appropriate comments according to different users considering the dispositional factors mined from their interaction history. It is shown that the generated comments for three example users from DS-Attributor contain diverse and more clear focuses. Regarding comment-3 which expresses complaint about the decision of dormintory requisition, we examine the corresponding situational and dispositional factors. In particular, for situational factor, we highlight the news sentence with blue color that users focus most on with the highest attention value . From situational attribution we can see that the user is concerned about dormitories and personal belongings in this news. For dispositional factor, from the aspect and opinion distributions , , we find that this user has the highest preference for Aspect#38 and Opinion#4 . As shown in Table 6 and Table 7, Aspect#38 talks about school, and Opinion#4 indicates negative sentiment. This gives rise to the dissatisfaction in the comment and helps detect the user’s actual focus in the news.
5 Applications
By situational and dispositional attribution, the proposed DS-Attributor can enable applications other than comment generation. In this section, we introduce two possible applications by employing the learned during situational and dispositional attribution.
5.1 News Aspect-Opinion Forcasting
In this subsection, we introduce a useful application aggregating the predicted comments to forecast the audience’s focus and opinion for future news. We will use an example of news to illustrate this application which describes Vietnam exposed a large-scale protest in the streets of the people. Specifically, for the given news, 200 users are selected as test subjects to predict their focused aspect distribution and corresponding opinion distribution . For simplicity, we obtain above two topics distribution of 200 users, and analyse the topics with the highest weights respectively.
We observe that most people concentrated on Aspect#39 which talks about social topics(e.g., politics, pretest, etc.). The keywords of generated comments on Aspect#39 are shown in the Figure 2(left), which are closely related to the news content. As for people’s opinions on Aspect#39, we visualize the opinion distribution in the Figure 2(right). Specifically, most people express positive emotions, such as “protect”, “alive” in Opinion#2 and “hope”, “solve” in Opinion#3, and the others express opposition to this matter(e.g., no, disbelief, etc.). Therefore, DS-Attributor provides the possibility to predict people’s reaction before or at the early stage of news release. By examining the users from certain community, By examining the users from a certain community, we can also support fine-grained aspect-opinion forecast. This will enable timely and effective public opinion management.
5.2 Reader-aware News Summarization
DS-Attributor derives users’ aspect preferences as by-product, which helps understand the subjective focus on news. Therefore, instead of objective news summarization as most current studies conduct by only analyzing the correlation between news sentences, we can exploit the derived user aspect preference to support a novel subjective news summarization. Specifically, we introduce subjective user factors into the traditional objective news summarization solution, by fusing the score (see section 3.2) and the decoder attention (see section 3.3) to update the similarity matrix of standard TextRank Mihalcea and Tarau (2004) as
(31) |
where is cosine similarity of two sentence vectors, is defined as
(32) |
is defined similar to , and , , are coefficients. The final sentence importance score is estimated after performing TextRank. With ROUGE-L as the evaluation metric, we compared three summarization strategies on 100 news: (1) Standard TextRank: to extract top-k sentences as summary without reader factors. (2) Single-user: to randomly select one from 20 users’ top-k results times, and average the ROUGE-L scores. (3) Multi-user: to randomly select users’ summary results and choose sentences by voting each time, repeat times and average the ROUGE-L scores.
The evaluation results of different methods are shown in Figure 3.


From the results, we have the following conclusions: (1) reader-aware summarization strategies show better performance than standard TextRank, because subjective reader factor is useful to extract the news article highlight. (2) the multi-reader strategy achieves superior performance when is small, which shows that the common interest of multiple readers is beneficial to mining the main purpose of the news. Users’ interests disperse as increases, where multi-user strategy obtains close performance to single-user strategy but still clearly outperforms the standard TextRank-based solution. Note that this evaluation is conducted with news title as the ground-truth. In practical scenarios, by exploiting the dispositional preference for a specific individual or group of users, we can develop applications like customized and even personalized news summarization.
6 Conclusion
In this paper, we have proposed an encoder-decoder framework, DS-Attributor, for modeling the comment generation process by combining both situational and dispositional factors. Following this study, we are working towards the following directions: (1) modeling comment attribution with news event, e.g., associating the discovered global aspect topics to the local news event aspects; (2) exploring more applications by employing the derived situational and dispositional factors, e.g., customized news summarization, comment-driven news recommendation.
References
- Banerjee and Lavie [2005] Satanjeev Banerjee and Alon Lavie. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pages 65–72, 2005.
- Boltužić and Šnajder [2014] Filip Boltužić and Jan Šnajder. Back up your stance: Recognizing arguments in online discussions. In Proceedings of the First Workshop on Argumentation Mining, pages 49–58, 2014.
- Dieng et al. [2020] Adji B Dieng, Francisco JR Ruiz, and David M Blei. Topic modeling in embedding spaces. Transactions of the Association for Computational Linguistics, 8:439–453, 2020.
- Heider and Simmel [1944] Fritz Heider and Marianne L. Simmel. An experimental study of apparent behavior. American Journal of Psychology, 57:243–259, 1944.
- Heider [2013] Fritz Heider. The psychology of interpersonal relations. Psychology Press, 2013.
- Hou et al. [2017] Lei Hou, Juanzi Li, Xiao-Li Li, Jie Tang, and Xiaofei Guo. Learning to align comments to news topics. ACM Transactions on Information Systems (TOIS), 36(1):1–31, 2017.
- Kingma and Ba [2014] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Li et al. [2019] Wei Li, Jingjing Xu, Yancheng He, Shengli Yan, Yunfang Wu, et al. Coherent comment generation for chinese articles with a graph-to-sequence model. arXiv preprint arXiv:1906.01231, 2019.
- Lin [2004] Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81, 2004.
- Mihalcea and Tarau [2004] Rada Mihalcea and Paul Tarau. Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing, pages 404–411, 2004.
- Papineni et al. [2002] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.
- Peng et al. [2020] Haiyun Peng, Lu Xu, Lidong Bing, Fei Huang, Wei Lu, and Luo Si. Knowing what, how and why: A near complete solution for aspect-based sentiment analysis. In AAAI, 2020.
- Pontiki et al. [2016] Maria Pontiki, Dimitrios Galanis, Haris Papageorgiou, Ion Androutsopoulos, Suresh Manandhar, Mohammad Al-Smadi, Mahmoud Al-Ayyoub, Yanyan Zhao, Bing Qin, Orphée De Clercq, et al. Semeval-2016 task 5: Aspect based sentiment analysis. In International workshop on semantic evaluation, pages 19–30, 2016.
- Qin et al. [2018] Lianhui Qin, Lemao Liu, Victoria Bi, Yan Wang, Xiaojiang Liu, Zhiting Hu, Hai Zhao, and Shuming Shi. Automatic article commenting: the task and dataset. arXiv preprint arXiv:1805.03668, 2018.
- Vaswani et al. [2017] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008, 2017.
- Vedantam et al. [2015] Ramakrishna Vedantam, C Lawrence Zitnick, and Devi Parikh. Cider: Consensus-based image description evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4566–4575, 2015.
- Wang et al. [2021] Wei Wang, Piji Li, and Hai-Tao Zheng. Generating diversified comments via reader-aware topic modeling and saliency detection. arXiv preprint arXiv:2102.06856, 2021.
- Yan et al. [2021] Hang Yan, Junqi Dai, Xipeng Qiu, Zheng Zhang, et al. A unified generative framework for aspect-based sentiment analysis. arXiv preprint arXiv:2106.04300, 2021.
- Yang et al. [2016] Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pages 1480–1489, 2016.
- Yang et al. [2020] Fan Yang, Eduard Dragut, and Arjun Mukherjee. Predicting personal opinion on future events with fingerprints. In Proceedings of the International Conference on Computational Linguistics, pages 1802–1807, 2020.