OATS: Opinion Aspect Target Sentiment Quadruple Extraction Dataset for Aspect-Based Sentiment Analysis

Abstract

Aspect-based sentiment analysis (ABSA) delves into understanding sentiments specific to distinct elements within a user-generated review. The goal is to determine: a) the target entity being reviewed, b) the high-level aspect to which it belongs, c) the sentiment words used to express the opinion, and d) the sentiment expressed toward the targets and the aspects. In this paper, we present the OATS dataset, which encompasses three fresh domains and consists of 20,000 sentence-level quadruples and 13,000 review-level tuples. Our new corpora addresses these gaps: lack of domain diversity, limited data for intricate quadruple extraction tasks, and an occasional oversight of the synergy between sentence and review-level sentiments. Moreover, to elucidate OATS’s potential and shed light on various ABSA subtasks that OATS can solve, we conducted in-domain and cross-domain experiments, establishing initial baselines for the community. This new dataset enables the community to work on a more holistic approach to ABSA, while at the same time contributes to diversifying the domains typically covered in ABSA.

Keywords: ABSA, ABSA Dataset, ASQP, ACOS, ASTE

\NAT@set@cites

Name of author

Address - Line 1

Address - Line 2

Address - Line 3

Abstract content

1. Introduction

The trend in analyzing online reviews has shifted from extracting a broad understanding of consumer opinions on overall product performance to a more granular examination of individual product aspects. This shift demands a different approach to analyzing reviews. Aspect-based sentiment Analysis (ABSA) emerged as the answer to this nuanced requirement, focusing on sentiment pertaining to specific aspects of an entity Hu-2004. However, current datasets often fall short of capturing the complete spectrum of ABSA. The main goal of ABSA is to identify the target of the opinion, the aspect category it belongs to, the opinion phrase, and the sentiment polarity associated with the opinion. One of the significant limitations is the inability to perform joint detection of all ABSA elements due to the absence of at least one critical component in the human-annotated reviews. This limitation stunts the potential of ABSA tasks. Despite the SemEval datasets’ popularity, many of their sentences often group multiple aspects under a single sentiment polarity. As noted by MAMS-dataset (MAMS-dataset), this approach simplifies the ABSA task, reducing it to mere sentence-level sentiment analysis. As a solution, they proposed a new large-scale Multi-Aspect Multi-Sentiment dataset, where each phrase has at least two independent aspects with different sentiment polarity. However, their introduction of a "miscellaneous" category or neutral sentiment when a sentence contains only one opinion tuple is impractical. This makes the task more challenging, but it is not realistic.

Recently released ABSA datasets such as ACOS ACOS and ASQP ASQP provide comprehensive annotations for all four elements. However, they are largely limited to the long-standing traditional domains of restaurants and laptops, a trend that has been observed since 2014. In contrast, newer datasets (i.e., DM-ASTE; Dom-Exp-ASTE) address domain diversity by including home appliances, fashion, groceries, and more, moving beyond the typical restaurant and laptop reviews. Yet, these innovative datasets lack the critical aspect category annotations required for a holistic joint detection of all elements.

Refer to caption — Table 1: (a) Left: Inter-Annotator agreement F1-scores for the OATS Datasets. Tgt. Span: Target span extraction. Opi. Ph. Span: Opinion Phrase span extraction. Asp-Sent: Aspect Category and Sentiment combination categorization. (b) Right: Current ASQP Dataset Statistics

\hlineB2 \rowcolorgray!35 Dataset	Tgt. Span	Opi. Ph. Span	Asp-Sen	Quadruple
Amazon_FF	72.57	69.72	85.43	65.78
Coursera	78.26	71.26	79.63	68.56
Hotels	74.78	72.05	87.32	73.84

\hlineB2 \rowcolorgray!35 Dataset	#sents	#pos	#neg	#neu	#Total
\hlineB2 Rest-15	1,562	1,710	701	85	2,496
Rest-16	2,024	2,293	877	125	3,295
Total	3,586	4,003	1,578	210	5,791
\hlineB2

\hlineB2 \rowcolorgray!35 Domain	#Revs.	#Sent.	#Rev.Op.	#Sent.Op.	#Total Op
\hlineB2 Amazon_FF	1,794	8,913	4,326	8,260	12,586
Coursera	1,702	8,278	5,350	7,875	13,225
Hotels	1,497	7,963	7,416	11,335	18,751
Total	4,993	25,154	17,092	27,470	44,562
\hlineB2

\hlineB2 \rowcolorgray!35 Stats/Domain	Amazon_FF	Coursera	Hotels
\hlineB2 Avg. Sentences/Review	4.96	4.81	5.31
Avg. Length of Sentence	71.34	79.02	75.95
Avg. Length of Review	359.9	391.55	405.29
Avg. Opinions/Sentence	0.92 (1.25)	0.95 (1.27)	1.42 (1.71)
Avg. Opinions/Review	2.4 (2.48)	3.16 (3.18)	4.96 (5.37)
\hlineB2

\hlineB2 \rowcolorgray!35			Sentence-Level					Review-Level
\rowcolorgray!35 Domain			0-Op	1-Op	2-Op	3-Op	>3-Op	0-Op	1-Op	2-Op	3-Op	>3-Op
\hlineB2 Amazon_FF	1,957	4,529	890	170	42	47	175	626	490	183
Coursera	1,395	3,685	690	134	36	8	107	298	349	449
Hotels	1,454	2,910	1,167	499	337	116	11	58	138	883
Total	4,806	11,124	2,747	803	415	171	293	982	977	1,515
\hlineB2

OATS: Opinion Aspect Target Sentiment Quadruple Extraction Dataset for Aspect-Based Sentiment Analysis

Abstract

1. Introduction

2. OATS Dataset

Amazon Fine Foods Dataset

Coursera Dataset

TripAdvisor Dataset

\SmallTitleFont2.1. \SmallTitleFontAnnotation Procedure

\SmallTitleFont2.2. \SmallTitleFontThe OATS dataset in Numbers

3. Experimental Evaluation

\SmallTitleFont3.1. \SmallTitleFontTasks

\SmallTitleFont3.2. \SmallTitleFontBaseline Methods

Task-specific methods

TASD Task-specific methods

ASD Task-specific methods

Unified methods

\SmallTitleFont3.3. \SmallTitleFontEvaluation

4. Discussion and Insights

5. Significance of OATS

6. Related Works

\SmallTitleFont6.1. \SmallTitleFontCurrent Datasets

\SmallTitleFont6.2. \SmallTitleFontRelated Methods

7. Conclusion

8. Ethics and Limitations

9. Acknowledgements

Appendix A Appendix: OATS Corpus Properties

Appendix B Appendix: Defining Fine-Grained Annotations

\hlineB2 \rowcolorgray!30 Method	Amazon_FF	Coursera	Hotels
\hlineB2 GAS-T5	19.62	22.23	26.33
Paraphrase	20.84	19.78	34.51
Template-ILO	21.01	21.24	24.62
Template-DLO	20.39	21.96	26.59
GEN-SCL-NAT	20.36	20.20	28.58
\hlineB2

\hlineB2 \rowcolorgray!30 Method	Amazon_FF	Coursera	Hotels
\hlineB2 TAS-BERT-BIO	45.12	44.41	45.92
TAS-BERT-TO	47.51	42.77	45.76
T5-ABSA	51.61	44.57	49.78
GAS	43.04	41.53	50.69
Paraphrase	44.89	40.24	49.81

\hlineB2 \rowcolorgray!35	Amazon_FF		Coursera		Hotels
\rowcolorgray!35 Method	R-ASD	S-ASD	R-ASD	S-ASD	R-ASD	S-ASD
BERT-pair-NLI-B	88.22	93.25	91.31	95.92	91.84	96.82
BERT-pair-QA-B	88.42	93.61	91.51	96.69	91.93	97.18
QACG-BERT-NLI-M	87.15	92.13	90.41	95.81	90.05	95.68
T5-ABSA	58.26	56.85	40.76	48.81	54.09	59.85
GAS-T5	65.23	57.57	46.61	56.79	58.61	66.9
Paraphrase	66.46	57.25	47.88	55.27	56.11	66.38

\hlineB2 \rowcolorgray!35 Domain/Task	TOWE	TSD	ASD	TASD	ASTE	ASQP
Amazon_FF	31.29	65.27	57.25	44.89	46.64	20.84
Coursera	38.64	67.75	55.27	40.24	39.30	19.78
Hotels	35.75	62.34	66.38	49.81	46.37	34.51

\hlineB2 \rowcolorgray!35 Domain	ET / IT	EO / IO	ET-EO	ET-IO	IT-EO	IT-IO
Amazon_FF	2,999 / 5,261	6,780 / 1,480	2,491	508	4,298	972
Coursera	5,163 / 2,712	6,185 / 1,690	4,222	941	1,963	749
Hotels	8,654 / 2,820	10,193 / 1,281	7,927	727	2,266	554

Entity/Attribute	GENERAL	PRICES	DESIGN&FEATURES	CLEANLINESS	COMFORT	QUALITY	STYLE&OPTIONS	MISCELLANEOUS
HOTEL	✓	✓	✓	✓	✓	✓	✗	✓
ROOMS	✓	✓	✓	✓	✓	✓	✗	✓
ROOM_AMENITIES	✓	✓	✓	✓	✓	✓	✗	✓
FACILITIES	✓	✓	✓	✓	✓	✓	✗	✓
SERVICE	✓	✗	✗	✗	✗	✗	✗	✗
LOCATION	✓	✗	✗	✗	✗	✗	✗	✗
FOOD&DRINKS	✗	✓	✗	✗	✗	✓	✓	✓

\hlineB2 \rowcolorgray!35			Sentence-Level			Review-Level
\rowcolorgray!35 Domain			#pos	#neg	#neu	#pos	#neg	#neu	#conf
\hlineB2 Amazon_FF	5,577	1,187	234	2,900	606	74	82
Coursera	4,403	1,008	213	2,910	721	129	71
Hotels	6,952	1,207	169	4,557	817	110	53
Total	16,932	3,402	616	10,367	2,144	313	206
\hlineB2