This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Evaluating the Ebb and Flow: An In-depth Analysis of Question-Answering Trends across Diverse Platforms

Rima Hazra, Agnik Saha, Somnath Banerjee, Animesh Mukherjee
Indian Institute of Technology Kharagpur
[email protected]
[email protected], [email protected]
Abstract

Community Question Answering (CQA) platforms steadily gain popularity as they provide users with fast responses to their queries. The swiftness of these responses is contingent on a mixture of query-specific and user-related elements. This paper scrutinizes these contributing factors within the context of six highly popular CQA platforms, identified through their standout “answering speed". Our investigation reveals a correlation between the time taken to yield the first response to a question and several variables: the metadata, the formulation of the questions, and the level of interaction among users. Additionally, by employing conventional machine learning models to analyze these metadata and patterns of user interaction, we endeavor to predict which queries will receive their initial responses promptly.

1 Introduction

Community question-answering platforms are gradually evolving over the past decade. Such portals allow users to post their queries and answer the questions. These CQA platforms help experts share their knowledge and new users solve their queries. Initially, these community question-answering platforms focused on providing the most relevant answers to the users’ queries [14][2]. As the CQA platforms keep growing, they not only focus on providing the relevant answers but also on quickly delivering such answers [12][9]. The growth and maturity of such platforms is contingent on a number of variables, including the “quality of questions and answers", the “response time of queries", “user involvement", and “user activity" [8]. The users with expertise in specific domains significantly help the platform’s growth. Based on the users’ activeness on the platform and the importance of their posts, they have been rewarded reputation scores, badges, and additional privileges. In most such CQA platforms driven by a reputation system, the platform’s popularity depends on the response time [3] of the question (i.e., time to receive the first answer). So, the users who want to post a query may also want to know the time by which they can expect a response to their question. StackExchange111https://stackexchange.com/ is one of the CQA platforms where the users can ask questions and receive answers from other users. An analysis of such a platform’s dynamics helps the community maintainers adapt and design the framework to address various CQA-related problems [1] [7] [13] [11] [6] [5] [4]. In this paper, we examine the dynamics of different CQA platforms relating the usual response time with other factors of the portals. We conduct an empirical study on six CQA platforms – Mathematics, Software Engineering, English, Game Development, Chemistry, and Money. To analyze the dynamics of the diverse community question-answering platforms, we look into various factors mentioned here – (1) Characterize the questions in terms of textual information, linguistic characteristics, question tags, votes received by the questions etc. We obtain correlations of these features with the time required to respond to a question. (2) Characterizing the users of the platform in terms of their reputation & activity in the platforms. Further, we build Asker-Answerer graph (AAG) where the nodes are the users and a link is created if the question from a user is answered by another user. Subsequently, we conduct in depth analysis of the characteristics of the common users present across all the platforms. (3)For all the platforms, we use metadata of the questions, question structure and user interaction as features to build a standard machine learning model for classifying the newly posted questions which are having fast answer. The majority of preceding studies [10] scrutinize the correlation between diverse question-related factors and response time. The novelty of our current research resides in our endeavor to comprehend how various inherent network-related features (reflected in the asker-answerer graph) coincide with question-related features in relation to response time. In this study, we make the following observations – (1) Short questions get faster responses. Chemistry and Software Engineering questions are harder. More tags mean slower replies. Math, Software Engineering, and English answers come in an hour, others in a day. In Software Engineering and English, many questions get 3\geq 3 answers.(2) Most users ( 60%-70%) on all platforms stay inactive. Domains like Money, Game Development, and Chemistry have more active users than Mathematics, Software Engineering, and English. (3) XGBoost and MLP perform best at classifying new questions on most platforms. The second best is usually the random forest model.

2 Dataset

We test six CQA datasets from StackExchange up to March 2022, picking six platforms (top three and bottom three in answering speed) from an initial group that includes Physics, Chemistry, Mathematics, English, Software Engineering, Game Development, AskUbuntu, Mathematica, Travel, Money. Answering speed, different from response time, is the % of questions answered within time tt (tt = 10 mins, 20 mins, etc.). Math, English, and Software Engineering are top in speed; Money and Finance, GameDev, and Chemistry are bottom. Data includes question title, body, tags, reporting time, answers with timestamps, votes, and user reputations. Table 1 presents dataset stats.

Datasets #Questions #Unique Tags #Unique Users Age (in yr.)
SE 60,801 1667 3,46,642 12 (2010)
GD 53,427 1094 1,22,553 12 (2010)
MA 14,79,363 1898 8,88,141 12 (2010)
EN 1,24,373 981 3,50,766 12 (2010)
CH 40,742 368 88,677 10 (2012)
MO 35,494 1005 84,650 13 (2009)
Table 1: Basic statistics of six community question answering platforms.
Datasets Average number Total number % of answered % of questions Average number
of tags of answers questions having accepted answer of votes
SE 2.802 1,72,275 95.44 57.78 6.6139
GD 2.795 77,996 86.31 52.80 2.567
MA 2.369 19,76,51 82.32 52.71 2.11
EN 2.08 2,79,346 92.12 48.1 3.36
CH 2.396 47,420 79.6 40.63 3.42
MO 3.11 66,917 91.19 45.32 4.71
Table 2: Basic statistics about the question structure

3 Empirical analysis

In order to compare the platforms we consider two different factors – (i) the characteristic features of the questions posted in a platform, and (ii) the interaction behaviour of users on a platform. In each case we correlate these factors with the response time, which is the time required to obtain the first answer after a question is posted.

3.1 Characterizing the questions

In order to characterize the questions of different platforms, we explore various structural properties of the question. These include length of the title, length of the body, linguistic characteristics of question title and question body. In Table 2, we have shown the average number of tags present in the question, total number of answers, percentage of questions having at least one answer, percentage of questions having accepted answer, average number of votes received by questions. From Table 2, it is observed that 90%-95% of questions have received at least one answer in SESE, ENEN and MOMO platforms. Lowest percentage (\sim79%) of questions have been answered in the CHCH platform. Further SESE, GDGD and MAMA have a higher percentage of questions having accepted answers than other platforms. The average number of votes to the questions are maximum for the SESE platform.
Length of the title and body: We have computed length of the title and body as the number of words present in the each of them respectively. Further we took top 20% questions (QtopQ_{top}) and bottom 20% questions (QbottomQ_{bottom}) based on the response time. Top 20% questions correspond to a lower response time and bottom 20% questions correspond to a higher response time. In MAMA, ENEN and SESE, the average length of the body of QtopQ_{top} questions are 69.06, 71.86, 150.70 respectively; the length of the body of QbottomQ_{bottom} questions are 117.48, 102.55, 189.70 respectively. Thus questions with lesser length get faster answers. The same trend is observed for MOMO, GDGD and CHCH where the length of the body for QtopQ_{top} are 121.56, 133.63, 81.197 and for QbottomQ_{bottom} are 138.48, 166.15, 106.117 respectively. We did not find much difference in the length of the titles for the QtopQ_{top} vs the QbottomQ_{bottom} questions.
Linguistic characteristics: In order to understand the linguistic characteristics, we have used three measures – Flesch reading ease test measure 222https://simple.wikipedia.org/wiki/Flesch_Reading_Ease (higher the better), Coleman Liau Index 333https://en.wikipedia.org/wiki/Coleman-Liau_index (lower the better) and Automated Readability Index 444https://en.wikipedia.org/wiki/Automated_readability_index(lower the better). We observe the readability scores for QtopQ_{top} and QbottomQ_{bottom} questions. For Flesch-reading-ease test (see Figure 1), the QtopQ_{top} questions of MAMA, ENEN and SESE have average scores of 68.53, 71.71, 63.31 respectively while the QbottomQ_{bottom} questions have the average scores of 60.59, 68.86 and 58.14 respectively. In case of MOMO, GDGD and CHCH, the average scores of QtopQ_{top} are 71.42, 62.75 and 63.22 respectively while that of QbottomQ_{bottom} questions are 68.83, 55.26 and 58.87 respectively. Thus readability is better (easier) for the QtopQ_{top} questions across all platforms irrespective of their popularity (in terms of the answering speed). This observation holds true for the other two readability metrics as well (see Figure 1).

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 1: (left): a. Flesch Reading Ease, (2nd2^{nd} from left): b. Coleman-Liau index, (middle): c. Automated readability index scores for QtopQ_{top} and QbottomQ_{bottom} questions. (2nd2^{nd} from right): d. Average Clustering Coefficient, (right): e. Average Degree of AAG of QtopQ_{top} and QbottomQ_{bottom}.

Tag related analysis: For each platform, we obtain the most frequently occurring tags (top 50) across the questions posted. For SESE, Java, Design, C# are the most popular tags. For GDGD, Unity, C++, C# and OpenGL are most frequent and for MOMO, United States and taxes are most popular tags. In case of MAMA, ENEN and CHCH, we do not observe any particularly dominant set of tags. Further, for each portal, we calculate the average number of tags present in QtopQ_{top} and QbottomQ_{bottom} questions. In MAMA, ENEN and SESE, the average number of tags in QtopQ_{top} and QbottomQ_{bottom} are 2.08, 2.03, 2.53, 2.66, 2.17 and 3.02 respectively. In GDGD, CHCH and MOMO, the average number of tags for QtopQ_{top} and QbottomQ_{bottom} are 2.58, 2.28, 3.00, 2.94, 2.54 and 3.23 respectively. We observe that there are lesser number of tags in the questions in QtopQ_{top} than in QbottomQ_{bottom}. This could possibly indicate lesser tags makes a question topically less confusing and streamlines it more effectively to people who can actually answer them.

Votes received by the question: In order to understand the temporal pattern of votes received by the questions, we count the fraction of questions receiving 1 upvote, 2 upvotes and 3\geq 3 upvotes on three consecutive days, i.e., the same day when the question is posted followed by the two subsequent days. In the same day, \sim30% questions of SESE platform receive more than 3 votes. But for other platforms most of the questions received one vote. In second day, around \sim20% of questions receive one more vote in SESE, MOMO and CHCH platforms. In case of SESE and MOMO, there are 10% of questions receiving at least three votes in the second day. In the third day, there are very less number of questions receiving votes across all the platform. This indicates that on all platforms the attention span of users for a particular question dies down within 48 hours from the post of the question. In case of MAMA, ENEN and SESE, the average votes for the questions in QtopQ_{top} are 1.99, 5.31 and 15.82 respectively and that for the questions in QbottomQ_{bottom} are 2.75, 2.60 and 4.44 respectively. In case of MOMO, GDGD and CHCH, the average number of votes in QtopQ_{top} are 6.46, 4.63 and 4.21 respectively while that in QbottomQ_{bottom} are 2.96, 2.08 and 5.17 respectively. Hence questions in lower response time typically receive more upvotes irrespective of the popularity of the platform.

3.2 Characterizing the users

In this section we analyse the user behaviour on each of these platforms. Number of users in MAMA, SESE and ENEN are approximately 888K, 346K and 350K which is much larger than the number of users in other platforms. For MOMO, GDGD and CHCH, the number of users are 84K, 122K and 88K respectively. In Table 3, we show the basic statistics about these unique users such as % of users only posting question, % of users only answering questions, % of users posting both the question and answers, and users whose at least one answer got accepted by an asker.

Datasets %Users posting only questions %Users posting only answers %Users posting both Q&A %Users whose at least one answer got accepted
SE 8.9 7.5 2 1.9
GD 19 12.7 5.3 4.9
MA 33 10 5 4
EN 17 12.3 2.6 2.3
CH 20 7 2.6 2.1
MO 22.2 9.8 2.7 2.2
Table 3: Basic statistics of the user interaction.

We have seen that there are around 60%-70% users who remain inactive in all the platforms. Inactive users neither post anything, nor do they interact with anyone on the platform. MAMA, SESE and ENEN platforms have approximately 37.38%, 14.14% and 26.73% of users who have posted at least one question or one answer (aka active users). In case of MOMO, GDGD and CHCH, the percentage of active users are 29.38%, 26.47% and 25.29% respectively. According to Table 3, MAMA has maximum fraction of users (33% users) who have posted at least one question. In terms of writing answers, around 12% users are active in ENEN and GDGD platforms which is larger than the other platforms. GDGD has maximum fraction of users (\sim5.3%) who have posted both question and answer. Once again, GDGD has the largest fraction of users ( 4.9%) whose at least one answer got selected. Further we conduct in depth analysis on the users who have asked or answered any questions in QtopQ_{top} and QbottomQ_{bottom} respectively. In Table 5, we show number of askers, answerers and their overlap for QtopQ_{top} and QbottomQ_{bottom} users. We analyze various factors of the asker and answerer of the questions in QtopQ_{top} and QbottomQ_{bottom} below.

Reputation score of users: We separately observe the reputation scores of asker and answerer of questions of QtopQ_{top} and QbottomQ_{bottom} category. Asker is the user who asked the question in the portal and answerer is the user who answered the question. In MAMA, ENEN and SESE, the average reputation scores of askers in QtopQ_{top} are 559.07, 546.27 and 1001.96 respectively. The average reputation scores of askers in QbottomQ_{bottom} 705.52, 522.78 and 644.14 respectively. In MOMO, GDGD and CHCH, the average reputation score of askers in QtopQ_{top} are 585.71, 401.53 and 424.68 respectively. The average reputation score of askers in QbottomQ_{bottom} are 540.71, 307.14 and 561.10 respectively. In case of answerers, the average reputations in QtopQ_{top} are 4063.39, 2147.42 and 3209.59 for MAMA, ENEN and SESE respectively and those in QbottomQ_{bottom} are 2001.24, 1622.79 and 1846.81 respectively. For MOMO, GDGD and CHCH, the average reputation of answerer in QtopQ_{top} are 4017.05, 1117.52 and 1948.33 respectively while that in QbottomQ_{bottom} are 1965.42, 552.84 and 1397.38 respectively. We observe three things here – (i) across all platforms the reputation scores of askers as well as answerers are typically more for QtopQ_{top} questions (ii) in the popular platforms the reputation scores for both askers and answerers are higher and (iii) answerer reputations are an order of magnitude higher than the askers.

Asker-Answerer graph (AAG): To understand the user interaction patterns on a platform, we build an undirected graph where the nodes are the users and the edges denote if a user (answerer) answers a question posted by another user (asker). For every platform, we build two such AAGs for users who either asked or answered the questions of QtopQ_{top} and QbottomQ_{bottom} respectively (See Table 4). Further, we calculate average degree, average clustering coefficient for the whole network. In Figure 1 (e), we observe that the average degree of QtopQ_{top} AAG for all platforms except CHCH are relatively higher than the average degree of QbottomQ_{bottom} AAG. For all the platforms, the average clustering coefficient of QtopQ_{top} AAG are higher than the average clustering coefficient of QbottomQ_{bottom} AAG (see Figure 1 (d)). This indicates that there is a higher density of interactions among the asker-answer(s) in the QtopQ_{top} AAG (and exceptionally higher platforms SESE, GDGD and MOMO).

Datasets # nodes # edges fraction of nodes present in largest connected component.
SE (Qtop,QbottomQ_{top},Q_{bottom}) 10319, 12756 11179, 11234 0.73, 0.53
GD (Qtop,QbottomQ_{top},Q_{bottom}) 8011, 9453 8730, 8547 0.72, 0.42
MA (Qtop,QbottomQ_{top},Q_{bottom}) 106948, 110284 218119, 210865 0.92, 0.865
EN (Qtop,QbottomQ_{top},Q_{bottom}) 19609, 20672 21499, 21800 0.702, 0.71
CH (Qtop,QbottomQ_{top},Q_{bottom}) 5350, 5336 5953, 6069 0.79, 0.73
MO (Qtop,QbottomQ_{top},Q_{bottom}) 5093, 6041 5966, 6096 0.87, 0.69
Table 4: Comparison of network properties of QtopQ_{top} & QbottomQ_{bottom} AAGs.
Datasets Number of askers in QtopQ_{top} Number of answerer in QtopQ_{top} Number of asker in QbottomQ_{bottom} Number of answerer in QbottomQ_{bottom} overlap of askers overlap of answerers
SE 7697 2861 8933 4701 797 878
GD 5959 2249 6417 4941 843 1905
MA 89557 14072 88772 39909 6891 18397
EN 14533 4542 15564 6349 1148 1241
CH 4408 1180 4021 1860 427 545
MO 4497 762 4724 1789 256 472
Table 5: Asker and Answerer behaviour QtopQ_{top} & QbottomQ_{bottom}

Cross platform users: We found around 2739 users who are present in all the six platforms. Fraction of common users posted at least one question is around 29%-30% for MAMA and ENEN and for SESE it is 18%. However for CHCH, GDGD and MOMO platforms, this fraction comparatively lower (15%\leq 15\%). The observations are similar for the number of common users posting an answer – the popular platforms have far larger fractions compared to less popular ones. Next we compute the number of common users who participated in asking/answering a question in QtopQ_{top} and QbottomQ_{bottom}. The percentage of common users asking one or more questions in case of MAMA, ENEN and SESE are 19.82%, 17.30% and 9.71% respectively in QtopQ_{top} and 20.0%, 16.24% and 9.49% respectively in QbottomQ_{bottom}. The percentage of common users in these three platforms who answered one or more questions in QtopQ_{top} are 9.3%, 8.72% and 6.17% respectively. In QbottomQ_{bottom} these percentages are 12.92%, 7.7% and 6.4% respectively. For MOMO, GDGD and CHCH, the percentage of common users acting as asker in QtopQ_{t}op are 7.84%, 6.38% and 5.69% respectively and in QbottomQ_{bottom} are 8.06%, 6.09% and 5.69% respectively. For these three platforms, the percentage of common users acting as answerer in QtopQ_{t}op are 2.7%, 4.3% and 1.8% respectively and in QbottomQ_{bottom} are 4.2%, 4.9% and 2.1% respectively.

Engagement of common users: In order to investigate the involvement of common users (present in QtopQ_{top} and QbottomQ_{bottom} as answerers) in the overall platform, we compute the average number of questions/answers posted by them at an aggregate level. For MAMA, ENEN and SESE, the average number of questions posted by common users in QtopQ_{top} are 22.68, 8.5 and 3.04 and that in QbottomQ_{bottom} are 21.60, 8.74 and 3.57 respectively. For these three platforms, the average number of answers posted by common users in QtopQ_{top} are 228.83, 47.32 and 61.01 and that in QbottomQ_{bottom} are 166.7, 52.20 and 56.74 respectively. For MOMO, GDGD and CHCH, the average number of questions posted by common users in QtopQ_{top} are 11.56, 5.1 and 5.61 respectively. and in QbottomQ_{bottom} are 9.02, 5.45 and 7.55 respectively. For these three platforms, the average number of answers posted by common users in QtopQ_{top} are 49.81, 24.74 and 15.55 respectively. The average number of answers posted by common users in QbottomQ_{bottom} are 35.11, 21.43 and 14.22 respectively. Thus (i) irrespective of the popularity of the platforms, common users engage more in terms of asking and answering in QtopQ_{top} than in QbottomQ_{bottom} and (ii) quite interestingly, the average number of answers posted by the common users in QtopQ_{top} and QbottomQ_{bottom} are far more than the questions posted by them which indicates that these users are more engaged in posting answers than questions.

4 Question Category Prediction

4.1 Classification model

We consider previously discussed properties of the questions as features and predict whether a question will belong to QtopQ_{top} or QbottomQ_{bottom}. The features are – length of question body, Flesch reading ease, Coleman Liau index, automated readability index, number of tags, reputation of the asker and the clustering coefficient of asker in the asker-answerer graph. For every dataset, we have randomly divided the whole dataset into three parts – training (70%), validation (10%), and test (20%). Given a dataset, the number of instances present in training, validation, and test are given in Table 6. For this experiment, we have used four standard machine learning classifiers – (a) logistic regression (LR), (b) support vector classifier (SVC), (c) random forest (RF) (d) XGBoost. For evaluation, we have used overall precision, recall and macro F1 score.

Datasets Train Size Valid Size Test Size
SE{SE} 15848 2268 4525
GD{GD} 12660 1812 3615
MA{MA} 325153 46543 92809
EN{EN} 31119 4454 8884
CH{CH} 8844 1266 2525
MO{MO} 8875 1270 2534
Table 6: Number of instances present in training, validation and testing for all the datasets

Parameter settings: The parameter settings for each model are given below. For SVC, we consider the value of parameter CC in the range of 1-3. The values tried for kernel are poly, rbf and sigmoid. For random forest classifier, the range of the number of estimators is between 160 to 200. The criterion values are gini and entropy. For XGBoost, the range of values for the number of estimators is from 160 to 200. The range of learning rate is 0.1 to 0.3.

Logistic regression (LR): For all datasets, we found the default parameters are the best.
Support vector classifier (SVC): For CHCH, the best value of CC is 3, and kernel is set to poly. For other datasets, the best parameter for kernel is rbf. For ENEN and GDGD, the value of the parameter C is set to 2 and 1 respectively. For the other datasets, the value of C is 3.
Random forest (RF): For CHCH, ENEN, MOMO, MAMA, the best value for criterion is entropy. For GDGD and SESE, the criterion is set to gini. For MOMO, the number of estimators is set to 180. For other datasets, the number of estimators is set to 200.
XGBoost: For all the datasets, the best value for the learning rate is 0.1. For CHCH, GDGD, MOMO, the value of the number of estimators is set to 160. For ENEN and SESE, the number of estimators is 180. For MAMA, the number of estimators is 200.

4.2 Results

Datasets LR SVC RF XGBoost
Precision Recall Macro F1 Precision Recall Macro F1 Precision Recall Macro F1 Precision Recall Macro F1
SE{SE} 0.67 0.67 0.67 0.69 0.69 0.69 0.71 0.71 0.71 0.72 0.72 0.72
GD{GD} 0.62 0.61 0.61 0.63 0.62 0.62 0.62 0.62 0.62 0.63 0.63 0.63
MA{MA} 0.69 0.69 0.69 0.70 0.70 0.70 0.69 0.69 0.69 0.71 0.71 0.71
EN{EN} 0.61 0.61 0.60 0.61 0.61 0.61 0.63 0.63 0.63 0.64 0.64 0.64
CH{CH} 0.59 0.58 0.58 0.59 0.58 0.57 0.59 0.58 0.59 0.58 0.58 0.58
MO{MO} 0.56 0.56 0.56 0.56 0.56 0.56 0.56 0.56 0.56 0.57 0.57 0.57
Table 7: Results of question classification models for all the datasets. The best results are highlighted in bold. The 2nd2^{nd} best results are highlighted in underline.

In Table 7, we present the overall precision, recall and macro F1 score for all the datasets. For SESE, RF and XGBoost are performing better than other models. For GDGD, almost all the models’ performances are similar (0.61-0.63 F1 score). For MAMA, XGBoost model is performing better (0.71 macro F1 score) than other models. For ENEN, RF and XGBoost attain better F1 score than other models. For CHCH, RF performs better (0.59 F1 score) than other models. For MOMO, all the models attain similar performance (0.56-0.57 F1 score).

5 Conclusion

In this paper, we study question-answering trends across six diverse CQA platforms. We find metadata, question structure, and user interaction patterns affect response time. Shorter, clearer questions with fewer tags are answered faster. High-reputation users engage more with quickly-answered questions. These question and asker features predict fast responses. We use machine learning to classify new questions based on metadata, planning to incorporate these factors into deep learning models to predict response times for new questions using text, metadata, and asker features.

References

  • Anderson et al. [2012] Ashton Anderson, Daniel Huttenlocher, Jon Kleinberg, and Jure Leskovec. Discovering value from community activity on focused question answering sites: A case study of stack overflow. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’12, page 850–858, New York, NY, USA, 2012. Association for Computing Machinery. ISBN 9781450314626. doi: 10.1145/2339530.2339665. URL https://doi.org/10.1145/2339530.2339665.
  • Bachschi et al. [2020] Timur Bachschi, Aniko Hannak, Florian Lemmerich, and Johannes Wachs. From asking to answering: Getting more involved on stack overflow, 2020.
  • Bhat et al. [2014] Vasudev Bhat, Adheesh Gokhale, Ravi Jadhav, Jagat Pudipeddi, and Leman Akoglu. Min(e)d your tags: Analysis of question response time in stackoverflow. In 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), pages 328–335, 2014. doi: 10.1109/ASONAM.2014.6921605.
  • Hazra et al. [2021] Rima Hazra, Hardik Aggarwal, Pawan Goyal, Animesh Mukherjee, and Soumen Chakrabarti. Joint autoregressive and graph models for software and developer social networks. In Djoerd Hiemstra, Marie-Francine Moens, Josiane Mothe, Raffaele Perego, Martin Potthast, and Fabrizio Sebastiani, editors, Advances in Information Retrieval, pages 224–237, Cham, 2021. Springer International Publishing. ISBN 978-3-030-72113-8.
  • Hazra et al. [2023a] Rima Hazra, Arpit Dwivedi, and Animesh Mukherjee. Is this bug severe? a text-cum-graph based model for bug severity prediction. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022, Grenoble, France, September 19–23, 2022, Proceedings, Part VI, page 236–252, Berlin, Heidelberg, 2023a. Springer-Verlag. ISBN 978-3-031-26421-4. doi: 10.1007/978-3-031-26422-1\_15. URL https://doi.org/10.1007/978-3-031-26422-1_15.
  • Hazra et al. [2023b] Rima Hazra, Debanjan Saha, Amruit Sahoo, Somnath Banerjee, and Animesh Mukherjee. Duplicate question retrieval and confirmation time prediction in software communities. CoRR, abs/2309.05035, 2023b. doi: 10.48550/ARXIV.2309.05035. URL https://doi.org/10.48550/arXiv.2309.05035.
  • Mondal et al. [2021] Saikat Mondal, C M Khaled Saifullah, Avijit Bhattacharjee, Mohammad Masudur Rahman, and Chanchal K. Roy. Early detection and guidelines to improve unanswered questions on stack overflow. In 14th Innovations in Software Engineering Conference (Formerly Known as India Software Engineering Conference), ISEC 2021, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450390460. doi: 10.1145/3452383.3452392. URL https://doi.org/10.1145/3452383.3452392.
  • Moutidis and Williams [2021] Iraklis Moutidis and Hywel TP Williams. Community evolution on stack overflow. Plos one, 16(6):e0253010, 2021.
  • Wang et al. [2013] Shaowei Wang, David Lo, and Lingxiao Jiang. An empirical study on developer interactions in stackoverflow. In Proceedings of the 28th Annual ACM Symposium on Applied Computing, SAC ’13, page 1019–1024, New York, NY, USA, 2013. Association for Computing Machinery. ISBN 9781450316569. doi: 10.1145/2480362.2480557. URL https://doi.org/10.1145/2480362.2480557.
  • Wang et al. [2018a] Shaowei Wang, Tse-Hsun Chen, and Ahmed E. Hassan. Understanding the factors for fast answers in technical q&a websites: An empirical study of four stack exchange websites. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pages 884–884, 2018a. doi: 10.1145/3180155.3182521.
  • Wang et al. [2018b] Shaowei Wang, Tse-Hsun Chen, and Ahmed E. Hassan. Understanding the factors for fast answers in technical q&a websites: An empirical study of four stack exchange websites. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pages 884–884, 2018b. doi: 10.1145/3180155.3182521.
  • Yazdaninia et al. [2021] Mohamad Yazdaninia, David Lo, and Ashkan Sami. Characterization and prediction of questions without accepted answers on stack overflow, 2021.
  • Zhang et al. [2021a] H. Zhang, S. Wang, T. Chen, Y. Zou, and A. E. Hassan. An empirical study of obsolete answers on stack overflow. IEEE Transactions on Software Engineering, 47(04):850–862, apr 2021a. ISSN 1939-3520. doi: 10.1109/TSE.2019.2906315.
  • Zhang et al. [2021b] Haoxiang Zhang, Shaowei Wang, Tse-Hsun (Peter) Chen, and Ahmed E. Hassan. Are comments on stack overflow well organized for easy retrieval by developers? ACM Trans. Softw. Eng. Methodol., 30(2), feb 2021b. ISSN 1049-331X. doi: 10.1145/3434279. URL https://doi.org/10.1145/3434279.