This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Enhancing Healthcare through Large Language Models: A Study on Medical Question Answering

Haoran Yu Independent Researcher
San Jose, USA
[email protected]
   Chang Yu Independent Researcher
Boston, USA
[email protected]
   Zihan Wang {@IEEEauthorhalign} Independent Researcher
San Jose, USA
[email protected]
   Dongxian Zou Independent Researcher
Mill Creek, USA
[email protected]
   Hao Qin Independent Researcher
Brea, USA
[email protected]
Abstract

In recent years, the application of Large Language Models (LLMs) in healthcare has shown significant promise in improving the accessibility and dissemination of medical knowledge. This paper presents a detailed study of various LLMs trained on the MedQuAD medical question-answering dataset, with a focus on identifying the most effective model for providing accurate medical information. Among the models tested, the Sentence-t5 combined with Mistral 7B demonstrated superior performance, achieving a precision score of 0.762. This model’s enhanced capabilities are attributed to its advanced pretraining techniques, robust architecture, and effective prompt construction methodologies. By leveraging these strengths, the Sentence-t5 + Mistral 7B model excels in understanding and generating precise medical answers. Our findings highlight the potential of integrating sophisticated LLMs in medical contexts to facilitate efficient and accurate medical knowledge retrieval, thus significantly enhancing patient education and support.

Index Terms:
Healthcare, Large Language Models, Natural Language Processing, Sentence-t5, Mistral 7B, Pretraining.

I Introduction

The intersection of healthcare and data science holds immense potential for improving patient outcomes and accessibility to medical information. In recent years, the advent of advanced artificial intelligence (AI) technologies, particularly Large Language Models (LLMs), has opened new avenues for enhancing healthcare services. LLMs, such as GPT-3, BERT, and their successors, have demonstrated remarkable capabilities in understanding and generating human-like text. These models can be trained on vast amounts of data to perform various natural language processing (NLP) tasks, including question answering, summarization, and text generation.

Healthcare is a domain where accurate and timely information is crucial. Patients often seek answers to their medical queries, which, if provided accurately, can alleviate concerns, enhance understanding, and guide them towards appropriate medical care. Traditional methods of patient education and information dissemination, while effective, can be time-consuming and resource-intensive. Here, LLMs can play a transformative role by providing instant, reliable answers to common medical questions, thus democratizing access to healthcare knowledge.

Despite their potential, the application of LLMs in healthcare is fraught with challenges. Medical language is complex, and the accuracy of information is paramount. Incorrect or misleading information can have serious consequences. Therefore, it is essential to ensure that these models are trained on high-quality datasets and fine-tuned to understand and generate precise medical information. Additionally, the models must be capable of handling the nuances and specificity of medical terminology and contexts.

This paper investigates the performance of several LLM configurations in processing and answering medical questions using the MedQuAD dataset, a comprehensive medical question-answer dataset. The primary objective is to identify the most effective model that can be deployed to assist patients in understanding their health conditions and treatments. We explore the training and fine-tuning of three models: Gemma 2b + LoRA, Phi-2, and Sentence-t5 + Mistral 7B. Each model undergoes a rigorous process of data preprocessing, prompt construction, and fine-tuning to optimize its performance.

Our study is motivated by the need to enhance patient education through scalable, AI-driven solutions. By leveraging LLMs, we aim to provide a tool that can deliver accurate medical information efficiently. This approach not only benefits patients but also supports healthcare professionals by reducing the burden of addressing routine queries, allowing them to focus on more complex and critical cases.

The contributions of this paper are threefold:

  • We present a detailed methodology for training and fine-tuning LLMs using the MedQuAD dataset.

  • We evaluate the performance of different model configurations, highlighting the strengths and limitations of each.

  • We demonstrate that the Sentence-t5 + Mistral 7B + Pretrain model achieves the highest precision, making it a promising candidate for real-world healthcare applications.

By addressing these aspects, we aim to provide insights into the effective deployment of LLMs in healthcare and pave the way for future research and development in this critical field. The findings of this study have the potential to significantly impact patient care and education, contributing to the broader goal of improving healthcare outcomes through technology.

II Related Work

The application of artificial intelligence in healthcare has been a subject of extensive research and development. Over the past decade, numerous studies have explored various AI-driven solutions for medical diagnosis, patient care, and information dissemination. Among these, the use of Large Language Models (LLMs) for processing and generating medical text has gained significant attention. This section reviews the relevant literature, highlighting key contributions and advancements in this field.

Devlin et al.[1]introduced BERT, a model that advanced NLP by understanding word context and improving text-based task performance.X Peng et al.[2]describes an NLP-based system that automates news production and integrates fact-checking to improve accuracy and reliability.Lee et al.[3] developed BioBERT, a model for biomedical text mining used in tasks like named entity recognition and question answering.Qihong Ning et al.[4]uses machine learning to enhance μ\muPADs for rapid and accurate CRP detection.

Huang et al.[5] introduced ClinicalBERT, enhancing understanding and generation of clinical notes for better information retrieval and clinical decision support.Y Cao et al.[6]predicting ICU admissions for COVID-19 patients, demonstrating high accuracy and robustness.Li et al.[7]investigated the use of transformer models for biomedical named entity recognition, showing that models like BERT can significantly enhance the accuracy of identifying biomedical entities in text.A Javanmardi et al.[8]finds that proactive planning improves outcomes, but neglecting reactive planning increases risk in construction meetings.M Zhu et al.[9]introduces an ensemble framework using LightGBM, XGBoost, and LocalEnsemble to enhance credit default prediction accuracy.

Peng et al. [10]proposed Med-BERT, a BERT model adapted for medical tasks, improving medical entity recognition and text classification.Zhang et al. [11] showed that models pre-trained on large medical corpora outperform general language models in medical NLP tasks.H Li et al.[12]introduces ET-DM, a model that uses diffusion and efficient Transformers to improve text-to-image synthesis.C He et al.[13]integrates a graph neural network and ontology to improve accuracy in predicting bridge preservation activities.Y Zhang et al.[14]introduces a deep learning model for automated GI tract segmentation in MRI scans.

J Root et al. [15]explored the use of LLMs in automated medical coding, showing that these models can significantly reduce the time and effort required for coding clinical notes.T Liu et al.[16]demonstrated the potential of GPT-3 in generating medical literature summaries, highlighting its ability to understand and synthesize complex medical information.B Zhang et al. [17]highlights the latest NLP applications in text sentiment analysis, emphasizing its role in enterprise decision-making and public opinion monitoringC He et al.[18]prioritizes collaborative scheduling practices that improve construction project performance. R Liu et al.[19] presents a k-means clustering-enhanced SVM algorithm for classifying flying and mobile robots.

Kalyan et al.[20] introduced BERT-based models fine-tuned on specific medical tasks, such as disease prediction and drug recommendation, showcasing the versatility of LLMs in healthcare applications.H Yan et al.[21]discusses the critical role of natural language processing in enhancing data mining and information retrieval in the big data eraA Javanmardi et al.[22]optimizes the OPDCA cycle to improve workflow reliability in construction projects. A Langedijk et al. [23] explored the use of GPT-3 for generating patient discharge summaries, highlighting its potential to automate and streamline clinical documentation processes.Y Zhang et al.[24]evaluates Monte Carlo Tree Search performance on CPUs and GPUs for game strategy simulation.

Mullenbach et al.[25]developed the CAML model, a convolutional attention-based model for medical text classification, which has been widely used in clinical NLP tasks.Y Xia et al.[26] presents a decision-making framework using multi-modal perception and deep reinforcement learning to optimize autonomous driving.M Bonilla et al.[27]proposes a need-based method for equitable road maintenance funding considering seasonal population changes.Li et al.[28] presented a study on the application of transformer models for detecting adverse drug events from clinical text, demonstrating the models’ effectiveness in identifying critical health information. A. Rahali et al.[29] present VRNet, which enhances human pose estimation from monocular RGB-D images using pseudo multi-view representations and additional RGB datasets. The advanced pretraining techniques of the Sentence-t5 combined with Mistral 7B model are inspired by the methodologies detailed by [30].

These studies collectively underscore the transformative potential of LLMs in healthcare, highlighting various approaches to model training and fine-tuning that improve tasks such as medical question answering and clinical document generation. Building on these advancements, our study leverages insights from previous efforts to enhance the precision and applicability of LLMs using the MedQuAD dataset and innovative model configurations. In conclusion, this body of work lays the foundation for pushing the boundaries of LLM capabilities in medical question-answering and patient education.

III Dataset and Preprocessing

The dataset utilized in this study is derived from MedQuAD, a comprehensive medical question-answering dataset. MedQuAD is particularly advantageous for training models designed to provide accurate responses to medical inquiries, as it encompasses a wide range of common medical questions and corresponding answers. Additionally, it includes various types of medical educational content, which further enriches the training material.

III-A Data Characteristics

The MedQuAD dataset comprises text-based question-answer pairs, where each pair includes a medical question and its corresponding answer. The structured nature of this dataset ensures comprehensive coverage of prevalent medical topics, rendering it an ideal resource for training large language models (LLMs) tailored for healthcare applications.

III-B Data Preprocessing

Effective data preprocessing is critical in transforming the raw dataset into a format suitable for model training. The preprocessing pipeline encompasses several essential stages:

III-B1 Data Cleaning

The initial step involves cleaning the dataset to eliminate any irrelevant or redundant information. This process includes:

  • Removing duplicate question-answer pairs.

  • Filtering out incomplete records.

  • Ensuring consistency in question and answer formats to maintain data integrity.

III-B2 Data Parsing

Following data cleaning, the dataset is parsed into a structured format. Each question-answer pair is converted into a standardized template to ensure uniformity. This involves formatting each pair into a predefined structure that distinctly separates the question from the answer.

III-B3 Template Formatting

The parsed data is formatted into a structured template. Each question-answer pair is mapped into a predefined format that emphasizes the separation of the question from the answer.

Template:   ”Question: question ; Answer: answer” (1)

III-B4 Tokenization

Tokenization is the process of converting textual data into tokens that can be processed by the model. For this study, the SentencePiece tokenizer was employed due to its efficiency and compatibility with various LLM architectures. Tokenization involves segmenting the text into subwords or tokens, which the model then utilizes as input.

III-B5 Data Augmentation

To enhance the diversity and robustness of the training dataset, several data augmentation techniques were applied:

  • Synonym Replacement: This technique involves replacing specific words in the questions with their synonyms, generating multiple variants of the same question to enrich the training set.

  • Back Translation: Questions are translated into another language and then back into English, producing paraphrased versions of the original questions. This method helps in diversifying the dataset.

These techniques are instrumental in expanding the dataset and preventing the model from overfitting to specific question phrasings.

III-B6 Handling Class Imbalance

Medical datasets frequently exhibit class imbalance, where certain types of medical questions are more prevalent than others. To address this issue, oversampling techniques such as the Synthetic Minority Over-sampling Technique (SMOTE) were employed. SMOTE generates synthetic samples for underrepresented classes, ensuring a more balanced dataset. This approach is critical for training models that are unbiased and perform well across all classes of questions.

By meticulously processing and augmenting the data, the dataset is transformed into a high-quality resource that is well-suited for training large language models. This comprehensive preprocessing pipeline is essential for enabling the models to learn effectively and perform accurately in answering medical questions.

IV Methodology

This section details the methodology employed in developing and training the models used in this study. Specifically, we focus on the Sentence-t5 combined with Mistral 7B model, which demonstrated superior performance. The methodology includes model architecture, training procedures, and specific techniques applied to enhance model performance.

IV-A Model Architectures

The architecture of the Sentence-t5 combined with Mistral 7B model leverages the strengths of both models to enhance performance. The overall architecture is shown in Fig 1.

Refer to caption
Figure 1: Overall architecture of the Sentence-t5 combined with Mistral 7B model.

IV-A1 Sentence-t5 Model

The Sentence-t5 model is a transformer-based architecture optimized for sequence classification tasks. It is pre-trained on a variety of sentence-level tasks, making it highly suitable for understanding and generating coherent responses to medical queries. The pre-training involves a masked language model (MLM) objective where tokens in a sentence are randomly masked, and the model is trained to predict these tokens.

Masked Language Model Objective
MLM=i=1NlogP(wi|w\i)\mathcal{L}_{MLM}=-\sum_{i=1}^{N}\log P(w_{i}|w_{\backslash i}) (2)

where P(wi|w\i)P(w_{i}|w_{\backslash i}) is the probability of the masked token given the context of the other tokens in the sentence. This objective helps the model learn the contextual relationships between words, enhancing its ability to generate meaningful responses.

Key features of the Sentence-t5 model include:

  • Transformer Layers: Utilizes multiple transformer layers to capture complex language dependencies.

  • Pretrained Embeddings: Incorporates embeddings pretrained on large-scale text corpora to provide rich contextual understanding.

  • Sequence Classification Head: Designed to output sequence-level predictions, making it suitable for generating contextually relevant prompts.

IV-A2 Mistral 7B Model

The Mistral 7B model is a large-scale transformer model designed to generate text by predicting the next word in a sequence. The fine-tuning process involves adjusting the model parameters to better handle the specific structure and vocabulary of medical texts.

Next Word Prediction Objective
NWP=t=1TlogP(wt|w<t)\mathcal{L}_{NWP}=-\sum_{t=1}^{T}\log P(w_{t}|w_{<t}) (3)

where P(wt|w<t)P(w_{t}|w_{<t}) is the probability of the next word given the context of the previous words in the sequence. This objective ensures that the model can generate fluent and contextually appropriate text.

Key features of the Mistral 7B model include:

  • Large-scale Transformer Architecture: Comprising 7 billion parameters, enabling it to capture intricate language patterns and dependencies.

  • Next-token Prediction: Optimized for next-token prediction tasks, which is crucial for generating coherent and contextually appropriate text.

  • Robust Training Data: Pretrained on diverse datasets, including large-scale NLP and domain-specific corpora, enhancing its generalization capabilities.

IV-B Training Procedures

The training procedures for the Sentence-t5 combined with Mistral 7B model involve multiple stages, including initial pretraining, prompt generation, secondary pretraining, and fine-tuning.

IV-B1 Initial Pretraining

The Sentence-t5 model undergoes initial pretraining on a diverse corpus to familiarize it with various linguistic patterns and structures. This pretraining phase involves:

  • Language Modeling: Training the model on a large corpus to predict masked tokens, thereby learning contextual representations of words and phrases.

  • Sequence Classification Tasks: Fine-tuning the model on sequence classification tasks to enhance its ability to generate accurate prompts.

IV-B2 Prompt Generation

Once pretrained, the Sentence-t5 model generates prompts based on the medical question-answer pairs from the MedQuAD dataset. This involves:

  • Contextual Understanding: Leveraging the model’s pretrained knowledge to generate prompts that accurately reflect the context of the medical questions.

  • Template Formatting: Structuring the prompts in a predefined template to ensure consistency and clarity.

IV-B3 Secondary Pretraining with Mistral 7B

The generated prompts are then fed into the Mistral 7B model, which undergoes further pretraining. This stage refines the model’s understanding and enhances its ability to produce accurate medical answers. Key aspects of this phase include:

  • Fine-tuning on Medical Data: Training the model on the MedQuAD dataset to adapt it to domain-specific language and content.

  • Next-token Prediction: Optimizing the model for next-token prediction tasks, which is essential for generating coherent and contextually appropriate answers.

During this phase, perplexity is used as an intermediate metric to assess the model’s performance in generating text. Perplexity is defined as:

Perplexity=exp(t=1TlogP(wt|w<t)T)\text{Perplexity}=\exp\left(\frac{\sum_{t=1}^{T}\log P(w_{t}|w_{<t})}{T}\right) (4)

A lower perplexity indicates better performance in predicting the next word in a sequence, thus ensuring the generated text is fluent and contextually relevant.

IV-B4 Fine-Tuning

The combined model undergoes fine-tuning on the MedQuAD dataset to optimize performance metrics, particularly precision in answering medical questions. The fine-tuning process includes:

  • Hyperparameter Optimization: Adjusting learning rates, batch sizes, and other hyperparameters to maximize model performance.

  • Loss Function: Utilizing cross-entropy loss for training, which is suitable for classification tasks. The cross-entropy loss can be defined as:

    CE=i=1Nyilogy^i\mathcal{L}_{CE}=-\sum_{i=1}^{N}y_{i}\log\hat{y}_{i} (5)

    where yiy_{i} is the true label and y^i\hat{y}_{i} is the predicted probability.

  • Evaluation Metrics: Monitoring performance metrics such as precision to ensure the model meets desired performance criteria.

IV-C Detailed Model Configuration and Training

Pretraining with Sentence-t5

Generate prompts using the pre-trained Sentence-t5 model and initialize the tokenizer:

prompts=generate_prompts(sentence_t5, dataset)\text{prompts}=\text{generate\_prompts(sentence\_t5, dataset)} (6)

This step leverages the pre-trained Sentence-t5 model to generate high-quality prompts that are used to fine-tune the Mistral 7B model.

Fine-Tuning with Mistral 7B

Format the prompts for Mistral 7B and tokenize:

ids,mask=tokenizer(prompts, return_tensors=”pt”)\text{ids},\text{mask}=\text{tokenizer(prompts, return\_tensors="pt")} (7)

Train the model using the generated tokens:

outputs=model.generate(input_ids, max_length=100)\text{outputs}=\text{model.generate(input\_ids, max\_length=100)} (8)

Fine-tuning involves adjusting the parameters of the Mistral 7B model using the generated prompts, optimizing it for the specific task of medical question answering.

IV-D Additional Techniques and Tricks

Several additional techniques and tricks were employed to enhance the performance and robustness of the models:

IV-D1 Learning Rate Scheduling

Dynamic learning rate scheduling was applied to adjust the learning rate during training, helping the model converge more efficiently. Techniques such as cosine annealing and learning rate warm-up were used. The learning rate schedule can be represented as:

lr(t)=init_lr×(1ttotal_steps)\text{lr}(t)=\text{init\_lr}\times\left(1-\frac{t}{\text{total\_steps}}\right) (9)

IV-D2 Gradient Clipping

Gradient clipping was implemented to prevent exploding gradients during training, ensuring stable and efficient model convergence. Gradient clipping can be expressed as:

gi=gimax(1,gic)g_{i}=\frac{g_{i}}{\max(1,\frac{\|g_{i}\|}{c})} (10)

where gig_{i} is the gradient and cc is the clipping threshold.

IV-D3 Regularization Techniques

Regularization techniques, including dropout and weight decay, were applied to prevent overfitting and enhance the model’s ability to generalize to unseen data. Dropout can be defined as:

hi=hiph_{i}=\frac{h_{i}}{p} (11)

where hih_{i} is the activations and pp is the dropout rate.

By meticulously applying these training procedures and techniques, the Sentence-t5 combined with Mistral 7B model was effectively trained to achieve superior performance in answering medical questions, demonstrating its potential for practical healthcare applications.

IV-E Model Evaluation

In this study, the primary metric used to evaluate the performance of the models is precision. Precision is defined as the ratio of correctly predicted positive observations to the total predicted positives. It is a crucial metric in scenarios where the cost of false positives is high. In medical question answering, providing incorrect information can have significant negative implications, potentially leading to misinformed medical decisions and adverse health outcomes. Therefore, ensuring that the answers generated by the model are accurate and relevant is of utmost importance.

Precision=True PositivesTrue Positives+False Positives\text{Precision}=\frac{\text{True Positives}}{\text{True Positives}+\text{False Positives}} (12)

Using precision as the primary evaluation metric allows us to focus on the model’s ability to provide accurate and relevant answers to medical queries, minimizing the risk of incorrect or misleading information.

V Experiments Results

The experiment evaluated the precision of various language models and configurations in processing healthcare-related data for large language models (LLMs). The models tested included variations of the T5 model, Phi-3, and gemma-2b, each with different fine-tuning or pretraining methods such as LoRA. The precision results are presented in Table I.

TABLE I: ModelPrecision for Different Configurations
Model Configuration Precision
Sentence-T5 0.702
Phi-3 + LoRA 0.718
Gemma-2b + LoRA 0.721
Sentence-T5 + Mistral 7B + Pretrain 0.762

The Sentence-t5 combined with Mistral 7B model outperformed other models evaluated in this study, including Gemma 2b + LoRA, Phi-2 + Fine-Tuning, and Sentence-t5. The higher precision score of 0.762 indicates that the combined model is more effective in generating accurate and relevant medical answers.

VI Conclusion

In conclusion, this study demonstrates the transformative potential of Large Language Models (LLMs) in healthcare by addressing the need for accurate and timely medical information. By training and fine-tuning models such as Gemma 2b + LoRA, Phi-2, and Sentence-t5 + Mistral 7B using the MedQuAD dataset, we identified the Sentence-t5 + Mistral 7B + Pretrain model as the most effective, achieving the highest precision in handling medical queries. Our methodology, which involved extensive data preprocessing, prompt construction, and model optimization, underscores the importance of specialized training for deploying LLMs in healthcare. This research highlights how AI-driven solutions can enhance patient access to reliable medical information, reduce the burden on healthcare professionals, and improve healthcare delivery. By integrating advanced AI technologies, we can significantly impact patient care, support healthcare professionals, and pave the way for future advancements in this critical field. The results of this study offer valuable insights into the effective deployment of LLMs, contributing to the broader goal of improving healthcare outcomes through technology.

References

  • [1] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  • [2] X. Peng, Q. Xu, Z. Feng, H. Zhao, L. Tan, Y. Zhou, Z. Zhang, C. Gong, and Y. Zheng, “Automatic news generation and fact-checking system based on language processing,” arXiv preprint arXiv:2405.10492, 2024.
  • [3] J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.
  • [4] Q. Ning, W. Zheng, H. Xu, A. Zhu, T. Li, Y. Cheng, S. Feng, L. Wang, D. Cui, and K. Wang, “Rapid segmentation and sensitive analysis of crp with paper-based microfluidic device using machine learning,” Analytical and Bioanalytical Chemistry, vol. 414, no. 13, pp. 3959–3970, 2022.
  • [5] K. Huang, J. Altosaar, and R. Ranganath, “Clinicalbert: Modeling clinical notes and predicting hospital readmission,” arXiv preprint arXiv:1904.05342, 2019.
  • [6] Y. Cao, P. Cao, H. Chen, K. M. Kochendorfer, A. B. Trotter, W. L. Galanter, P. M. Arnold, and R. K. Iyer, “Predicting icu admissions for hospitalized covid-19 patients with a factor graph-based model,” in Multimodal AI in healthcare: A paradigm shift in health intelligence.   Springer, 2022, pp. 245–256.
  • [7] F. Li, Y. Jin, W. Liu, B. P. S. Rawat, P. Cai, H. Yu et al., “Fine-tuning bidirectional encoder representations from transformers (bert)–based models on large-scale electronic health record notes: an empirical study,” JMIR medical informatics, vol. 7, no. 3, p. e14830, 2019.
  • [8] A. Javanmardi, M. Liu, C. He, S. M. Hsiang, and A. Abbasian-Hosseini, “Improving construction meeting effectiveness: Trade-offs between reactive and proactive site-level planning discussions,” Journal of Management in Engineering, vol. 40, no. 5, p. 04024029, 2024.
  • [9] M. Zhu, Y. Zhang, Y. Gong, K. Xing, X. Yan, and J. Song, “Ensemble methodology: Innovations in credit default prediction using lightgbm, xgboost, and localensemble,” arXiv preprint arXiv:2402.17979, 2024.
  • [10] Y. Peng, S. Yan, and Z. Lu, “Transfer learning in biomedical natural language processing: an evaluation of bert and elmo on ten benchmarking datasets,” arXiv preprint arXiv:1906.05474, 2019.
  • [11] Y. Li, P. Hu, Z. Liu, D. Peng, J. T. Zhou, and X. Peng, “Contrastive clustering,” in Proceedings of the AAAI conference on artificial intelligence, vol. 35, no. 10, 2021, pp. 8547–8555.
  • [12] H. Li, F. Xu, and Z. Lin, “Et-dm: Text to image via diffusion model with efficient transformer,” Displays, vol. 80, p. 102568, 2023.
  • [13] C. He, M. Liu, S. M. Hsiang, and N. Pierce, “Synthesizing ontology and graph neural network to unveil the implicit rules for us bridge preservation decisions,” Journal of Management in Engineering, vol. 40, no. 3, p. 04024007, 2024.
  • [14] Y. Zhang, Y. Gong, D. Cui, X. Li, and X. Shen, “Deepgi: An automated approach for gastrointestinal tract segmentation in mri scans,” arXiv preprint arXiv:2401.15354, 2024.
  • [15] J. Root and D. S. Ahn, “Incentives and efficiency in constrained allocation mechanisms,” arXiv preprint arXiv:2006.06776, 2020.
  • [16] T. Liu, W. Tan, X. Tang, J. Chen, and D. Cao, “Adaptive energy management for real driving conditions via transfer reinforcement learning,” arXiv preprint arXiv:2007.12560, 2020.
  • [17] B. Zhang, J. Xiao, H. Yan, L. Yang, and P. Qu, “Review of nlp applications in the field of text sentiment analysis,” Journal of Industrial Engineering and Applied Science, vol. 2, no. 3, pp. 28–34, 2024.
  • [18] C. He, M. Liu, T. d. C. Alves, N. M. Scala, and S. M. Hsiang, “Prioritizing collaborative scheduling practices based on their impact on project performance,” Construction management and economics, vol. 40, no. 7-8, pp. 618–637, 2022.
  • [19] R. Liu, X. Xu, Y. Shen, A. Zhu, C. Yu, T. Chen, and Y. Zhang, “Enhanced detection classification via clustering svm for various robot collaboration task,” arXiv preprint arXiv:2405.03026, 2024.
  • [20] K. Eldefrawy, M. Locasto, N. Rattanavipanon, and H. Saidi, “Towards automated augmentation and instrumentation of legacy cryptographic executables: extended version,” arXiv preprint arXiv:2004.09713, 2020.
  • [21] H. Yan, J. Xiao, B. Zhang, L. Yang, and P. Qu, “The application of natural language processing technology in the era of big data,” Journal of Industrial Engineering and Applied Science, vol. 2, no. 3, pp. 20–27, 2024.
  • [22] A. Javanmardi, C. He, S. M. Hsiang, S. A. Abbasian-Hosseini, and M. Liu, “Enhancing construction project workflow reliability through observe–plan–do–check–react cycle: A bridge project case study,” Buildings, vol. 13, no. 9, p. 2379, 2023.
  • [23] A. Langedijk, V. Dankers, P. Lippe, S. Bos, B. C. Guevara, H. Yannakoudakis, and E. Shutova, “Meta-learning for fast cross-lingual adaptation in dependency parsing,” arXiv preprint arXiv:2104.04736, 2021.
  • [24] Y. Zhang, M. Zhu, K. Gui, J. Yu, Y. Hao, and H. Sun, “Development and application of a monte carlo tree search algorithm for simulating da vinci code game strategies,” arXiv preprint arXiv:2403.10720, 2024.
  • [25] J. Mullenbach, S. Wiegreffe, J. Duke, J. Sun, and J. Eisenstein, “Explainable prediction of medical codes from clinical text,” arXiv preprint arXiv:1802.05695, 2018.
  • [26] Y. Xia, S. Liu, Q. Yu, L. Deng, Y. Zhang, H. Su, and K. Zheng, “Parameterized decision-making with multi-modal perception for autonomous driving,” arXiv preprint arXiv:2312.11935, 2023.
  • [27] M. Bonilla, W. Rasdorf, M. Liu, M. Al-Ghandour, and C. He, “Inequity reduction in road maintenance funding for municipalities,” Public Works Management & Policy, vol. 28, no. 3, pp. 339–362, 2023.
  • [28] Q. Li, “A machine learning framework for analysis of resting-state eeg in patients,” 2022.
  • [29] A. Zhu, J. Li, and C. Lu, “Pseudo view representation learning for monocular rgb-d human pose and shape estimation,” IEEE Signal Processing Letters, vol. 29, pp. 712–716, 2021.
  • [30] Y. Sun and J. Ortiz, “Rapid review of generative ai in smart medical applications,” arXiv preprint arXiv:2406.06627, 2024.