¹¹institutetext: University of Brasilia

Enhancing AI-Driven Psychological Consultation: Layered Prompts with Large Language Models

Rafael Souza Jia-Hao Lim Alexander Davis

Abstract

Psychological consultation is essential for improving mental health and well-being, yet challenges such as the shortage of qualified professionals and scalability issues limit its accessibility. To address these challenges, we explore the use of large language models (LLMs) like GPT-4 to augment psychological consultation services. Our approach introduces a novel layered prompting system that dynamically adapts to user input, enabling comprehensive and relevant information gathering. We also develop empathy-driven and scenario-based prompts to enhance the LLM’s emotional intelligence and contextual understanding in therapeutic settings. We validated our approach through experiments using a newly collected dataset of psychological consultation dialogues, demonstrating significant improvements in response quality. The results highlight the potential of our prompt engineering techniques to enhance AI-driven psychological consultation, offering a scalable and accessible solution to meet the growing demand for mental health support.

Keywords:

Psychological Consultation Large Language Models Prompt Engineering Empathy-Driven AI Scenario-Based Prompts Mental Health Support

1 Introduction

Psychological consultation is a crucial service aimed at improving mental health and well-being by providing individuals with professional guidance and support. The significance of psychological consultation lies in its ability to address a wide range of mental health issues, such as anxiety, depression, and stress, which are increasingly prevalent in today’s society [1]. Effective psychological consultation can lead to improved emotional resilience, better coping strategies, and enhanced quality of life for individuals facing mental health challenges [2].

Despite its importance, there are several challenges associated with psychological consultation. One of the primary challenges is the shortage of qualified mental health professionals, which limits the accessibility of these services, particularly in underserved regions [1]. Additionally, the traditional one-on-one consultation model is often not scalable, making it difficult to meet the growing demand for mental health support [1]. Furthermore, there is a need for maintaining privacy and confidentiality in online consultations, which adds another layer of complexity to delivering effective mental health services [3].

Our motivation stems from these challenges and the potential of leveraging large language models (LLMs) to augment psychological consultation services. Recent advancements in LLMs, such as GPT-4, have shown promise in generating human-like responses and understanding complex human emotions [4]. However, existing models often struggle with maintaining the emotional nuance required in therapeutic contexts and ensuring comprehensive understanding of diverse psychological issues. To address these limitations, we propose a novel approach that integrates refined prompt engineering techniques with LLMs to enhance their performance in psychological consultation tasks.

Our approach involves developing a series of dynamic prompts that adapt based on user input, utilizing a layered prompting system. This system starts with broad, open-ended questions to gather initial user concerns, followed by specific, context-sensitive prompts designed to delve deeper into the user’s issues. For instance, initial prompts might include, "Can you tell me more about what’s been on your mind lately?" followed by tailored prompts like, "How have these thoughts affected your daily life?" and "Have you noticed any patterns or triggers for these feelings?" This method ensures that the LLM gathers comprehensive and relevant information before offering advice.

Additionally, we implement empathy-driven prompts that encourage the LLM to respond with compassion and understanding. Prompts such as "That sounds really challenging, can you share more about how you’re coping?" and "It’s okay to feel this way, let’s explore what might help you feel better" guide the LLM in maintaining a supportive tone. Furthermore, we incorporate scenario-based prompts, where the LLM is given specific case studies and example interactions during training, to help it better understand and simulate real-life counseling scenarios.

To evaluate the effectiveness of our approach, we conducted experiments using a newly collected dataset of psychological consultation dialogues. This dataset includes a diverse range of mental health issues and user interactions, ensuring that the model is trained on realistic and varied scenarios. We used GPT-4 for evaluation, assessing the model’s ability to provide accurate, empathetic, and contextually appropriate responses. The results demonstrated significant improvements in the quality of the LLM’s responses, highlighting the potential of our prompt engineering techniques in enhancing AI-driven psychological consultation [5].

•

We propose a novel layered prompting system that dynamically adapts to user input, ensuring comprehensive and relevant information gathering for psychological consultation.
•

We develop empathy-driven and scenario-based prompts to enhance the emotional intelligence and contextual understanding of LLMs in therapeutic settings.
•

We validate our approach through rigorous experiments using a newly collected dataset and GPT-4 evaluations, demonstrating significant improvements in response quality.

2 Related Work

2.1 Large Language Models

The rapid development of deep learning has brought significant changes and advancements to the fields of computer vision [6, 7] and natural language processing [8]. Large Language Models (LLMs) [9] have significantly advanced the field of natural language processing (NLP) due to their ability to understand and generate human-like text. Recent surveys and overviews highlight the architectural innovations, training strategies, and applications of LLMs such as GPT-3, GPT-4, PaLM, and LLaMA [10, 11]. These models have demonstrated remarkable capabilities across various tasks, including text generation, summarization, and translation, by leveraging vast amounts of training data and sophisticated neural architectures [12, 13, 14].

The impact of LLMs extends to domains such as information retrieval, where they enhance the precision and expressiveness of user queries and improve retrieval efficiency [15, 16, zhou, 17]. Additionally, LLMs are being utilized in optimization tasks, serving as powerful tools for mathematical problem-solving and decision-making processes [18]. Their versatility is further evident in multimodal applications, where models integrate visual and textual data to perform complex tasks [12, 19].

Despite these advancements, LLMs face challenges related to ethical considerations, such as bias, toxicity, and privacy concerns [20]. Researchers are actively exploring methods to align LLMs with ethical standards and mitigate these risks, ensuring their safe and responsible deployment [20, 21].

2.2 Psychological Consultation

The application of LLMs in psychological consultation is a growing area of research, driven by the need to address mental health issues efficiently and effectively. Traditional psychological counseling methods, which rely heavily on human interaction, face challenges such as scalability, accessibility, and privacy concerns [1, 3]. The integration of LLMs into psychological services aims to overcome these limitations by providing scalable and accessible mental health support.

Several studies have proposed frameworks and models that leverage LLMs for psychological consultation. For instance, CPsyCoun focuses on reconstructing and evaluating multi-turn dialogues to create a high-quality dataset for training LLMs in the context of psychological counseling [22]. Similarly, Psy-LLM and ChatCounselor utilize LLMs to provide online mental health support, enhancing the accessibility and affordability of psychological services [1, 3].

Innovative approaches such as K-ESConv and BianQue inject professional knowledge into LLMs, improving the quality and diversity of responses in emotional support dialogues [23, 24]. These methods highlight the potential of LLMs to offer empathetic and contextually appropriate support, addressing the emotional and psychological needs of users effectively.

Overall, the integration of LLMs into psychological consultation presents a promising solution to the growing demand for mental health support, offering scalable, accessible, and effective services. However, ongoing research is necessary to address ethical considerations and ensure the safe deployment of these technologies in sensitive contexts such as mental health.

3 Dataset

In this section, we describe the collection and processing of the dataset used in our experiments, followed by an explanation of the evaluation metrics employed, with a particular focus on the innovative use of GPT-4 as an evaluator, moving away from traditional metrics.

3.1 Data Collection

To ensure a diverse and comprehensive dataset, we collected psychological consultation dialogues from multiple online platforms and mental health forums. The data collection process involved the following steps:

•

Source Selection: We identified reputable sources of psychological consultation dialogues, including well-known mental health platforms and forums where professionals and users interact. These sources were chosen based on their high user engagement and the quality of their content.
•

Data Extraction: Using web scraping techniques and API access where available, we extracted dialogues that covered a wide range of mental health issues, such as anxiety, depression, stress, relationship problems, and more. This extraction process ensured that we captured a variety of interaction types and user concerns.
•

Anonymization and Cleaning: To maintain privacy and confidentiality, all personal identifiers were removed from the dialogues. The data was then cleaned to eliminate any irrelevant or sensitive information, ensuring that the dataset adhered to ethical standards for data usage.
•

Format Standardization: The collected dialogues were formatted uniformly to facilitate ease of use in training and evaluation. Each dialogue was structured to include metadata such as the topic, user demographics (where available), and the context of the consultation.

The resulting dataset comprises thousands of dialogues, each providing a rich context for training and evaluating large language models in the domain of psychological consultation.

3.2 Evaluation Metrics: GPT-4 as Judge

Traditional evaluation metrics for natural language generation, such as BLEU and ROUGE, focus on lexical similarity and do not adequately capture the nuanced requirements of psychological consultation, such as empathy, relevance, and contextual appropriateness. To address this limitation, we employed GPT-4 as a judge for evaluating the performance of our model.

3.2.1 Evaluation Framework

The evaluation framework using GPT-4 involves several key components:

•

Contextual Understanding: GPT-4 assesses whether the model’s responses accurately reflect an understanding of the user’s context and concerns. This includes evaluating the relevance and appropriateness of the responses in the given context.
•

Empathy and Support: The ability of the model to provide empathetic and supportive responses is crucial in psychological consultation. GPT-4 evaluates the tone and emotional intelligence of the responses, ensuring they offer genuine support and understanding.
•

Interactive Engagement: Effective psychological consultation requires engaging the user in meaningful dialogue. GPT-4 judges the model’s ability to ask relevant follow-up questions, provide insightful feedback, and maintain an interactive conversation flow.
•

Professionalism and Accuracy: The accuracy of the information and advice provided is paramount. GPT-4 evaluates the correctness of the content and the professional quality of the responses, ensuring they align with accepted psychological practices and knowledge.

3.2.2 Implementation of GPT-4 Evaluation

To implement GPT-4 as an evaluator, we utilized a few-shot in-context learning approach. This involved the following steps:

•

Prompt Design: We designed specific prompts to guide GPT-4 in evaluating the dialogues based on the aforementioned criteria. These prompts included examples of high-quality responses and detailed instructions on what aspects to consider during evaluation.
•

Comparison and Scoring: GPT-4 was tasked with comparing the model-generated responses against a set of predefined benchmarks and providing scores based on the evaluation criteria. Additionally, GPT-4 offered qualitative feedback to highlight areas of strength and improvement.
•

Iterative Refinement: The evaluation process was iterative, with continuous refinement of prompts and evaluation criteria based on the feedback from GPT-4. This iterative approach ensured that the evaluation framework remained robust and aligned with the goals of psychological consultation.

By leveraging GPT-4’s advanced natural language understanding capabilities, we were able to develop a comprehensive and nuanced evaluation framework that goes beyond traditional metrics, providing deeper insights into the performance and effectiveness of our prompt engineering techniques in enhancing LLM-based psychological consultation.

•

We collected a diverse and comprehensive dataset of psychological consultation dialogues from multiple reputable sources, ensuring a wide range of mental health issues and interaction types.
•

We developed an innovative evaluation framework using GPT-4 as a judge, focusing on contextual understanding, empathy, interactive engagement, and professional accuracy.
•

Our evaluation framework employs few-shot in-context learning with GPT-4, providing a robust and nuanced assessment of model performance, offering qualitative feedback, and guiding iterative refinement.

4 Method

Our method leverages advanced prompt engineering techniques to enhance the performance of large language models (LLMs) in the domain of psychological consultation. This section details the specific prompts we developed, the motivation behind these prompts, their input and output structures, and the rationale for their effectiveness.

4.1 Prompt Engineering Motivation

The motivation for our prompt engineering approach is rooted in the need to address several key challenges in AI-driven psychological consultation. Traditional LLMs often lack the nuanced understanding and empathetic response generation required for effective psychological support. Our goal is to create prompts that not only gather comprehensive user information but also encourage the LLM to generate emotionally intelligent and contextually appropriate responses. By designing layered, empathy-driven, and scenario-based prompts, we aim to enhance the LLM’s ability to engage users in meaningful dialogues, provide accurate advice, and maintain a supportive tone throughout the interaction.

4.2 Prompt Structure and Design

Our prompt engineering approach involves creating a series of dynamic and layered prompts that guide the LLM through various stages of the consultation process. Each prompt is designed to fulfill a specific role, from initial information gathering to providing tailored advice and empathetic support.

4.2.1 Initial Information Gathering Prompt

The initial prompt is designed to open the conversation and gather broad information about the user’s concerns. An example of this prompt is:

"Can you tell me more about what’s been on your mind lately?"

Input: The input to this prompt is typically a user’s brief description of their current mental state or specific concerns.

Output: The output is a detailed response from the user that provides context for further conversation.

Significance: This prompt is crucial as it sets the tone for the interaction and helps the LLM understand the user’s primary concerns. It encourages users to share openly, which is essential for accurate and relevant advice generation.

4.2.2 Context-Sensitive Follow-Up Prompts

Following the initial information gathering, we employ context-sensitive prompts to delve deeper into the user’s issues. Examples include:

"How have these thoughts affected your daily life?"

"Have you noticed any patterns or triggers for these feelings?"

Input: The input here includes the user’s previous responses, providing context for the LLM to generate more specific follow-up questions.

Output: The output is additional detailed information from the user, offering deeper insight into their mental health concerns.

Significance: These prompts help to uncover the root causes and specific triggers of the user’s issues, facilitating a more targeted and effective consultation process.

4.2.3 Empathy-Driven Prompts

To ensure the LLM responds with appropriate empathy and support, we incorporate empathy-driven prompts such as:

"That sounds really challenging, can you share more about how you’re coping?"

"It’s okay to feel this way, let’s explore what might help you feel better."

Input: The input for these prompts is typically a user’s expression of distress or difficulty.

Output: The output is a supportive and empathetic response that validates the user’s feelings and offers comfort.

Significance: These prompts are essential for maintaining an empathetic tone, which is crucial in psychological consultation. They help to build rapport and trust between the user and the AI, encouraging further sharing and engagement.

4.2.4 Scenario-Based Prompts

Scenario-based prompts are used to simulate real-life counseling situations, training the LLM to handle a variety of psychological issues. An example prompt might be:

"Imagine a user comes to you feeling overwhelmed by work stress. How would you guide them through this issue?"

Input: The input for this prompt includes detailed scenarios based on common psychological issues.

Output: The output is a step-by-step guidance or advice tailored to the specific scenario.

Significance: These prompts train the LLM to apply psychological principles in practical situations, improving its ability to offer relevant and actionable advice.

4.3 Effectiveness of the Proposed Method

The effectiveness of our prompt engineering method lies in its ability to create a structured yet flexible dialogue flow that mimics the nuances of human psychological consultation. By incorporating layered prompts that adapt based on user input, we ensure that the LLM gathers comprehensive information and provides responses that are both contextually relevant and emotionally supportive. The use of empathy-driven prompts enhances the LLM’s ability to connect with users on an emotional level, while scenario-based prompts improve its practical application of psychological knowledge. This approach not only enhances the user experience but also ensures that the advice provided is accurate, empathetic, and tailored to the user’s unique needs.

In summary, our method combines advanced prompt engineering techniques with the powerful capabilities of LLMs to deliver a more effective and empathetic psychological consultation service. The structured prompts guide the LLM through various stages of the consultation process, ensuring comprehensive information gathering, deep contextual understanding, and supportive engagement. This innovative approach holds significant promise for improving the accessibility and quality of mental health support provided by AI-driven systems.

5 Experiments

In this section, we present the experiments conducted to evaluate the effectiveness of our proposed prompt engineering method. We compared our approach with a baseline method and the Chain-of-Thought (CoT) prompting method on both ChatGPT and GPT-4 models. The experimental results demonstrate that our method significantly outperforms the alternatives in terms of response quality, empathy, and user engagement.

5.1 Experimental Setup

We conducted our experiments using the following models and methods:

•

ChatGPT Baseline: Standard prompts without specific tailoring for psychological consultation.
•

GPT-4 Baseline: Standard prompts using the GPT-4 model.
•

Chain-of-Thought (CoT) Prompting: A method where the LLM is guided to think step-by-step before generating a response.
•

Proposed Method: Our layered and empathy-driven prompt engineering approach applied to both ChatGPT and GPT-4.

Each model was evaluated based on its performance in generating responses to a diverse set of psychological consultation scenarios. We measured the effectiveness of the responses using several criteria, including relevance, empathy, context understanding, and overall user satisfaction.

5.2 Evaluation Metrics

To quantitatively assess the performance of each method, we used the following metrics:

•

Relevance: The degree to which the response addresses the user’s specific concerns.
•

Empathy: The extent to which the response shows understanding and compassion for the user’s feelings.
•

Context Understanding: The ability of the model to maintain context and provide consistent advice throughout the conversation.
•

User Satisfaction: Overall satisfaction of users based on feedback collected after each interaction.

5.3 Results

The results of our experiments are summarized in Table 1. Our proposed method consistently outperformed the baseline and CoT methods across all metrics.

Table 1: Comparison of Different Methods on Various Metrics

Method	Relevance	Empathy	Context Understanding	User Satisfaction
ChatGPT Baseline	3.2	3.0	3.1	3.2
GPT-4 Baseline	3.5	3.4	3.6	3.5
CoT Prompting	3.8	3.7	3.9	3.8
Proposed Method (ChatGPT)	4.2	4.4	4.3	4.5
Proposed Method (GPT-4)	4.5	4.7	4.6	4.8

5.4 Analysis and Discussion

The experimental results clearly indicate that our proposed method significantly enhances the performance of LLMs in psychological consultation tasks. The improvements in relevance, empathy, context understanding, and user satisfaction highlight the effectiveness of our layered, empathy-driven, and scenario-based prompts.

5.4.1 Relevance

Our method outperformed the baseline and CoT methods in terms of relevance. The dynamic prompts allowed the LLM to gather comprehensive information from users, resulting in more accurate and pertinent responses.

5.4.2 Empathy

The empathy-driven prompts significantly enhanced the model’s ability to provide compassionate and supportive responses. This is crucial in psychological consultation, where users seek not only advice but also emotional support.

5.4.3 Context Understanding

The scenario-based prompts improved the LLM’s ability to maintain context throughout the conversation. This consistency is essential for building trust and providing reliable advice in psychological consultations.

5.4.4 User Satisfaction

The overall user satisfaction was highest with our proposed method, indicating that users found the interactions more helpful and engaging. The combination of relevant, empathetic, and contextually accurate responses contributed to a better user experience.

5.4.5 Verification of Effectiveness

To further validate the effectiveness of our method, we conducted additional analysis on the feedback collected from users. The qualitative feedback corroborated our quantitative findings, with users frequently mentioning the improved empathy and relevance of the responses. This additional analysis confirms that our prompt engineering approach not only performs well on objective metrics but also meets the subjective needs of users seeking psychological support.

In conclusion, our experiments demonstrate that our proposed prompt engineering method significantly enhances the performance of LLMs in psychological consultation tasks. The layered, empathy-driven, and scenario-based prompts enable the models to provide more relevant, compassionate, and contextually accurate responses, leading to higher user satisfaction and better overall outcomes in mental health support.

6 Conclusion

In this study, we introduced an innovative prompt engineering approach to improve the performance of large language models (LLMs) in the context of psychological consultation. Our method involves layered prompts that adapt dynamically to user inputs, empathy-driven prompts that foster supportive interactions, and scenario-based prompts that simulate real-life counseling scenarios. Experimental comparisons with baseline methods and Chain-of-Thought (CoT) prompting on ChatGPT and GPT-4 models revealed that our approach excels in generating relevant, empathetic, and contextually accurate responses. The superior performance was corroborated by quantitative metrics and qualitative user feedback, underscoring the method’s ability to enhance user engagement and satisfaction. These findings suggest that our prompt engineering techniques can significantly elevate the quality and accessibility of AI-driven psychological consultation, offering a scalable solution to meet the growing demand for mental health support.

References

[1] Lai, T., Shi, Y., Du, Z., Wu, J., Fu, K., Dou, Y., Wang, Z.: Psy-llm: Scaling up global mental health psychological services with ai-based large language models. CoRR abs/2307.11991 (2023). https://doi.org/10.48550/ARXIV.2307.11991, https://doi.org/10.48550/arXiv.2307.11991
[2] Zhang, C., Li, R., Tan, M., Yang, M., Zhu, J., Yang, D., Zhao, J., Ye, G., Li, C., Hu, X., Wong, D.F.: Cpsycoun: A report-based multi-turn dialogue reconstruction and evaluation framework for chinese psychological counseling. CoRR abs/2405.16433 (2024). https://doi.org/10.48550/ARXIV.2405.16433, https://doi.org/10.48550/arXiv.2405.16433
[3] Liu, J.M., Li, D., Cao, H., Ren, T., Liao, Z., Wu, J.: Chatcounselor: A large language models for mental health support. CoRR abs/2309.15461 (2023). https://doi.org/10.48550/ARXIV.2309.15461, https://doi.org/10.48550/arXiv.2309.15461
[4] Zhao, J., Wang, T., Abid, W., Angus, G., Garg, A., Kinnison, J., Sherstinsky, A., Molino, P., Addair, T., Rishi, D.: Lora land: 310 fine-tuned llms that rival gpt-4, A technical report. CoRR abs/2405.00732 (2024). https://doi.org/10.48550/ARXIV.2405.00732, https://doi.org/10.48550/arXiv.2405.00732
[5] Huang, J., Wang, W., Li, E.J., Lam, M.H., Ren, S., Yuan, Y., Jiao, W., Tu, Z., Lyu, M.R.: Who is chatgpt? benchmarking llms’ psychological portrayal using psychobench. CoRR abs/2310.01386 (2023). https://doi.org/10.48550/ARXIV.2310.01386, https://doi.org/10.48550/arXiv.2310.01386
[6] Zhou, Y., Long, G.: Improving cross-modal alignment for text-guided image inpainting. arXiv preprint arXiv:2301.11362 (2023)
[7] Wang, Q., Hu, H., Zhou, Y.: Memorymamba: Memory-augmented state space model for defect recognition. arXiv preprint arXiv:2405.03673 (2024)
[8] Zhou, Y., Geng, X., Shen, T., Long, G., Jiang, D.: Eventbert: A pre-trained model for event correlation reasoning. In: Proceedings of the ACM Web Conference 2022. pp. 850–859 (2022)
[9] Zhou, Y., Li, X., Wang, Q., Shen, J.: Visual in-context learning for large vision-language models. arXiv preprint arXiv:2402.11574 (2024)
[10] Naveed, H., Khan, A.U., Qiu, S., Saqib, M., Anwar, S., Usman, M., Barnes, N., Mian, A.: A comprehensive overview of large language models. CoRR abs/2307.06435 (2023). https://doi.org/10.48550/ARXIV.2307.06435, https://doi.org/10.48550/arXiv.2307.06435
[11] Wornow, M., Xu, Y., Thapa, R., Patel, B.S., Steinberg, E., Fleming, S.L., Pfeffer, M.A., Fries, J.A., Shah, N.H.: The shaky foundations of clinical foundation models: A survey of large language models and foundation models for emrs. CoRR abs/2303.12961 (2023). https://doi.org/10.48550/ARXIV.2303.12961, https://doi.org/10.48550/arXiv.2303.12961
[12] Ye, Q., Xu, H., Xu, G., Ye, J., Yan, M., Zhou, Y., Wang, J., Hu, A., Shi, P., Shi, Y., Li, C., Xu, Y., Chen, H., Tian, J., Qi, Q., Zhang, J., Huang, F.: mplug-owl: Modularization empowers large language models with multimodality. CoRR abs/2304.14178 (2023). https://doi.org/10.48550/ARXIV.2304.14178, https://doi.org/10.48550/arXiv.2304.14178
[13] Wornow, M., Xu, Y., Thapa, R., Patel, B.S., Steinberg, E., Fleming, S.L., Pfeffer, M.A., Fries, J.A., Shah, N.H.: The shaky foundations of clinical foundation models: A survey of large language models and foundation models for emrs. CoRR abs/2303.12961 (2023). https://doi.org/10.48550/ARXIV.2303.12961, https://doi.org/10.48550/arXiv.2303.12961
[14] Zhou, Y., Shen, T., Geng, X., Long, G., Jiang, D.: Claret: Pre-training a correlation-aware context-to-event transformer for event-centric generation and classification. arXiv preprint arXiv:2203.02225 (2022)
[15] Zhu, Y., Yuan, H., Wang, S., Liu, J., Liu, W., Deng, C., Dou, Z., Wen, J.: Large language models for information retrieval: A survey. CoRR abs/2308.07107 (2023). https://doi.org/10.48550/ARXIV.2308.07107, https://doi.org/10.48550/arXiv.2308.07107
[16] Zhou, Y., Geng, X., Shen, T., Pei, J., Zhang, W., Jiang, D.: Modeling event-pair relations in external knowledge graphs for script reasoning. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (2021)
[17] Zhou, Y., Shen, T., Geng, X., Tao, C., Shen, J., Long, G., Xu, C., Jiang, D.: Fine-grained distillation for long document retrieval. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 38, pp. 19732–19740 (2024)
[18] Tang, X., Wang, X., Zhao, W.X., Lu, S., Li, Y., Wen, J.: Unleashing the potential of large language models as prompt optimizers: An analogical analysis with gradient-based model optimizers. CoRR abs/2402.17564 (2024). https://doi.org/10.48550/ARXIV.2402.17564, https://doi.org/10.48550/arXiv.2402.17564
[19] Cui, C., Ma, Y., Cao, X., Ye, W., Zhou, Y., Liang, K., Chen, J., Lu, J., Yang, Z., Liao, K., Gao, T., Li, E., Tang, K., Cao, Z., Zhou, T., Liu, A., Yan, X., Mei, S., Cao, J., Wang, Z., Zheng, C.: A survey on multimodal large language models for autonomous driving. In: IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACVW 2024 - Workshops, Waikoloa, HI, USA, January 1-6, 2024. pp. 958–979. IEEE (2024). https://doi.org/10.1109/WACVW60836.2024.00106, https://doi.org/10.1109/WACVW60836.2024.00106
[20] Xu, Y., Hu, L., Zhao, J., Qiu, Z., Ye, Y., Gu, H.: A survey on multilingual large language models: Corpora, alignment, and bias. CoRR abs/2404.00929 (2024). https://doi.org/10.48550/ARXIV.2404.00929, https://doi.org/10.48550/arXiv.2404.00929
[21] Bowman, S.R.: Eight things to know about large language models. CoRR abs/2304.00612 (2023). https://doi.org/10.48550/ARXIV.2304.00612, https://doi.org/10.48550/arXiv.2304.00612
[22] Zhang, C., Li, R., Tan, M., Yang, M., Zhu, J., Yang, D., Zhao, J., Ye, G., Li, C., Hu, X., Wong, D.F.: Cpsycoun: A report-based multi-turn dialogue reconstruction and evaluation framework for chinese psychological counseling. CoRR abs/2405.16433 (2024). https://doi.org/10.48550/ARXIV.2405.16433, https://doi.org/10.48550/arXiv.2405.16433
[23] Chen, W., Zhao, G., Zhang, X., Bai, X., Huang, X., Wei, Z.: K-esconv: Knowledge injection for emotional support dialogue systems via prompt learning. CoRR abs/2312.10371 (2023). https://doi.org/10.48550/ARXIV.2312.10371, https://doi.org/10.48550/arXiv.2312.10371
[24] Chen, Y., Wang, Z., Xing, X., Zheng, H., Xu, Z., Fang, K., Wang, J., Li, S., Wu, J., Liu, Q., Xu, X.: Bianque: Balancing the questioning and suggestion ability of health llms with multi-turn health conversations polished by chatgpt. CoRR abs/2310.15896 (2023). https://doi.org/10.48550/ARXIV.2310.15896, https://doi.org/10.48550/arXiv.2310.15896