Utilizing Large Language Models for Information Extraction from Real Estate Transactions

Yu Zhao ¹ ^∗ Haoxiang Gao² ^∗
¹ University of Toronto ² Motional AD LLC
[email protected], [email protected]

Abstract

Real estate sales contracts contain crucial information for property transactions, but manual data extraction can be time-consuming and error-prone. This paper explores the application of large language models, specifically transformer-based architectures, for automated information extraction from real estate contracts. We discuss challenges, techniques, and future directions in leveraging these models to improve efficiency and accuracy in real estate contract analysis. We generated synthetic contracts using the real-world transaction dataset, thereby fine-tuning the large-language model and achieving significant metrics improvements and qualitative improvements in information retrieval and reasoning tasks.

^*^*footnotetext: These authors contributed equally to this work

1 Introduction

Real estate transactions involve complex legal documents, such as sales contracts, that outline terms and conditions agreed upon by buyers and sellers. Extracting key information from these contracts is essential for various purposes, including due diligence, risk assessment, and compliance. However, manual review and extraction of data from these documents can be labor-intensive and prone to errors.

Real estate transactions are uniquely complex due to several distinctive factors. One key aspect is the presence of contingencies, which are conditions that must be met for the transaction to proceed, such as financing or inspections. These contingencies add negotiation layers and uncertainty. Additionally, real estate transactions often involve an executory period spanning weeks or months, allowing time for inspections and repairs before final closing. Property ownership is transferred through a deed, a legal document that conveys ownership rights and must be carefully drafted and executed. Furthermore, transactions entail various liabilities like environmental issues or property defects, requiring disclosure and mitigation to minimize risk. These factors highlight the specialized expertise needed to navigate real estate transactions successfully.

In this paper, we investigate the use of large language models for extracting structured data from real estate sales contracts. Traditionally, LSTM recurrent networksSchmidhuber et al. (1997) and TransformersVaswani et al. (2017) have been widely utilized to analyze sequential data in a variety of domains. We explore techniques to preprocess documents, fine-tune models, and information extraction techniques.

2 Related Work

Recent advances in natural language processing (NLP) Chang et al. (2024), particularly with language models like BERT (Bidirectional Encoder Representations from Transformers) Devlin et al. (2018) and GPT (Generative Pre-trained Transformer) Brown et al. (2020), offer promising solutions for automating the extraction of information from text. These models, which have been proven to excel in a number of applications requiring sophisticated context understanding and reasoningGao et al. (2024), excel in understanding and generating human-like text, making them suitable for complex document analysis tasks.

Traditional statistical and machine learning approaches such as Conditional Random Fields (CRF), Support Vector Machines (SVM) and Hidden Markov Model (HMM) have been explored to extract named entities from legal contracts Nadeau and Sekine (2007) Betts and Jaep (2016) Surden (2021) Cui et al. (2023) Zamani and Schwartz (2017). Joshi et al. applied a sequence of information retrieval and traditional machine learning methods to determine the governing law of a contract.

While there has been ample research on large language models, the applications of machine learning models to law are scarce. To the best knowledge of the author, the applications of large language models in the domain of real estate transactions have not been well enough studied.

3 Motivations

The motivation for employing Legal Language Models (LLMs) in reading and understanding real estate transaction contracts is multifaceted and impactful. One key benefit is the optimization of attorney time. LLMs can swiftly analyze lengthy contracts, identify critical clauses, and flag potential issues, enabling attorneys to focus their efforts on higher-level legal analysis and strategic decision-making. This streamlined approach not only enhances productivity but also ensures that legal professionals can devote more time to addressing complex aspects of the transaction, ultimately providing greater value to their clients.

Furthermore, LLMs offer valuable tools for real estate agents, buyers, and sellers to comprehend contract terms effectively. By translating legal jargon into layman’s terms, LLMs empower individuals without legal expertise to understand the key provisions and implications of the contract. This enhanced understanding fosters transparency and facilitates informed decision-making during the negotiation and execution of real estate transactions. Real estate agents can better advise their clients, and buyers/sellers can navigate contractual terms with confidence, leading to smoother and more successful transactions overall.

When combined with inspection reports, appraisal reports, and past transaction history from recorded conveyance records, Large Language Models (LLMs) have the potential to consolidate a wealth of information and streamline the transaction reporting process. This integration allows LLMs to distill key insights and generate concise reports summarizing critical details such as property condition, valuation, ownership history, and legal implications. The ability of LLMs to synthesize disparate information into a cohesive and easily digestible format not only saves time but also enhances decision-making by providing stakeholders with a clear overview of the transaction’s key aspects and potential implications. Ultimately, leveraging LLMs alongside other pertinent reports and records enables a more efficient and informed approach to real estate transactions.

4 Methodology

To extract information from real estate contracts using large language models, we adopt the following methodology:

4.1 Data and Feature Preprocessing

Data processing for real estate transaction contracts within the context of large language models involves several intricate steps leveraging mathematical operations and matrix representations. Initially, raw text representing real estate contracts is tokenized into a sequence of tokens $x_{1},x_{2},\dots,x_{n}$ , where each token corresponds to a word or subword unit within the document. These tokens are then mapped to unique indices using a vocabulary mapping function $\text{vocab}(x_{i})=i$ , forming a tokenized input sequence $\mathbf{x}=(x_{1},x_{2},\dots,x_{n})$ . Subsequently, tokens are embedded into dense vector representations using an embedding matrix $\mathbf{E}$ , where each token $x_{i}$ is represented as a vector $\mathbf{e}_{i}=\mathbf{E}[i,:]$ . The embedded sequence $\mathbf{X}=(\mathbf{e}_{1},\mathbf{e}_{2},\dots,\mathbf{e}_{n})$ captures semantic and contextual information crucial for understanding real estate contract text.

Additionally, positional encodings Wu et al. (2021) Vaswani et al. (2017) can be incorporated to convey sequential information in the input. This involves computing sinusoidal positional encodings based on the position $pos$ and dimension $d$ of each token, defined as:

\text{PE}(pos,2i)=\sin\left(\frac{pos}{k^{2i/d}}\right)

\text{PE}(pos,2i+1)=\cos\left(\frac{pos}{k^{2i/d}}\right)

where $i$ denotes the dimension index within the embedding vector and $k$ denotes a tuning parameter controlling the sequential relationships between texts. It’s worth noting that real estate transactions exhibit less sequential interdependence, where each paragraph generally aligns with a specific clause in a contract or a particular point in a report, resulting in relatively independent paragraphs. These positional encodings complement token embeddings and aid in capturing sequential relationships within the contract text.

Furthermore, preprocessing steps such as padding, truncation, or batch formation can ensure uniform input dimensions and facilitate efficient training and inference. These mathematical operations and matrix manipulations are fundamental in transforming raw real estate transaction text into structured numerical inputs suitable for processing by large language models, enabling tasks such as contract analysis, information extraction, and legal document understanding.

4.2 Fine-tuning Large Language Models

Fine-tuning large language models (LLMs) for real estate contract text data involves various approaches that leverage transfer learning, task-specific fine-tuning, and multi-task learning to enhance model performance and enable domain-specific learning. Transfer learning involves using pre-trained LLMs, like BERTDevlin et al. (2018) or GPTAchiam et al. (2023), trained on massive datasets for general language understanding. These models are then fine-tuned on real estate contract text data to adapt their learned features and knowledge to the nuances of real estate transactions. The fine-tuning process updates model parameters to better align with the target domain, enhancing the LLM’s ability to understand real estate-specific language and concepts.

Task-specific fine-tuning is another approach where pre-trained LLMs are fine-tuned on a dataset tailored specifically for real estate contracts. This method focuses on optimizing the LLM’s performance for a particular task, such as contract summarization or clause classification. By training the model on task-specific data, it learns to extract relevant information and make domain-specific predictions, effectively enhancing its ability to process real estate contract text.

Multi-task learning is a technique that involves training an LLM on multiple related tasks simultaneously Mahabadi et al. (2021), including real estate-specific tasks like contract interpretation and clause extraction. This approach has been used in numerous applications in specific domains Chakrabarty et al. (2019) because it encourages the model to learn shared representations across tasks, enabling it to generalize better and improve performance on each individual taskHoward and Ruder (2018) Wallingford et al. (2022). Multi-task learning helps the LLM leverage common patterns and features across different real estate contract-related tasks, leading to enhanced domain-specific learning and more robust performance.

4.3 Information Extraction

Another approach involves utilizing sequence labeling models, such as conditional random fields (CRFs), to capture structured information within real estate contracts. CRFs can model dependencies between tokens in text and assign labels to sequences corresponding to different contract elements like contingency clauses or price terms. By training LLMs with annotated real estate contract data and CRFs, the model can learn to identify and extract key information effectively.

Furthermore, LLMs can leverage semantic parsing techniques to understand the semantics of real estate contract text and extract specific attributes like property details, contract conditions, and financial terms. Semantic parsing involves mapping natural language expressions to structured representations, enabling LLMs to interpret complex contract language and extract structured information like property attributes and contractual obligations.

Mathematically, these approaches involve training LLMs with labeled real estate contract data $\mathcal{D}$ and learning to predict key information categories $\mathcal{C}$ based on the contract text $x$ . Let $f(x;\theta)$ represent the LLM model parameterized by $\theta$ , and $\hat{y}=f(x;\theta)$ be the predicted information categories. The objective is to minimize a loss function $\mathcal{L}(\hat{y},y)$ that measures the model’s performance in predicting the correct information categorie $y$ from the real estate contract text $x$ .

5 Query

Once an LLM model is fine-tuned, it is capable of answering a wide range of questions related to real estate transactions. For example, queries like ”Describe the area of the property” can be effectively answered by the model, leveraging its understanding of geographic references and property descriptions. Similarly, questions about specific contract terms, such as ”What are the contingencies in the contract?” can be addressed based on the model’s training on legal language and contract structures. By fine-tuning the LLM with relevant real estate data and legal documents, the model gains the ability to interpret and respond to diverse inquiries related to property details, transaction terms, and legal provisions, providing valuable insights and information to users involved in real estate dealings. This versatility in answering questions contributes to the efficiency and effectiveness of utilizing LLMs in real estate transactions, enhancing accessibility and understanding across various aspects of property transactions and contracts.

In fact, real estate professionals can create a tailored set of questions encompassing the crucial due diligence questions for each transaction. These questions can subsequently be inputted into the model for every deal, enabling the extraction of pertinent information and presenting key details to stakeholders involved in the transaction.

6 Dataset Generation

There are a number of challenges that arise when trying to analyze real estate transactions, one of the biggest being the scarcity of data due to privacy concerns. Public researchers have no access to a large volume of real estate transaction contracts between private buyers and sellers, and it’s also difficult to acquire the dataset covering all the diversity and complexity of real estate contracts, e.g., differences between different states and property types. The lack of access to a comprehensive dataset of real estate transaction contracts between private buyers and sellers is a significant challenge for public researchers. This limitation hinders their ability to conduct in-depth analysis and derive meaningful insights into the real estate market. Several factors contribute to this data accessibility issue.

Firstly, real estate transactions are often considered private and confidential. As a result, the parties involved are hesitant to share their contracts with third parties, including researchers. This reluctance stems from concerns about privacy, potential legal implications, and the sensitive nature of the information disclosed in the contracts.

Secondly, the real estate market is highly fragmented and diverse. There are significant variations in laws, regulations, and practices across different states and localities. Additionally, the types of properties involved in transactions can range from residential homes to commercial buildings, each with its own unique set of contractual considerations. This diversity makes it challenging to collect a comprehensive dataset covering the entire real estate contract spectrum.

The absence of such a dataset has far-reaching implications for public researchers. It limits their ability to accurately assess market trends, analyze the impact of government policies, and evaluate the effectiveness of various real estate-related interventions. Without access to a comprehensive dataset, researchers are forced to rely on limited or incomplete information, which can lead to biased or inaccurate conclusions.

Real-world contracts are in unstructured formats, such as scanned images and PDF formats, making it difficult to convert the raw contracts into structured text data that can be consumed by machine learning models. This lack of data makes it difficult to identify trends and patterns and can lead to inaccurate or biased results.

On top of that, several other challenges arise when trying to analyze real estate transactions. These include:

•

The complexity of real estate contracts: Real estate contracts are often long and complex and can contain a variety of clauses and conditions that can be difficult to interpret.
•

The lack of standardization in real estate contracts: There is no standard format for real estate contracts, and the terms and conditions can vary widely from one contract to another. This can make it difficult to compare different contracts and identify trends.
•

The need for expert knowledge: Analyzing real estate transactions requires a deep understanding of the real estate market and its legal framework. This can make it difficult for non-experts to conduct meaningful analyses.

Synthetic data has emerged as a valuable tool for training Large Language Models (LLMs) due to its potential to overcome limitations associated with real-world data, such as privacy concerns, biases, and limited availability of specific examples. Several methods have been explored to leverage synthetic data for LLM trainingLiu et al. (2024):

•

Data Augmentation: Existing datasets are enhanced by generating additional variations of existing samples. This can involve techniques like paraphrasing, text transformations, or generating new examples based on specific patterns.
•

Rule-Based Generation: Synthetic data is generated based on predefined rules or templates. This approach is useful when specific linguistic patterns or structures must be reinforced.
•

Model-Based Generation: LLMs are utilized to generate synthetic text based on patterns learned from existing data. This method can produce diverse and realistic examples, but care must be taken to avoid perpetuating biases present in the original data.
•

Hybrid Approaches: These combine multiple methods, such as using rule-based generation to create initial examples, followed by model-based generation to refine and expand the synthetic dataset.

To tackle the dataset challenge, we adopted the best practices of large language model training to create a synthetic dataset representing real-world scenarios of real estate transactions. This dataset is used to adapt general-purpose LLMs to be legal experts in real estate transactions. We collected public datasets on real estate transactions in different states and prepared templates as contexts to generate example contracts.

We obtained authentic real estate transaction datasets from public sources, covering real estate transactions in New York CityNYC (2017). Those datasets are in tabular formats, which can be easily processed and understood by machine learning algorithms, and the columns contain key attributes of real estate transactions, like city, lot number, transaction time, and prices.

To generate the synthetic contract, we prompt the LLM with the contract templates from each state and provide these ground-truth attributes as contexts. We designed a variety of questions and answers that can be used to evaluate the accuracy of information retrieval and reasoning tasks.

For clauses not available in the real transaction data, we adopted a rule engine as described in Figure LABEL:fig:_synthetic to generate random contract terms and use them as ground truth answers to evaluate the model’s reasoning capabilities.

Figure 1: Method to generate synthetic contracts

7 Model Training and Experiments

While the field of natural language processing has been dominated by massive language models with hundreds of billions of parameters, our work takes a different approach. We leverage a smaller large language model (LLM), with a scale of a few billion parameters, exemplified by models like LLaMA-8BTouvron et al. (2023) and Phi-3Abdin et al. (2024). This deliberate choice is motivated by several crucial factors that align with our application’s specific requirements and constraints. Firstly, deploying a smaller LLM significantly enhances privacy. Smaller models enable on-device processing, unlike their larger counterparts, which often necessitate cloud-based inference. This means sensitive user data remains on the user’s device, minimizing the risks associated with data transmission and storage on external servers. This is particularly critical in privacy-sensitive domains where data security is paramount. Secondly, smaller LLMs offer practical advantages in inference speed and fine-tuning. Their reduced scale translates to lower computational demands, making them easier to adapt and specialize using modest resources. This efficiency is particularly beneficial when working with synthetic datasets, as it allows for rapid experimentation and iterative refinement of the model to achieve optimal performance on the specific task.

To further enhance the efficiency and performance of our chosen LLM, we employ Parameter-Efficient Fine-Tuning (PEFT) techniquesMangrulkar et al. (2022). PEFT methods are designed to adapt large language models for specific tasks by fine-tuning only a small subset of parameters while keeping most of the model’s weights frozen. By limiting the number of trainable parameters, PEFT significantly reduces the computational resources required for fine-tuning. This makes it feasible to adapt large models even with limited hardware. PEFT methods minimize the memory footprint during fine-tuning, enabling the training process on devices with less memory capacity. It also preserves the majority of general knowledge existing in the pre-trained model parameters. Low-Rank Adaptation (LoRA)Hu et al. (2022) inserts adapters with trainable rank decomposition matrices into each layer of the transformer model. In our work, we carefully evaluate different PEFT methods to identify the most effective approach for fine-tuning our chosen LLM on the synthetic dataset. By combining the advantages of a smaller LLM with the efficiency of PEFT techniques, we balance performance, resource utilization, and privacy.

On top of Llama-3.1-8B Instruct modelDubey et al. (2024), we adopt the following LoRA hyper-parameter configs for fine-tuning, which results in a massive reduction of trainable parameters to only 10 million.

•

alpha=64,

•

dropout=0.05,

•

rank = 4

•

target modules = [ ”q proj”, ”k proj”, ”v proj”, ”o proj”, ”gate proj”, ”up proj”, ”down proj”,]

The model is fine-tuned for 3 epochs with our synthetic training dataset of over three thousand contracts with over 3 million tokens. We sampled 100 contracts from the held-out dataset as a validation set to monitor the loss. We can observe that the model converges within 3 epochs and starts over-fitting at the end of the third epoch, as shown in Figure 2. The model is trained on one NVIDIA A100 GPU, and the total training time is around 8 hours.

8 Evaluation

One of the most basic tasks that an LLM can perform is extracting information from a real estate contract. This includes information such as the buyer and seller’s names, the property address, the purchase amount, and the closing date. This information can be used to create various reports and documents, such as title reports, closing statements, and tax returns.

In addition to extracting information from contracts, LLMs can also be used to perform more complex tasks, such as logical reasoning with scenario-based questions. For example, an LLM could be used to determine whether or not a seller can collect earnest money if the buyer cannot obtain financing. This type of task requires the LLM to understand the specific terms of the contract and apply them to a given scenario.

To reduce hallucination, we insert random questions about details unknown from the contract. This will help the LLM to learn that it cannot simply generate information that is not present in the contract. Here are some examples of random questions that could be used to reduce hallucination:

•

What is the property tax amount?
•

What is the name of the insurance company providing the buyer’s homeowners insurance?
•

What is the lender’s name providing the buyer’s mortgage?
•

What is the date of the buyer’s loan application?

8.1 Evaluation Metrics

Evaluating the performance of our fine-tuned LLM on the question-answering (QA) task requires careful selection of appropriate metrics. While accuracy, measured as the percentage of correctly answered questions, provides a basic indication of performance, it fails to capture the nuances of language and the complexities of QA. Therefore, we employ a combination of metrics that provide a more comprehensive assessment.

Exact Match (EM) measures the proportion of answers that perfectly match the ground truth, emphasizing precise language understanding and generation. However, EM can be overly strict, especially for unstructured answers. To address this, we utilize the F1 score, which considers the overlap between predicted and ground truth answers at the token level. F1 score provides a more lenient evaluation, rewarding models that capture the essential information even with minor variations in phrasing.

Although the F1 scores provide a quantitative measure of answer accuracy, they often fall short of capturing the semantic similarity between the predicted and ground truth answers. To address this limitation, we incorporate BERTScoreZhang* et al. (2020) as an evaluation metric for our QA system. Unlike string-based comparisons, BERTScore leverages pre-trained contextual embeddings from BERT to compute the similarity between two sentences. This allows for a more nuanced assessment that considers the semantic meaning and contextual information embedded within the text, rather than relying solely on lexical overlap. By utilizing BERTScore, we aim to evaluate the quality of generated answers based on their semantic alignment with the ground truth, even if they exhibit variations in phrasing or lexical choices. This approach provides a more comprehensive and meaningful evaluation of the LLM’s ability to understand and respond to questions accurately and comprehensively.

Our evaluation demonstrates that, after fine-tuning, LLM is able to better match the reference answer in our synthetic dataset, with precision and recall improvements in BERT Score. The only regression in text-matching recall metrics is because the pre-trained model, without fine-tuning, tends to generate verbose answers and include sentences of the original contract, not providing the direct answer.

Evaluation Metrics		llama-3-8b-instruct (Baseline)	After fine-tuning
BERT Score	Precision	0.638	0.827
	Recall	0.833	0.930
	F1-Score	0.722	0.874
	Match Rate ( F1 >0.7)	0.630	0.968
Text Matching	Precision	0.099	0.788
	Recall	0.914	0.596
	F1-Score	0.168	0.633
	Match Rate (F1 >0.5)	0.041	0.613

Table 1: Evaluation metrics show significant improvements after fine-tuning LLM with synthetic dataset

8.2 Qualitative Analysis

We also provide some examples demonstrating the pros and cons of fine-tuning models.

8.2.1 Good Examples

Example 1: After fine-tuning, LLM is able to provide succinct and more direct and rephrased answers which is more understandable.

Question: What happens if the Purchaser is not satisfied with the inspection?

Answer (Baseline): According to Section 3(a) of the contract, if the Purchaser is not satisfied with the condition of the Premises after conducting inspections by the specified date, they have the right to terminate this Contract by providing written notice to the Seller prior to the expiration of the inspection period.

Answer (Fine-tuned): Upon termination, the escrow deposit of $22,325 will be returned to the Purchaser, and all parties will be released from further obligations under this Contract.

Example 2: The pre-trained model tends to generate irrelevant details to the answers.

Question: What will the seller convey to the purchaser?

Answer (Baseline): According to the contract, the seller (Ernestina Parker) shall sell and convey the property, together with all buildings and improvements thereon, to the purchaser (Jay Aschenbrenner). This includes the property described in Schedule A, which is 189 Schermernhorn Street, along with any rights to land lying in the bed of any street or highway adjoining the property.

Answer (Fine-tuned): The property and all buildings and improvements thereon.

Example 3: For terms unspecified in the contract, the baseline model generates answers with blank lines, without direct answers to the question.

Question: Who holds the earnest money deposit in escrow?

Answer (Baseline): According to paragraph 6, the earnest money deposit is held in escrow by at , in a segregated, interest-bearing account for the benefit of the parties.

Answer (Fine-tuned): Escrow holder. (Name and address to be filled)

8.2.2 Failure Cases

The fine-tuned model sometimes generates overly simplified answers, missing enough details.

Question: Which costs is the Seller responsible for?

Answer (Baseline): Real estate agent commissions, transfer taxes, escrow fees According to Section 5 of the contract, the Seller is responsible for the following costs: Real estate agent commissions, Transfer taxes, All escrow fees associated with the Closing

Answer (Fine-tuned): Seller’s Costs.

The fine-tuned model sometimes doesn’t provide a direct answer.

Question: Is the property being sold ’as is’?

Answer (Baseline): Yes, according to Clause 3, the Inspection Contingency, the property is being sold ”as is” without any inspection contingency. This means that the Purchaser is assuming the property in its current condition, without the ability to conduct inspections or request repairs before closing.

Answer (Fine-tuned): No inspection contingency. Purchaser assumes property ”as is”.

9 Future Directions

The application of large language models in real estate contract analysis opens up several venues for future research and development:

•

Multi-lingual Support: Extending models to handle contracts in different languages to facilitate global real estate transactions.
•

Image Support: Extending models to handle inspections and appraisal reports with images.
•

Pricing Guidance: Extending the models to guide pricing.
•

Regulatory Compliance: Integrating legal compliance checks into automated contract analysis systems.

10 Conclusion

In this paper, we have discussed using large language models for extracting information from real estate sales contracts. By leveraging advanced NLP techniques, we can automate tedious tasks associated with contract analysis and improve efficiency in real estate transactions. Challenges such as legal complexity and ambiguity can be addressed using large language models and domain-specific fine-tuning. Future research focuses on enhancing multi-lingual support, semantic understanding, and regulatory compliance in automated contract analysis systems.

References

Abdin et al. [2024] Marah Abdin, Jyoti Aneja, et al. Phi-3 technical report: A highly capable language model locally on your phone, 2024.
Achiam et al. [2023] Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
Betts and Jaep [2016] Kathryn D Betts and Kyle R Jaep. The dawn of fully automated contract drafting: Machine learning breathes new life into a decades-old promise. Duke L. & Tech. Rev., 15:216, 2016.
Brown et al. [2020] Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language models are few-shot learners, 2020.
Chakrabarty et al. [2019] Tuhin Chakrabarty, Christopher Hidey, and Kathleen McKeown. Imho fine-tuning improves claim detection. arXiv preprint arXiv:1905.07000, 2019.
Chang et al. [2024] Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, et al. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, 15(3):1–45, 2024.
Cui et al. [2023] Jiaxi Cui, Zongjian Li, Yang Yan, Bohua Chen, and Li Yuan. Chatlaw: Open-source legal large language model with integrated external knowledge bases. arXiv preprint arXiv:2306.16092, 2023.
Devlin et al. [2018] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
Dubey et al. [2024] Abhimanyu Dubey, Abhinav Jauhri, et al. The llama 3 herd of models, 2024.
Gao et al. [2024] Haoxiang Gao, Yaqian Li, Kaiwen Long, Ming Yang, and Yiqing Shen. A survey for foundation models in autonomous driving. arXiv preprint arXiv:2402.01105, 2024.
Howard and Ruder [2018] Jeremy Howard and Sebastian Ruder. Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146, 2018.
Hu et al. [2022] Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022.
Joshi et al. [2018] Sandeep Joshi, Parth Shah, and Amaresh Kumar Pandey. Location identification, extraction and disambiguation using machine learning in legal contracts. In 2018 4th International Conference on Computing Communication and Automation (ICCCA), pages 1–5, 2018.
Liu et al. [2024] Ruibo Liu, Jerry Wei, Fangyu Liu, Chenglei Si, Yanzhe Zhang, Jinmeng Rao, Steven Zheng, Daiyi Peng, Diyi Yang, Denny Zhou, et al. Best practices and lessons learned on synthetic data for language models. arXiv preprint arXiv:2404.07503, 2024.
Mahabadi et al. [2021] Rabeeh Karimi Mahabadi, Sebastian Ruder, Mostafa Dehghani, and James Henderson. Parameter-efficient multi-task fine-tuning for transformers via shared hypernetworks. arXiv preprint arXiv:2106.04489, 2021.
Mangrulkar et al. [2022] Sourab Mangrulkar, Sylvain Gugger, Lysandre Debut, Younes Belkada, Sayak Paul, and Benjamin Bossan. Peft: State-of-the-art parameter-efficient fine-tuning methods. https://github.com/huggingface/peft, 2022.
Nadeau and Sekine [2007] David Nadeau and Satoshi Sekine. A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1):3–26, 2007.
NYC [2017] NYC. Nyc property sales, 2017. https://www.kaggle.com/datasets/new-york-city/nyc-property-sales.
Schmidhuber et al. [1997] Jürgen Schmidhuber, Sepp Hochreiter, et al. Long short-term memory. Neural Comput, 9(8):1735–1780, 1997.
Surden [2021] Harry Surden. Machine learning and law: An overview. Research Handbook on Big Data Law, pages 171–184, 2021.
Touvron et al. [2023] Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. Llama: Open and efficient foundation language models, 2023.
Vaswani et al. [2017] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
Wallingford et al. [2022] Matthew Wallingford, Hao Li, Alessandro Achille, Avinash Ravichandran, Charless Fowlkes, Rahul Bhotika, and Stefano Soatto. Task adaptive parameter sharing for multi-task learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7561–7570, 2022.
Wu et al. [2021] Kan Wu, Houwen Peng, Minghao Chen, Jianlong Fu, and Hongyang Chao. Rethinking and improving relative position encoding for vision transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10033–10041, 2021.
Zamani and Schwartz [2017] Mohammadzaman Zamani and H Andrew Schwartz. Using twitter language to predict the real estate market. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 28–33, 2017.
Zhang* et al. [2020] Tianyi Zhang*, Varsha Kishore*, Felix Wu*, Kilian Q. Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations, 2020.