A Simple and Fast Way to Handle Semantic Errors in Transactions

Jinghan Zeng The University of Chicago [email protected] , Eugene Wu Columbia University [email protected] and Sanjay Krishnan The University of Chicago [email protected]

Abstract.

Many computer systems are now being redesigned to incorporate LLM-powered agents, enabling natural language input and more flexible operations. This paper focuses on handling database transactions created by large language models (LLMs). Transactions generated by LLMs may include semantic errors, requiring systems to treat them as long-lived. This allows for human review and, if the transaction is incorrect, removal from the database history. Any removal action must ensure the database’s consistency (the ”C” in ACID principles) is maintained throughout the process.

We propose a novel middleware framework based on Invariant Satisfaction (I-Confluence), which ensures consistency by identifying and coordinating dependencies between long-lived transactions and new transactions. This middleware buffers suspicious or compensating transactions to manage coordination states. Using the TPC-C benchmark, we evaluate how transaction generation frequency, user reviews, and invariant completeness impact system performance. For system researchers, this study establishes an interactive paradigm between LLMs and database systems, providing an ”undoing” mechanism for handling incorrect operations while guaranteeing database consistency. For system engineers, this paper offers a middleware design that integrates removable LLM-generated transactions into existing systems with minimal modifications.

1. Introduction

For decades, system design has been centered around humans as the primary users and operators. In database management systems (DBMS), transactions are usually initiated and committed by human users. While syntactic errors can be easily detected and corrected through standard error handling, users may also introduce semantic errors or need additional time to decide whether to commit a transaction. To manage these challenges, researchers and engineers have developed ”undo” mechanisms, ensuring both data integrity and system recoverability.

The ARIES protocol (Algorithm for Recovery and Isolation Exploiting Semantics) has long been the gold standard for recovery, combining write-ahead logging, undo, and redo operations to efficiently manage fine-grained rollbacks (Mohan et al., 1992). However, managing semantic errors requires transactions to remain long-lived for user review, making ARIES potentially too storage-intensive for such use cases. Beyond ARIES, alternative approaches such as sagas have been developed for long-running transactions in distributed systems, enabling partial rollbacks through compensating transactions while maintaining system consistency (Garcia-Molina and Salem, 1987). ACTA is derivative work for sagas, and it extends sagas to include extended transactions while still providing compensating transactions and maintaining consistency(Chrysanthis and Ramamritham, 1992). Similarly, advancements in multi-version concurrency control (MVCC) allow systems to maintain historical states, enabling rollbacks to previous valid database states (Bernstein et al., 1987). Escrow methods, often utilized in high-concurrency environments, only allowed conditional updates and guarantees to facilitate safe rollbacks without locking resources (O’Neil, 1986). These mechanisms either allow ”undo” or give users a chance to store long-lived transactions and review the transactions to fix semantic errors.

Large Language Models (LLMs) are increasingly being seen as active operators within systems, either by translating human commands for system interaction or functioning autonomously to maintain the system based on predefined prompts. LLMs have demonstrated their ability to understand system structures and effectively query data (Zhang et al., 2024; Madden et al., 2024; Fernandez et al., 2023; Chen et al., 2024). While these actions typically do not modify the original database state and instead provide more accessible methods for data extraction, more complex interactions involve ”write” actions that alter the database state (Patil et al., 2024; Subramaniam and Krishnan, 2024; Patil et al., 2023; Wornow et al., 2024; Zhang et al., 2024). These write actions can introduce errors, which are often manageable through validation, as well as semantic errors, which may require user review for correction (Zhang et al., 2024; Xu et al., 2024; Patil et al., 2023).

An intuitive solution is to improve the prediction accuracy and reduce semantic errors of LLMs in adhering to user instructions and expectations. However, due to inherent limitations in LLMs, achieving perfect performance is unrealistic. Thus, we must treat incorrect transactions as errors and remove them when necessary. Existing approaches like ARIES, sagas, escrow, and MVCC provide partial solutions but fall short of addressing the challenges posed by long-lived transactions, especially those generated by LLMs. ARIES, for instance, does not support long-lived transactions. Sagas undo transactions by predefined independent compensating transactions, which cannot handle inter-dependent transactions that violate system constraints. Sagas also requires system implementation case by case. ACTA briefly explains how to handle transactions that cannot be undone. However, it does not discuss inter-dependent transactions, which may violate database constraints based on the other transaction’s state, not just an independent transaction itself. Escrow focuses on counter-type data, limiting its applicability. MVCC does not support selective removal of specific transactions. Approaches like rewriting history rely on commutativity checks to remove incorrect transactions and transactions affected by incorrect transactions, but commutativity imposes overly strict constraints (Liu et al., 2000). A more practical approach would relax this standard by enforcing database constraints rather than insisting on identical database states. Thus, we adapted and extended Invariant Confluence to check whether LLM-generated transactions require review and may need coordination with newer transactions. (Bailis et al., 2014). Coordinations stall certain newer transactions to guarantee that the undoing of long-lived transactions always leads to a consistent database state. We also reviewed additional papers on long-lived transactions; see the details in the discussion section.

Table 1. Comparison of Undo Strategies

Feature	ARIES	Sagas	ACTA	Escrow	Proposed Method
Long-Lived Undo		Yes	Yes	Yes	Yes
Unified Implementation	Yes			Yes	Yes
Dynamic Recoverability				Yes	Yes
Diverse Constraints	Yes	Yes	Yes		Yes

In this paper, Section 2 outlines the system settings and requirements. Sections 3 and 4 summarize the processes for identifying transactions that may require coordination for the consistency of long-lived transactions waiting for reviewing. Sections 5 and 6 explain the roles of transaction managers and middleware in supporting coordination mechanisms. Section 7 explores the interplay between system availability, human review, LLM-generated transactions, and the completeness of query and invariant information. Section 8 compares and discusses various strategies, highlighting their strengths and limitations.

At a high level, developers using this framework must first manually analyze the dependencies between each pair of SQL query templates used to interact with their database models. The dependency check details for each pair can be referenced in Table 4. After identifying the dependencies, developers need to register them in the dependency-checking function within the middleware (Section 6). Once a transaction request is received from the user, the middleware passes the dependency-checking function along with the new request to the transaction manager (Section 5). The transaction manager then asynchronously coordinates the transactions. Users can either check the status of their submitted transactions or review LLM-generated transactions (Section 6).

For researchers working on LLM-data system integration, they can focus on each component separately for optimization.

2. System Setting

Refer to caption — Figure 1. System Setting

We consider a system that operates in an environment that accepts transactions from human users and LLMs (Large Language Models). Occasionally, transactions may later be identified as incorrect by users, such as when users decide they no longer want to proceed with a transaction or when a semantic error in LLM-generated transactions leads to an imperfect execution of users’ instructions. For example, in Figure 1, after transaction 2 (T2), an account balance is 50 USD, and transaction 3 (T3) increments 10 USD while all new transactions T4 - T8 have a net effect that decreases by 60 USD. One database constraint requires that the account balance must be positive. T3 is an LLM-generated transaction; the user reviewed it and decided to remove it. In this case, T3 cannot be removed; otherwise, it will lead to a violation of the database constraint. This setting raises two key questions:

(1)

How can we isolate or remove these questionable transactions from LLMs, especially when they may no longer be needed, from database history?
(2)

In certain situations, removing a suspicious transaction might not be feasible while maintaining the consistency of the data system due to subsequent new transactions’ modification to the database.

Solution for (1): In this paper, we mainly suggest two approaches for suspicious transaction removal/separation, and we discussed the problem of other approaches:

Buffering Suspicious Transactions: Rather than committing suspicious transactions directly to the database, the web developer can store these transactions in a buffer. Once an administrative user reviews and either accepts or removes the transaction, the transaction manager within the system can then commit or discard the transaction accordingly.

Buffering Compensating Transactions: A compensating transaction is a specialized transaction used to ”undo” the effects of a previously committed transaction by applying an opposite action to restore the system to a consistent state (Korth et al., 1990). Unlike a rollback, which cancels a transaction before it is committed, a compensating transaction is applied after commitment to address cases where reversal is necessary without altering the original transaction history. For instance, in a banking system, if 100 USD is mistakenly transferred from Account A to Account B, the compensating transaction would transfer 100 USD back to Account A to correct the error. Compensating transactions is essential in complex systems where direct rollbacks may be impractical because the transaction has already been committed.

In our context, the system commits the transaction upon receipt and buffers its corresponding compensating transaction. If an administrative user reviews and decides to remove the transaction, the transaction manager within the system can then commit a compensating transaction to negate the effect of the suspicious transaction on the database. If the transaction is approved, no further action is needed for the database.

To ensure the database maintains consistency while also allowing for the removal or separation of suspicious transactions, we consider several approaches:

1) Full Database Locking: The simplest approach is locking the entire database whenever a suspicious transaction request is received. The system continues running but restricts other transactions until users review the suspicious transaction. This enforces strict serializability and ensures ACID properties, particularly consistency. 2) Sandbox Simulation: Another option is to create a sandbox to simulate transactions. As long as a compensating transaction can remove the effects of suspicious transactions while maintaining consistency, even after subsequent transactions, the suspicious transaction is safe to undo. 3) Granular Locking: Locking specific database parts (e.g., rows or tables) can also support removability. However, determining the appropriate granularity to maintain removability without compromising consistency is challenging. 4) Naive Buffering of Suspicious/Compensating Transactions: Suspicious transactions can be temporarily buffered without immediate execution. However, without coordination, a buffered transaction that is initially valid may become uncommittable due to subsequent modifications. Each of these approaches has limitations, which we will further discuss in Section 8.

5) Buffering with Logical Dependency Checks: A logical dependency check can identify new transactions that would compromise the validity of undoing a buffered transaction without committing the buffered transaction. By holding these identified transactions, the system can ensure that all buffered transactions remain valid. Specifically, if the system has knowledge of database invariants and transaction conditions, an \textit{Invariant Satisfaction} (or \textit{Invariant Confluence}) analysis can assess dependencies between buffered transactions and new transactions. Whether buffering suspicious transactions or compensating transactions, invariant analysis determines if new transactions should proceed or be held, maintaining consistency.

Solution for (2): We identify new transactions that will lead to the inconsistency; we will provide inconsistency detection in section 3:

Consistency: Consistency in this paper refers to the ”C” in the ACID property of database transactions (Gray et al., 1981; Haerder and Reuter, 1983). From the perspective of the database, the developers must guarantee its consistency; specifically, they need to ensure the predefined constraints from the table creators are not violated.

Dependency: We define dependency as follows: when two transactions arrive in sequence, the system may be unable to accept both due to consistency constraints. In this case, the second transaction is said to depend on the first. For example, consider a database constraint requiring an account balance to remain above 0 and assume an account currently holds 50 USD. If the first transaction deducts 40 USD and is marked as suspicious, it may be buffered for possible removal. However, if a second transaction attempts to deduct 20 USD, it will be removed or held, as committing both would violate the balance constraint. Thus, the second transaction depends on the first and requires coordination, such as buffering, to maintain consistency.

Recoverability: In the buffering compensating transaction approach, when a transaction commits to the database, it transforms the database from state 1 to state 2. If we later need to undo this transaction’s changes, we must apply a compensating transaction defined by the system designer. However, there is a risk that a compensating transaction could result in an inconsistent state. We define the ability to remove the committed Transaction 2 with consistency satisfaction as recoverability. In a system that buffers suspicious transactions, the buffered suspicious transaction can always be safely removed without affecting a consistent system state. From the perspective of buffering compensating transactions, the recoverability of a suspicious transaction is decided by its corresponding compensating transaction’s dependency on a new transaction. If a committed new transaction does not depend on its compensating transaction, the suspicious transaction can be recovered.

Availability: Availability is impacted by dependencies between transactions. When a suspicious or compensating transaction is buffered, new transactions that depend on it may also be held. The system will delay any dependent transactions until the buffered transaction receives final approval from administrators or users. This holding process reduces transaction throughput and affects overall system availability.

CAD theorem: There are trade-offs between consistency, availability, and dependency. In our system setting, we cannot achieve the three of them at the same time. This is similar to the CAP theorem in distributed systems (Gilbert and Lynch, 2002).

•

Consistency and Availability: To maintain both, we need to minimize the duration of dependencies by reducing the time that suspicious transactions are held.
•

Consistency and Dependency: Achieving both often requires sacrificing some availability, which may involve reducing the number of accepted transactions and delaying transaction completion within a defined time frame.
•

Availability and Dependency: To maintain availability and dependency, we must relax transaction consistency requirements. Specifically, fewer database or application constraints should be enforced at the system design stage.

2.1. Middleware

For web developers and system engineers, introducing a new framework in a modern enterprise is often discouraged due to concerns about cost and system reliability. Frameworks like LangChain address this by offering tool packages that integrate into larger systems as microservices (LangChain, 2024). Similarly, this project focuses on creating middleware compatible with existing web frameworks, minimizing disruption to current workflows. By avoiding changes to the database or data schema, our solution ensures data integrity and reliability without requiring complex system migrations.

The Model-View-Controller (MVC) (Burbeck, 1992) is a widely used design pattern in software development, especially for web applications, which divides an application into three interconnected components: Model, View, and Controller. The Model manages data and business logic, representing the application’s core functionality by handling data storage, retrieval, and processing. The View serves as the presentation layer, displaying data from the Model to the user and relaying user commands to the Controller. The Controller acts as an intermediary, processing user inputs, updating the Model, and instructing the View to display the updated data.

In our setting (in Figure 2), after the view receives the request from either users or LLM agents, it will not directly talk to the controller. Instead, the control workflow will be delegated to the middleware, which ensures that new transactions that depend on buffered transactions can be appropriately coordinated to maintain system consistency.

As we discussed before, there are two approaches to buffering: buffering suspicious transactions and buffering compensating transactions. We will use the second one as an example to explain how the middleware works.

2.2. Dependency-Check for a Middleware

We assumed web development based on MVC, and the middleware only received the transaction name and parameters. Usually, a transaction request triggers predefined transaction logic according to the transaction name and parameters (details in section 3). The web developer needs to pre-analyze that logic’s dependency and register it in the dependency check function. In the runtime, if the server receives one new transaction, it can always check its dependency with former transactions with the dependency check function.

2.3. Buffering Compensating Transactions

2.3.1. Request from the User

Assume that the User or LLM interacts with the system via an HTML interface, with actions on the webpage submitted through URL requests. Rather than directly calling database systems or ORM operations, these URL-based requests are first placed into a queue, where the Transaction Manager processes them periodically. Each transaction request is assigned a unique transaction ID before entering the queue. This ID is also logged or updated in the transaction status table with a status of “submitted,” which allows for universal tracking and identification of each transaction’s status.

In a separate process running parallel to the user-facing request handler, the Transaction Manager periodically retrieves requests from the queue, assigning each request for dependency check. The Transaction Manager (details in section 4) decides whether a new transaction should be buffered or committed based on its dependency on buffered transactions. It also checks if any transactions in the buffer are ready to be removed or executed.

2.3.2. Review a Buffered Transaction

If an admin or user identifies a suspicious buffered LLM-generated transaction, they can choose to accept or remove it. Once the approval or removal is triggered and submitted to the system via a URL request, the request is added to a decision buffer.

During each iteration, the Transaction Manager retrieves results from the decision buffer and removes the corresponding buffered transaction. It then commits these removed or “mature” buffered transactions and updates the status table accordingly before accepting new transactions.

2.3.3. Checking the Status

When a request for transaction status is received, the system retrieves the result or status from the status table using the transaction ID.

2.4. Buffering Suspicious Transactions

Same framework except for buffering suspicious transactions instead of compensating transactions.

2.5. API Documentation

(1)

Endpoint: transaction_request

Description: Initiates a transaction request and returns a unique transaction ID.
HTTP Method: POST
Parameters:
- •
  
  transaction_name (string): The name of the transaction.
- •
  
  transaction_parameters (object): Specific parameters required for the transaction.
Returns:
- •
  
  transaction_id (string): A unique identifier for the requested transaction.
(2)

Endpoint: transaction_review

Description: Accept or remove an LLM-generated and suspicious buffered transaction.
HTTP Method: POST
Parameters:
- •
  
  transaction_id (string): The unique identifier of the transaction to be processed.
Returns:
- •
  
  status (string): The processing result, indicating whether the transaction has been accepted or removed.
(3)

Endpoint: transaction_status

Description: Retrieves the status of the database query result of a specific transaction.
HTTP Method: POST
Parameters:
- •
  
  transaction_id (string): The unique identifier of the transaction.
Returns:
- •
  
  transaction_status (string/object): The current status or database query result related to the transaction.

3. Constraints in Long-Lived Transactions

3.1. Invariant Satisfaction

Invariant confluence: Invariant confluence (Bailis et al., 2014; Whittaker and Hellerstein, 2020), originally applied in distributed systems, ensures that, after merging, distributed data stores still satisfy a common invariant (i.e., database constraints). Specifically, starting from a common ancestor database state, there are two concurrent branches, each with a sequence of transactions, as shown in Figure 4. If each sequence results in a consistent database state (All S are consistent state) and the two branches can be merged without violating consistency, this state is called invariant confluence.

A key theorem of invariant confluence states that a set of transactions T can execute without coordination and converge to a consistent state if and only if T is invariant confluent. Coordination here means that if two transactions are not invariant confluent, one should be held to prevent concurrent execution. In the setting of this paper and Invariant Satisfaction, coordination requires that the latter be held until the former finishes reviewing.

The invariant confluence between two transactions can be assessed by logically verifying their operations against database invariants. If two transactions are not invariant confluent and the first transaction executes earlier, we say that Transaction 2 depends on Transaction 1.

Invariant Satisfaction: In our setting, while a suspicious transaction remains buffered, new transactions may be committed to the database in ways that prevent the buffered transaction from committing consistently. When treating the buffered transaction and a new transaction as parallel, concurrent operations, this scenario is similar to invariant confluence.

We propose that if the buffered transaction and the new transaction are invariant confluent, they can proceed without coordination. If they are not invariant confluent, the new transaction must be held until the buffered transaction is either accepted or removed by the user.

A key difference from traditional invariant confluence is that our setting involves only a single transaction in each ”branch.” We refer to this as Invariant Satisfaction (I-Satisfaction) to specify the relationship between buffered and new transactions. This also implies that if the transactions are not I-satisfied, and the buffered transaction occurs before the new transaction, then the new transaction depends on the buffered transaction.

3.2. Invariants Analysis

In this section, we outline a series of invariants that web developers should consider. We list and explain pairs of invariants and operations that do not meet Invariant Satisfaction requirements and, therefore, require coordination. (assuming 1. Transaction A is a suspicious transaction that might be removed and not exist; 2. Transaction B is a new transaction; 3. Transaction A occurs before Transaction B; 4. Transaction A and Transaction B are not committed to the database for a constraint validation check but are logically analyzed for dependency)

1 Database Constraints: These are built-in constraints within relational database systems. This section examines cases where these constraints do not meet Invariant Satisfaction requirements, necessitating coordination (Chamberlin and Boyce, 1974).

UNIQUE: Ensures all values in a column are unique. Non-Invariant Satisfaction Example: Consider a database table of user emails. If Transaction A inserts a user with the email "[email protected]" and Transaction B also inserts a user with the same email concurrently, merging these states would violate the uniqueness constraint. Coordination is required to prevent such conflicts.

FOREIGN KEY: Maintains integrity by preventing actions that would destroy links between tables. Non-Invariant Satisfaction Example: Suppose Transaction A inserts a new employee record with a foreign key reference to a department, and Transaction B deletes the same department concurrently. Merging these states would result in an invalid state with a ”dangling pointer.” Coordination ensures that dependent data remains consistent.

Auto-Increment: Automatically generates a unique identifier for new rows. Non-Invariant Satisfaction Example: If Transaction A inserts a new record with ID 100, and Transaction B inserts another record concurrently with ID 100 as well, the auto-increment constraint is violated. Coordination ensures unique ID generation across transactions.

CHECK Constraints: Enforces that values in a column must meet a specific condition. Non-Invariant Satisfaction Example: In a financial application, a column for account balance may have a CHECK constraint requiring that all balances be greater than or equal to zero. If Transaction A deducts $400 from an account with a balance of $400 and Transaction B deducts $100 simultaneously, the merged state could result in a negative balance, violating the constraint. Coordination ensures that these operations do not breach the CHECK condition. Without knowing the balance, two decrement transactions are not I-satisfaction for ¿ constraint; two increment transactions are not I-satisfaction for ¡ constraint; all transactions are I-satisfaction for = constraint.

2 Abstract Data Type Constraints:(Conway et al., 2012; Lynch and Merritt, 1993; Shapiro et al., 2011)

Counter Constraints: Similar to check constraints. Non-Invariant Satisfaction Example: In an accounting system, assume the balance of an account must remain positive. If Transaction A deducts $30 and Transaction B deducts $40 simultaneously from an account with only $50, the resulting balance after merging would be negative, violating the constraint. Coordination is needed to manage concurrent deductions.

Set, List, Map Size Constraints: Track the collection size or state accurately. Non-Invariant Satisfaction Example: In a collaborative document editing system, if Transaction A adds a new section to a list and Transaction B adds another section simultaneously, the final list size or state might not reflect both changes correctly. Coordination ensures the consistency of collection updates.

New Constraints (Not Mentioned by the Coordination Avoidance Paper)

Tree and Hierarchical Constraints: Ensure relationships between parent and child nodes adhere to specific rules. Non-Invariant Satisfaction Example: In a file system, Transaction A deletes a parent directory while Transaction B adds a new file to that directory. Merging these transactions without coordination would violate the hierarchical constraint, leading to an invalid state.

Graph Constraints: Enforce rules specific to graph-based data structures, such as avoiding circular references. Non-Invariant Satisfaction Example: In a social network, Transaction A adds a relationship where User A follows User B, and Transaction B adds a relationship where User B follows User A. Merging these states would create a cycle, violating the constraint. Coordination prevents such circular references.

3 Application Data Constraints (ADC): This section provides examples of data constraints defined by applications that go beyond what can be enforced by built-in database constraints (Nodine et al., 1992).

Sequential Order: Ensure data follow a specific sequence. Non-Invariant Satisfaction Example: In a task management application, Transaction A updates the start_date of a task, and Transaction B updates the end_date. If these transactions occur simultaneously and the start_date ends after the end_date when merged, the sequence is violated. Coordination ensures the proper order of operations.

Conditional Value Constraints: Values in certain fields depend on others in the data structure. Non-Invariant Satisfaction Example: In a customer management system, a VIP customer must have an account balance above $10,000. Transaction A modifies the customer’s VIP status, and Transaction B deducts from their balance. Without coordination, the merged result could reflect a VIP status with an insufficient balance, violating the constraint.

Session Data Consistency: Ensures that data remains stable and isolated within a user’s session, preventing modifications by other users during that session. Non-Invariant Satisfaction Example: In an online shopping cart, if Transaction A holds the session while adding an item and Transaction B from another user modifies the cart’s contents simultaneously, merging these actions could result in an inconsistent cart state. Coordination helps maintain session data stability and integrity.

4 Application Process Constraints (APC): This section provides examples of constraints for application procedures or workflows, focusing on the sequence and conditions of operations rather than the data state itself (Biliris et al., 1994; Nodine et al., 1992).

Sequence Requirements: Define that specific actions must occur in a specific order. Non-Invariant Satisfaction Example: In an order processing system, payment must be confirmed before delivery is initiated. Transaction A processes the delivery, while Transaction B confirms payment from the same order. If merged without coordination, delivery could occur before payment, violating the process requirements.

Grouped Actions: Enforce that certain operations must happen together. When adding a new student, they must be inserted into both the department and dorm tables. Non-Invariant Satisfaction Example: If Transaction A inserts the student into the department table and Transaction B inserts them into the dorm table, merging without coordination may leave an inconsistent state where the student is only present in one table.

Branching/Conditional Logic Constraints: Define constraints based on conditional paths in the data or process flow. Non-Invariant Satisfaction Example: If Transaction A updates $x$ , and Transaction B adjusts $y$ based on the new value of $x$ (e.g., if $x>0$ , $y\times 2$ ; otherwise, $y+2$ ). The existence of x decides the value of y, and they cannot be run concurrently since Transaction B needs to wait for the result of Transaction A.

4. I-Satisfaction Check, Completeness of Query, Invariants

Scenario 1: Assume that transaction type A is deducting $x_{1}$ USD from account balance of row $y_{1}$ , and transaction type B is deducting $x_{2}$ USD from account balance of row $y_{2}$ . Invariant: account balance $>0$ .

In a web development environment with a relational database, queries within a transaction are represented as SQL query templates, which may require users to input specific parameters. This means that transactions can have different levels of completeness depending on whether all required parameters are specified. For example, in Scenario 1, parameters like $x_{1}$ , $x_{2}$ , $y_{1}$ , and $y_{2}$ may or may not need to be defined by the user, depending on application logic. In more complex transactions, values for $x_{1}$ , $x_{2}$ , $y_{1}$ , and $y_{2}$ might be dynamically retrieved from a SELECT query.

On the other hand, we cannot assume that every developer or user has prior knowledge of all constraints. We will explore how varying levels of completeness in query information affect dependency checks for ensuring consistency. For Scenario 1, we explained the granularity of dependency checks, and in Table 2, we summarized dependency checks for other constraints based on actions and completeness of information.

Complete Query (No User Input), with Invariant Information: When no user input is required, the exact parameters for each transaction are known in advance. For example, transaction type A always deducts 2 USD from account row 1, and transaction type B also deducts 2 USD from account row 1. In this case, without knowing the balance, these two transactions require coordination. We can simply hard-code transaction type B to wait for type A to finish whenever they interact. Here, the second transaction depends on the first only if they both deduct from the same field in the same row, which has an invariant that ensures the value remains positive.

Complete Query (With User Input), with Invariant Information: When user input is required, dependencies between templates cannot be hard-coded. Instead, we can only identify the row affected once the user submits the necessary parameters. For example, if transaction A deducts 2 USD from account row $y_{1}$ and transaction B deducts 2 USD from account row $y_{2}$ , we need to dynamically check if $y_{1}$ and $y_{2}$ refer to the same row based on user inputs. If they do, coordination is required.

Partial Query with Invariants Info: If the row number from transaction 1 is read from the system, then transaction 1 is an incomplete partial query until the read is finished. To check the invariant between two transactions requires “locking” the entire column. In this case, if they deduct a field in the same column with an invariant requiring a positive value, the second transaction depends on the first transaction.

If the row number for transaction 1 is retrieved from the database system, then transaction 1 is considered an incomplete or partial query until this read operation is completed. To check the invariance between two transactions, it may be necessary to “lock” the entire column. In this case, any new deduction transaction depends on the first deduction transaction if they both deduct a field in the same column, with an invariant requiring that the value remain positive.

Query with No Invariants Info: In Scenario 1, if we do not know where the invariant has been applied to the table, the only solution is to ”lock” the table. If the second transaction modifies the same table, it should wait and be held. In cases with foreign key constraints, where there is no referencing/referenced table information, strict serializability should be enforced.

In Scenario 1, if we do not know where the invariant has been applied to the table, the only solution is to “lock” the entire table. If the second transaction modifies the same table, it must wait and be held. In cases involving foreign key constraints, where no information about referencing or referenced tables is available, strict serializability should be enforced.

Row/Column/Table/DB?: Is there a universal rule for determining whether a dependency check depends on operations to the same row, column, table, or entire database? The answer is no; it depends on the specific operations and invariant pairs involved. Strictly speaking, an Invariant Satisfaction Check (IC) acts as a form of locking, but it leverages knowledge of invariants to narrow the scope of what needs to be locked. When query and constraint information is incomplete, a broader scope of “locking” is required.

Consistency Validation: It is important to note that, in invariant confluence, all transactions have been committed within a branch. All intermediate and final states are valid under database constraints. In our setting, however, transactions are buffered and thus not immediately committed for constraints validation. The logical dependency check only identifies dependencies between transactions; it does not determine whether a transaction can ultimately commit while preserving consistency.

A buffered suspicious transaction only holds dependent transactions to prevent them from reducing the buffered suspicious transaction’s likelihood of committing successfully. For instance, in Scenario 1, a second transaction might consume a positive balance, which could make the first (deducting) transaction inconsistent. However, whether the first transaction can be committed while maintaining system consistency depends on the current balance. The consistency of an individual transaction is ultimately verified and enforced by the database’s validation logic (not needed for application-level constraints).

In the case of the dependent-upon (buffered) transaction is found inconsistent after physical constraints validation, it will be discarded, and the dependent transaction will not be affected, since the dependent-upon transaction does not change any database state.

Physical Dependency Check & Logical Dependency-Check: We knew that two concurrent decrement transactions may lead to a database state that violates database consistency. Rather than committing transactions to the database and relying on built-in constraints validation, we use logical reasoning to identify dependencies. There are several reasons we prefer a Logical Check over a physical check. 1) A logical check allows us to extend and support dependency checks that are not natively built into the system. 2) Dependency checking through physical transaction simulation is less efficient and will be explained in the discussion section.

Invariants	Transaction 1 Action	Transaction 2 Action	Complete Query with Invariant	Partial Query with Invariant: $T_{1}$ missing row/column information	Without Invariant: do not know which row/columns has constraint
Database Constraints
Uniqueness	Update/Insert	Update/Insert	Update/Insert the same column	$T_{2}$ Update/Insert the same table as $T_{1}$	$T_{2}$ Update/Insert the same table as $T_{1}$
Foreign Key	Delete key in a referenced table with no cascading delete	Insert to the referencing table	Insert row referencing the deleted row of the referenced key	Insert row referencing any key in the referenced table	Insert row referencing any key in the referenced table
Auto-Increment	Insert	Insert	Insert the same table	Insert the same table	Insert the same table
Check (less than)	Increment	Increment	Increment the same field	Increment the same column	Increment the same table
Check (more than)	Decrement	Decrement	Decrement the same field	Increment the same column	Increment the same table
Abstract Data Types Constraints
Counter (less than)	Increment	Increment	Increment the same field	Increment the same column	Increment the same table
Counter (more than)	Decrement	Decrement	Decrement the same field	Increment the same column	Increment the same table
Set, List, Map: size =	Mutation to the ADT	Mutation to the ADT	Mutate the same table (one table is a set)	Mutate the same table (one table is a set)	Mutate the same table (one table is a set)
Tree: Child has a Parent	Delete the parent	Insert a child	(A row is a node) Insert row as the child of the deleted row	Inserted row is in the same table as the deleted row	Inserted row is in the same table as the deleted row
Graph: No Circular Reference	Insert a reference $b$	Insert $b$ referencing $a$	(A row is an edge) Insert in the same reference column	(A row is an edge) Insert in the same reference column	(A row is an edge) Insert in the same table
Application Logic Data Constraints
Sequential Order: In a row, the start date is smaller than the end date	Insert/Update	Insert/Update	Insert/Update the same row	Insert/Update the same table	Insert/Update the same table
Conditional Value Constraints: If a customer’s status is VIP, their account balance must be above a certain threshold	Insert/Update (VIP status)	Insert/Update (deduct balance)	Inserted balance row constrained by the VIP row	Inserted balance row constrained by any row in the VIP status table	Inserted balance row constrained by any row in the VIP status table
Session Data Consistency: Data states do not change unexpectedly within a single session	Action from user 1	Action from user 2	Different users work in the same table	Different users work in the same table	Different users work in the same table
Application Logic Process Constraints
Sequence Requirements: Payments happen before deliveries	Insert (order is LLM generated)	Insert (delivery should wait)	Delivery insert row corresponds to a specific payment insert	Delivery insert corresponds to any insert in the payment table	Delivery insert corresponds to any insert in the payment table
Grouped Actions: Adding a student includes inserting them in both the department table and the dorm table	Insert	Insert	One must wait until the other appears
Branching/Conditional Logic Constraints: Transaction $A$ updates $x$ ; transaction $B$ if $x>0$ , $y\times 2$ , else $y+2$	Update (transaction 1)	Update (transaction 2)	Transaction $B$ needs to wait until transaction $A$ is finished

Table 2. Comparison of Transaction Actions under Different Query and Invariant Conditions

5. Transaction Manager

In this section, we consider Scenario 2: transaction type A adds $x_{1}$ USD to the account balance of the row $y_{1}$ , and transaction type B deducts $x_{2}$ USD from the account balance of the row $y_{2}$ . Invariant: The account balance must remain $>0$ .

Isolation Coordination: Isolation in this paper refers to the ”I” in the ACID properties of a database system, ensuring that transactions are processed independently to avoid conflicts (Gray et al., 1981). Isolation coordination involves concurrency control to guarantee the serializability of transactions. When two transactions have write-write, read-write, or write-read conflicts and are running concurrently, isolation coordination ensures that the outcome is equivalent to executing these operations in a specific sequential order. In our framework, the database manages isolation, while the middleware we proposed is responsible solely for maintaining consistency.

Consistency Coordination: We have previously discussed consistency and dependency. To ensure system consistency while allowing for the potential removal of suspicious transactions, coordination between transactions is necessary. For instance, if we have two transactions that are both type A (increment operations) in a given scenario, there is no dependency, and no coordination is required. Similarly, if one transaction is an increment and the other a decrement, no coordination is needed. However, if both transactions are decrement operations, a dependency exists, and the second transaction should wait until the first transaction is complete.

Since we cannot assume a specific isolation level in the database system, we consider that transactions without coordination constraints may be committed concurrently.

5.1. Processes vs. Transaction Manager

We may need a machine to coordinate transactions. In theory, all uncommitted transactions could be managed in separate processes and threads, which would host transactions waiting for locks, perform simulations, or carry out logical dependency checks between new transactions and all buffered suspicious or compensating transactions. However, this approach has high bandwidth costs. More importantly, it is impractical in a long-lived transaction scenario, as we discussed, where processes might need to wait for days for user reviews. A more efficient solution is to build a transaction manager that can track transactions, manage their state, and commit them as appropriate.

5.2. Transaction Manager Environment

Committed	Buffered	New
T1, T2, T3	(T4) (T5) (T6)	T7

In the above diagram, we provide an example of dependency checking, and there are three types of transactions:

Committed transactions constitute the current state of the database. Buffered transactions include either suspicious transactions or compensating transactions to support transaction separation/removal. Buffered transactions also include those that depend on other previous buffered transactions. New transactions are added to the buffer if they are marked as suspicious. The system then checks whether a new transaction depends on any buffered transaction. If there are $N$ buffered transactions, each new transaction requires $N$ dependency checks. When a dependency is identified, it is recorded accordingly, as characterized in Table 1.

5.2.1. How to manage/buffer transactions:

The transaction manager handles all incoming new transactions serially, one at a time. Each transaction is evaluated against Invariant-Satisfaction for dependency check. If there is no dependency: The transaction is materialized (committed) into the database. If dependencies exist: The transaction is retained in the buffer for future processing.

5.2.2. How to check dependency:

Assume we have two transactions, T1 and T2. Go to Table 1 and check whether these two have actions paired in the same row of the table. Then, based on the completeness of the information at the time of commit, they can get to know the dependency criteria. For example, T1 is UPDATE key = 1 from table1 where id = 1, AND T2 is Update key = 2 from table1 where id = 1, and we know the ”key” field has a uniqueness constraint. Then we go to table 1, go to row 1, and the ”Complete Query with Invariant” column since we see it mentioned the same actions as T1 and T2, and we have complete query information that we know those transactions are working on the same ”key” field which has uniqueness constraints. Then, dependency and coordination are needed. If the transaction has missing parameters related to columns or rows because it ”read” real-time information from the database, we should check ”Partial Query with Invariant: missing row/column information.” If we do not know which rows/columns/tables have this constraint, but we know there may be one somewhere, we should go to ”Without Invariant: do not know which row/columns has a constraint.”

5.2.3. How to represent the interdependence between transactions:

Transactions are retained in the buffer not just because they are suspicious but because their dependencies are logged in the Dependency Matrix. The Dependency Matrix tracks relationships between transactions, indicating interdependence, as shown in the table below.

5.2.4. Algorithm for Adding a New Transaction

This algorithm is implemented as process_transactions().

Input: DB, Transaction Buffer, Dependency Matrix DM, New transaction T
Output: Updated DB, Updated Transaction Buffer, Updated Dependency Matrix

If T is suspicious, it will be added to the DM and wait for administration processing. Then, the transaction manager checks T’s dependency against all existing transactions in the dependency matrix. For any transaction with more than one dependency identified, it specifies its dependency in the dependency matrix.

If T is not suspicious, Then the transaction manager checks T’s dependency against all existing transactions in the dependency matrix. Only after the dependency is identified will it be added to the transaction buffer, and its dependency will be specified. For non-suspicious transactions, if no dependency is detected, they will be committed to the DBMS.

5.2.5. Algorithm for Materializing a Buffered Transaction

This algorithm is implemented as check_for_materialization().

Input: DB, Transaction Buffer, Dependency Matrix DM
Output: Updated DB, Updated Transaction Buffer, Updated Dependency Matrix

The Transaction Manager regularly checks the Dependency Matrix. For each transaction in the Dependency Matrix:

(1)

First, check whether it is approved by the administrator or if it is not a suspicious transaction.
(2)

Then, check the dependency table vertically to see whether the transaction is still waiting for others.

If both conditions (1 and 2) are satisfied, the Transaction Manager will:

•

Remove its horizontal dependencies from the Dependency Matrix.
•

Release any transactions that depend on it.
•

Commit the transaction to the DBMS.

6. Experiment

In our system setting, we characterize the availability of the middleware and web service pipeline by the buffered rate. While end-to-end performance—from the user’s request to transaction completion—may serve as a more comprehensive availability metric, it is influenced by the specific implementations of the transaction manager and middleware, which are not the primary focus of this paper. Also, database performance can impact end-to-end performance. Thus we assume that the database completes all transactions immediately once they exit the middleware. We only simulate transaction management and do not commit it to the database.

\text{Buffered rate}=\frac{\text{Number of buffered transactions}}{\text{Number of total test transactions}}

TPC-C (Transaction Processing Performance Council Benchmark C) (Transaction Processing Performance Council, 2010) is a standardized benchmark for evaluating the performance of transaction processing systems. It simulates an order-entry environment, which includes operations such as processing new orders, payment transactions, order status checks, deliveries, and stock level monitoring. TPC-C is widely used to measure and compare the efficiency and scalability of database and transaction management systems under a complex, real-world workload.

We have a naive intuition that with more dependency between transactions, the buffered rate will increase. In a web service system, if the proportion of inter-dependent transactions is higher, the buffer rate will accordingly increase. Thus, a fixed distribution of transactions could give us a better experiment parameter control to explore other parameters that affect the buffered rate. This is also the reason we use TPC-C as the test framework because it assumes a roughly fixed distribution between 5 types of transactions.

A natural intuition suggests that as dependencies between transactions increase, the buffered rate will also rise. In a web service system, a higher proportion of interdependent transactions leads to a corresponding increase in the buffered rate. Therefore, using a fixed distribution of transactions provides better control over experimental parameters, allowing us to explore other factors affecting the buffered rate. This rationale also underlies our choice of the TPC-C framework for testing, as it maintains a roughly fixed distribution across five transaction types.

We implement our test frame by adapting and rewriting a TPC-C benchmark tool (py-tpcc) (Pavlo, 2011). It is originally used to generate TPC-C transactions, dock transactions to a database, and measure the throughput performance of the target database. We rewrite this package, dock the transactions to our transaction manager, and control the flow of transaction generation. In our experiments, we mainly control 2 factors, the interval between two LLM-generated suspicious transactions (SI), and the interval between two user reviews to accept or remove the suspicious transaction, (RI). For simplicity, we assume equal intervals for both RI and SI.

Question 1: Benchmark Comparison: how does the completeness of the query and invariant affect the buffered rate

From our previous discussion, we understand that with higher information completeness to queries and invariants, we can assess transaction dependency with finer granularity. For example, in Scenario 1, with complete query and invariant information, we identify a dependency if the second transaction modifies the same field in the same row as the first transaction, assuming a ”¿ 0” constraint on that field. However, without prior knowledge of invariants, we may need to block the entire system to ensure consistency for the first transaction.

We assume that a system without using invariant information can coordinate transactions in the TPC-C benchmark only through table-level locking. This is our baseline.

Therefore, we tested how dependency check granularity, under different levels of query and invariant completeness, affects the buffered rate. Our previous experiments used dependency checks based on Invariant Satisfaction. In TPC-C’s design, all invariant-related transaction queries are complete at the time of coordination checks, meaning a new transaction only depends on a buffered transaction if they modify the same field. To compare, we implemented a transaction manager without prior invariant knowledge using table-level dependency checks. This means that a second transaction is held or buffered as long as it modifies the same table as the first transaction.

Conclusion 1.1: When RI is infinite, and SI is set to 5, there is no user review to reduce buffered transactions. With 20 trials, the average buffered rate with field-level granularity dependency checks (complete invariants and query information) is lower than with table-level granularity checks (incomplete invariants and query information). This suggests that with complete information about invariants and queries, a system can achieve lower buffered rates through finer-grained coordination compared to cases with no invariant information (see Figure 11).

Conclusion 1.2: If user reviews occur at a rate of 80%, which removes 80% buffered transactions, (RI = 50, SI = 5, with 20 trials for each transaction length), the average buffered transaction rate in the case with complete information is higher than in the case without complete information (see Figure 11).

Benchmark Conclusion 1.3: Regardless of user reviews, the buffered rate in the baseline benchmark (without invariant information) is double that of the benchmark utilizing complete query and invariant information.

Question 2: Which type of transaction is most likely to cause congestion or hold new transactions, leading to an increase in the buffered rate?

From the manual analysis and invariant confluence, in TPC-C, the dependency exists between New-order and New-order Transactions; and between Delivery and Delivery Transactions.

Question 3: What factors related to suspicious transactions affect the transaction acceptance rate?

Conclusion 3.1 If the RI is set to infinity—meaning there is no review to remove buffered transactions that may congest new transactions—the buffered rate does not change significantly as the number of transactions increases.

In this scenario, we assume an SI of 5, where one out of every five transactions is labeled as an LLM-generated or suspicious transaction (e.g., True, False, False, False, False, True, …). We calculate all buffered rates 5 times and average the results. As shown in Figure 11, the average buffered rate remains steady at around 0.6 if the number of transactions is larger than 400, regardless of the increasing number of transactions.

Conclusion 3.2 If the RI is set to infinity, a longer SI leads to a decrease in the average buffered rate.

We varied the interval between two LLM-generated transactions, testing SIs of 2, 5, 10, and 50 (e.g., one LLM-generated transaction for every 2 transactions, with 20 trials per SI). As shown in Figure 11, the test case with higher SIs results in a lower average buffered rate.

Fact 3.3: If the user reviews and either approves or rejects all transactions at any point, all buffered transactions will be processed, resulting in a buffered transaction rate of 0.

Conclusion 3.4 A shorter RI results in a lower number of average buffered transactions.

Since reviewing all transactions would result in a buffered rate of 0, we tested the effect of partially reviewing transactions by randomly removing 80% of the transactions in the buffer. Assuming that human reviews occur less frequently than LLM-generated transactions, we set the RI (user review interval) to be a multiple of the SI (suspicious transaction interval), with values of (1, 2, 5, 10, 100) times the SI. For this test, we fixed the SI at 5, giving RI values of (5, 10, 25, 50, 500), with 20 trials per RI.

We focused on cases where the number of transactions exceeded the RI to ensure human processing within each test case, omitting scenarios without human processing. Consequently, a few lines in Figure 11 are truncated. Also, as shown in Figure 11, a shorter RI results in fewer buffered transactions.

Conclusion 3.5: When RI is fixed, the number of buffered transactions remains relatively stable, even as the total number of transactions increases.

Conclusion 3.6: Because the number of buffered transactions remains stable, the buffered transaction rate decreases as the total number of transactions increases (see Figure 11).

7. Discussion

7.1. Mechanism Comparison

We have discussed several mechanisms and approaches for addressing our system setting. However, we also mentioned they are not a good choice for our questions. How do these mechanisms compare, and why do we prefer buffering suspicious or compensating transactions with the invariant-satisfaction dependency check?

7.1.1. Naive Buffering

In naive buffering, incoming removable transactions are stored in a buffer and committed or removed if they are reviewed by the admin users. The naive buffering strategy does not assume any prior knowledge about invariants. Essentially, suspicious transactions are moved to the end of the transaction history, with no coordination between new transactions and buffered transactions. Thus, there is a risk that a transaction deemed consistency-valid may fail to commit (Scenario 1), and naive buffering does not guarantee the commit of a buffered transaction.

7.1.2. Transaction Simulation

Transaction simulation periodically checks if a compensating transaction can be used to remove an erroneous transaction. When a new transaction arrives, we set up a sandbox environment and commit various combinations of the new and compensating transactions to a database snapshot. As long as all combinations preserve system consistency, the new transaction can be committed. This simulation approach ensures that any committed transaction can be removed if necessary using its corresponding compensating transaction.

If there are buffered compensating transactions and new transactions ready for undo simulation, let $N$ be the number of buffered transactions and $n$ the number of new transactions. $(2^{N})\times n$ simulations may be needed to test all combinations. Meanwhile, our proposed buffering compensating/suspicious approach only requires $N\times n$ I-satisfaction dependency checks. Additionally, it’s important to note that simulation involving physical I/O is significantly more time-intensive than in-memory logical I-satisfaction checks.

7.1.3. Buffering Suspicious Transaction Is Better than Buffering Compensating Transaction ?

One key drawback of buffering compensating transactions is if a transaction is reviewed and removed by the user, the system actually interacts with the database twice for committing. It commits the original suspicious transaction and the undo compensating transaction. This doubles the transaction overhead if users remove all LLM-generated transactions. For buffering suspicious transaction approaches, no additional actions are required other than removing transactions from the buffer, and no database execution is needed.

7.1.4. Buffering Compensating Transaction is better than Buffering Suspicious Transaction:

Invariant satisfaction only provides guidance on whether two transactions need coordination if they are requested by users concurrently. However, invariant satisfaction does not provide any guarantee whether this transaction can be committed and satisfy database consistency. We summarize the constraints validation of the above-mentioned approaches below:

•

Pure Buffering: Constraint validation is done at the point of committing or through extra application logic.
•

Simulation: Constraint validation is tested at the simulation stage.
•

Buffering Suspicious Transactions with Invariant Satisfaction: If transaction 1 depends on transaction 2, the transaction will be constraint-validated until transaction 1 has been reviewed by the user and submitted to the database. If transaction 2 is not suspicious, it needs to wait until transaction 1 is finished. Thus, constraint validation has been delayed significantly.
•

Buffering Compensating Transactions with Invariant Satisfaction: The suspicious transaction can be committed and validated when the user requests. The corresponding compensating transaction is the same. There is no delay for the consistency validation check.

In scenarios with many consistency-invalid transactions that should be aborted, buffering compensating transactions enables immediate abortion and removal of the suspicious transaction at request time: When all transactions are consistency-invalid, buffering suspicious transactions wait a long for user review and unnecessarily pile up the buffer with invalid transactions.

7.2. Lock

A dependency check based on invariant satisfaction is similar to locking in a database system. The dependency check identifies conflicts between two transactions and holds the second transaction until the first one is completed. In databases, locks are primarily used to ensure isolation between transactions. If there is a write-write (WW), write-read (WR), or read-write (RW) conflict between two transactions, the lock is applied to the necessary data, requiring one transaction to wait until the other releases the lock (Eswaran et al., 1976). However, locks for isolation typically have fixed granularity, such as row-level, table-level, or system-level locks, which may be insufficient for maintaining consistency in complex coordination scenarios which has varying granularity requirements.

For example, In the context of ensuring consistency, 2PL may require a table-level lock. To be more specific, for consistency coordination required invariants like uniqueness, the former transaction must lock the whole table to make sure all later transactions will not insert any row to lower the former transaction’s consistency priority (shrinking the former transaction’s range of unique names) (Eswaran et al., 1976). Holding a table lock is also true for a foreign key since if I want to delete a referenced foreign key tuple, I need to lock the whole referencing table to make sure no new transaction will reference the deleted referenced key. For Invariant ¿, ¡, the system may only need to lock the corresponding rows to make sure no other transaction can modify it and negatively affect the former transaction’s consistency priority. Since the lock-based approach does not have information to invariant beforehand, it is necessary to assume a table-based lock to guarantee consistency.

For example, a two-phase locking (2PL) might enforce a table-level lock. It is a good granularity choice for invariants like uniqueness, where the first transaction must lock the entire table to prevent subsequent transactions from inserting rows that could violate the uniqueness constraint. However, it is not necessary for CHECK invariants with conditions (e.g., >, <); the system may only need to lock specific rows to prevent other transactions from altering them and compromising consistency. This fixation on granularity leads to coordination overhead.

7.3. Strict Serialization and Commutativity in Transactions

Transaction orders can lead to different database states if two transactions are not non-commutative (Eswaran et al., 1976). If we assume there is a ”correct” order of transaction, any change to this order may lead to an ”incorrect” state. Our framework can also ensure a favorable and ”correct” outcome by coordinating these transactions based on commutative checks (Ranjan et al., 2021).

For buffered suspicious transactions and new transactions, to ensure the final state is identical to that of the ”correct” order, we can use a commutativity check to identify necessary coordination. For a new transaction, if it is not commutative with any of the buffered transactions, then dependencies exist between the new transaction and the non-commutative buffered transactions. The coordination mechanisms remain the same as previously discussed. This approach not only guarantees a consistent database state but also ensures that the final state is identical to the ”correct” transaction order.

7.4. Undo with Compensating Transaction

A compensating transaction may not exist if there is no logical inverse for a given operation. For example, setting a value to zero eliminates any knowledge of its previous state, making it impossible to reverse mathematically. Similarly, sending a notification or email represents an external action that cannot be ”unsent” or undone once it leaves the system. This absence of an inverse action means that the system cannot revert these transactions in a straightforward manner, as the original state or condition cannot be restored.

Cascading compensating transactions occur when other transactions use the results of the current transaction (Garcia-Molina and Salem, 1987). If one transaction depends on or builds upon the outcome of another, reversing the initial transaction requires creating compensating transactions for all dependent ones. This cascade can become increasingly complex as each linked transaction must be undone or adjusted to maintain consistency.

7.5. Related Works

Undoing a transaction involves two key components: determining how to reverse the transaction’s effects and ensuring that compensating actions maintain system consistency. Traditional approaches (e.g., compensating transactions, ACTA, sagas) have largely addressed the first aspect—how to perform the compensation. However, the approach described here addresses both. Previous studies have explored ensuring consistency for long-lived transactions under weaker serializability assumptions.

Certain work has discussed verifying whether a compensating transaction can be applied based on write-read dependencies and introduced the notion of r-soundness, a weaker form of consistency ensuring that compensated or rolled-back states satisfy database constraints (Korth et al., 1990). While that research established the concept, our method provides an exact procedure for verifying r-soundness through Invariant-C.

An earlier analysis addressed concurrency control and recovery together but focused on short-lived operations and traditional recovery mechanisms, not on long-lived transactions or compensating strategies (Schek et al., 1993).

Additional studies employed abstract data types (ADTs) to manage long-lived transactions (Speegle and Gordon, 1992). This approach sometimes permitted concurrent execution without violating constraints, conceptually aligning with our integrity constraints (IC). However, while those works utilized alternative structures and focused less on preventing constraint violations, our approach employs a matrix-based mechanism explicitly designed to maintain database constraints.

Other efforts demonstrated that certain compatible transactions can commit without coordination if their semantics are aligned (Garcia-Molina, 1983), a concept similar to IC, yet they did not offer a comprehensive mechanism to identify and handle all potential conflicts as our IC-based solution does.

Further research categorized interactions among concurrent transactions using semantic considerations, discussing semantic and commutative compatibilities that maintain integrity and allow for free interleaving when operations result in equivalent states (Natrajan and Reynolds, 1999). However, these approaches did not focus on systematically enforcing database constraints.

Finally, some papers have summarized various serializability and correctness models—such as Predicate-wise Serializability (PSR), Quasiserializability (QSR), Cooperative Serializability (CoSR), and Setwise Serializability (SWSR). These models are group-based and restrict uncoordinated operations to occur only between groups rather than at the individual transaction level (Ramamritham and Chrysanthis, 1996). PSR, while conceptually related to IC, has been shown to improve the availability of long-lived transactions but only allows concurrent transactions between groups (Rastogi et al., 1993). This is stricter than r-soundness. In contrast, our Invariant-C approach allows for greater flexibility at the transaction level while still preserving global integrity.

Table 3. Comparison of Possible Transaction Management Strategies

Strategy	Transaction Required?	Invariant + Transaction Required?	Guaranteed Commit?	Physical Check?
Naive Buffer	✓			✓
Simulation	✓		✓	✓
Buffer Suspicious Transactions		✓	✓
Buffer Compensating Transactions		✓	✓
Commutativity + Buffer	✓		✓	maybe

8. Conclusion

In this work, we characterized the workflow of interactions between LLMs and databases in the presence of semantic errors. We introduced the concept of Invariant Satisfaction, which identifies the necessary coordination between long-lived, buffered transactions and new transactions. This ensures that new transactions do not lead to states where buffered transactions become irrecoverable. Invariant Satisfaction extends beyond database constraints to include application logic constraints, dynamically reducing coordination requirements based on the completeness of constraint and query information. We proposed a middleware framework for coordinating undo-able LLM-generated transactions that can be integrated into existing enterprise systems with minimal modifications.

This paper provides a robust solution for managing undo-able long-lived transactions and guarantees consistency. For system researchers, this study introduces an interactive paradigm between LLMs and databases, featuring an ”undoing” mechanism to handle incorrect operations while maintaining consistency. For system engineers, it offers a middleware design that efficiently incorporates undo-able LLM transactions into current workflows, ensuring reliability with minimal disruption.

References

(1)
Bailis et al. (2014) Peter Bailis, Alan Fekete, Michael J Franklin, Ali Ghodsi, Joseph M Hellerstein, and Ion Stoica. 2014. Coordination avoidance in database systems (Extended version). arXiv preprint arXiv:1402.2237 (2014).
Bernstein et al. (1987) Philip A Bernstein, Vassos Hadzilacos, Nathan Goodman, et al. 1987. Concurrency control and recovery in database systems. Vol. 370. Addison-wesley Reading.
Biliris et al. (1994) Alexandros Biliris, Shaul Dar, Narain Gehani, HV Jagadish, and Krithi Ramamritham. 1994. ASSET: A system for supporting extended transactions. ACM SIGMOD Record 23, 2 (1994), 44–54.
Burbeck (1992) Steve Burbeck. 1992. Applications programming in smalltalk-80 (tm): How to use model-view-controller (mvc). Smalltalk-80 v2 5 (1992), 1–11.
Chamberlin and Boyce (1974) Donald D Chamberlin and Raymond F Boyce. 1974. SEQUEL: A structured English query language. In Proceedings of the 1974 ACM SIGFIDET (now SIGMOD) workshop on Data description, access and control. 249–264.
Chen et al. (2024) Zui Chen, Lei Cao, Sam Madden, Tim Kraska, Zeyuan Shang, Ju Fan, Nan Tang, Zihui Gu, Chunwei Liu, and Michael Cafarella. 2024. SEED: Domain-Specific Data Curation With Large Language Models. arXiv:2310.00749 [cs.DB] https://arxiv.org/abs/2310.00749
Chrysanthis and Ramamritham (1992) Panos K Chrysanthis and Krithi Ramamritham. 1992. ACTA: the SAGA continues.
Conway et al. (2012) Neil Conway, William R Marczak, Peter Alvaro, Joseph M Hellerstein, and David Maier. 2012. Logic and lattices for distributed programming. In Proceedings of the Third ACM Symposium on Cloud Computing. 1–14.
Eswaran et al. (1976) Kapali P. Eswaran, Jim N Gray, Raymond A. Lorie, and Irving L. Traiger. 1976. The notions of consistency and predicate locks in a database system. Commun. ACM 19, 11 (1976), 624–633.
Fernandez et al. (2023) Raul Castro Fernandez, Aaron J. Elmore, Michael J. Franklin, Sanjay Krishnan, and Chenhao Tan. 2023. How Large Language Models Will Disrupt Data Management. Proc. VLDB Endow. 16 (2023), 3302–3309. https://api.semanticscholar.org/CorpusID:261193780
Garcia-Molina (1983) Hector Garcia-Molina. 1983. Using semantic knowledge for transaction processing in a distributed database. ACM Transactions on Database Systems (TODS) 8, 2 (1983), 186–213.
Garcia-Molina and Salem (1987) Hector Garcia-Molina and Kenneth Salem. 1987. Sagas. ACM Sigmod Record 16, 3 (1987), 249–259.
Gilbert and Lynch (2002) Seth Gilbert and Nancy Lynch. 2002. Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. Acm Sigact News 33, 2 (2002), 51–59.
Gray et al. (1981) Jim Gray et al. 1981. The transaction concept: Virtues and limitations. In VLDB, Vol. 81. 144–154.
Haerder and Reuter (1983) Theo Haerder and Andreas Reuter. 1983. Principles of transaction-oriented database recovery. ACM computing surveys (CSUR) 15, 4 (1983), 287–317.
Korth et al. (1990) Henry F Korth, Eliezer Levy, and Abraham Silberschatz. 1990. A formal approach to recovery by compensating transactions. University of Texas at Austin, Department of Computer Sciences.
LangChain (2024) LangChain. 2024. LangChain Documentation. https://www.langchain.com/ Accessed: 2024-11-16.
Liu et al. (2000) Peng Liu, Paul Ammann, and Sushil Jajodia. 2000. Rewriting histories: Recovering from malicious transactions. Security of Data and Transaction Processing (2000), 7–40.
Lynch and Merritt (1993) Nancy A Lynch and Michael Merritt. 1993. Atomic transactions: in concurrent and distributed systems. Morgan Kaufmann Publishers Inc.
Madden et al. (2024) Samuel Madden, Michael Cafarella, Michael Franklin, and Tim Kraska. 2024. Databases Unbound: Querying All of the World’s Bytes with AI. Proc. VLDB Endow. 17, 12 (Nov. 2024), 4546–4554. https://doi.org/10.14778/3685800.3685916
Mohan et al. (1992) Chandrasekaran Mohan, Don Haderle, Bruce Lindsay, Hamid Pirahesh, and Peter Schwarz. 1992. ARIES: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Transactions on Database Systems (TODS) 17, 1 (1992), 94–162.
Natrajan and Reynolds (1999) Anand Natrajan and Paul F Reynolds. 1999. Resolving concurrent interactions. In Proceedings 3rd IEEE International Workshop on Distributed Interactive Simulation and Real-Time Applications. IEEE, 85–92.
Nodine et al. (1992) Marian H Nodine, Sridhar Ramaswamy, and Stanley B Zdonik. 1992. A cooperative transaction model for design databases.
O’Neil (1986) Patrick E O’Neil. 1986. The escrow transactional method. ACM Transactions on Database Systems (TODS) 11, 4 (1986), 405–430.
Patil et al. (2024) Shishir G Patil, Tianjun Zhang, Vivian Fang, Roy Huang, Aaron Hao, Martin Casado, Joseph E Gonzalez, Raluca Ada Popa, Ion Stoica, et al. 2024. GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications. arXiv preprint arXiv:2404.06921 (2024).
Patil et al. (2023) Shishir G. Patil, Tianjun Zhang, Xin Wang, and Joseph E. Gonzalez. 2023. Gorilla: Large Language Model Connected with Massive APIs. arXiv preprint arXiv:2305.15334 (2023).
Pavlo (2011) Andy Pavlo. 2011. py-tpcc: Python Implementation of TPC-C. https://github.com/apavlo/py-tpcc Accessed: 2024-11-12.
Ramamritham and Chrysanthis (1996) Krithi Ramamritham and Panos K Chrysanthis. 1996. A taxonomy of correctness criteria in database applications. The VLDB Journal 5 (1996), 85–97.
Ranjan et al. (2021) Nalin Ranjan, Zechao Shang, Sanjay Krishnan, and Aaron J Elmore. 2021. Version Reconciliation for Collaborative Databases. In Proceedings of the ACM Symposium on Cloud Computing. 473–488.
Rastogi et al. (1993) Rajeev Rastogi, Sharad Mehrotra, Yuri Breitbart, Henry F Korth, and Avi Silberschatz. 1993. On correctness of non-serializable executions. In Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems. 97–108.
Schek et al. (1993) Hans-Jörg Schek, Gerhard Weikum, and Haiyan Ye. 1993. Towards a unified theory of concurrency control and recovery. In Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems. 300–311.
Shapiro et al. (2011) Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011. A comprehensive study of convergent and commutative replicated data types. Ph.D. Dissertation. Inria–Centre Paris-Rocquencourt; INRIA.
Speegle and Gordon (1992) Gregory D Speegle and Andrew L Gordon. 1992. Quantifying the benefits of semantics. In Proceedings of the 1992 ACM annual conference on Communications. 423–430.
Subramaniam and Krishnan (2024) Pranav Subramaniam and Sanjay Krishnan. 2024. Intent-Based Access Control: Using LLMs to Intelligently Manage Access Control. arXiv:2402.07332 [cs.DB] https://arxiv.org/abs/2402.07332
Transaction Processing Performance Council (2010) Transaction Processing Performance Council. 2010. TPC Benchmark™ C Standard Specification, Revision 5.11. Retrieved from http://www.tpc.org/tpcc/.
Whittaker and Hellerstein (2020) Michael Whittaker and Joseph M Hellerstein. 2020. Checking invariant confluence, in whole or in parts. ACM SIGMOD Record 49, 1 (2020), 7–14.
Wornow et al. (2024) Michael Wornow, Avanika Narayan, Krista Opsahl-Ong, Quinn McIntyre, Nigam H. Shah, and Christopher Re. 2024. Automating the Enterprise with Foundation Models. arXiv:2405.03710 [cs.SE] https://arxiv.org/abs/2405.03710
Xu et al. (2024) Ziwei Xu, Sanjay Jain, and Mohan Kankanhalli. 2024. Hallucination is Inevitable: An Innate Limitation of Large Language Models. arXiv:2401.11817 [cs.CL] https://arxiv.org/abs/2401.11817
Zhang et al. (2024) Xiang Zhang, Khatoon Khedri, and Reza Rawassizadeh. 2024. Can LLMs substitute SQL? Comparing Resource Utilization of Querying LLMs versus Traditional Relational Databases. arXiv:2404.08727 [cs.DB] https://arxiv.org/abs/2404.08727