Blockchain-enabled Data Governance for Privacy-Preserved Sharing of Confidential Data
Abstract
In a traditional cloud storage system, users benefit from the convenience it provides but also take the risk of certain security and privacy issues. To ensure confidentiality while maintaining data sharing capabilities, the Ciphertext-Policy Attribute-based Encryption (CP-ABE) scheme can be used to achieve fine-grained access control in cloud services. However, existing approaches are impaired by three critical concerns: illegal authorization, key disclosure, and privacy leakage.
To address these, we propose a blockchain-based data governance system that employs blockchain technology and attribute-based encryption to prevent privacy leakage and credential misuse. First, our ABE encryption system can handle multi-authority use cases while protecting identity privacy and hiding access policy, which also protects data sharing against corrupt authorities. Second, applying the Advanced Encryption Standard (AES) for data encryption makes the whole system efficient and responsive to real-world conditions. Furthermore, the encrypted data is stored in a decentralized storage system such as IPFS, which does not rely on any centralized service provider and is, therefore, resilient against single-point failures. Third, illegal authorization activity can be readily identified through the logged on-chain data. Besides the system design, we also provide security proofs to demonstrate the robustness of the proposed system.
Index Terms:
Attribute-based Encryption, Blockchain, Data Sharing, Governance, Multi-authority, Privacy Enhancing TechnologyI Introduction
Notwithstanding the many advantages of cloud computing, which have led to its widespread adoption and continuous growth, it also presents certain risks that prompt the exploration of alternative architectures. In particular, due to the inherent centralization of cloud services, they can become a single point of failure. This presents issues regarding service availability, censorship, and end-user privacy concerns. These challenges are further exacerbated by potential insider attacks and the service provider’s own agency. For instance, Apple’s decision in 2021 to roll out a Child Sexual Abuse Material detection technology by scanning images stored in its iCloud service led to numerous collateral privacy concerns and criticisms[1]. Even though the original plan was eventually abandoned mainly because of strong public backlash, the fundamental vulnerability of such centralized systems being subject to privacy violations or censorship remains.
To address some of these issues inherent in centralized cloud storage, many encryption schemes such as AES, RSA, Proxy Re-encryption (PRE), Identity-based Encryption (IBE), and Attribute-based Encryption (ABE) have been used to secure data confidentiality [2, 3, 4, 5]. However, some encryption schemes may not be applicable for data sharing, which is a common use case.
As an example, consider private electronic health records (EHR) that can be accessed by a patient (data owner, DO) and all the hospitals (data users, DUs) that are involved in the patient’s treatments. When the patient wants to claim the cost of medical treatment, the insurance company, which is a new DU, needs to obtain the patient’s EHR to make a judgment. In a traditional public-key encryption system like RSA, the patient must re-encrypt the EHR using the public key of the insurance company. Similarly, in the IBE scheme, the patient must generate a new identity-based ciphertext for the insurance company. These access control options are limited in terms of flexibility and scalability, as the DO may have to take additional steps on demand to ensure that the encrypted content is accessible to a new DU. In Ciphertext-Policy Attribute-based Encryption (CP-ABE) schemes, each user is associated with a set of attributes. Successful decryption can be carried out only if a user’s attribute set satisfies the access policy embedded in the ciphertext. As a result, once the message has been encrypted using the CP-ABE, it becomes available to all authorized existing and future users. Therefore, CP-ABE is a promising solution for data confidentiality and data sharing.
Traditional CP-ABE schemes may rely on intermediary entities like a Trusted Third Party (TTP) and a Central Authority (CA) for the security and trustworthiness of data access control. For normal cloud users with limited computation resources, the DO usually outsources the ciphertext to the TTP under a specified policy, and the DU may also delegate the decryption task to the TTP in [6, 7]. It also requires the CA to verify attributes across different organizations and issue private keys to every user in the data-sharing system. Therefore, when analyzing the adversarial model of these schemes, researchers tend to make the assumption that TTP or CA are fully trusted [8, 7]. Such a system is vulnerable to access credentials misuse that a malicious CA may purposefully issue attribute keys to some unauthorized users. It is thus crucial to design decentralized alternatives for trust mechanisms and enforce traceability throughout the access control system.
Blockchain-based access control system with CP-ABE has the potential to be an effective solution to these problems. In the absence of a TTP to control the data, each node would share a distributed ledger that keeps track of a growing list of transactions that have been verified and confirmed by consensus mechanisms before being recorded. The integrity of transactions can be secured by hashing, Merkle trees, time stamping, and incentive mechanisms. Based on these premises, several blockchain-based access control systems have been proposed since the emergence of public blockchain systems[9] and the advent of Attributed-based Encryption[10]. Some efforts leverage the immutable public ledger to build a transaction-based access control system for secure data sharing[11, 12, 13], while others leverage the self-executing smart contracts to establish a smart contract-based access control system for flexibility and traceability[14, 15].
However, just employing blockchain technology and CP-ABE encryption for an access control system is inadequate for various real-world applications. On the one hand, information is not always shared inside a single domain or organization. For example, driver’s licenses and university registration information may be managed by separate entities. The same attribute authority cannot be responsible for both attribute management and key distribution. Therefore, multi-authority Attribute-based Encryption [16], first proposed by Chase in 2007, is used to solve the access problem involving attributes belonging to various authorities. This scheme permits any number of independent authorities to distribute secret keys, which are later selected by the data owner for encrypting a message.
Another concern is the privacy issue, which encompasses both policy-hiding and receiver privacy. Since in the classic CP-ABE schemes, an access structure specified in terms of user attributes is explicitly transmitted alongside ciphertext, whoever accesses the ciphertext is also aware of the corresponding access policy. Therefore, multi-authority CP-ABE schemes [16, 17, 18] are also unsuitable for certain use cases since access policies contain sensitive information. This calls for mechanisms to hide access policies for CP-ABE systems. Additionally, DU needs to provide a full set of user attributes to each authority for an attribute key, inevitably compromising the key receiver’s privacy.
In pursuit of addressing these concerns, several multi-authority CP-ABE schemes that feature policy-hiding have been proposed [19, 20, 21]. Despite these efforts, they do not completely fulfill various practical requirements. Schemes such as [19, 21] are prone to the leakage of DU’s confidential attribute information during the key generation or encryption process. Michalevsky and Joye introduced a fully policy-hiding decentralized CP-ABE scheme [20], which protects attribute information attached to the access policy and even addresses the issue of receiver privacy. However, we identified a vulnerability in this scheme as it is susceptible to rogue-key attack. In this type of attack, a malicious AA can register an aggregate public key using public information from other honest AAs. This key can be used to decrypt the ciphertext despite lacking the necessary attribute keys to satisfy the policy. The attack mechanism on [20] is elaborated in Section VI-F1.
As a result, the challenge of securely storing user data, enabling efficient data sharing, and managing multi-authority scenarios, while concurrently maintaining a balance of privacy, transparency and traceability constitutes a complex problem that requires innovative solutions.
I-A Contributions
In this paper, we propose a multi-party CP-ABE-based storage outsourcing system that uses blockchain technology to limit access credential misuse and privacy leakage. To the best of our knowledge, this is the first practical solution for storage outsourcing to achieve fine-grained access control with user anonymity and flexibility.
The core contributions of this work are as follows:
- 
• 
We discern that the decentralized ABE scheme with policy-hiding presented in [20] inherently possesses security vulnerabilities. It is susceptible to a rogue-key attack: A malicious AA can decrypt the ciphertext without the knowledge of the attribute keys required to satisfy the policy. It also encounters a potential risk where an adversary might infer some sensitive information from the published ciphertext, a result of poorly chosen public parameters. These vulnerabilities are thoroughly analyzed in Section VI-F1 and Section VI-G1, respectively. 
- 
• 
To counteract the rogue-key attack and alleviate some potential risks, we modify the algorithms of Setup and Auth Setup in [20] as described in Definition 5. Firstly, we introduce a multi-party protocol inspired by [22] for public key generation, which is detailed in Trusted Setup of Section V-A. Secondly, we impose a prerequisite for each AA to prove the knowledge of published information during the process of Auth Setup. This is elaborated in Section V-B. We further demonstrate that our enhanced system successfully mitigates the aforementioned security concerns, as outlined in Section VI-F2 and Section VI-G2. 
- 
• 
In order to further decentralize our proposed system, we integrate blockchain technologies such as smart contracts and content addressing, alongside multi-authority attribute-based encryption. An overview of the system architecture is presented in Section IV and pseudo code of system contracts can be found in Section VIII. This hybrid approach enhances the practicality and security of the system, which makes it resilient against single-point failures and misuse of credentials. Given that transparency and traceability are inherent attributes of blockchain, a blockchain-enabled ABE system actually realizes a balanced solution for data sharing while simultaneously preserving privacy. 
- 
• 
Overall, we propose a secure, privacy-preserving data governance system based on blockchain technology and an improved decentralized CP-ABE scheme with policy-hiding. Using a combination of Attribute-based Encryption (ABE) and the Advanced Encryption Standard (AES) makes the system practical. The special ABE encryption scheme is capable of handling multi-authority use cases while protecting identity privacy and ABE’s policy. The adoption of AES helps assure the confidentiality of user data, which is furthermore stored in a decentralized storage system, specifically Inter Planetary File System (IPFS), which does not rely on a central service provider and hence does not become a single point failure. 
| Approach | Authority | Policy | Universe | Policy-hiding | Receiver-hiding | Access Control | Storage | 
|---|---|---|---|---|---|---|---|
| [13] | Single | AND | Small | No | No | Smart Contract | IPFS | 
| [14] | Multiple | LSSS | Small | No | No | CSP | CSP | 
| [19] | Multiple | LSSS | Small | No | Yes | CSP | CSP | 
| [23] | Single | AND | Small | Partially | No | CSP | CSP | 
| [24] | Single | AND | Large | Fully | No | Smart Contract | CSP | 
| [25] | Single | LSSS | Large | Partially | No | CSP | CSP | 
| [26] | Single | LSSS | Large | Partially | No | CSP | CSP | 
| [27] | Multiple | LSSS | Large | Fully | No | CSP | CSP | 
| [28] | Multiple | LSSS | Large | Fully | No | CSP | CSP | 
| This work | Multiple | AND | Small | Fully | Yes | Smart Contract | IPFS | 
In Table I, we compare and position our proposed Blockchain-enabled data governance system with existing works [13, 14, 23, 27, 24, 25, 26, 19, 28] that are closely related to ours with regard to flexibility, scalability, privacy, and decentralization across the following assessment criteria:
- 
1. 
Attribute Authority: Whether the authorities involved in CP-ABE schemes are divided into single thus central authority or multi-authority. 
- 
2. 
Policy: Linear Secret Sharing Scheme (LSSS) which supports AND gate, OR gate, and threshold gate versus only AND. 
- 
3. 
Attribute Universe: We define the complete set of supported attributes as attribute universe and only take into account two types of universe: large universe and small universe. In large universe ABE, the attribute universe size has no effect on the size of the system’s public key. 
- 
4. 
Privacy: There are two aspects of privacy involved in CP-ABE schemes: policy-hiding and receiver-hiding. For the policy-hiding scheme, the CP-ABE system is available in two forms: fully hidden and partially hidden. The former means that none of the attributes can be revealed from the access policies, whereas the latter refers to only hiding sensitive attribute values in the access policies. For the receiver-hiding scheme, it prevents any AAs from learning the full set of attributes the receiver (i.e., the DU) possesses, hence relieving the DU from disclosing them while requesting attribute keys. 
- 
5. 
Storage: From a technical perspective, traditional cloud service provider (CSP) and decentralized storage systems such as IPFS, Storj, and Sia, are two distinct popular solutions for data storage and sharing. CSPs may take advantage of their comprehensive control over data, but end users are exposed to the risks of a single point of failure, privacy violation, and censorship. 
- 
6. 
Access Control: We indicate whether access control enforcement is through a smart contract and thus logically decentralized or by a cloud service provider and thus logically centralized. 
From Table I, we observe that very few schemes [27, 19, 28] achieve fine-grained access control and support multi-authority with privacy-preserved characteristics, such as policy-hiding and receiver privacy. However, they all rely on a trusted third party (TTP) or cloud service provider (CSP) to offer centralized storage and access control management and are thus susceptible to the inherent vulnerabilities of such centralized systems in terms of privacy issues. In contrast, our proposed scheme integrates an IPFS network for decentralized storage and a smart contract for access control management in order to decentralize.
I-B Organization
The rest of the paper is structured as follows: Section II contains related work that reviews traditional Attribute-based Encryption schemes and conducts an analysis of some recent solutions for access control systems with ABE technology, while Section III summarizes the preliminaries that the techniques developed in this paper build upon. The proposed system model and protocols are discussed in depth in Section IV and Section V. Section VI contains systematic security analysis. Finally, our conclusions and future plans are presented in Section VII.
II Related Work
Attribute-based encryption (ABE) was first introduced by Sahai and Waters in 2005 [10]. Subsequently, numerous proposals for single-authority ABE system [29, 30] have been put forth. In these systems, the data owner (DO) encrypts data and employs a boolean formula over a set of attributes to restrict access. If the data user (DU) possesses the secret keys, issued by a central authority (CA) that satisfy the boolean formula attached to the ciphertext, DU can retrieve the original data. However, these single-authority ABE systems [10, 29, 30] encounter constraints such as performance bottlenecks and key escrow issues.
Therefore, Zhang et al proposed an enhanced CP-ABE scheme [31], which alleviates the performance bottleneck issue by reducing the computation cost and ciphertext length. This work has been further explored in [13] to create a framework that integrates decentralized storage, smart contract, and CP-ABE techniques to achieve fine-grained access control.
Another concern with the single-authority ABE system is key escrow, where CA issues all the attribute secret keys, thereby gaining the ability to decrypt each ciphertext generated by data owners. To address this issue, Chase and Chow introduced Multi-authority Ciphertext-policy ABE (MA-CP-ABE) [16] without the need for CA. Lewko and Waters further developed this multi-authority scheme in their work [17] allowing any authority to join or leave the system independently. Based on this work, Qin et al designed a blockchain-based multi-authority access control scheme to address performance and single-point failure issues [14].
In an effort to extend the usability of CP-ABE schemes, Nishide et al. presented a desirable property, hidden access policy, in [23]. This approach protects sensitive attribute values while leaving attribute names public, denoted as partially-hiding. Since then, multiple enhanced schemes [32, 24, 25, 26, 27] have been proposed. To support a wide variety of access structures, a fully secure policy-hiding CP-ABE was proposed in [32]. Gao et al used the optimized scheme of [32] to build a blockchain-based access control system in [24] that achieves trustworthy access while maintaining the privacy of policy and attributes. To improve the expressiveness of the access policy, a partially hidden CP-ABE scheme under the Linear Secret Sharing Scheme (LSSS) policy was proposed in [25]. Zhang et al proposed a privacy-aware access control system in [26], denoted as PASH, which supports a large universe CP-ABE scheme with partially hidden CP-ABE. We note that there is another, stricter form of policy-hiding CP-ABE, fully hiding CP-ABE, with which no information of the attributes is revealed with the access policies. Currently, fully hiding CP-ABE can only be indirectly achieved by inner product encryption (IPE) or by using threshold policies [7]. There are several similar approaches providing policy-hiding as well as ensuring accountability for key abuse, for example, Li’s work [27] based on large universe Attribute-Based Encryption construction [33] and Wu’s scheme [34] based on attribute bloom filter (ABF) [35].
Nevertheless, most of the aforementioned schemes either neglect the attribute of policy-hiding or exist as single-authority ABE systems. This gap is addressed by multi-authority CP-ABE schemes with a hidden access policy [36, 37, 20, 28]. The multi-authority CP-ABE scheme featuring policy-hiding was initially introduced by Zhong et al in [36], and subsequently improved by Belguith in [37] that significantly diminishes computational cost by delegating the decryption work to a semi-trusted cloud server. In 2018, Michalevsky and Joye [20] put forward the first practical decentralized CP-ABE scheme with the policy-hiding property. This scheme provides a security proof in the random oracle model and supports various types of access policies, including conjunctions, disjunctions, and threshold policies. Furthermore, Michalevsky and Joye addressed the issue of receiver privacy through the use of vector commitment. However, it has its limitations, including its support for only fixed-size attributes and authorities, and the requirement for coordination among authorities during the setup phase. Most critically, we demonstrate that it is vulnerable to a rogue-key attack, where a compromised authority may decrypt the ciphertext even in the absence of the requisite attribute keys.
In addition to the above, there are a few other proposals [19, 28] in this area that, unfortunately, give rise to additional issues. For instance, a system developed by Yang et al. in [19] keeps the user’s identity private from the attribute authority (AA) if they are not in the same domain. Yet, this approach creates a new privacy issue that users might request AAs within the domain to ask secret attribute keys from other AAs on their behalf, implying that an AA could potentially possess a complete set of a DU’s secret keys. Zhao et al. presented a data sharing scheme [28] that adopts the access policy of linear secret sharing scheme (LSSS) and supports multi-authority CP-ABE scheme with policy-hiding to achieve the privacy-preserving functionality. However, this system is vulnerable to user key abuse due to its dependence on a single central authority for key generation.
III Preliminary
To initiate, we revisit certain foundational principles employed within our system. A summary of crucial notations utilized throughout the manuscript is provided in Table II.
| Notation | Description | 
|---|---|
| Two additive cyclic groups of prime order | |
| A multiplicative cyclic group of the same order . | |
| A set of integers modulo | |
| A security parameter that measures the input size of the system | |
| A prime number used for groups and | |
| Two indexes used to represent th (th) element or position in a sequence or set | |
| The parameter associated with the k-lin assumption, representing the linear independence of group elements. | |
| Public parameters for the use of Attribute-Based Encryption | |
| Public parameters for the use of Vector Commitment | |
| A scalar used for generation of and | |
| A set of secret elements used for Trusted Setup | |
| A set of secret elements used for Authority Setup | |
| A hash of committed elements in Trusted Setup | |
| A hash of committed elements in Authority Setup | |
| A proof of knowledge for an element | |
| A s-pair of the element in group . The superscript of represent s-pair elements in group | |
| A list of s-pair consisting of all the committed group elements | |
| A public key owned by AA, which is used for ABE encryption | |
| A secret key owned by AA, which is used for ABE encryption | |
| A secret matrix element in | |
| A secret vector element in | |
| A secret value in | |
| The secret exponents used in | |
| A component of the attribute key generated by the attribute authority corresponds to an individual attribute. | |
| The consolidated secret key issued by an attribute authority. Given that an attribute authority can oversee multiple attributes, might comprise several components, each representing a distinct attribute. | |
| A policy vector | |
| An attribute vector | |
| A Vector Commitment associated with a specific Data User, derived from its attribute vector and global identifier | |
| A special message used in Vector Commitment | |
| An opening proof to reveal the Vector Commitment | |
| The elements in where | |
| A secret exponent of group element | |
| A collection of message | |
| A set of attribute authorities in the universe | |
| The number of attribute authorities includes in the system | |
| A set of attributes in the universe | |
| The number of supported attributes | |
| A set of attributes possessed by each attribute authority | |
| A set of attributes possessed by data user | |
| A file shared by data owner | |
| An AES key used to encrypt the file | |
| An encrypted sharing file encrypted by AES system | |
| A metadata consisting related information of | |
| A keyword used in metadata to ease the data retrieval process | |
| An encrypted metadata encrypted by ABE system | |
| A location address in IPFS for a file | |
| A blockchain address | |
| A Data User’s Global identifier | |
| A public key registered in a blockchain | |
| A private key registered in a blockchain | |
| IPFS | The InterPlanetary File System | 
III-A Bilinear Mapping
Consider as an algorithm that accepts a security parameter and constructs three multiplicative cyclic groups of prime order : , and . We introduce as a bilinear map, with . The bilinear map has the following characteristics:
- 
1. 
Bilinearity: for all , . 
- 
2. 
Non-degeneray: . 
- 
3. 
Computability: for all , can be efficiently computed. 
III-B Auxiliary methods and definitions
We make an assumption of possessing an algorithm, denoted as COMMIT, which takes strings of arbitrary length as input and produces outputs as determined by a random oracle. While this assumption aids our security analysis, in practical implementations, we could use the BLAKE-2 hash function in place of COMMIT.
For the inputs that can not be mapped directly to integers, especially in the case of group elements, we represent them using byte-strings.
Additionally, we introduce several auxiliary methods to facilitate the verification procedure for certain special properties.
The following definitions and claims are first proposed in the work [22]
Definition 1.
Given a bilinear mapping , elements and . If , we may use the term to represent this relation.
Definition 2.
Given a bilinear mapping , and cyclic group of order , an s-pair is a pair such that , or ; and . For such an s-pair in or , we may represent it using the notation or respectively.
Claim 1.
SameRatio true) if and only if there exists such that is an s-pair in and is an s-pair in .
Proof.
Suppose there exist one element that , and . For some , we have and . As defined in Definition 1, none of the elements is .
Therefore,
if and only if , otherwise, false. As a result, no such exists. ∎
Finally, we can construct our special s-pair as follows
Definition 3.
Given a bilinear mapping and a matrix , a special s-pair is a pair such that or ; and
For such a special s-pair in or , we may also denote it as . Given that a vector can be considered a matrix with a single column, we can also use the notation to represent an s-pair when .
III-C Assumptions
Given a bilinear mapping with associated generators and group order , our work builds upon a variety of standard assumptions, which are detailed below.
Assumption 1.
Symmetric External Diffie-Hellman (SXDH) assumption [38]
It is hard to distinguish  from  where . This also holds to the tuples  and  in different group.
Assumption 2.
K-Linear assumption [39]
It is hard to distinguish  from  where . This also holds in the group .
For , we have the construction of
| (1) | 
and
| (2) | 
, with .
Assumption 3.
Assumption 4.
Square Computational Diffie-Hellman assumption [40]
Given  for a random number  in a cyclic group  of order , a  algorithm  outputs  with non-negligible probability.
Assumption 5.
Knowledge of Coefficient assumption [41] 
Given a string of arbitrary length , and a uniformly chosen  (independent of ), an efficient algorithm  exists that can randomly generate  and . Meanwhile, for the same inputs , there is an efficient deterministic algorithm  cable of extracting a scalar . The probability that:
- 
1. 
‘succeeds’, meaning it satisfies the condition: SameRatio() 
- 
2. 
‘fails’, meaning 
is negligible.
III-D Proof of Knowledge
We adopt the well-established Schnorr Identification Protocol [42], utilizing it as our Non-interactive Zero-knowledge Proof (NIZK). Provided with an s-pair and a string , we establish NIZK (Algorithm 3). This can serve as a proof that the originator of the string is aware of in the s-pair .
Furthermore, we define VerifyNIZK (Algorithm 4), which checks the validity of the provided proof .
III-E Vector Commitment
We ensure our attribute-hiding property through the utilization of a Vector Commitment scheme, as described in [43]. The summarized scheme is as follows:
Definition 4.
This Vector Commitment system commits to an ordered sequence of attribute elements as commitment , then opens it in a certain position of to a corresponding attribute authority (AA), and finally proves that only authorized value existed in the previously supplied commitment . The system normally consists of 4 algorithms:
- 
• 
Key Generation : This is a Decentralized Key Generation (DKG) algorithm. It takes as input the security parameter and the number of attribute authorities, , in the system, and outputs global public parameters where . The element is generated and published by . Following that, the elements can be issued by each based on the shared . 
- 
• 
Commitment : This algorithm is run by a Data User (DU). It takes as input the message generated based on the authorized attributes from , and outputs the commitment . 
- 
• 
Open : This algorithm is also run by a DU. It takes as input the auxiliary information and index , and outputs the opening proof . 
- 
• 
Verify : The Verify algorithm is run by AA. It takes as inputs the commitment , message , index , and opening proof , and outputs the result of the verification. It outputs when it accepts the proof. 
III-F Decentralized Inner-Product Predicate Encryption
Definition 5.
A Multi-authority Attribute-based Encryption with policy-hiding scheme [20] consists of a tuple of Probabilistic Polynomial-Time (PPT) algorithms, such that:
- 
• 
Setup : The global setup algorithm is a decentralized generation algorithm that takes as input the security parameter and then outputs the public parameters . 
- 
• 
Authority Setup : The authority setup algorithm is run by each . It takes as input public parameter and authority index , and outputs a pair of authority keys . 
- 
• 
Key Generation : The key generation algorithm is run by each . It takes as input the global public parameters , the authority index , its secret key , all the public keys , and DU’s global identifier and the attribute vector , and outputs the secret keys . 
- 
• 
Encryption : The data encryption algorithm is run by a Data Owner (DO). It takes as inputs the global parameters , the public keys of all the authorities , the ciphertext policy vector and a file , and outputs a ciphertext . 
- 
• 
Decryption : The decryption algorithm is run by the authorized DU. It takes as inputs the collection of secret keys from and the ciphertext , and outputs the message if the access policy has been satisfied. 
It’s important to highlight that the algorithms delineated above do not hide the attributes vector from the authorities. Rather, we incorporate the vector commitment as an input for the Key Generation process. The detail of this procedure will be provided in the context of Section V.
IV System Overview
IV-A System Architecture
The system architecture is shown in Figure 1,
 
which comprises the following logical entities:
Data Owner (DO): DO is an entity (individual or organization) that owns a certain file . For secure storage and sharing, DO encrypts using the AES key and uploads the encrypted file to the IPFS network, records the returned file location , and embeds and into the metadata which is subsequently encrypted using the ABE system and published in the Ethereum network.
Data User (DU): DU is a data client for DO. It asks the attribute authority AA for permission to get the necessary attribute secret keys , which are then used to decrypt the associated stored on the Ethereum network. After getting the key and the location from , DU can download the encrypted file from the IPFS network and recover the original file .
Attribute Authority (AA): AA is an entity (individual or organization) that contributes to the generation of the public parameters of the ABE system and the vector commitment scheme , publishes the public key for the DO to encrypt metadata , owns a set of attributes and issues secret key for the owned attributes upon the request of the DU.
Trusted Attributed Authority (): is a trusted attribute authority that mainly generates a secret key for DU and deploys system contracts for setup and registration. , unlike normal AA, owns no attributes but is in charge of a specific position in the attribute vector .
Service User ((SU)): In the system, (SU) is a general entity comprising DO, DU and AA.
Participant (): is a special entity that represents each AA during the process of Trusted Setup. The index of denotes the chronological order of each piece of public parameter generated and shared by AA.
Trust Setup Contract (): The contract is deployed to the Ethereum network by the and can only be invoked by an authorized AA within the time window specified. It is responsible for generating the global public parameters .
Authority Setup Contract : Contract is deployed to the Ethereum network by the . It can only be invoked by the authorized AA within the specified time window. It is used to generate the global public parameters and to keep track of the valid information about AA’s address , public key , and supported attributes .
User Registration Contract : Contract is deployed to the Ethereum network by the and can be invoked by all the potential DUs. To register the in the system, DU needs to make sufficient payment to the and then get back the which can later be used to request secret key from AA.
Utility Contract (): Contract is deployed to the Ethereum network by the and can only be invoked by other contracts deployed by . It is mainly used to verify group elements published by AA.
Log Contract (): Contract is deployed to the Ethereum network by the . When it receives a new transaction from DO, it records the encrypted data of metadata and triggers the event to the subscribers.
Blockchain: Each user (DO, DU, AA and ) possesses a pair of keys and a corresponding wallet address on the blockchain. Our system employs two blockchains: IPFS for data storage and Ethereum for data governance.
IV-B Interactions overview
In this section, we describe the overview of our proposed system to show how smart contracts, IPFS, vector commitment, and MA-CP-ABE with policy-hiding are composed together to build a secure, privacy-preserving, and blockchain-enabled data governance system. When it ought to be clear from the context, we omit most indices, like and of elements, and superscript in for readability.
- 
1. 
Trusted Setup - 
1 
First, a community of normal attribute authorities (AAs) with size and a special trusted attribute authority () must be determined. selects the security parameter and two generators for the bilinear mapping, and defines following algorithms: COMMIT, NIZK, VerifyNIZK and powerMulti. 
- 
2 
deploys one system contract and one utility contract . 
- 3
- 
4 
After that, AA computes and publishes the commitments as showed in Alg. 1 to , where . 
- 5
- 
6 
In the Round 1, we define one attribute authority AA as participant who firstly publishes group elements in : , based on the previously verified set of elements . 
- 
7 
Participant computes based on previous using algorithm powerMult (Alg. 5) and publishes these as the arguments of the function Compute of contract to check validity. 
- 
8 
We define the last valid received by contract as one piece of the public parameter . 
- 
9 
In the Round 2, the first AA, also known as participant , publishes group elements in : , also based on the previously verified set of elements . 
- 
10 
Participant , where computes its based on previous using algorithm powerMult and publishes these as the arguments of the function Generate. 
- 
11 
We also define the last valid received by contract as last piece of the public parameter . Therefore, we have . 
 
- 
1 
- 
2. 
Authority Setup - 
1 
deploys contract for authority setup. 
- 
2 
Each AA randomly samples another set of secret element : a matrix , a vector , two numbers , a scalar and a scaled number . Using that, AA takes as secret keys, and publishes a corresponding set of s-pair to the contract . 
- 
3 
After that, AA computes and publishes the commitment to , where . 
- 
4 
Every AA then needs to prove the knowledge of elements by generating the proofs using algorithm NIZK as the argument of function Prove of contract for validity check. 
- 
5 
We define each AA with index based on the receiving order of the complete set of valid and set attribute authority with index . 
- 
6 
Therefore, we have the verified sets of elements and for each . 
- 
7 
In the last stage, for each , needs to compute a set of group elements in . Then publishes these elements, with the number of supported attributes as the argument of the function Setup. 
- 
8 
The contract checks the validity of these elements published by and then registers its address with the elements . 
- 
9 
In the end, we have for vector commitment scheme. 
 
- 
1 
- 
3. 
Data User Registration - 
1 
deploys contract for service registration. 
- 
2 
Data User (DU) makes a direct registration payment to the contract to get the global identifier which is the hash value of DU’s address . 
- 
3 
Afterwards, DU can setup a secure channel with each that possesses the needed attributes and can verify DU’s identity. 
- 
4 
verifies DU’s identity and sends back the set of acknowledged attributes through the secure channel. 
- 
5 
Upon receiving all the from , DU defines a set of ‘N/A’ attributes for those can not issue the attribute set and finally gets a complete set of attributes by combing and together. 
 
- 
1 
- 
4. 
Key Gen - 
1 
DU generates an attribute vector from set of attributes , creates a vector commitment for and sends it with opening proof to each through separate secure channels. 
- 
2 
firstly checks the validity of its responsible part in the commitment using , then issues DU’s requested attribute secret key , and finally sends it back to DU through the channel. 
- 
3 
Upon receiving responses from each , DU gets a complete set of secret keys . 
 
- 
1 
- 
5. 
Encryption and Upload - 
1 
deploys last system contract to record encrypted related information of file 
- 
2 
Data Owner (DO) randomly samples an AES key , encrypts to obtain the ciphertext , and uploads it to the IPFS network. 
- 
3 
After successfully receiving the from DO, IPFS network returns a special hash value as a file location on the IPFS network. 
- 
4 
Then, DO constructs a metadata , specifies a policy vector based on selected attributes from each , uses published to encrypt the metadata and publishes this encrypted information to contract . 
 
- 
1 
- 
6. 
Download and Decryption - 
1 
DU reads every new coming from the contract and checks if its owned secret keys satisfies the access policy to recover the metadata . 
- 
2 
DU retrieves the file location and AES key from the metadata and requests the ciphertext from the IPFS network with the file location . 
- 
3 
DU uses the AES key to recover the original file . 
 
- 
1 
V System Design
In this section, we provide more details on Trusted Setup, Authority Setup, Data User Registration, Key Generation, Encryption and Upload, and Download and Decryption processes.
V-A Trusted Setup
The process of Trusted Setup consists of 4 stages: Initiate, Commit and Reveal, Verify, and Generate, and finally outputs the global public parameter for the ABE system.
In the initial three stages, each attribute authority AA sends its transactions independently to the contract . In contrast, during the final Generate stage, each participant (where we use the placeholder notation in Commit and Reveal to represent each AA) must send transactions to in a sequential manner. This sequentiality is necessary because each incoming transaction is generated based on the preceding ’s transaction received by contract .
V-A1 Initiate
At the start, a fixed-numbered community of size will be determined, which will include all of the normal attribute authorities AA and one special trusted authority . represents this community to set the global security parameter to be and the generators of the with prime order to be respectively. Therefore, the bilinear map can be .
also deploys two distinct system contracts: contract for trusted setup and contract for resolving the problem of allowing complex cryptographic computations to be used in the system. also specifies the deadlines for the process Trusted Setup and sets an authorized list to restrict access. For a simple description of the system, we assume that each attribute authority AA submits the required transactions within the deadlines. The pseudo codes of these two contracts are provided in Appendix VIII-A and VIII-D, respectively.
To realize the generation of , this process highly depends on the interaction between each attribute authority AA and contract , which has five main functions, Commit, Reveal, Prove, Compute and Generate with the help from contract . Generally, these functions can only be invoked by a blockchain address owned by AA, which is included in the authorized list , and executed before the pre-defined deadline . Some variables used here are listed below:
- 
1. 
(Global Variable): The sender of the message or transaction. 
- 
2. 
(Global Variable): The current block timestamp as seconds. 
- 
3. 
(State Variable): A list of pre-verified blockchain addresses belonging to attribute authorities set by a trusted authority. 
- 
4. 
(State Variable): A deadline that only allows transaction calls within the pre-set time window for the stage Commit and Reveal. 
- 
5. 
(State Variable): A deadline that only allows transaction calls within the pre-set time window for the stage Verify. 
- 
6. 
(State Variable): A deadline that only allows transaction calls within the pre-set time window for the stage Compute and Generate. 
V-A2 Commit and Reveal
Every AA randomly picks a set of elements : a matrix , a matrix 111The value of in this context does not derive from the security parameter . Rather, it is a reference to the k-Lin assumption 2 upon which our construction is predicated., their corresponding scalar values, and , and scaled matrixes .
Then, AA has
| (3) | 
and then generates a set of s-pair.
For such element , we refer to the s-pair in by and in by as Definition 2. These s-pair in both and are listed as follows:
- 
• 
For matrix : 
- 
• 
For matrix : 
- 
• 
For scalar : 
- 
• 
For scalar : 
- 
• 
For scaled matrix : 
- 
• 
For scaled matrix : 
Other than these s-pair listed above, AA also commits each element using Algorithm 1. For each :
Subsequently, the overall commitment is:
| (4) | 
After that, AA publishes the commitment to the contract through blockchain transaction. The state variable of (Algorithm 7) will store the value with the key as , also known as AA’s blockchain address. Apart from , we define few state variables used in contract as follows:
- 
1. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its commitment 
- 
2. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its unverified list of s-pairs in both and 
- 
3. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its verified list of s-pairs in both and 
After has been received by contract , the sender needs to reveal committed element by passing a list of s-pair in both and
| (5) | 
as argument of the function Reveal (Algorithm 8) before deadline , which checks the existence of the published by , and then verify that indeed 222For brevity, we will use this shorthand notation to represent the above concatenation where takes on all value in the set by using the utility function Hash (Algorithm 17), which works similarly as Algorithm 1. Finally, each pair will be stored in another state variable with the key as .
V-A3 Verify
After deadline set by in the first stage Initiate, the system enters into the stage Verify. In this stage, we need to check that each attribute authority AA possesses the knowledge of the exponent used in the list of s-pair .
Every AA generates the proof using Algorithm 3 for each , and broadcasts these proofs as a list
| (6) | 
through a blockchain transaction to get them verified. The function Prove (Algorithm 9) from contract takes input and processes this verification works.
The function Prove firstly calls function SameRatio of contract (Algorithm 18) similar as Algorithm 2 to examine the authenticity of the published and . Afterwards, it computes , and takes with verified and provided as input to the function CheckPoK of contract (Algorithm 19), which works similarly as Algorithm 4 and returns true if the given proof is valid. Finally, the function Prove will remove the list of s-pair from the state variable and store it in the state variable . This indicates that the AA possesses knowledge of the exponents for the set of s-pair.
V-A4 Compute and Generate
In this stage, the system will generate the public parameters for the attribute-based encryption in two rounds. We use below notation powerMulti for the following algorithm 5:
Round 1: We define the first attribute authority AA as participant , who broadcasts as argument of function Compute in contract (Algorithm 10). The elements are constructed as follows:
And the next participant , generates corresponding elements using Algorithm 5, and also broadcasts them to the contract :
Since receiving the elements from first participant , function Computer of checks the validity of each incoming elements published by .
In the end, we define the last valid as one piece of the public parameter:
Round 2: In this round, we also define the first attribute authority AA as participant , who broadcasts as argument of the function Generate in contract (Algorithm 11). The elements are constructed as follows:
And the next participant where , generates corresponding elements using Algorithm 5, and also broadcasts them to the contract :
Invoked by these transactions, contract checks the validity of each received elements and updates the value of .
As a result of this procedure, the final piece of the public parameter is defined as the last valid received:
At the end of this stage, each Service User (SU) can easily get the global public parameters of the attribute-based encryption system based on the published values.
| (7) | 
V-B Authority Setup
This step consists of 3 stages: Commit and Reveal, Verify, and Generate, and finally outputs another global public parameter and public key for each attribute authority AA.
V-B1 Initiate
The contract deployed by trusted authority has 4 main functions, Commit, Reveal, Prove and Generate, which also interacts with utility function . Its accessibility is also limited by deadlines and the authorized list set by . The pseudo-code of contract is provided in Appendix VIII-B.
V-B2 Commit and Reveal
Every AA firstly samples a set of elements : a matrix , a vector , two secret elements , a scalar and a scaled element .
Therefore, the AA defines a set for vector commitment as:
and also designates:
as the secret key. Then, AA computes a set of s-pair for each element in both and . We refer to the s-pair in by , and the s-pair in by as Definition 2. These s-pair are defined as follows:
- 
• 
For matrix 
- 
• 
For vector 
- 
• 
For element 
- 
• 
For element 
- 
• 
For scalar : 
- 
• 
For scaled element : 
After that, AA computes using Algorithm 1 as follows:
| (8) | 
and broadcasts it to the contract through blockchain transaction as the argument of the function Commit (Algorithm 12), which will store the value into state variable if the transaction is valid. Apart from , several other state variables are defined in the contract , described as follows:
- 
1. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its commitment 
- 
2. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its list of unverified elements 
- 
3. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its list of unverified group elements in both and 
After recorded by contract , AA needs to reveal each committed element by passing two lists of s-pair
| (9) | ||||
| (10) | 
as arguments of the function Reveal (Algorithm 13), which computes the hash result of and using utility contract (Algorithm 17), and compares the result with the value stored in state variable . Finally, the valid set of s-pair will be stored into state variables and of the contract respectively.
V-B3 Verify
The system enters into the stage Verify after set by trusted authority .
First of all, attribute authority AA generates the proof for each in both and , and broadcast these proofs as a list
| (11) | 
through transaction before deadline . The function Prove of contract (Algorithm 14) takes input as the argument. It checks the validity of these proofs by using utility functions SameRatio and CheckPoK of contract (Algorithm 18 and 19).
We assign each AA an index , based on the order in which receives the complete set of valid published proofs . The trusted authority, denoted as , is assigned the index .
At the end of this stage, we have the verified elements and for each .
V-B4 Generate
In the last stage Generate, needs to generate a set of group elements , selects a reasonable number of supported attributes , and broadcasts them to contract before the deadline .
The elements provided by are constructed as follows:
Invoked by this transaction, function Generate of (Algorithm 15) firstly checks the validity of elements and the value of , and then records ’s blockchain address with the claimed attribute size . For example, if owns the set of attributes , the value of attribute size is .
| Address | …… | ||||||
|---|---|---|---|---|---|---|---|
| Attribute Authority | …… | ||||||
| Attribute Representation | …… | ||||||
After the contract receives all the pairs from , every Service User (SU) can get the global parameter for the vector commitment system
| (12) | 
and generate a mapping table that maps each blockchain address of to its corresponding information as shown in Table III. We use to represent the size of complete supported attributes set and to represent the owned attributes for each AA.
V-C Data User Registration
To get the global identifier which will be used in the process of Key Generation, the data user (DU) needs to register the owned address by calling the function user_register of the contract (Algorithm 16).
DU needs to send predefined amount of GWEI to the contract as the registration fee payment. This amount defaults to 1000000 GWEI, which is approximately equivalent to 1.63 USD as of September 2023. In return, DU receives a value , which is the hash value of DU’s address , also known as the of this transaction call.
Following that, DU establishes a secure channel or employs some off-chain methods with that have the required attributes and can verify DU’s identity. We assume that DU is an agent for one insurance company, and the company itself runs the consensus node of in this system. Therefore, DU may easily get verified by showing an ID badge to the person who manages the . DU then requests that issues a set of attributes in regards to DU’s identity. The format of a set of attributes might look like this: out of the full set of attributes .
For those that do not contain the needed attributes or cannot verify DU’s identity, DU may just set to be with .
Finally, DU receives from , sets for , and combines them as the set of attributes which will be used in the process of Key Generation.
V-D Key Generation
Without loss of generality, we suppose that there are a total set of attributes , indexed from 1 to and a total set of attributes authorities including , indexed from 1 to . Assume that the attribute authority has a subset of attributes , then we have for and and .
To get the secret key which is comprised of multiple key parts from , Data User (DU) must initially generate the attribute vector . This is based on that was acquired during the Data User Registration process.
Given that the DU’s set of attributes and the mapping Table III generated from the process Authority Setup, the attribute vector is set as follows:
- 
1. 
Set the first entries such that = 
- 
2. 
Set the entry to be 1. ( is responsible for this entry) 
Then, DU randomly chooses for each AA and combines the arguments with the attribute vector (represented as a bit string) to produce a committed value , where .
| Attribute Authority | |||||
|---|---|---|---|---|---|
| Attributes | entry | mid | senior | agent | manager | 
| Vector Element | |||||
| Element Value | 1 | 0 | 0 | 1 | 0 | 
| nonce | |||||
| COMMIT() | COMMIT() | ||||
In Table IV, for instance, we have two attribute authorities and that possess attributes (entry, mid, senior) and (agent, manager) respectively. If there is a data user DU with a set of attributes (entry, agent), DU’s attribute vector is . For and , DU samples two random values and then computes auxiliary information using COMMIT (Algorithm 1).
In general, based on the public parameter for vector commitment , generated in Authority Setup, DU can calculate the commitment on the attribute vector from its :
To request the secret key part , DU establishes another secure channel with and sends commitment along with an opening and nonce . Such opening is calculated as follows:
Based on the DU’s , firstly retrieves the information of the set of attributes which have been issued in the previous process Data User Registration and then verifies commitment using the opening and nonce received
where is the value calculated by the issued to DU. If the above check passes, uses a pre-defined random oracle to generate masking value
and hash functions to generate where
Finally, computes the secret keys
| (13) | 
, which consists of key parts
| (14) | 
for each possessed attributes by and send back to the DU through the secure channel.
To get the special secret key part for the entry, DU also needs to communicate with trusted authority and provides the commitment with the opening and nonce through the secure channel. As shown in the Table III, sets the to be COMMIT() where and then checks the following equation
If it passes, computes the secret key part similar as
| (15) | 
and sends it back to DU.
Upon receiving all the responses from each , DU will finally gets a complete set of secret keys .
V-E Encryption and Upload
Given that Attribute-based Encryption (ABE) is significantly more expensive than symmetric key encryption[44], the files that the data owner (DO) wants to share are not directly encrypted with ABE. Instead, hybrid encryption of ABE and AES is used for efficiency.
First of all, DO randomly samples an AES key from the key space and encrypts the file to be the ciphertext .
Then, DO uploads the ciphertext to IPFS and records the file location returned by IPFS. The metadata message can then be constructed as:
Using the policy vector discussed above, DO samples a random vector , generates a policy vector acting as the ciphertext policy, and outputs the ciphertext consisting of
| (16) | |||
| (17) | |||
| (18) | 
, where .
Lastly, the ciphertext is sent to the contract with the optional keyword that may ease the data retrieval process. The contract will emit this new uploading information to the subscribers as follows:
V-F Download and Decryption
As Algorithm 6 shows, all the encrypted metadata will be recorded sequentially. If the Data User (DU) is interested in one of DO’s files, DU may subscribe to the event created by the contract and call the Function GET (Algorithm 6) to obtain the encrypted metadata .
To decrypt the ciphertext , DU computes
and tries to recover the message
| (19) | 
If DU’s attribute vector satisfies the policy vector selected by DO, DU can retrieve the AES key and file location from the metadata . Finally, DU downloads the encrypted file from the IPFS based on the location , and uses to decrypt to recover the original file .
Correctness. Since , and , we can compute
If , we obtain and can recover the message.
VI System Analysis
VI-A Performance Analysis
In this section, we compare our proposed Blockchain-enabled data governance system with previously discussed related works [13, 14, 23, 27, 24, 25, 26, 19, 28] in terms of performance and security. In Table V, we position these aforementioned works that are closely related to our work, providing comprehensive comparisons across the following assessment criteria:
| Approach | Group | Security | Encryption | Decryption | 
| [13] | Prime | S-Rom | exp | pair | 
| [14] | Composite | F-Rom | exp+pair | pair+exp | 
| [19] | Composite | F-ROM | exp | pair + exp | 
| [23] | Prime | S-STM | pair* | pair* | 
| [24] | Composite | F-STM | exp | pair | 
| [25] | Prime | S-STM | exp | pair + exp | 
| [26] | Composite | F-STM | exp* | pair + exp* | 
| [27] | Prime | S-STM | exp | pair + exp | 
| [28] | Prime | S-STM | pair + 4exp + 2pm | pair + exp | 
| This work | Prime | S-ROM | exp | 2 pair + exp | 
Several works did not present the computational complexity information. In such instances, we have either extrapolated the complexity on our own or referenced results presented in the survey by Zhang et al. [7]. The latter are highlighted with ∗.
- 
1. 
Group: There are two types of groups used in CP-ABE: prime-order and composite-order groups. It is noted that the design of prime-order CP-ABE is more efficient than composite-order CP-ABE but is more challenging to achieve full security. 
- 
2. 
Security Model: Standard model (STM) and random oracle model (ROM) are two typical types of security models considered in the CP-ABE scheme. And adversaries are categorized into selective adversaries and adaptive adversaries. If a CP-ABE scheme is secure against adaptive adversaries under the standard model, we denote this scheme as F-STM achieving full security. Likewise, S-ROM represents a CP-ABE scheme robust against selective adversaries under the random oracle model, and F-ROM if it is secure against adaptive adversaries under the random oracle model. 
- 
3. 
Computation Cost: This assessment considers both the encryption and decryption costs in terms of their complexity, measured in terms of certain standard (cryptographic) operations. The following notations are used: - 
– 
: exponentiation operation 
- 
– 
: bilinear pair operation 
- 
– 
: point multiplication operation 
- 
– 
: the access policy complexity for Linear Secret Sharing Scheme (LSSS) 
- 
– 
: the size of attributes 
- 
– 
: the size of attribute authorities 
 
- 
– 
In terms of computational costs, bilinear pairing (pair) is the most expensive operation compared to exponentiation (exp) and point multiplication (pm), and the LSSS is a special matrix whose rows are labeled by attributes such that the cost of might be similar as or . Consequently, our scheme is superior to most of these works with respect to encryption and decryption cost, as shown in Table V, but it is constrained in terms of the conjunction access policy. More precisely, for ciphertext consisting of
, costs exp, costs exp, and costs another exp as the result is given by each public key . Similarly, for decrypting ciphertext , the formula costs pair + exp.
VI-B Policy Hiding
Policy-hiding means that the ciphertext policy is hidden from inspection. In our approach, we achieve a weaker concept known as weakly attribute-hiding, which ensures that the policy remains unknown to all parties except those who can decrypt the ciphertext. Our access control system is constructed on top of the decentralized inner-product predicate encryption scheme in [20]. Combined with special policy encoding, the proposed construction in Michalevsky’s work is proved to be weakly attribute-hiding against chosen plaintext-attack under Assumption 3 in the presence of corrupt authorities.
[20] also provides detailed proof that, in the absence of corrupted authorities, the advantage of a PPT adversary in winning a sequence of games, beginning with the actual scheme and ending with a challenge ciphertext, is negligible in the security parameter against a challenger .
However, the privacy of attributes can not be maintained if adversary colludes with a corrupt authority and asks it to provide it with an attribute key for a special value .
Let represent a subset of attributes included in the ciphertext policy. One of the naive constructions of policy vector might be as follows:
- 
1. 
Set the first entries such that = 
- 
2. 
Set the entry to . 
If DO sets the policy vector to be and DU has the attribute vector to be , it’s easily to get , which is the precondition of the successful decryption. A trustworthy authority should only issue keys for values or . A corrupted authority, on the other hand, can provide an adversary with a key issued for a specific value , which may ”satisfy” the policy vector despite lacking all necessary attributes. If the policy vector is still to be and the corrupted AA issues the key for the attribute vector , we can also get . As a result, rather than using to indicate the absence or presence of an attribute, the enhanced scheme in [20] encodes the required attributes using randomly sampled over a large field, defeating an attempt to craft a key by arbitrarily selecting a value that would result in a zero inner product.
VI-C Receiver Privacy
According to the Definition 5, the receiver must provide its attribute vector to each attribute authority AA from which a key is requested. In consequence, AA learns not only if the user possesses the attribute that AA owns, but also all of the user’s other attributes. This appears to violate the privacy of the user in a decentralized setting.
Therefore, Michalevsky and Joye propose an enhancement that provides additional privacy protection for the attribute vector in their work [20], which uses it to hide the set of receiver attributes from authorities. This technique is called position-binding Vector Commitments, introduced by Catalano and Fiore [43]. Our access control system is based on this work, but it has been slightly modified to work with asymmetric pairings , which meets two security requirements under Assumption 4:
- 
1. 
Position Hiding: Even after seeing some openings for certain positions, an adversary cannot tell whether a commitment was made to a sequence of messages, or . 
- 
2. 
Position Binding: An adversary should not be able to open a vector commitment to two different messages at the same position. 
Since hiding is not a critical property and can be easily achieved in the realization of vector commitments [43], only the proof of position binding is provided below:
Proof.
Suppose an efficient adversary can produce two valid openings to different messages at the same position . In that case, we can build an algorithm that uses to break the Square Computational Diffie-Hellman assumption.
As we know from Definition 4, for a sequence of messages and public parameter , the vector commitment is and the opening for position is .
The efficient algorithm takes as input a tuple and aims to compute to break Assumption 4.
First, selects a random position on which adversary will break the position binding. Next, chooses where and then computes:
Second, sets and runs , that outputs a tuple such that and both and are valid.
Finally, computes
Since openings verify the two messages correctly at position , then it holds:
which means that
Since ,
we have:
, which justifies the correctness of ’s output. Therefore, if the Square Computational Diffie-Hellan assumption holds, the scheme satisfies the position-binding property. ∎
VI-D Other Security Requirements
VI-D1 Trustability
Most existing solutions require an intermediary entity to ensure reliable and secure data management, resulting in expensive costs to prevent a single point of failure and privacy leakage. To overcome these obstacles, our approach employs the blockchain’s distribution, decentralization, transparency, and immutability characteristics. By publishing the encrypted metadata, which consists of the AES key and file location , to the blockchain, we can maintain the integrity of access control management without requiring any intermediaries.
VI-D2 Traceability
Our system can track and validate access control data on the blockchain. Any activities, including setup, registration, key generation, encryption, and data uploading, are recorded as immutable transactions.
VI-E Collusion Attack Analysis
A fundamental requirement for an ABE scheme is to prevent collusion attacks. Let and be two users, possessing sets of secret keys and , corresponding to the attribute vector . consists of key-parts that enable obtaining a secret related to attribute from every attribute authorities for ’s processed attributes element . consists of key-parts that enable obtaining a secret related to attribute element from every attribute authorities for ’s processed attributes . The collusion prevention is against that and can mix their key-parts in a way that constructs a secret key to a new vector such that . Otherwise, users can collude to decrypt the ciphertext, which is not accessible for either of them.
Therefore, we propose two mechanisms to restrict key combinations:
- 
1. 
Global Identifier associates each secret key with an identity by incorporating it into the key parts issued by the attribute authorities. 
- 
2. 
Masking Term maps a combination of the public keys of all other authorities , global identifier , and the vector commitment to a random element 
It is worth noting that our scheme necessitates one special authority to be trusted specifically to issue keys only for . As previously discussed, this design choice ensures our scheme’s security, even if some corrupt authorities collude with adversaries to compute a key for any distinct value within the attribute vector .
VI-F Rogue-key Attack Analysis
First, we provide proof that the original scheme, as proposed by Michalevsky et al. [20] and outlined in Definition 5, is vulnerable to a Rogue-key Attack. Following this, we discuss how our blockchain-based data governance mechanism effectively mitigates this vulnerability.
VI-F1 Attack
We demonstrate that a corrupt authority can learn the key parts corresponding to issued by trusted authority and thus decrypt the ciphertext in Michalevsky and Joys’s scheme [20], without having the required secret key satisfying the attached access policy. This is a typical attack, called Rogue-key Attack, in which the adversary uses a public key, a function of honest users’ keys [45].
Proof.
First, an adversarial authority holds until all the other attribute authorities publish their public keys.
and then calculates the public key as follows:
Second, an adversarial data user creates an attribute vector and requrests key parts from all the attribute authorities .
As in the encryption phase, data owner DO will collect all the public keys from each and outputs the cipher text consisting of
, where
Third, given the ciphertest and received secret keys , adversary decrypts it as following:
Since ’s attribute vector is , we have
| (20) | 
In order to decrypt the ciphertext, we need to cancel out the last element in the above Equation 20. Therefore, adversary could use the published to calculate attacking component .
Based on , we obtain and can further cancel the last component in the Equation 20 as .
In the end, the adversary recovers the message by computing
∎
VI-F2 Proof of security of our approach
One way to mitigate rogue-key attack requires proof of knowledge upon public key registration with a certificate authority (CA) [45]. In the absence of a CA, Non-interactive Zero-knowledge Proof (NIZK) could be utilized to solve this security issue in a decentralized manner, as has been the case for our approach.
Proof.
In the process of Authority Setup, every AA needs to generate a commitment of secrets used for public key , and then creates its proof of knowledge using NIZK (Algorithm 3).
Since NIZK is built upon Schnorr Identification Protocol [42], our scheme is resistant to rogue-key attacks under the Assumption 5, i.e., Knowledge of Coefficient Assumption [41]. Particularly, given a simple example , adversary can not generate a valid NIZK of without knowing with non-negligible probability.
As defined in the NIZK of (Algorithm 3), a valid proof is such that
, where is deterministic by and a given string . Thus, if such is provided to and used to generate is unknown, can construct a pair for a given and also there exists an efficient deterministic that outputs .
However, because of the Knowledge of Coefficient Assumption [22], the probability that both
- 
1. 
‘succeeds’, meaning it satisfies the condition: SameRatio() 
- 
2. 
‘fails’, meaning 
is negligible. Thus, adversary can not generate a valid proof without knowing based on given information. In other words, can not generate a public key with a valid NIZK, if and only if has the knowledge of ’s exponent . As such, our scheme is secure against the rogue-key attack that malicious AA can not register a that is a function of others.
∎
VI-G Inferring the Secret Vector in Ciphertext
With its further study of the Decentralized Policy-hiding ABE scheme [20], we also identified a potential risk associated with the generation of public parameters during Setup. The detail of the risk and proof of security is described in the following sections.
VI-G1 Vulnerability
As defined in Definition 5, a trusted third-party (TTP) or an attribute authority (AA) needs to pick a set of random numbers , and then generate a random matrix
 with 
, which  during Setup.
Proof.
If the above generation is conducted by a PPT adversary , or if the sensitive information is exposed, can infer the value of from any published ciphertext .
As we know, to generate a ciphertext , data owner (DO) firstly samples a random vector , then creates a policy vector acting as the ciphertext policy, and finally outputs the ciphertext consisting of
, where and collected from published public keys .
Since
can use the result and known to calculate:
For the exponent , we have
It is simple to determine that the th elements from exponents and partially cancel each other out, and the remaining element is the targeting exponent. Therefore, adversary might extrapolate from the published ciphertext with the knowledge of .
∎
VI-G2 Proof of security with our approach
One potential mitigation of the inferring attack is to generate a composite public parameter by multiple AA, such that neither of those individual AAs knows the secrets . This is achieved relying on the Knowledge of Exponent Assumption (KEA), introduced by Damgård in [46], more precisely:
Proof.
From the Section V-A4 Round 1, the first participant samples a random matrix and a scalar , constracts the elements , and publishes these information for the next participant.
Upon receiving the , samples his own random matrix and a scalar , calculates
using Algorithm 5, and also broadcasts them to the next participant.
Similarly, the next participant performs the same operation until the very last one. In the end, we have a composite
, and no participant learns the secrets of others unless one must collude with every other participant under Assumption 5.
Another security concern is the consistency that an adversary with index can sample different and to compute and , rendering final composite invalid and unusable.
Since a set of group elements
in both and has been committed and revealed by each AA as described in Section V-A2, we may validate that adversary’s is a proper multiple of ’s components:
Even if all the other participants have collaborated, this is still a robust and sufficient setup scheme, even if only one participant is honest and does not reveal its secrets. Hence, the greater the number of unrelated participants in a trusted setup, the less likely the possibility of invalid and unusable public parameter [47].
∎
VII Concluding Remarks
Securing cloud data access and protecting identity privacy are legitimate concerns for many use cases, which this work addresses with a blockchain-based data governance system that is secure and privacy-preserving. A combination of attribute-based encryption (ABE) and the Advanced Encryption Standard (AES) makes the system efficient and responsive to real-world conditions. Our ABE encryption system can handle multi-authority scenarios while protecting identity privacy and hiding ABE’s policy.
However, because our system is built on top of Michalevsky’s scheme[20] and Bowe’s setup [22], it has inevitably inherited a few drawbacks: First, it only supports fixed-size attributes and authorities, which means that any changes to these may necessitate requesting a new system setup or a new key for DU from all authorities. Second, because each AA’s public key must be shared with others for the computation of masking terms , the system requires coordination among authorities during the setup phase. Third, an authorized user should be unable to learn which attributes were included in the encryption policy, which is a desirable property called strongly attribute-hiding. However, our system can only achieve weakly attribute-hiding, such that the policy is hidden from all entities other than authorized users. Finally, implementing trusted setup to protect against rogue-key attacks further complicates the entire setup process posing limitations to scenarios in which authorities might join and leave frequently. Addressing these shortcomings comprise our planned future work.
References
- [1] Clark Mitchell. Activists respond to apple choosing encryption over invasive image scanning plans, 2022.
- [2] M Sudha and M Monica. Enhanced security framework to ensure data security in cloud computing using cryptography. Advances in Computer Science and its Applications, 1(1):32–37, 2012.
- [3] OD Alowolodu, BK Alese, AO Adetunmbi, OS Adewale, and OS Ogundele. Elliptic curve cryptography for securing cloud computing applications. International Journal of Computer Applications, 66(23), 2013.
- [4] Liang Yan, Chunming Rong, and Gansen Zhao. Strengthen cloud computing security with federal identity management using hierarchical identity-based cryptography. In IEEE International Conference on Cloud Computing, pages 167–177. Springer, 2009.
- [5] Caixia Yang, Liang Tan, Na Shi, Bolei Xu, Yang Cao, and Keping Yu. Authprivacychain: A blockchain-based access control framework with privacy protection in cloud. IEEE Access, 8:70604–70615, 2020.
- [6] Matthew Green, Susan Hohenberger, and Brent Waters. Outsourcing the decryption of ABE ciphertexts. In 20th USENIX Security Symposium (USENIX Security 11), 2011.
- [7] Yinghui Zhang, Robert H Deng, Shengmin Xu, Jianfei Sun, Qi Li, and Dong Zheng. Attribute-based encryption for cloud computing access control: A survey. ACM Computing Surveys (CSUR), 53(4):1–41, 2020.
- [8] Ming Li, Shucheng Yu, Yao Zheng, Kui Ren, and Wenjing Lou. Scalable and secure sharing of personal health records in cloud computing using attribute-based encryption. IEEE transactions on parallel and distributed systems, 24(1):131–143, 2012.
- [9] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system. Decentralized Business Review, page 21260, 2008.
- [10] Amit Sahai and Brent Waters. Fuzzy identity-based encryption. In Annual international conference on the theory and applications of cryptographic techniques, pages 457–473. Springer, 2005.
- [11] Damiano Di Francesco Maesa, Paolo Mori, and Laura Ricci. Blockchain based access control. In IFIP international conference on distributed applications and interoperable systems, pages 206–220. Springer, 2017.
- [12] Aafaf Ouaddah, Anas Abou Elkalam, and Abdellah Ait Ouahman. Fairaccess: a new blockchain-based access control framework for the internet of things. Security and communication networks, 9(18):5943–5964, 2016.
- [13] Shangping Wang, Yinglong Zhang, and Yaling Zhang. A blockchain-based framework for data sharing with fine-grained access control in decentralized storage systems. Ieee Access, 6:38437–38450, 2018.
- [14] Xuanmei Qin, Yongfeng Huang, Zhen Yang, and Xing Li. A blockchain-based access control scheme with multiple attribute authorities for secure cloud data sharing. Journal of Systems Architecture, 112:101854, 2021.
- [15] Yiming Hei, Jianwei Liu, Hanwen Feng, Dawei Li, Yizhong Liu, and Qianhong Wu. Making ma-abe fully accountable: A blockchain-based approach for secure digital right management. Computer Networks, 191:108029, 2021.
- [16] Melissa Chase. Multi-authority attribute based encryption. In Theory of cryptography conference, pages 515–534. Springer, 2007.
- [17] Allison Lewko and Brent Waters. Decentralizing attribute-based encryption. In Annual international conference on the theory and applications of cryptographic techniques, pages 568–588. Springer, 2011.
- [18] Yannis Rouselakis and Brent Waters. Efficient statically-secure large-universe multi-authority attribute-based encryption. In International Conference on Financial Cryptography and Data Security, pages 315–332. Springer, 2015.
- [19] Yan Yang, Xingyuan Chen, Hao Chen, and Xuehui Du. Improving privacy and security in decentralizing multi-authority attribute-based encryption in cloud computing. IEEE Access, 6:18009–18021, 2018.
- [20] Yan Michalevsky and Marc Joye. Decentralized policy-hiding abe with receiver privacy. In European Symposium on Research in Computer Security, pages 548–567. Springer, 2018.
- [21] Leyou Zhang, Juan Ren, Li Kang, and Baocang Wang. Decentralizing multi-authority attribute-based access control scheme with fully hidden policy. International Journal of Network Security, 23(4):588–603, 2021.
- [22] Sean Bowe, Ariel Gabizon, and Matthew D Green. A multi-party protocol for constructing the public parameters of the pinocchio zk-snark. In International Conference on Financial Cryptography and Data Security, pages 64–77. Springer, 2018.
- [23] Takashi Nishide, Kazuki Yoneyama, and Kazuo Ohta. Attribute-based encryption with partially hidden encryptor-specified access structures. In International conference on applied cryptography and network security, pages 111–129. Springer, 2008.
- [24] Sheng Gao, Guirong Piao, Jianming Zhu, Xindi Ma, and Jianfeng Ma. Trustaccess: A trustworthy secure ciphertext-policy and attribute hiding access control scheme based on blockchain. IEEE Transactions on Vehicular Technology, 69(6):5784–5798, 2020.
- [25] Hui Cui, Robert H Deng, Junzuo Lai, Xun Yi, and Surya Nepal. An efficient and expressive ciphertext-policy attribute-based encryption scheme with partially hidden access structures, revisited. Computer Networks, 133:157–165, 2018.
- [26] Yinghui Zhang, Dong Zheng, and Robert H Deng. Security and privacy in smart health: Efficient policy-hiding attribute-based access control. IEEE Internet of Things Journal, 5(3):2130–2145, 2018.
- [27] Jiguo Li, Yichen Zhang, Jianting Ning, Xinyi Huang, Geong Sen Poh, and Debang Wang. Attribute Based Encryption with Privacy Protection and Accountability for CloudIoT. IEEE Transactions on Cloud Computing, 10(2):762–773, April 2022.
- [28] Chenbin Zhao, Li Xu, Jiguo Li, He Fang, and Yinghui Zhang. Toward secure and privacy-preserving cloud data sharing: Online/offline multiauthority cp-abe with hidden policy. IEEE Systems Journal, 16(3):4804–4815, 2022.
- [29] Vipul Goyal, Omkant Pandey, Amit Sahai, and Brent Waters. Attribute-based encryption for fine-grained access control of encrypted data. In Proceedings of the 13th ACM conference on Computer and communications security, pages 89–98, 2006.
- [30] John Bethencourt, Amit Sahai, and Brent Waters. Ciphertext-policy attribute-based encryption. In 2007 IEEE symposium on security and privacy (SP’07), pages 321–334. IEEE, 2007.
- [31] Yinghui Zhang, Dong Zheng, Xiaofeng Chen, Jin Li, and Hui Li. Computationally Efficient Ciphertext-Policy Attribute-Based Encryption with Constant-Size Ciphertexts. In Sherman S. M. Chow, Joseph K. Liu, Lucas C. K. Hui, and Siu Ming Yiu, editors, Provable Security, volume 8782, pages 259–273. Springer International Publishing, Cham, 2014.
- [32] Junzuo Lai, Robert H Deng, and Yingjiu Li. Fully secure cipertext-policy hiding cp-abe. In Information Security Practice and Experience: 7th International Conference, ISPEC 2011, Guangzhou, China, May 30–June 1, 2011. Proceedings 7, pages 24–39. Springer, 2011.
- [33] Yannis Rouselakis and Brent Waters. Practical constructions and new proof methods for large universe attribute-based encryption. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 463–474, 2013.
- [34] Axin Wu, Yinghui Zhang, Xiaokun Zheng, Rui Guo, Qinglan Zhao, and Dong Zheng. Efficient and privacy-preserving traceable attribute-based encryption in blockchain. Annals of Telecommunications, 74:401–411, 2019.
- [35] Changyu Dong, Liqun Chen, and Zikai Wen. When private set intersection meets big data: an efficient and scalable protocol. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 789–800, 2013.
- [36] Hong Zhong, Wenlong Zhu, Yan Xu, and Jie Cui. Multi-authority attribute-based encryption access control scheme with policy hidden for cloud storage. Soft Computing, 22:243–251, 2016.
- [37] Sana Belguith, Nesrine Kaaniche, Maryline Laurent, Abderrazak Jemai, and Rabah Attia. Phoabe: Securely outsourcing multi-authority attribute based encryption with policy hidden for cloud assisted iot. Computer Networks, 133:141–156, 2018.
- [38] Lucas Ballard, Matthew Green, Breno De Medeiros, and Fabian Monrose. Correlation-resistant storage via keyword-searchable encryption. Cryptology ePrint Archive, 2005.
- [39] Dan Boneh, Xavier Boyen, and Hovav Shacham. Short group signatures. In Annual international cryptology conference, pages 41–55. Springer, 2004.
- [40] Mike Burmester, Yvo Desmedt, and Jennifer Seberry. Equitable key escrow with limited time span (or, how to enforce time expiration cryptographically) extended abstract. In International Conference on the Theory and Application of Cryptology and Information Security, pages 380–391. Springer, 1998.
- [41] Sean Bowe, Ariel Gabizon, and Ian Miers. Scalable multi-party computation for zk-snark parameters in the random beacon model. Cryptology ePrint Archive, 2017.
- [42] Claus-Peter Schnorr. Efficient identification and signatures for smart cards. In Advances in Cryptology—CRYPTO’89 Proceedings 9, pages 239–252. Springer, 1990.
- [43] Dario Catalano and Dario Fiore. Vector commitments and their applications. In International Workshop on Public Key Cryptography, pages 55–72. Springer, 2013.
- [44] Xinlei Wang, Jianqing Zhang, Eve M Schooler, and Mihaela Ion. Performance evaluation of attribute-based encryption: Toward data privacy in the iot. In 2014 IEEE International Conference on Communications (ICC), pages 725–730. IEEE, 2014.
- [45] Thomas Ristenpart and Scott Yilek. The power of proofs-of-possession: Securing multiparty signatures against rogue-key attacks. In Annual International Conference on the Theory and Applications of Cryptographic Techniques, pages 228–245. Springer, 2007.
- [46] Ivan Damgård. Towards practical public key systems secure against chosen ciphertext attacks. In Annual International Cryptology Conference, pages 445–456. Springer, 1991.
- [47] Zooko Wilcox. The Design of the Ceremony, October 2016.
VIII Appendix
VIII-A System Contracts
The function Commit and the state variable are defined as following:
- 
1. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its commitment 
The function Reveal and some state variables used are defined as follows:
- 
1. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its commitment 
- 
2. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its unverified list of s-pairs in both and 
The function Prove and some state variables used are defined as follows:
- 
1. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its verified list of s-pairs in both and 
The function Compute and some state variables used are defined as follows:
- 
1. 
(State Variable): The most recent published value 
- 
2. 
(State Variable): The most recent published value 
- 
3. 
(State Variable): The most recent published scalar 
The function Generate and some state variables used are defined as follows:
- 
1. 
(State Variable): The most recent published value 
- 
2. 
(State Variable): The most recent published value 
- 
3. 
(State Variable): The most recent published scalar 
VIII-B Authority Setup Contracts
Function Commit and the state variable are defined as following:
- 
1. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its commitment 
The function Reveal and some state variables used are defined as follows:
- 
1. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its commitment 
- 
2. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its list of unverified elements 
- 
3. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its list unverified group elements in both and 
The function Prove and related state variables are defined as follows:
- 
1. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its list of verified group elements in 
- 
2. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its list of verified elements in 
- 
3. 
(State Variable): An integer counter initialized to . It is used to track the number of valid attribute authorities. 
- 
4. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its index 
The function Setup and some state variables used are defined as follows:
- 
1. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its set of elements in the public parameter of vector commitment 
- 
2. 
(State Variable): A mapping collection from the blockchain address belonged to one attribute authority to its number of supported attribute 
VIII-C User Registration Contracts
Function user_register and the state variable are defined as following:
- 
1. 
(State Variable): A mapping collection from the of to the value of global identifier 
VIII-D Utility Function Contracts
The three main functions Hash, SameRatio and CheckPoK are defined as following: