Abstract

Confidentiality of information should be preserved despite the emergence of data outsourcing. An existing approach is supposed to achieve confidentiality by vertical fragmentation and without relying on encryption. Although prohibiting unauthorised (direct) accesses to confidential information, this approach has so far ignored the fact that attackers might infer sensitive information logically by deduction. In this article vertical fragmentation is modelled within the framework of Controlled Query Evaluation (CQE) allowing for inference-proof answering of queries. Within this modelling the inference-proofness of fragmentation is proved formally, even if an attacker has some a priori knowledge in terms of a rather general class of semantic database constraints.

Download full text

Extended Abstract

In these days information has become one of the most important resources, which has to be protected. In order to protect information from undesired disclosures, confidentiality requirements are declared by setting up a confidentiality policy. According to such a confidentiality policy a system should enforce the declared confidentiality requirements autonomously.

Moreover, there is an increasing need for storing data cost-efficiently in our economy-driven society. One approach to achieve this goal is called “database as a service” paradigm and leads to third party service providers specialized on hosting database systems and offering the use of these database systems to their customers via Internet in return for payment of rent. These customers may save money because they are freed from purchasing expensive hard- and software and dealing with difficult administrative and maintenance tasks such as upgrading hard- and software or eliminating technical malfunctions.

Obviously, there is a goal conflict between the discussed “database as a service” paradigm and confidentiality requirements because the service provider cannot be restrained from reading all cleartext information stored in its systems. One natural approach to cope with that conflict lies in encrypting all outsourced data on the user side. But, unfortunately, such an approach often makes the efficient evaluation of queries on the server side impossible.

The benefit of encryption of data lies in making these data – and also the information contained in these data – illegible. But often, in relational database systems single pieces of information are not confidential per se. Due to the storage of data according to some (static) relational schema, semantic associations between different pieces of information are represented and often only these associations are confidential. For example, in a hospital the list of illnesses cured and the list of patients are both not particularly sensitive per se. In contrast, an association between a patient's name and a specific illness is very sensitive and has to be protected.

To achieve this protection, some authors suggest to break sensitive associations by splitting relational instances vertically, which is referred to as vertical fragmentation. There are several different approaches to achieving confidentiality based on vertical fragmentation and for each of these approaches the corresponding authors describe how fragments of an original relational instance can be outsourced so that unauthorised (direct) accesses to confidential information are prohibited. But it is not shown that confidential information cannot be inferred by employing inferences, which may offer the possibility to infer confidential information based on the knowledge of non-confidential information. Moreover, it is not considered that an attacker often has some a priori knowledge, which might enable him to infer confidential information.

In contrast, there are several approaches to so-called Controlled Query Evaluation (CQE) and for each of these approaches it is proven that a declared confidentiality policy is enforced so that any harmful inferences are avoided. “Inference-proofness” is achieved by limiting a user's information gain so that this user cannot infer protected information reliably based on his a priori knowledge and the (possibly distorted) answers to his queries.

The main novel contribution of this article consists of a formal analysis of a specific approach to vertical fragmentation – splitting a relational instance into one externally stored part and one locally-held part – w.r.t. its inference-proofness. More specifically, a formalisation of this approach to vertical fragmentation is developed. Then, after introducing the framework of CQE briefly, a logic-oriented modelling of the approach to vertical fragmentation considered within the framework of CQE is introduced and subsequently analysed w.r.t. its inference-proofness. Thereby an attacker's a priori knowledge in terms of a rather general class of semantic database constraints is respected.

Download full text