We conceptually design, formally verify and experimentally evaluate a sophisticated information control mechanism for a relational database instance. The mechanism reacts on access requests for data publishing or query answering with a granularity of either the whole instance or individual tuples. The reaction is based on a general read access permission for the instance combined with user-specific exceptions expressed as prohibitions regarding particular pieces of information declared in a confidentiality policy. These prohibitions are to be enforced in the sense that the user should neither be able to get those pieces directly nor by rational reasoning exploiting the interaction history and background knowledge about both the database and the control mechanism. In an initial off-line phase, the control mechanism basically determines instance-independent weakening templates for individual tuples and generates a policy-compliant weakened view on the stored instance. During the system-user interaction phase, each request to receive data of the database instance is fully accepted but redirected to the weakened view.
Early versions of access control deal with objects as containers on the layer of an operating system. Basically, the control intercepts any request issued by a process to read, write or execute the content of a container and then either accepts or denies the request. The decision is taken according to previously granted access rights, but without inspecting the actual content of the container. Access control primarily aims at enforcing requirements of confidentiality, integrity and availability. In this article, we focus on confidentiality regarding processes of a single user or a group of potentially colluding users. Accordingly, requests to read or, more generally, to receive data are our main concern.
Since early days, many refinements of access control have been proposed and have come into operation. In particular, the concepts of granularity, history-awareness and content-sensitivity are important for access control on the layer of a database management system. Going even further, managing data can be seen as the fundament of providing knowledge or some kind of belief, by assigning some well-defined meaning to raw data. Typically, such semantics are defined for the syntax of a formal logic. For example, first-order logic is employed for query answering in a relational database management system. Dealing with sophisticated notions of information – whether seen as knowledge or as belief – rather than with raw data might be even more ambitious, leading to a further layer of a knowledge-and-belief management system. Accordingly, access control for such a system demands for further concepts, namely of information control and entailment.
If a process running on behalf of an intelligent agent issues access requests, the results of an accepted access might be further exploited by computational rational reasoning, in order to determine the information actually gained. Roughly described, this gain is the new information inferred by reasoning about recently directly received data together with the already previously held information. Hence, the control has to confine the information content of data delivered such that any information gain by a “too curious” receiver does not comprise information to be kept confidential.
To still achieve best availability of information, the control should then be further enhanced by more sophisticated reactions on a request: rather than simply either accepting or denying a request, the control can react by a larger range of options, including the mediation of distorted data. However, distortions might lead to new vulnerabilities by so-called meta-inferences. Accordingly, on the layer of a multi-(intelligent-)agent system, it is necessary to also deal with adversarial reasoning including meta-inferences based on advanced background knowledge about the protection mechanism.
During this development rather straightforward access control gradually matured to highly sophisticated inference control. Unfortunately, the increase of functionality comes along with a decline of efficiency and scalability. One line of answers to this challenge is known as confidentiality/privacy-preserving data publishing, which in particular includes the technique of value generalization by k-anonymization as a special case of information weakening. In a first precomputation offline phase, the control system generates a sanitized view such that all concerns regarding inferences are already provably captured. In a second system-user interaction phase, access to the original data is completely prohibited, but full read access rights on the view are granted.
A particular instantiation of this approach applied to relational databases even goes a step further. In this instantiation access rights for receiving data are expressed by the combination of (i) a general permission to see the tuples of a fixed database instance and (ii) exceptions in the form of user-specific prohibitions to acquire specific pieces of information. These forbidden pieces are expressed as queries in terms of the database schema and declaratively stated in a confidentiality policy. Notably, a security officer should declare such prohibitions independently of the actual instance. Given a confidentiality policy and the database instance, the control system splits the offline phase into two stages, which can be roughly rephrased as follows:
Our contributions generalize and substantially extend that particular instantiation of confidentiality/privacy-preserving data publishing: