Jan 22, 2011

Database with Integrity - Study


Some time ago we have written an informal study for implementation of a data management system with integrity protection. It has been lying on a disk for a while and when we came across it recently, we decided to put it online as it shows some basic cryptographic mechanisms.
Category: General
Posted by: dc352

Definition of the Problem

You desire to keep arbitrary amounts of security-critical application data (e.g. user and rights data in an access management system) in a database. Database storage is chosen for its capacity, speed of access, querying facilities and high-availability characteristics. Assuming the database administrators and system administrators are themselves not trusted with this data, how would you protect it? The constraints are:
  • You may modify the schema to accommodate the integrity mechanism.
  • You may assume that there is a single entity which can process and produce data elements which is trusted and inside a security boundary. However the data itself is stored outside this boundary.
  • It is the integrity of the data, not confidentiality, which is important here.
  • Most of the data should be readable by anyone, but it should not be possible to change, obscure, or re-order data except from within an authorised and trusted boundary.
  • The design should aim to have as small an effect on search performance as possible.
Any aspect of data modification is considered a risk and it is important to maintain the relationships between data records, not just individual records.

Summary of the Problem

Clearly Defined Points

  1. Arbitrary amount of data.
  2. Storage is untrusted.
  3. Security problem is integrity.
  4. Trusted entity - produces and processes data.
  5. Data readable by anyone.
  6. Change of data only within the trusted boundary (trusted computing base).
  7. Small effect on search performance.
  8. Integrity of data and relations is important.
The security property we are interested in is integrity, not confidentiality or availability of the system.

Issues not Defined / Unclear

  1. Who can change/update data - our assumption is that it happens within authorised and trusted boundary.
  2. Who is auditing breaches of integrity - our assumption is there are no auditors and breaches are reported to users.
  3. What is the attacker model - our assumption is that attackers can change any data stored and communicated outside the trusted part of the system.
  4. Whether the requirement is only to protect integrity or to ensure correct reporting of corrupted data as well - we assume the latter is correct but we simplify the solution by assuming the relevant communication is within the trusted part of the system.

System Model

There are several entities important for dataflows within the system:
  • Source of the data (SRC) - trusted entity, within authorised and trusted boundary (ATB).
  • Storage of the data (DB) - untrusted entity.
  • Users (USR) - querying the data storage, it is in their interest to obtain correct data, but they may also try to corrupt the data.
  • Administrators of the untrusted system (ADM) - they are untrusted with respect to the data, but they are supposed to ensure availability and correct functionality of the system (its parts).
There are no auditors required or in existence. We may assume that any security related problems will be reported to users, as it is in their interest to receive correct data.

Data Flows

The data flows are relatively simple: SRC <--> DB --> USR It's not clear who can request changes of the data so let us assume that it can be done and there are no restriction on the data manipulations except that it must be done within ATB.

Attacker / Threat Model

It is not clear, whether the attacker can influence communication between parts / entities of the system. Let us assume that it is indeed possible. Then, the attacker may have unlimited access rights to untrusted part of the system and any communication outside ATB, with a possible (but not necessary) exception of communication between USR and ATB. As such, we can not avoid tampering of the data stored in DB, or transmitted between system's elements. The goal of the protection will be to detect unauthorised modifications and react. The unauthorised actions include:
  • changes of the data in the database;
  • deletion of the data in the database;
  • changes of the data en-route between ATB and DB;
  • changes of the data en-route between DB and USR.
The administrators are not fully trusted, as well as other internal staff. However, we still assume that administrators will keep the system running as their performance, in this respect, is easily verifiable. The staff is assumed to be able to perform various passive and active attacks on the system to thwart integrity of the data secured by the system. These attacks may target the system and its parts locally or re-direct data flows to other untrusted places. Users are not trusted either and they may try to exploit accessible API to attack the system. We also assume that although the staff are not trusted, it would be very difficult for them to collude if at least X=2 persons need to co-operate.

Relevant Threats

I skip threats related to force majeure, organisational shortcomings, human / user failure, and technical failure as these would require substantial space. Also, the problem to be solved relates particularly to malicious tampering with the integrity of the stored data. Here is a list of selected threats, according to IT Baseline Manual of BSI (Bundesamt fur der Sicherheit in der Informationstechnik), that may be used to attack the data's integrity, with short reasons justifying the selection:
  1. manipulation of lines - with a goal to perform MITM on a trusted channel;
  2. manipulation of data or software - to extract secrets used to provide integrity protection mechanisms, or introduced unauthorised changes to the data processing functions;
  3. abuse of remote maintenance ports - to get unauthorised access to trusted code, or secrets used;
  4. eavesdropping of rooms (terminals, ...) - to get unauthorised access to secrets used for integrity;
  5. threat posed by internal staff during maintenance / administration work - to get unauthorised access to secrets used for integrity;
  6. threat posed by external staff during maintenance work - to get unauthorised access to secrets used for integrity;
  7. systematic trying-out of passwords - attacking access controls of trusted code;
  8. abuse of user / administrator rights - to perform unauthorised operations, access secrets;
  9. trojan horses - attacks to penetrate boundaries of trusted part of the system;
  10. replay of messages - to introduce unauthorised changes of data;
  11. social engineering - to learn about security mechanisms and discover any vulnerabilities;
  12. unauthorised use of a cryptomodule - to attack integrity of processed data;
  13. manipulation of a cryptomodule - to change its functionality and change either content of the database or messages signalling security problems;
  14. compromising cryptographic keys - to get hands on the keys allows unauthorised changes of data;
  15. sabotage - to trigger emergency procedures and exploit vulnerabilities there.

Overall System Architecture

The system's security architecture will comprise of at least two basic elements that would provide:
  • integrity protection of the data elements;
  • functionality to verify integrity of the data; and optionally;
  • data redundancy elements (decreasing the data entropy) allowing easier detection of unauthorised modifications.
According to the threat model, the system has to provide trusted computational base and trusted path between itself and its users. The following phases of the system's life-cycle must be supported to deliver the required security properties:
  1. system setup and initialisation;
  2. production phase of the system:
    1. normal system tasks (providing data integrity, verification of data integrity);
    2. reporting of breaches of security requirements; and
    3. lost of sensitive security data.
  3. termination of the system.
Let us use the following few acronyms from now on: ICM for an integrity computation mechanism, IVM for an integrity verification mechanism, DMM for a data manipulation mechanism, QMM for query manipulation mechanism, TP for a trusted path. ICM, IVM, DMM, and QMM are within ATB - the trusted part of the system. Symbols [X] ... [/X] denote beginning and end of X.

Changes to Data Flows

All paths in the data flow diagram must incorporate ATB, and particularly a subsystem verifying integrity of the data.
  • Data creation: [ATB] SRC --> ICM [/ATB] --> DB
  • Data modification: DB --> [ATB] IVM --> DMM --> ICM [/ATB] --> DB
  • Data retrieval: [TP] USR --> [ATB] [/TP] QMM [/ATB] --> DB --> [ATB] IVM --> DMM --> [TP] [/ATB] --> USR [/TP]
The IVM component of the paths adds extra computational overhead and delay to the communication, an effect we want to keep on the minimum possible level, especially for the data retrieval according to an initial requirement. The error resolution procedures for IVM must be elaborated.

Security Mechanisms

We have defined several components of the system. Particularly trusted path (TP), integrity computation mechanism (ICM), and integrity verification (IVM) are crucial for satisfying security requirements. Data is stored in an untrusted environment and we therefore need to use some sort of cryptographic mechanisms as data can be manipulated by the attacker at their will.

All data-flow paths go through ATB, and we assume that it is a single physical component. Also, ATB is is as close to users (USR) as possible to minimise the amount of data to undergo integrity related computations and minimise the length of trusted paths. IVM and ICM (integrity verification and computation) are implemented within this boundary and we can therefore use symmetric cryptography as it is easier to administer, more computationally efficient, and there will be no problems with the key transport.

Note: IVM can be even moved to users (USR). It would require users to possess computational platform and key material required for integrity verification. The advantage would be minimum computational overhead on the system's (centralised) side.

Trusted path (TP) between authorised and trusted base (ATB) and users would be best facilitated with SSL or SSH protocol that would require ATB to possess a pair of keys for an asymmetric algorithm, with a public key certificate. This certificate can be issued by a trusted third party or it can be e.g. self-signed - assuming that users can obtain hash of the certificate via a trusted out-of-band communication (i.e., original PGP approach). TP would require only one-way authentication of a server to users (unless there are other requirements for accountability).

Database Schema

Integrity protection of data records in the database will deploy the fastest possible implementation and it should therefore use a combination of hash functions (e.g., SHA-X, RIPEMD-160) and a symmetric encryption algorithm (AES, ...). Alternatively, the symmetric encryption algorithm can be replaced with MAC implemented with the hash function (it would be more efficient from the computational point of view). We will use the former option, as it combines two different primitives and adds extra diversity (redundancy) to the system. The database schema will have the following new data elements:
  • Timestamp - any form of a timestamp. The size would largely surpass lifetime and size of the system / data. We would suggest 128bit value as it could be overloaded, or provide a space for sparsely distributed elements - if there was such a need;
  • Algorithm identification - containing IDs of algorithms used for cryptographic computations and a version of the record processing method. It will allow update of the method, changes of keys, or changes of cryptographic primitives;
  • Cryptographic signature of the record - to store encryption of the record's hash value, the size would be twice the current need.
  • hash value - hash value of the first (original) content of the database record;
  • (OPTIONAL) Record index - (it might be already part of the record), it would be needed if timestamp were pdate-able'';
  • (OPTIONAL) Hash - of the previous record inserted to the database. The value would be part of the state of the ICM;
  • (OPTIONAL) Links - sufficiently sized element allowing verification of other records' existence. we assume, for the sake of this exercise, that there are two records linked to the actual record - creating a tree-like structure
The complete security information in each record would consist of:
  1. timestamp (16B);
  2. record index (8B);
  3. hash (32B);
  4. algorithm ID (4B);
  5. cryptographic signature (32B); and
  6. links (80B).
It is 172B per record and the data items provide more than sufficient space for updates of algorithms / key sizes. The size could be optimised (e.g., hash values can be truncated) but as there are no requirements with respect to the data size, we will not elaborate this aspect of implementation.

Cryptographic Computations

The cryptographic signature will be computed in two steps:
  1. hash of the whole database record (all its elements): data || timestamp || algorithm identification || hash || links.
  2. encryption of hash from the previous step with a key secret to ATB.

Hash would provide linear chaining of the records, as they are inserted to the database. If a record is deleted, it can be detected. Links would serve the same purpose as the hash, if there are more complex relations between records in the database. Verification of links and Hash can be outside the record's signature verification to minimise computational overhead of database queries.

We assume that records' chaining checks would be implemented and carried out in two ways: when a record is being accessed (just a re-computation of cryptographic protection within the accessed record) and also independently when returning data to users' queries.

The reason for this approach is that importance of links between database records has not been specified. It may turn out that these are even more important than integrity of particular records. We therefore allow for an efficient implementation of a mechanism that would check only inter-record dependencies. IVM would recompute the hash of the data and compare it with the value stored in the database record. When the data is updated, ICM would recreate cryptographic signature of the record. The hash value element will remain unchanged, as well as links.

Implementation of Cryptography

Let us describe cryptographic protection in a bit more detail. As mentioned above, we need two (possibly three) main functions implementing computation and verification of integrity protection datagrams. There are also necessary two more functions if we are to support initialisation and recovery of the system. Any database record will consist of: [Data || Hash || Timestamp || algorithmId || RecordIndex || Signature || Links]

Integrity Computation

The first computation will set algorithmId according to the current settings of the system, create Timestamp value, and RecordIndex for a new record and read Data and Links. A record's hash will be computed from all the above mentioned elements. This hash will then be encrypted to obtain the records' Signature. A hash from the whole record will be computed and it will replace the hash in the internal state of the ICM.

The obvious problem here seems to be the combination of trusted computations and database operations over untrusted part of the system. The following operations: gather data, compute pieces of cryptographic information, store the record in the database, update internal state of the trusted code - their order and commitments must be properly designed to minimise denial of service (DoS) attacks on the system's integrity.

The balance between reliability (with respect to random communication errors between trusted and untrusted part of the system) and security of the system must be found. All these atomic operations are security sensitive - correctness of the code and confidentiality of used keys must be ensured.

Integrity Verification

There are two parts of IVM - intra-record verification and inter-record verification. The first will recompute record's signature and the integrity of links' elements. The latter does not include database queries for the linked records. The second verification can be performed asynchronously on user's queries and possibly run continuously.

The computation itself can be analogous to ICM - computing hash and encrypting it. This way, we do not require the decryption function for normal operation of the system - i.e. the internal API is simpler. The decryption may make sense only for investigative purposes - this would required a more thorough analysis. The problem here is to implement proper error propagation if any inconsistencies are detected. There are four approaches:

  1. Signal a problem to system administrators (more than one) - assuming that collusion is limited and/or administrators are trusted to a certain degree.
  2. Signal a problem to any user who is the first to query data after the inconsistency is detected - the assumption here being that it would initiate some sort of out-of-band problem escalation.
  3. Signal aproblem to the first user who queries the records with problem - similar to the previous one, but we also assume that users will be more motivated to uncover the problem.
  4. Signal a problem to an independent entities within the system - there may be auditor roles among internal staff that would be authorised and entrusted to react to problems.

Investigative/Forensic Functionality

Out of scope now. It would probably be an extra API function using IVM. I'm not sure whether any extra logging facility would make sense in the system as a lot of unauthorised actions can happen outside the trusted part of the system. However, extensive logging makes any attacks more difficult so further logging of operations performed by the trusted, as well asuntrusted part of the system would be recommended. It would need more elaboration though.

Secrets' Back-up

This is preceded by a system initialisation, when a key is generated - this would require a secure random number generator ( (P)RNG ). When the trusted part of the system is initiated, a function allowing back-up of the generated secrets should provided. There are two pieces of information that are relevant here. The key used for encryption of the record's hashes. The back-up will be implemented by splitting the key into several parts. We can use either XOR operation, or a hash operation - parts are hashed together to create the key.

The interface will implement export of the key's parts supported by a proper access control or operational procedures. Subject to further analysis. The second issue is that of the algorithm selection. It is the only configurable parameter (in this stage of the system's design). Selection and any change should be linked to the key generation / authentication, possibly with the main key. It needs a bit more thinking here as well - to realise all implications of the just stated requirement.

Secrets' Recovery

The internal state is lost so the recovery will consist of re-entering parts of the key, and setting the algorithm used for cryptographic operations (this could be verified by signature verification on a randomly selected record from the database). Note: any random selections from the DB, non-deterministic use of untrusted parts should be relied upon only in the minimum necessary extent.

Hardware Implementation of Trusted Part (ATB)

Hardware implementation depends on expected load of the system and on physical location and accessibility of the trusted part of the system. It may be implemented as a PC, or an embedded hardware system in a locked room, protected by physical and administrative procedures, it could be a dedicated HSM / tamper-proof cryptographic accelerator, or a more complex system able to handle high expected load of the system. We assume that the cheapest solution for small / mid-sized systems would be to use a HSM for cryptographic operations and implementation of the trusted part of the system.

System Life-cycle

The production phase of the life-cycle makes use of several sensitive pieces of information: a secret symmetric key for ICM/IVM and a state containing data for chaining database records. Setup and initialisation of the system is relatively straightforward: a key is generated, initial internal state is initialised.

We may require back-up of the key that can be best done by exporting keys in several parts, or a secret-key sharing. Appropriate procedures, and functionality for the key recovery must be defined and implemented within ATB. Reporting of security breaches / inconsistencies will be provided to users querying the database (broken integrity of database records) and to administrators (or users) if there is a problem in chaining.

IVM may be deployed synchronously - users receive either correct data, or an error message, or asynchronously - user will query results of IVM separately from the data. The latter approach will improve response to data search requests, but it will also increase probability that users will skip necessary security checks.

Recovery from the lost of the key and state of ICM can be only partial at best. There will always be a window of opportunity for the attacker. We cannot protect the database from deleting most recent data records, unless forward chaining is provided (record n-1 contains link to record n). This may significantly increase cost of database INSERT operations and make the system more complex.

It may be also possible to implement hot-backup of the trusted part of the system, when two copies of the trusted part would hold a copy of the same key(s) and they would share up-to-date processing state. If there is a problem with the active trusted component, the backup system will take over.

Threats - Mitigation

Let us have a look at the list of threats above and briefly describe how they are covered by the system's design.
  1. Manipulation of lines - we assume one-way authentication (as the simplest approach). This seems to be a potentially weak point of the design as it depends on user's responsibility and correct application of security measures (use of the correct certificate and configuration of the client's application). The attacks, however, will target particular users and not the system as a whole.
  2. Manipulation of data or software - it cannot be prevented. We try to detect any such unauthorised manipulation by verifying integrity of the data and providing trusted part between the verification and the consumer of the data. The trusted verification is implemented and operated in a secure environment. The assurance here is provided by this protection and validation/verification of the verification code.
  3. Abuse of remote maintenance ports - This is prevented by encapsulating the sensitive part of the system within an HSM with secure API.
  4. Eavesdropping of rooms (terminals, ...) - Technical means (secure hardware / HSM being used), combined with organisational and physical measures limiting access to the trusted part of the system.
  5. Threat posed by internal staff during maintenance / administration work - organisational procedures and reliance on secure hardware.
  6. Threat posed by external staff during maintenance work - see the previous one.
  7. Systematic trying-out of passwords - no weak passwords are being used.
  8. Abuse of user / administrator rights - There are no administration tasks necessary for the trusted part of the system. Initialisation and recovery implements dual control.
  9. Trojan horses - Use of non-standard (non PC) hardware for trusted hardware, use of simple communication interface preventing communication of code.
  10. Replay of messages - Use of integrity protection mechanisms, and a trusted path with strong one-way authentication (including freshness verification).
  11. Social engineering - We limit human interference / administration with the system, proper design implementing several layers of protection (cryptography, logging, proper verification).
  12. Unauthorised use of a cryptomodule - elimination of unused API functions and physical and operational procedures (not covered here).
  13. Manipulation of a cryptomodule - Mainly physical and operational procedures (not covered here).
  14. Compromising cryptographic keys - live keys are in a secure environment, back-up keys protected splitting the secret, and by physical and procedural measures.
  15. Sabotage - proper design of recovery procedures.
As the list goes, the weakest points currently seem to be: users, key back-up, and possibly recovery procedures. We can add one extra point that is related to dependency of trusted code on the untrusted part of the system. All operations that incorporate communication with untrusted code should be resistant to improper behaviour of the untrusted code. This design element requires particular attention during the design and implementation.

System Decomposition - Sketch

This section builds on the previous text and features a slightly more elaborate overview of the system's architecture. Figure 1 shows that the whole system composes of two parts: the original (possibly) database system, and a new trusted part of the system that will ensure security properties defined for the system.

Figure 2 elaborates more on the architecture of the module C1. Although the diagram's title is Trusted computing base, the scope of the trusted code can be reduced if required. This would also make security verification (or certification if to be provided) somewhat easier and cheaper. The minimum set of the trusted code would not contain the database interface and module C1.7, as we cannot prevent unauthorised changes of the data communicated with the database system - D1 and D2 - the latter in the sense of removing/adding records from the data flow.

Figure 1: Overall System Structure

Figure 2: The high-level structure of module C1.

The diagrams depict data flows, not control flows - the reason for the note in fig. 2. User connection manager receives/opens a connection with a user. The query is sent to C1.7 that will change the query such that not only data, but also new security-related records (see Security Mechanisms above) will be fetched.

Module C1.7 may call API of the original system, if there is such. D1 represents a final query to the database. The result (D2) is verified for integrity (module C1.5 calling C1.4) and the result is passed on to the connection manager C1.1 and back to the user. C1.4 - IVM - makes use of the cryptographic subsystem providing cryptographic primitives (MAC, encryption) as needed and it also ensures that the appropriate cryptographic mechanism is being invoked.

Flow D5 and the depending module C1.9 might be changed when the supported data update mechanisms are made concrete. D11 can go through C1.7 or directly to the database interface. The actual variant makes database interfacing easier to manage, as the final database query is produced in a single module. The data flow for new data starts at the data source (C1.8), C1.9 will facilitate ICM functions and the new INSERT query specification will be sent to C1.7.

Figure 3: The structure of module C1.9 - record manipulation.

Figure 3 is an example how modules from C1 would be broken down into smaller sub-modules. I imagine that diagrams on this level of detail would be accompanied by functional descriptions and only some of the modules would require another level of decomposition.