Reverse engineering of database security policies

Tweet about this on TwitterShare on FacebookBuffer this pageShare on RedditShare on LinkedInShare on Google+Email this to someone

With the excuse of collaborating to the book Ingénierie et management des systèmes d’information, a book to honor the career of Jacky Akoka we revisited and extended our previous work on reverse engineering for database security policies (presented at Dexa 2013). Keep reading for this new version of our work on getting a database security model out of your current production database.

Why do we need to care about database security ?

Relational databases are at the core of most companies information systems, hosting critical information for the day to day operation of the company. Securing this information is therefore a critical concern. For this purpose, both, researchers and tool vendors have proposed and developed security mechanisms to ensure the data within the database system is safe from possible threats. Due to its relative conceptual simplicity, one of the most used mechanisms within DataBase Management Systems (DBMSs) are access control(AC) policies where Role-based access control (RBAC) (Ferraiolo, Sandhu, Gavrila, Kuhn, & Chandramouli, 2001) is the current trend.

However, and despite the few methods that attempt to automatically derive these policy implementations from high-level security specifications (Basin, Doser, & Lodderstedt, 2006)(Oh & Park, 2001) (Barker & Douglas, 2003), the task of implementing an access control security policy remains in the vast majority of cases a manual process which is time-consuming and error-prone. Besides, several database mechanisms may be needed in the implementation of the policy, e.g., triggers can be used to add fine-grained control on privileges, scattering the policy and increasing the complexity of the definition process. Furthermore, as security requirements are rarely static (new application scenarios, new users, etc), frequent modifications of the security policy implementation are required, what increases the chances of introducing new errors and inconsistencies.

In this context, discovering and understanding which security policies are actually being enforced by the database system comes out as a critical necessity. This is a necessary condition for the reenginering of the current policies to adapt them to evolving needs of the company and also to detect inconsistencies between the enforced and the desired policies. The main challenge for this discovery process is bridging the gap between the vendor-dependent policy representation and a more logical model that 1) express these policies in a way that abstracts them from the specificities of particular database systems and, 2) can be understood by security experts with no deep knowledge of the particularities of each vendor. Representing and processing the security policies at this logical level is much easier than a direct exploration of the database dictionary where the security information is scattered among different dictionary tables whose structure is unfortunately not standardised. Additionally, this logical model would also allow us to implement all analysis/evolution/refactoring/manipulation operations on the security policies in a vendor-independent and reusable way.

The goal of this paper is then two-fold.

  1. First we provide a means to represent such logical models for security concerns in relational database systems.
  2. Secondly, we describe a reverse engineering approach that can automatically create this database security model out of an existing database. Reverse engineering, as a process aimed to represent running systems at higher abstraction level, has been proved useful in many domains (Canfora Harman & Di Penta, 2007), including database systems (Hainaut et al., 1993), (Chiang, Barron, & Storey, 1994). However, these works have focused on the database structure and ignored the security aspects. We intend to cover this gap.

Additionally, we discuss possible applications and benefits of using a model-based representation given the fact that this enables to reuse in the security domain the large number of model-driven techniques and tools. Model manipulation techniques can then be applied to visualize, analyze, evolve, etc the model. Then, a forward engineering process could be launched in order to generate the new security policy implementation ready to import in the target database system.

Basic concepts: DBMSs Access Control Mechanisms

Security in databases heavily relies on the implementation of access control mechanisms that ensure that only authorised users have access to read/modify a given piece of data.

Access control manages the assignment of privileges or permissions, i.e., the execution of operations, on system objects to subjects. It has been largely studied by the research community and adopted by database tool vendors (Bertino & Sandhu, 2005). Mandatory Access Control (MAC), Discretionary Access Control(DAC) and Role-based Access Control (RBAC) are the most successful models. RBAC is currently the most popular one and will be the focus of this paper. In RBAC privileges are granted to roles which are, in turn, assigned to users.

Such a RBAC-like access control security policy is enforced in a database system by using a plethora of different mechanisms. Moreover, the information of the implemented policies is scattered around several tables and columns in the internal database dictionary, making it very difficult to quickly grasp the security constraints of a database. This section provides an introduction to these security mechanisms while next one presents the security model as a way to provide a more homogeneous representation of all of them.


We can divide the privileges that can be granted in a DBMSs in five categories: Database-level privileges, for those privileges that imply creation of database objects including users and roles. Table-level privileges for those that implies table, columns and index access. Permission-delegation privileges to delegate permission administration to users and roles. Execute privileges for the executable elements (stored procedures and functions) and Session privileges. The corresponding privileges for each category are listed below:

• The table level privileges are the following: SELECT, UPDATE, INSERT and DELETE.
• The database level privileges include CREATE, ALTER and DROP.
• GRANT and REVOKE are the privileges in the permission-delegation category. They can be granted to users or roles so that the permission administration is delegated.
• The EXECUTE privilege is the only one in the execute category. It is meant to be granted on procedural code elements.
• SET and CONNECT are the privileges in the session category. They manage the ability of a user to access a database and to act as a given role.

The list of privileges that exist within concrete DBMSs is longer than the one presented here (e.g., Oracle defines more than one hundred privileges to be granted). However, they are either vendor specific or are meant to be used only in very specific cases.

Creating and assigning roles.

A basic role-based access control policy can be set by using standard SQL commands to create and assign roles (and permissions), although not all commercial DBMSs fully support the SQL standard (e.g., MySQL does not give support to roles so privileges are directly assigned to users). In the following the relevant SQL commands are showed and explained.

• CREATE / DROP for roles: Those commands enable the creation and deletion of roles from the database.
• GRANT / REVOKE: The action of these commands is twofold, as they can be used on privileges but also on roles. When used on privileges they allow to grant/revoke a given privilege to a user or a role on a specific database object. When used on roles, it allows to grant/revoke a role to a user or to another role in DBMSs that supports role hierarchies.

Some of the explained commands admit the use of modifiers that enhance their behaviour and their semantics (e.g., granting a role with the GRANT OPTION allows the grantee to, in turn, grant the role to other users). Again, as the basic commands, they are supported only by certain vendors and just until some extent.

View definitions.

Views can be used as a fine-grained access control mechanism. A view filters table rows and columns with respect to a desired criterion so when a subject is granted privileges over a view, she is actually being granted privileges over a table or a set of tables but with certain constraints. This actually represents column-level and content-based (row level) access control.

Functions, stored procedures and triggers.

Stored procedures, triggers and functions may also participate in the implementation of an access control policy.

A stored procedure can be executed with two different sets of permissions, definer’s rights or invoker’s rights. If it is executed with definer’s right this, in fact, implies the invoker obtains transitive rights on the objects used within the code of the stored procedure. This constitutes a security feature in several DB systems as it allows to give very little permissions to application roles and to encapsulate functionality in the procedures. However, it is important to be able to “see” this information as it could be a security threat (some user not supposed to insert in a given table may do it by transitivity). Moreover, if the definer’s set of rights is important, potentially including administration permissions, a user or role winning permission on the modification of the procedure could modify it to gain access to critical information. Making this information easy to grasp, will allow for a better organization of the policy.

Triggers can also be used to set constraints on granted privileges. This way, generic privileges granted to users will be further filtered depending on the context of the access request. Typically, triggers will be defined to monitor the modification actions on certain tables. Once the trigger is fired, a condition is evaluated to decide whether to go ahead or not with the action based on contextual information (e.g. time of the day, values of the modification action…). As part of the condition a stored procedure could be called.


Access-control Metamodel

This section proposes our specific relational database access control metamodel (RBAC-inspired). Next sections describe how to automatically populate it based on the actual policies enforced in a particular database and how it can be used to analyze those policies.

The SQL standard predefines a set of privileges and object types that can be used in the definition of a relational database. Our metamodel provides a direct representation of these concepts to facilitate the understanding of the security policies linking the security elements with the schema objects constrained by them as depicted in the Figure . In the following, we describe the main elements of the metamodel.

Database objects: The main objects users can be granted privileges on are the following: tables, columns, views, procedures and triggers. Each one of these elements is represented in the metamodel by a metaclass inheriting from the SchemaObject metaclass. We remark that views, as can be seen in metaclass View, can have not only columns corresponding to table columns but also derived columns. Databases include a set of schemas (Schema metaclass) which in turn contain the SchemaObjects.

Privileges: Privileges, divided in five categories as presented in the previous section, are depicted in the bottom part of Figure F1. If needed, an extension of this metamodel (by inheriting from the Operation metaclass) could be provided to deal with vendor-specific privileges.

Subjects: Subjects executing actions on the DBMS are users and roles and they are, as such, represented in the metamodel with the corresponding metaclasses User and Role. The metamodel has to be also able to represent role hierarchies, normally allowed in DBMSs supporting roles. In the metamodel, the metaclass Role inherits from the metaclass Subject which enables it to become grantee. The User metaclass also inherits from Subject what means that, privileges, in contrast to pure RBAC, can be granted to both users and roles.

Other elements and constraints: There are several other aspects that have to be taken into account when developing a domain specific metamodel for access control in DBMSs.


Database Security Metamodel

Database Security Metamodel

The execution of privileges may need to be constrained. In DBMSs this is normally achieved by using triggers and procedural code. The metamodel should permit the representation of the existence of such constraints. The metaclass Constraint meets that purpose. Typically it points to a Trigger linked to the object and the operation on it that is constrained by the trigger. The Trigger metaclass also includes the attributes the trigger body and the condition and error message to be displayed when the trigger execution fails due to any exception.

Another aspect to be taken into account is object ownership as it is the basis for permission delegation. An association between objects and subjects records this ownership relationship.

Finally, global permission, i.e., permissions that are granted on all the corresponding elements (e.g., granting SELECT permission on ANY TABLE gives permission to select over all the tables in the schema) also exist in DBMSs and must be represented. For that purpose, in the metamodel, permissions can be granted on any object, including the metaclasses Schema and Database. A select permission granted on Schema or Database represents that the grantee can select all the tables and views contained in those objects.


Reverse Engineering Process to obtain the database security model

The previous metamodel allows to represent database access control policies at the logical level. Models conforming to this metamodel express the security policies in place in a given database in a vendor-independent manner. These models can be manually created but ideally they should be automatically created as part of an automatic reverse engineering process that instantiates the model elements based on the information extracted out of the database. Note that, the extracted models are vendor-independent but the extraction process is not since each vendor uses different internal structures (i.e. a different set of tables/columns in the data dictionary to express this information). In fact, we could regard this “injection” process as the one that abstracts out the specific product details.

Schema for the Extraction of database security policies

Extraction of database security policies

Our reverse-engineering approach is summarized in Figure \ref{fig:process}. It starts by populating our security model with the basic schema information (tables, views, etc). Then this model is refined to add user, roles and privileges information whereas in a last step, the bodies of triggers and stored procedures are analyzed to complement the access control policies in the security model. In the following we will detail the process for the reverse engineering of security policies over an Oracle 10g DBMS. This same process could be reused for other DBMSs changing the references to the specific data dictionary tables with the corresponding ones in that system.

Extracting general schema information.

The first step of the reverse engineering process consists in populating the part of the model that describes the structure of the database itself e.g., schemas, tables (and columns), views, procedures, etc. In order to do this, an injector (in our terminology, an injector is a software component that bridges technical spaces (Kurtev, Bézivin, & Aksit, 2002), in this case moving information from the database technical space to the modeling technical space) is needed.  This injector connects to the database and queries the dictionary tables to retrieve all the necessary information. The selected objects are inserted as model elements in the security model.


View extraction

View Extraction is more challenging since we need to parse the view definition to get the list of tables (or other views), columns and conditions the view selects. Since the tables and columns have already been created in the model in the previous step, the result of the parsing is a set of links between the view object and the other objects filtered by the view. The where clause will be copied as it is into the condition attribute of the view metaclass.  Moreover, a view can contain derived columns, e.g., columns that do not exist in any table like sums, averages, etc. To retrieve these columns, the database dictionary has to be queried to extract the list of columns of the view and then intersect it with the set of previously obtained columns (i.e., view columns pointing to table columns). Each column that is not in the intersection set will be created as a view column in the model.


Extracting users, roles and privileges

Once the schema information has been already inserted in the database security model, it is time to add the access control information. As in the precious step, an injector is needed to query the database dictionary tables and populate the model with the results. After this step, the general access control information of the database, i.e., subjects, objects and permissions are already represented in the security model.


Extracting stored procedures and triggers information

Triggers. Complex security checks can be implemented by means of triggers that fire to prevent certain actions when performed in a certain context. However, triggers are used for a huge variety of tasks beyond security purposes. In our approach, all triggers are retrieved and parsed, then, they are analyzed with respect to a number of heuristic conditions in order to select which of them are implementing security checks and constraints and discard the rest.

The heuristic conditions analyze the following aspects:

  • Trigger kind: Triggers are associated with statements (e.g., a select statement) and fired when the statements are invoked. However, it is possible to select if the trigger will be executed before or after the statement .BEFORE STATEMENT triggers are executed before the statement is completed, enabling the possibility of evaluating security concerns (e.g., the possibility to make inserts in certain tables could be enabled only to working days). Conversely, AFTER STATEMENT triggers are executed once the action of the statement is performed, so, when involved in security, they are normally used for logging purposes. Clearly, our focus should be in the BEFORE STATEMENT
  • Trigger contents: Although the kind of the trigger is an important hint, it is not enough to decide if it implements a security check or constraint or any other operation (For example, for each table with an auto-generated primary key, this will be implemented as a BEFORE STATEMENT insert trigger. Similarly, a number of derived attributes or inter-table check constraints may be implemented with this kind of trigger). To tackle this problem, we analyze the contents of the trigger in order to find operations that are likely to be used when performing security checks like system context information checks, user information checks, exception raising’s, etc.

As a summary, a trigger will qualify as a security trigger if it fulfils all the following heuristic conditions:

  • The trigger is a before statement trigger.
  • The trigger contains an exception section that raises an exception.
  • The trigger evaluates conditions on the system (IP address, host, time) or use information (name, assigned privileges). Note that checking this part is strongly vendor dependant.

As an example, Table 1 shows two triggers from the well-know Human Resources (HR) schema example provided by Oracle: Secure_employees and Update_job_history. The latter one is an AFTER STATEMENT trigger which directly disqualifies it as a security enforcement trigger. Conversely, the former fulfils all the heuristic conditions. It is a BEFORE STATEMENT trigger, it raises and exception and checks system information (the time) and thus it qualifies as a security trigger.  Once a trigger is identified as a security trigger, the constraints imposed by the trigger need to be extracted and added to the model.

Stored Procedures: As introduced before, stored procedures can be executed with invoker or definer’s rights. In the latter, the invoker role gets transitive access to certain database objects. The source code of the procedure must be analysed to obtain such set of transitive permissions. Our approach parses the store procedures and extracts the accessed database objects. These are then linked to the procedure in the database security model in order to easily retrieve them later on during the analysis phase. As an example, the add_job_history procedure in Table 1 is declared to be invoked with definer rights. The table it accesses, Job_history, would appear linked to this procedure in the security model.

Table 1. Procedural code excerpts

Table 1. Procedural code excerpts


The database security model obtained at the end of the previous step can be used in many different scenarios. In the following we describe some relevant ones. Note that the implementation of these scenarios benefits from the fact that at this stage security aspects are expressed at the model level and thus all existing model-driven techniques can be reused when manipulating them. This model-based representation also allows us to get rid of all vendor specific knowledge during the abstraction process. Therefore, these operations can be applied no matter the database system in place.

In order to illustrate the model analysis and manipulation that follows in this section, we will use the Human Resources (HR) schema. To that example we have added two roles, HR_MANAGER and HR_TRAINEE so that we can show the capabilities of our metamodel and reverse-engineering process to obtain and represent database RBAC security policies. Then, we have applied our tool to obtain the corresponding database security model, hereinafter called HR security model.


Visual data is often easier and faster to analyze than textual or tabular data. Using MDE tools we can easily provide a visualization of our database security model so that the relation between subjects, objects and permissions can be easily grasped. We have used Portolan , a model visualization tool, in order to generate such visualization.

Graphic Visualizing database permissions

Visualizing database permissions

In Figure 2 we show the visual representation we obtained for the HR security model, where each kind of element and relation is shown is a different color. Thanks to this visualization we can quickly see that HR_MANAGER has privileges in all tables whereas HR_TRAINEE has only privileges on the EMPLOYEES and DEPARTMENTS tables and on the ADD_JOB_HISTORY stored procedure. Moreover, the table EMPLOYEES has a constraint implemented by the trigger SECURE_EMPLOYEES (that calls a stored procedure). Finally, we used the Portolan path-discovery feature to see if there is a path between the database user JOHN and the table JOB_HISTORY. This path exists, as the table is reachable through the stored procedure ADD_JOB_HISTORY (note that for simplicity and readability, some information like ownership or system roles is not shown in the diagram).


Complex security queries and metrics.

The most basic thing we may want to do with a security model is to query it to learn more about specific details of the security policies currently enforced in the database. This is very complex to do directly on the database itself since this information is scattered among a number of database dictionary tables which are completely vendor-specific. Instead, when using our model we can just use a standard model query language to traverse the information in the extracted model classes. The model query is defined just once and can be executed on security models extracted from any relational vendor.

As an example, we have used the Eclipse environment to query the HR security model. In Table 2 we show a query written in OCL (Cabot & Gogolla, 2012), the standard query language for models. It extracts, for a given role (in this case HR_MANAGER), all the granted privileges that are subject to constraints (e.g. there is a trigger that restricts the execution of that privilege). As a result, the query shows the privilege, the object and the trigger enforcing the constraint.



let constrainedPrivileges : Sequence(DBSecurity::Permission) =

self.subjects->select(e | = ‘HR_MANAGER’)

->collect(e | e.grantedPrivileges)->select(e | not e.constraints->isEmpty()) in

constrainedPrivileges->collect(e | Tuple{object = e.object,

operation = e.operation, type = e.constraints.executes->first()})



Tuple{object=Table EMPLOYEES, operation=Update, type=Trigger SECURE_EMPLOYEES}

Tuple{object=Table EMPLOYEES, operation=Insert, type=Trigger SECURE_EMPLOYEES}

Tuple{object=Table EMPLOYEES, operation=Delete, type=Trigger SECURE_EMPLOYEES}

In the same way, we can also calculate metrics to learn more about the implemented security policy. A software metric is a measure of some property of a piece of software or its specifications. Quantitative measurements according to these metrics could help in several activities, like quality assurance and feature analysis. Clearly, there are many metrics that may be interesting to evaluate the security policies of an organization. Measures like counting the average of objects reachable per subject or how many users are assigned per role can give as one idea of how the policy is defined. We can discover, by example, that some roles accumulate too many privileges what could put the system into risk.

In the MDE world, the Object Management Group (OMG) has proposed a standard metamodel, Software Metrics Metamodel (SMM) to express the metrics and measurements on a model. SMM is integrated in the model-driven reverse engineering framework MoDisco (Bruneliere, Cabot, Jouault, & Madiot, 2010). Therefore, we can easily define SMM metrics and evaluate them on a given security model to quickly get some summary information.


Refactoring of security policies.

The complexity and frequent evolution of access control security policies may lead to correct but non-optimal definitions. This problem can be tackled by using a technique, refactoring, proved useful for quality improvements in several domains (source code, models, etc).

A refactoring on the database security model will keep the semantics of the policy, i.e., the same objects will be accessible by the same subjects, while improving one or more quality attributes of the model. Attributes like brevity, clarity and breadth (how comprehensive a policy is) of security policies (Goel & Chengalur-Smith, 2010) can significantly impact the quality of the policy as they will ease the task of managing it.

With our approach we could build a library of security refactoring patterns for databases that would allow designers to improve the security implementation of any database. As an example, we could envision an Introduce Role refactoring that creates a new role grouping a set of users with the same permissions. This refactoring does not change the behaviour of the policy (i.e. users do not have more nor less permissions than before) but will improve its maintainability since now we can work at the role-level instead of managing every time many individual users.

Refactoring can be implemented with model transformations whereas refactorings opportunities (“smells”) can be specified by means of model queries. In Table 3 we show a Model-to-Model transformation excerpt implemented in the ATL (Jouault, Allilaire, Bézivin, & Kurtev, 2008) transformation language that introduces a new role, assigns to it the permissions of a given user, remove the permissions from that user and grant her the new role.

module refactoring;

create OUT : SecurityDB refining IN : SecurityDB;

rule IntroduceRole{

from user: SecurityDB!User to

modifiedUser: SecurityDB!User (

permissions <- Sequence {},

roles <- user.grantedRoles->append(newRole) ),

newRole: SecurityDB!Role (

name <- ‘newRole’, permissions <- user.permissions )


Security policy deployment (reengineering).

New requirements to be met by the database security policy may appear. Instead of directly modifying the database security implementation, what will require vendor-specific knowledge and could lead to introduce errors as stated in the motivation of this work, the extracted model can be used to start a model-driven forward engineering process.

First, our reverse engineering process would create the security model out of the current database implementation. Then, the required changes and validations will be performed on the security model. Finally, model-to-model transformation and model-to-text transformations will be used to generate the SQL sentences and procedural code corresponding to the evolved model.

Our framework can also be used as part of existing forward engineering methods (Oh & Park, 2001), (Basin et al., 2006), (Soler, Villarroel, Trujillo, Fernandez-Medina, & Piattini, 2006) able to derive policy implementations from high-level security specifications. All these approaches could use our metamodel as a pivot/intermediate representation in their “code-generation” process to simplify bridging the gap between the high-level specifications and the vendor specific notation.



Barker, S., & Douglas, P. (2003). RBAC Policy Implementation for SQL Databases. In DBSec (pp. 288–301).

Basin, D., Doser, J., & Lodderstedt, T. (2006). Model driven security: From {UML} models to access control infrastructures. ACM Trans. Softw. Eng. Methodol., 15(1), 39–91.

Bertino, E., & Sandhu, R. (2005). Database security – concepts, approaches, and challenges. Ieee Transactions On Dependable And Secure Computing, 2(1), 2–19.

Bruneliere, H., Cabot, J., Jouault, F., & Madiot, F. (2010). MoDisco: a generic and extensible framework for model driven reverse engineering. In IEEE/ACM international conference on Automated software engineering (pp. 173–174).

Cabot, J., & Gogolla, M. (2012). Object Constraint Language (OCL): A definitive guide. In Formal Methods for Model-Driven Engineering (pp. 58–90).

Canfora Harman, G., & Di Penta, M. (2007). New Frontiers of Reverse Engineering (pp. 326–341). IEEE Computer Society.

Chiang, R. H. L., Barron, T. M., & Storey, V. C. (1994). Reverse Engineering of Relational Databases: Extraction of an EER Model from a Relational Database. Data & Knowledge Engineering, 12(2), 107–142.

Ferraiolo, D. F., Sandhu, R., Gavrila, S., Kuhn, D. R., & Chandramouli, R. (2001). Proposed NIST standard for role-based access control. ACM Transactions on Information and System Security, 4(3), 224–274.

Goel, S., & Chengalur-Smith, I. N. (2010). Metrics for characterizing the form of security policies. The Journal of Strategic Information Systems, 19(4), 281–295.

Hainaut, J., Chandelon, M., Ch, M., Tonneau, C., Joris, M., Hainaut, J., … Joris, M. (1993). Contribution to a Theory of Database Reverse Engineering. In in Proc. of the IEEE Working Conf. on Reverse Engineering (pp. 161–170). IEEE Computer Society.

Jouault, F., Allilaire, F., Bézivin, J., & Kurtev, I. (2008). ATL: A model transformation tool. Science of Computer Programming, 72(1-2), 31–39.

Kurtev, I., Bézivin, J., & Aksit, M. (2002). Technological spaces: An initial appraisal. In CoopIS, DOA’2002 Federated Conferences, Industrial track.

Oh, S., & Park, S. (2001). Enterprise Model as a Basis of Administration on Role-Based Access Control. In CODAS’01 (pp. 165–174).

Soler, E., Villarroel, R., Trujillo, J., Fernandez-Medina, E., & Piattini, M. (2006). Representing Security and Audit Rules for Data Warehouses at the Logical Level by Using the Common Warehouse Metamodel. In Proceedings of the First International Conference on Availability, Reliability and Security (pp. 914–921). IEEE Computer Society.


Tweet about this on TwitterShare on FacebookBuffer this pageShare on RedditShare on LinkedInShare on Google+Email this to someone


Your email address will not be published. Required fields are marked *