We propose a new model transformation process aiming at performing the reverse engineering of web-based PHP applications. Our PHP to UML approach follows an Architecture-driven Modernization (ADM) process. Model Transformation rules are expressed in ATL.

The first step was to find a way to parse the PHP code in order to extract the corresponding AST (Abstract Syntax Tree) model. In order to accomplish this task, we thought about using Modisco tool. As known, MoDisco is an Eclipse GMT project designed for the Model Discovery area. It is intended to facilitate the design and development of model-based solutions dedicated to legacy systems reverse engineering.

However, MoDisco only supports natively a few number of technologies. For instance, it does not offer any possibility to handle PHP web-based applications despite the importance of this language in the field of web development. So we have extended Modisco with PHP support. In particular, by benefitting from the open and extensible architecture of Modisco tool and respecting the OMG recommendations in terms of ADM, we built PHPDiscoverer, a new discoverer intended for PHP language. The model discovery process used by our PHPDiscoverer is illustrated in the following figure.

PHP Discoverery process

Fig. 1. The PHP Model discovery process

Concerning the PHP Specific ASTM, we used the Eclipse PHP Development Tools (PDT) library which allows the Eclipse IDE to assist the developers in editing PHP code by providing some features such as syntax highlighting, code auto-completion, syntax error detection, etc. This library contains a package of base classes called org.eclipse.php.core.ast.nodes, which our PHP Specific ASTM is based on. The following figure 2 gives an overview of these base classes organization. The ASTM is then saved in an Ecore file. This article gives more explanations about the PHP Discoverer realization steps.

Abstract Syntax Tree for PHP

Fig. 2.The PHP metamodel – Specific ASTM for PHP

The  following example  represents  a  simple  PHP  Math  class  contained  in  a PHP project, and that contains a static member with a function of adding two variables:

class Math {
  public static final $PI = 3.14159265359;
  public function add($a, $b) {
    return $a + $b;

By applying the model discovery process using the implemented PHP discoverer on the example shown above, we obtain the AST model (corresponding to the PHP Specific ASTM) serialized in XMI. This would be the XMI resulting from the “discovering” the previous PHP code.

<?xml version="1.0" encoding="ASCII"?>
  <php:AST xmi:version="2.0" xmlns:xmi=http://www.omg.org/XMI xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:php="http://eclipse.org/gmt/modisco/php/incubation/beta">
       <statement xsi:type="php:ClassDeclaration" modifier="none">
         <identifier name="Math"/>
         <body isCurly="true">
       <statement xsi:type="php:FieldsDeclaration" modifier="public static">
          <field xsi:type="php:SingleFieldDeclaration">
            <variableName xsi:type="php:Variable" isDollared="true">
             <name xsi:type="php:Identifier" name="PI"/>
            <value xsi:type="php:Scalar" value="3.14159265359"/>
       <statement xsi:type="php:MethodDeclaration" modifier="public">
          <identifier name="add"/>
          <body isCurly="true">
          <statement xsi:type="php:ReturnStatement">
            <expression xsi:type="php:InfixExpression" operator="+">
            <left xsi:type="php:Variable" isDollared="true">
              <name xsi:type="php:Identifier" name="a"/>
            <right xsi:type="php:Variable" isDollared="true">
              <name xsi:type="php:Identifier" name="b"/>
            <parameterName xsi:type="php:Variable" isDollared="true">
              <name xsi:type="php:Identifier" name="a"/>
            <parameterName xsi:type="php:Variable" isDollared="true">
              <name xsi:type="php:Identifier" name="b"/>

Next step is generating the UML model corresponding to this PHP AST model. To execute this mapping we defined a set of model-to-model transformations written in ATL, as depicted in the next figure.

The PHP to UML transformation process

Fig. 3. The php2uml model transformation process

As an example, you can see two of the transformation rules that initialize the UML model and generate a UML class for each PHP class declaration.

rule PHPAST2UMLModel {
  from a: PHP!AST
  to m: UML!Model(
     name <- 'default model',
     visibility <- #public,
     packagedElement<- a.program->collect(p | p.statement),
     ownedComment<- a.program->collect(p | p.comment)
rule PHPClassDeclaration2UMLClass {
  from p:PHP!ClassDeclaration
  to u:UML!Class(
      name <- p.identifier.name,
      visibility <- #public,
      isAbstract<- (p.modifier = 'abstract'),
      isFinalSpecialization<- (p.modifier = 'final'),
      ownedAttribute<- p.body.statement->
          select(o | o.oclIsTypeOf(PHP!FieldsDeclaration))->
          collect(s | s.field),
      ownedOperation<- p.body.statement->select(o | o.oclIsTypeOf(PHP!MethodDeclaration)),
         generalization <- p.superClass

Of course, here we have limited ourselves to these two examples, but there are many other transformation rules involved in this process.
By applying the set of ATL transformation rules on the AST model presented above, we obtain the corresponding UML model as shown below:

Generated UML class from the original PHP code

Fig 4. Generated UML class

For now the transformation focuses on the structural aspects of the PHP to UML transformation but we’re working on discovering also behavioural aspects and generating activity diagrams from them.

Pin It on Pinterest

Share This