Click here to Skip to main content
Click here to Skip to main content

Easy and fast way of analysis (reverse-engineering) of any CMS or framework on PHP

, 23 Jul 2013 CPOL
Rate this:
Please Sign up or sign in to vote.
Experiment

Introduction

I would like to share my way of parsing code unknown CMS or framework, to know your opinion on this matter and find out how and with what tools do the rest of it.

Experiment 

First of all, the proposed method and tools help me in solving the problems in the analysis of an unknown CMS from scratch. These will include:

  1. To locate the file (s) html template layout of the output to the browser.
  2. Identifying the mechanism of routing on the URL that is as a URL-based system determines what content and form in which to display or put in short - a mechanism  controller in the pattern of MVC. 
  3. Identifying the mechanism of the assembly of the main content of the modules, and work with a database that is everything for which usually must meet  model in the pattern of MVC.
  4. Determination of the mechanism of assembly of the entire html page before. That is, view in the pattern of MVC. 
  5. Documentation, creation API. 

This is a basic, in my opinion, problems arising when creating/redoing projects in PHP, of course is a kind of generalization of a specific task, and yet, I think, claim 5 is encountered, perhaps less frequently. 

To meet these challenges, I usually use an IDE for PHP (Eclipse), but in principle any suitable, debugger  Xdebug and a special service for the visualization of trace files Xdebug - Recognizer. It is good because it builds a comfortable working sequence diagram of a trace log. 

Identifying the mechanisms of the code is easier for me to begin with a look from the general to the specific, so the general procedure is as follows:

  • Check the call tree in the diagram in order from top to bottom (of course you can, and bottom-up);
  • I look at the values ​​of the arguments and return values ​​of the great challenges (which contain a lot of other calls) at higher levels (closer to the {main});
  • If the arguments or return values ​​have something important to identify mechanisms for work, then I go down deeper in the tree and study the arguments and return values ​​at these levels. Typically, the arguments and return values ​​contain information that allows us to understand, is responsible for what one or another class/function is generally given to understand much;
  • Check my guesses just commenting the line (s) in the place where there is a particular challenge and look at the results of the output / error output from PHP;
  • Then for a more detailed analysis of the study code or run it through a debugger and look at the values ​​of variables. 

Maybe enough for someone to view the source code or running in the debugger, but for me the most chart tracing helps to form in my head for the system and has become a really convenient way of analysis. 

To cite an example of the results of a cursory analysis (15-20 minutes) CMS Drupal - a completely unknown to me. I added for clarity one article to the site, the default Drupal displays its preview on the main page, and it creates a separate item on the menu to bring up completely. More than anything through admin I did not. 

I will list the names of the functions / methods, and for what they are responsible. If desired, you can repeat my experience to evaluate my technique. At the end of the article will put all the files. 

  •  drupal_valid_http_host - defines and returns the address of the host for further substitution in a template and elsewhere;
  • request_path - specifies the URL of the requested page for later use when routing and determine the output page; 
  • drupal_settings_initialize - defines the main site configuration file that can be seen by the return path;
  • _drupal_bootstrap_variable - within this function loads a main variables site;
  • variable_initialize  - just loads the configuration of the site (for example a theme template and other settings of the site), part of the settings taken from the database, which can be seen at the request of the database as arguments to internal calls;
  • drupal_session_initialise  - initialization of the session;
  • drupal_page_headers  - is responsible for working with headers at all, it is evident from internal calls across the diagram;
  • drupal_send_headers  - sends a header part, which is evident in the arguments;
  • module_load_all  - loads all necessary modules;
  • system_list  - is responsible for obtaining the paths to modules for further downloads that can be seen on the return path, and returns an array with the list of modules to load;
  • drupal_load - loading modules on the list, including the module system and node - the principal, which can be determined by commenting out their load, the site stops working;
  • menu_execute_active_handler - inside it is an operation of the controller and the model;
  • call_user_func_array - to determine what content to include, reports the configuration template themes in the return value; 
  • node_load_multiple - defines which article (s) to extract from the database, it returns the configuration; 
  • node_view multiple - determines how these items should be sorted on the page;
  • drupal_deliver_page - is responsible for the construction of all content;
  • drupal_add_http_header - adds a header portion in response;
  • element_info - defines the configuration (name and other parameters) of article output and returns it as an array;
  • block_page_build - the definition of configuration elements - variables caused by the template;
  • theme - gathers everything together, there is activated the path to the template. 

And this is the result of a quick look at the chart, you can continue to study it in more detail and reveal a lot of useful, well, for the most detailed analysis to see the source code and use the debugger. 

I attach the log file, which I analyzed, so you can repeat my experiment and evaluate the methodology, as well as the archive with the CMS (installation instructions inside). 

p.s. remind you, that the trace log should be created with parameters in the php.ini, said here

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Кирилл Терентьев

Russian Federation Russian Federation
No Biography provided

Comments and Discussions

 
-- There are no messages in this forum --
| Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.141223.1 | Last Updated 23 Jul 2013
Article Copyright 2013 by Кирилл Терентьев
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid