Click here to Skip to main content
11,413,961 members (74,923 online)
Click here to Skip to main content


, 14 Oct 2014 GPL3
Rate this:
Please Sign up or sign in to vote.
Introduction to my GENOME-IN-CODE Project about virtual cell modelling the bacteria xcc8004

Download GCModeller IDE UI Framework

A new fashion in the molecular biology research

There comes a new fashion in the molecular research these years: a lot of molecular researcher willing combine a computational model into their research article to explain their experimental data. This is can be reflect from an example that the articles about the cell system modeling to explain the high through experiment data, and the Flux balance analysis (FBA) model is the most popular model to represent the metabolism system. The reason of this phenomenon is originally comes from the computer technology development nowadays and more and more biology researcher gets the programming skills training on Perl, R, matlab, Linux shell and VisualBasic 6 to using some program utility or coding a new tool for explain their experiment data.

There is another one phenomenon: As the genetic engineering procedures is a way to reprogram a bacterial genome, but this is not the point. The point is modifying which gene will actually change the cell phenotype or creates a phenotype we wanted? Generally, create a mutant takes several weeks of laboratory works, and maybe the mutant is not the one we want. So we want a tool to predict if we create a mutant, what changes will we make on its cell function?

Here is the personal view from Markus W. Covert, who create a platform for simulate a cell using matlab:

“Understanding how complex phenotypes arise from individual molecules and their interactions is a primary challenge in biology that computational approaches are poised to tackle.” [^]

Which means the biology now is facing a big challenge: we need a powerful tool to explains what is the life? A best solution for this challenge is the virtual cell technology.

[^] Karr, J. R., et al. (2012). "A whole-cell computational model predicts phenotype from genotype." Cell 150(2): 389-401.


Can we reprogramming the genome?

Picture1. What a cell system are really looks like?

If we comparing the cell process with the .NET program assembly and its running way, then we will found out that:
Gene expression processing just like we create an object instance from specific type information in the programming, and the protein enzymes is the class instance of a gene. Then the expressed proteins will implements some phenotype function from catalyzed some metabolism pathways, and this is just like a method invoke.
So if a cell system architecture can be treat as a program assembly, and the cell components is equals to the object class instance in a .NET program, then which means we can modify the cell function process from we modify the genome information. Actually the traditional genetic engineering method is a way of genome reprogramming method, with modify the genome then create a mutant, then its cell function changed. This is just like modify the source code and compile a new assembly. So the DNA sequence just like the binary sequence of the compiled program assembly, and the molecular experiments in the laboratory is the work of disassembly.

One conclusion about the cell system: Comparing with a program running way, the cell system processes are more likely the threads in a program.

So can we just solve a biology problem from the view of the programming? For example, I'm doing the signal transduction network study work in the laboratory now, and i trying to answer 3 questions from my laboratory research job:
  • 1. For example a disassembly like problem: The HrpG/HrpX and DSF/Rpf in Xanthomonas, how does these two system module interaction with each other to performance the pathogenicity phenotype, the molecular mechanism is still unclear now and even if it is clear but we are unable to describe it as they are two complex network.
  • 2. For example a program debug like problem: It is the target gene mutation is really affecting the bacteria phenotype? And if it really does, how does it affecting the phenotype?
  • 3. And a system architecture like problem: There is a lot of two-component system (TCS) in its signal transduction network, so why chose TCS in its evolution time? This is an interesting problem because there is only one STK protein (Gene Id: XC_3631) in Xcc8004 and lots of HK in it from its genome annotation. The Eukaryotes chose STK and the Prokaryotes chose HK, this is interesting in the evolution time!

Picture2. Typical system architecture of a TCS

These questions can be answer by both in traditional molecular experiment method and newly arise computational methods in bioinformatics. And I think me maybe answering these questions much better in the computational way. So I trying to build a new tools for my scientific research, the GCModeller.
Picture3. xcc8004 genome circle diagram
Picture4. xcc8004 virtual cell pathway network real-time visualization drawing comes from the GCModeller

Introduce GCModeller

As the technology limits today, to answer those question mention above may be too difficult and takes lots of time as we employ the traditional molecular biology research procedures. The computational technology maybe is a better choice for answering this question which is about a huge interaction network problem through the method of we just simulate the network and get the answer from the calculation result.

I have create a platform for my whole life career

An article was publishing on the Nature in year 2011: “pathogenomics of Xanthomonas: understanding bacterium-plant interactions”, Then i start my further study after i graduate from university in 2012, the idea then comes out in my mind when i back to the university study again: shall we develop a simulating platform to apply on these problem researches? Then I coding nearly 1 year to build the GCModeller platform, from 2013 to 2014.

I mean maybe I will and I willing to devote my whole life in the career of the research on the bacterial Xcc8004 and the computational analysis of its interaction with its plant host. The GCModeller is the first step in this scientific research career.

As a kind of a plant pathogen, how does the plant host arabidopsis or radish interaction with the Xcc8004, the whole course of events will be represent from this simulation platform i build. Currently i just able to simulate a prokaryote on GCModeller, because it is difficult to build a eukaryote cell model as the Eukaryotes cell structure and genetic code is more complex than the Prokaryotes. And maybe I can finish this job for my doctor degree in the feature years.
Although such a huge project will takes me years to done this job, and it is worth for me to spend effort on this project. As all we want to get a easy way to modelling a cell through programming, and a cell system is like a .NET program assembly. So maybe we could introduce VisualBasic syntax like programming language to the genetics researchers from GCModeller in the feature to build a cell model. This is awesome!!


Current available Virtual Cell simulation platform

This table lists the virtual cell simulation platforms as far as i known:


Platform Bacterial Specie on first published article Bacterial Type Language Home Page  
vcell     C++ & Java  
simtk Mycoplasma genitalium Human pathogens Matlab  
GCModeller Xanthomonas campestris pv. campestris str. 8004 Plant pathogens VisualBasic.NET (Unpublished)  
“GCModeller”is short for the genome-in-code modeller or genetic clock modeller. The goal of the “genome-in-code” project is to create a virtual cell simulation platform on your desktop or server, GCModeller currently just support the bacterial simulation.
All of the component in the GCModeller is develop in visual studio 2013 and using VisualBasic.NET language, but all of the component source code can be easily convert in to the C# language using sharp develop.
Why I choosing VisualBasic language, here is about some reason:


  1. For its English like syntax and keyword, development IDE of VisualBasic is friendlier to the biological researchers. And I am a big fan of the BASIC serials language: QBASIC, VisualBasic 6, and VisualBasic.NET. I have nearly 7 years’ experience of programming using Basic language.
  2. For its fully support to the object-oriented programming feature, so that we are able to modeling the whole cell.
  3. The LINQ syntax in VisualBasic makes it more easily in the object query instead of so much For loop in the code to makes our code massive.
  4. VisualBasic support the MySQL free database server.
  5. VisualBasic is a kind of cross platform language, can running both on Windows/LINUX/MAC using mono runtime not wine.
  6. Create a GUI interface in VisualBasic just easy and looks nice.
  7. Using the Reflection operation can easily extend the program, and dynamic coding.
  8. Easy parallel computing in VisualBasic.
  9. Development using VisualBasic is quite smart and fast.


GCModeller Component List

Here is a list of article about the detail code implement information of the GCModeller components which are published on codeproject, and the list will be continues update with the development of GCModeller:


  • MYSQL database server adapter wrapper

This project implements the model data read and write operation on a MySQL database server and it makes the model compiler development more easily.

  • LINQ Script for query the biological database

This query script is originally comes from the LINQ syntax in the VisualBasic language, and you can use this script language to query the local biological database from the GCModeller IDE.


  • * PLAS Metabolism simulation system core

Although we are finally using a modified simulation algorithm based on the popular FBA model not the PLAS model, but I still want to introduce this model to you as the book “Computational analysis of chemical system” about the PLAS model which is written by EO Voit is the first introduce to the scientific research area to me. all of the ideas in the genome-in-code project is comes from this book, this is my favorite book.






  • Mathematics calculation engine for the PLAS script

This module support the complex mathematics expression calculation in the PLAS model


  • GCModeller IDE plugin system

This is a plugin system for the GCModeller IDE; it is based on the reflection technology to dynamic load the command to the IDE menu.

  • The data exchange library

The csv file format is the most use and common format in the bioinformatics programming in R, here is the csv file format wrapper in my project which is used for exchange the data between the genome-in-code program and R server. The extension method in this library makes my coding job more easily!

  • VisualBasic ShellScript for systems biology

here is a new kind of script language which was original developed in VisualBasic.NET, and it is using for the systems biology research, a lot of function was included in this "genome-in-code" project api library: experiment data analysis and data visualization:


Xcc8004 genetic clock diagram

The genetic clock is the most interesting

phenomenon of the bacteria genome wide regulation, the GCModeller is the brief name of Genetic Clock Modeller or Genome-in-Code Modeller, and this virtual cell platform is original aim at this  phenomenon study.




Article about the genetic clock research on nature


GCModeller IDE

We have done the developing job of the GCModeller simulation engine kernel, and this engine is under the laboratory testing now. And the whole project compiled assembly will be release publishes from Google code and codeproject before our scientific research article was published. The GCModeller gets a great GUI interface for the genetics or molecular biology researchers; I just finish the GUI framework development in February this year and upload this GUI Framework here to share with you.

Picture6. GCModeller IDE screenshoot

The IDE has a visual studio 2010 like GUI interface, and it is based on the WinForm technology not WPF for this GUI IDE running on the Linux platform as the WinForm is more compatible to the LINUX desktop environment on mono runtime environment than WPF does from my testing. Although the IDE can run on the LINUX platform, but there is a lot of buggy problem in the mono WinForm, it is unstable now and you didn’t have a best experience of this IDE on Linux. Maybe the problems will be solved in the next update version of the mono runtime

Special thanks

The Form base of GCModeller IDE is original comes from here:

The interaction between genome-in-code program and R server was interact by RDotNET


This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)


About the Author

Mr. xieguigang 谢桂纲
Student Guangxi University
United States United States
A student of Genetics major, doing Bioinformatics programming, Molecular Biology and Microbial Genetics research. Interesting in Data Mining of Bioinformatics data and wanna working in Google. Now he is working hard on his Laboratory Experiments for his first research article about the analysis of the Signal Transduction Network in the bacterial Xanthomonas campestris pathovar carnpestris 8004.
Follow on   Twitter   Google+

Comments and Discussions

SuggestionSimilar idea from China Pin
waigua2, 23-Oct-14 7:50
memberwaigua223-Oct-14 7:50 
GeneralRe: Similar idea from China Pin
Mr. xieguigang 谢桂纲, 25-Oct-14 12:18
professionalMr. xieguigang 谢桂纲25-Oct-14 12:18 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web04 | 2.8.150427.2 | Last Updated 14 Oct 2014
Article Copyright 2014 by Mr. xieguigang 谢桂纲
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid