A fractal is "a rough or fragmented geometric shape that can be split into parts, each of which is (at least approximately) a reduced-size copy of the whole," a property called self-similarity. Roots of mathematically rigorous treatment of fractals can be traced back to functions studied by Karl Weierstrass, Georg Cantor and Felix Hausdorff in studying functions that were analytic but not differentiable; however, the term fractal was coined by Benoît Mandelbrot in 1975 and was derived from the Latin fractus meaning "broken" or "fractured." A mathematical fractal is based on an equation that undergoes iteration, a form of feedback based on recursion.
A fractal often has the following features:
- It has a fine structure at arbitrarily small scales.
- It is too irregular to be easily described in traditional Euclidean geometric language.
- It is self-similar (at least approximately or stochastically).
- It has a Hausdorff dimension which is greater than its topological dimension (although this requirement is not met by space-filling curves such as the Hilbert curve).
- It has a simple and recursive definition.
Intuitively, nonrandomness means that a sequence has "structure." More mathematically, it means a nonuniformity of subsequences. If a sequence of numbers is used to produce an attractor, and that attractor has visually observable structure then we have revealed some underlying structure in the sequence of numbers, or some nonuniformity of subsequences. The Sierpinski triangle is very random but if we use an algorithm which is controlled by a poor (not very random) random number generator, the attractor displays nonuniformity (i.e., visually observable subsets or features).
Since a DNA sequence can be treated formally as a string composed from the four letters "a," "c," "g," and "t" (or "u"), it is an obvious candidate for testing the CGR to see whether in fact visually interesting features were present.
Specifically, we experimented with several DNA sequences, as follows: Instead of "rolling a 4-sided die,"(a way that we using for creating Sierpinski triangle)use the next base (a, c, g, t/u) to pick the next point. Each of the four corners of the square is labelled "a," "c," "g," or "u;" if a "c," for example, is the next base, then a point is plotted half way between the previous point and the "c" corner.
Example: The first 6 bases of the GenBank sequence
HUMHBB (human beta globin region, chromosome 11) are "gaattc":
- The first "g" is plotted halfway between the center of the square and the "g" corner.
- The next base, "a," is plotted halfway between the point just plotted and the "a" corner.
- The base "a" is plotted half way between the previous point and "a" corner.
- Next, "t" is plotted half way between the previous point and the "t" corner, etc.
Plotting these six bases, we obtain Fig. 1.
(the g-quadrant). A smaller copy of this "scoop"Fig. 1.
CGR of the first six bases of HUMHBB. As with the initial points of the Sierpinski triangle, little significance is visible. However, if we continue for the entire 73,357 bases of HUMHBB, we obtain Fig. 2.
Fig. 2. CGR of human beta globin region on chromosome 11(HUMHBB) (73,357 bases).
The CGR of HUMHBB is a good example of the point of this visualization technique, for it illustrates a number of the characteristics of CGRs in general.
Maybe you ask yourself what this algorithm is for !!!??? Exactly, I will use this algorithm for recognising finger print. I will put it here soon. ;)
Chaos and Fractals - A Computer Graphical Journey. Ten Year Compilation of Advanced Research