Click here to Skip to main content
15,885,366 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I'm writing a code that identifies the highest-scoring local alignment between a substring X' of X and a substring Y' of Y that has at least one column in which a C from X' is aligned to a W from Y'.

Example: if X=:HCEA and Y=:HWEA, then some of the local alignments with a column that aligns a C to a W are :

CE
WE

HCE
HWE

H-CE
-HW-

I need to compute all these possibilities and find the one which has a maximum alignment score.

My Solution:

I've taken the above 2 sequences as inputs and tried to solve the problem. Here is my code. I myself know that I haven't fulfilled the question because I've identified the matching substring alone. Can anyone tell me how should I improve my code to get the desired result.

public class Q1 {

    public static void main(String[] args) {
        //  Input Protein Sequences
        String seq1 = "HCEA";
        String seq2 = "HWEA";

        //  Array to store the score
        int[][] T = new int[seq1.length() + 1][seq2.length() + 1];

        //  initialize seq1
        for (int i = 0; i <= seq1.length(); i++) {
            T[i][0] = i;
        }

        //  Initialize seq2
        for (int i = 0; i <= seq2.length(); i++) {
            T[0][i] = i;
        }

        //  Compute the matrix score
        for (int i = 1; i <= seq1.length(); i++) {
            for (int j = 1; j <= seq2.length(); j++) {
                if ((seq1.charAt(i - 1) == seq2.charAt(j - 1))
                        || (seq1.charAt(i - 1) == 'C') && (seq2.charAt(j - 1) == 'W')) {
                    T[i][j] = T[i - 1][j - 1];
                } else {
                    T[i][j] = Math.min(T[i - 1][j], T[i][j - 1]) + 1;
                }
            }
        }

        //  Strings to store the aligned sequences
        StringBuilder alignedSeq1 = new StringBuilder();
        StringBuilder alignedSeq2 = new StringBuilder();

        //  Build for sequences 1 & 2 from the matrix score
        for (int i = seq1.length(), j = seq2.length(); i > 0 || j > 0;) {
            if (i > 0 && T[i][j] == T[i - 1][j] + 1) {
                alignedSeq1.append(seq1.charAt(--i));
                alignedSeq2.append("-");
            } else if (j > 0 && T[i][j] == T[i][j - 1] + 1) {
                alignedSeq2.append(seq2.charAt(--j));
                alignedSeq1.append("-");
            } else if (i > 0 && j > 0 && T[i][j] == T[i - 1][j - 1]) {
                alignedSeq1.append(seq1.charAt(--i));
                alignedSeq2.append(seq2.charAt(--j));
            }
        }

        //  Display the aligned sequence
        System.out.println(alignedSeq1.reverse().toString());
        System.out.println(alignedSeq2.reverse().toString());
    }
}

A post states that, dynamic programming could solve this question in a better way. Also is there any other way to solve the question optimally? If its available can anyone post a pseudo code or an algorithm, so that I'll be able to code for it.
Posted
Updated 28-Feb-14 23:37pm
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900