Consider the following portion of a multiple sequence alignment. a) What is the
ID: 168261 • Letter: C
Question
Consider the following portion of a multiple sequence alignment.
a) What is the consensus sequence for this alignment? Use the following algorithm.
If more than 50% of the sequences have a gap character, the consensus character is '-'. Otherwise …
If two or more non-gap characters are tied in terms of being most frequent, the consensus character is 'n'. Otherwise …
The most frequently occurring non-gap character will be the consensus character. If it occurs in at least 50% of the sequences, it appears as a capital letter in the consensus sequence.
If the most frequently occurring character occurs in less than 50% of the sequences, it appears as a lowercase letter in the consensus sequence.
Note that this is different than the algorithm used by the EMBOSS showalign program.
b) What is the sum-of-pairs score for the alignment? Use the scoring function:
score(a,b) = +1 if a=b
score(a,b) = -1 if a=b (including a or b is a gap)
score(-,-) = 0
For this question give your intermediate results as well as your final answer.
Explanation / Answer
a) The consensus sequence for the given alignment (as per the algorithm provided) will be:
gTCnCCnTT- -
b) Sum-of-pairs (SP)score for the given alignment should be calculated as follows:
SP (column 1)
Score(-,A)+ Score(-,T) +Score(-,G)+ Score (-,G)+ Score (A,T) +score (A,G) + score(A,G) + score (T,G) +score (T,G)+ score (G,G)
= (-1)+(-1) + (-1) + (-1) + (-1) + (-1) + (-1) + (-1) + (-1) + 1
= -9 + 1
= -8
Similarly we calculate SP for column 2 as:
1+1+1+(-1) +1 +1+ (-1) + 1 + (-1) + (-1) = 6-4 = 2
SP for Column 3: (-1) + (-1) + (-1) + (-1) + (-1)+ (-1) + (-1) + 1+1 +1 = -7 +3 = -4
SP for column 4: 1+ (-1) + (-1) + (-1) + (-1) + (-1) + (-1) + (-1) + (-1) +1 = -8 + 2 = -6
SP for column 5: 1+ 1+ (-1) + 1 + 1 + (-1) + 1 + (-1) + 1 + (-1) = 6-4 = 2.
SP for column 6: 1+ (-1) + 1+ (-1) + (-1) + 1 + (-1) + (-1) + 1+ (-1) = -6 + 4 = -2
SP for column 7: (-1) + 1 +(-1) + (-1) + (-1)+(-1)+1 + (-1) + (-1) + (-1) = -8 +2 = -6
SP for column 8: 1+ 1 + 1 + (-1) + 1 + 1 +(-1) +1 + (-1) + (-1) = 6-4 = 2.
SP for column 9 : 1 + 1 + 1 + (-1) + 1 + 1 + (-1) +1 + (-1) + (-1) = 6-4 = 2.
SP for column 10: 0 + (-1) + 0 + (-1) + (-1) + 0 + (-1) + (-1) + 1 + (-1) = -6 +1 = -5
SP for column 11: 0 + (-1) + 0 + 0 + (-1) + 0 + 0 + (-1) + (-1) +0 = -4.
So, the final SP score for the alignment is:
(-8) + 2 + (-4) + (-6) + 2 + (-2) + (-6) +2 + 2+ (-5) + (-4) = -35 +8 = -27