src/clustal/hhalign_wrapper.c File Reference

#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <ctype.h>
#include <stdbool.h>
#include "seq.h"
#include "squid/squid.h"
#include <limits.h>
#include <strings.h>
#include <stdarg.h>
#include <stdio.h>
#include "util.h"
#include "seq.h"
#include "squid/stopwatch.h"
#include "hhalign/general.h"
#include "hhalign/hhfunc.h"
#include "hhalign/hhalign.h"
Include dependency graph for hhalign_wrapper.c:
This graph shows which files directly or indirectly include this file:

Defines

#define APPLY_BG_HMM_UP_TO_TREE_DEPTH   10
#define TIMING   0
#define TRACE   0
#define NOTX   'N'

Functions

void SetDefaultHhalignPara (hhalign_para *prHhalignPara)
 FIXME.
void SanitiseUnknown (mseq_t *mseq)
 get rid of unknown residues
void TranslateUnknown2Ambiguity (mseq_t *mseq)
 translate unknown residues back to ambiguity codes; hhalign translates ambiguity codes (B,Z) into unknown residue (X). we still have the original (un-aligned) residue information, by iterating along the original and aligned sequences we can reconstruct where codes have been changed and restore them to their original value
void ReAttachLeadingGaps (mseq_t *prMSeq, int iProfProfSeparator)
 re-attach leading and trailing gaps to alignment
void PrepareAlignment (mseq_t *mseq, char **ppcProfile1, char **ppcProfile2, double *pdWeightsL, double *pdWeightsR, double *pdSeqWeights, int iLeafCountL, int *piLeafListL, int iLeafCountR, int *piLeafListR)
 reallocate enough memory for alignment and attach sequence pointers to profiles
double HHalignWrapper (mseq_t *prMSeq, int *piOrderLR, double *pdSeqWeights, int iNodeCount, hmm_light *prHMMList, int iHMMCount, int iProfProfSeparator, hhalign_para rHhalignPara)
 wrapper for hhalign. This is a frontend function to the ported hhalign code.

Define Documentation

#define APPLY_BG_HMM_UP_TO_TREE_DEPTH   10
#define NOTX   'N'
#define TIMING   0
#define TRACE   0

Function Documentation

double HHalignWrapper ( mseq_t prMSeq,
int *  piOrderLR,
double *  pdSeqWeights,
int  iNodeCount,
hmm_light *  prHMMList,
int  iHMMCount,
int  iProfProfSeparator,
hhalign_para  rHhalignPara 
)

wrapper for hhalign. This is a frontend function to the ported hhalign code.

Parameters:
[in,out] prMSeq holds the unaligned sequences [in] and the final alignment [out]
[in] piOrderLR holds order in which sequences/profiles are to be aligned, even elements specify left nodes, odd elements right nodes, if even and odd are same then it is a leaf
[in] pdSeqWeights Weight per sequence. No weights used if NULL
[in] iNodeCount number of nodes in tree, piOrderLR has 2*iNodeCount elements
[in] prHMMList List of background HMMs (transition/emission probabilities)
[in] iHMMCount Number of input background HMMs
[in] iProfProfSeparator Gives the number of sequences in the first profile, if in profile/profile alignment mode (iNodeCount==3). That assumes mseqs holds the sequences of profile 1 and profile 2.
[in] rHhalignPara various parameters read from commandline
Returns:
score of the alignment FIXME what is this?
Note:
complex function. could use some simplification, more and documentation and a struct'uring of piOrderLR
HHalignWrapper can be entered in 2 different ways: (i) all sequences are un-aligned (ii) there are 2 (aligned) profiles. in the un-aligned case (i) the sequences come straight from Squid, that is, they have been sanitised, all non-alphabetic residues have been rendered as X's. In profile mode (ii) one profile may have been produced internally. In that case residues may have been translated back into their 'native' form, that is, they may contain un-sanitised residues. These will cause trouble during alignment
: introduced argument hhalign_para rHhalignPara; FS, r240 -> r241
: if hhalign() fails then try with Viterbi by setting MAC-RAM=0; FS, r241 -> r243

translate back ambiguity residues hhalign translates ambiguity codes (B,Z) into unknown residues (X). as we still have the original input we can substitute them back

void PrepareAlignment ( mseq_t mseq,
char **  ppcProfile1,
char **  ppcProfile2,
double *  pdWeightsL,
double *  pdWeightsR,
double *  pdSeqWeights,
int  iLeafCountL,
int *  piLeafListL,
int  iLeafCountR,
int *  piLeafListR 
)

reallocate enough memory for alignment and attach sequence pointers to profiles

Parameters:
[in,out] mseq sequence/profile data, increase memory for sequences in profiles
[out] ppcProfile1 pointers to sequencese in 1st profile
[out] ppcProfile2 pointers to sequencese in 2nd profile
[out] pdWeightsL weights (normalised to 1.0) for sequences in left profile
[out] pdWeightsR weights (normalised to 1.0) for sequences in right profile
[in] pdSeqWeights weights for _all_ sequences in alignment
[in] iLeafCountL number of sequences in 1st profile
[in] piLeafListL array of integer IDs of sequences in 1st profile
[in] iLeafCountR number of sequences in 2nd profile
[in] piLeafListR array of integer IDs of sequences in 2nd profile
void ReAttachLeadingGaps ( mseq_t prMSeq,
int  iProfProfSeparator 
)

re-attach leading and trailing gaps to alignment

Parameters:
[in,out] prMSeq alignment structure (at this stage there should be no un-aligned sequences)
[in] iProfProfSeparator gives sizes of input profiles, -1 if no input-profiles but un-aligned sequences
Note:
leading and tailing profile columns that only contain gaps have no effect on the alignment and are removed during the alignment. if they are encountered a warning message is printed to screen. some users like to preserve these gap columns FS, r213->214
void SanitiseUnknown ( mseq_t mseq  ) 

get rid of unknown residues

Note:
HHalignWrapper can be entered in 2 different ways: (i) all sequences are un-aligned (ii) there are 2 (aligned) profiles. in the un-aligned case (i) the sequences come straight from Squid, that is, they have been sanitised, all non-alphabetic residues have been rendered as X's. In profile mode (ii) one profile may have been produced internally. In that case residues may have been translated back into their 'native' form, that is, they may contain un-sanitised residues. These will cause trouble during alignment FS, r213->214
void SetDefaultHhalignPara ( hhalign_para *  prHhalignPara  ) 

FIXME.

Note:
prHalignPara has to point to an already allocated instance
void TranslateUnknown2Ambiguity ( mseq_t mseq  ) 

translate unknown residues back to ambiguity codes; hhalign translates ambiguity codes (B,Z) into unknown residue (X). we still have the original (un-aligned) residue information, by iterating along the original and aligned sequences we can reconstruct where codes have been changed and restore them to their original value

Parameters:
[in,out] mseq sequence/profile data, mseq->seq [in,out] is changed to conform with mseq->orig_seq [in]
Generated on Fri Aug 31 05:32:52 2012 for Clustal Omega by  doxygen 1.6.3