Main Content

pfamhmmread

Read data from PFAM HMM-formatted file

Syntax

HMMStruct = pfamhmmread(File)
HMMStruct = pfamhmmread(File,'TimeOut',TimeOutValue)

Input Arguments

File

Character vector or string specifying a file name, a path and file name, a URL pointing to a file, or the text of a PFAM-HMM-formatted file. The referenced file is a PFAM HMM-formatted file. If you specify only a file name, that file must be on the MATLAB® search path or in the current folder.

Tip

You can use the gethmmprof function with the 'ToFile' property to retrieve HMM profile information from the PFAM database and create a PFAM HMM-formatted file.

TimeOutValueConnection timeout in seconds, specified as a positive scalar. The default value is 5. For details, see here.

Output Arguments

HMMStructMATLAB structure containing information from a PFAM HMM-formatted file.

Description

Note

pfamhmmread reads PFAM-HMM formatted files, from file format version HMMER2.0 to HMMER3/f.

HMMStruct = pfamhmmread(File) reads File, a PFAM HMM-formatted file, and converts it to HMMStruct, a MATLAB structure containing the following fields corresponding to parameters of an HMM profile:

FieldDescription
NameThe protein family name (unique identifier) of the HMM profile record in the PFAM database.
PfamAccessionNumberThe protein family accession number of the HMM profile record in the PFAM database.
ModelDescriptionDescription of the HMM profile.
ModelLengthThe length of the profile (number of MATCH states).
AlphabetThe alphabet used in the model, 'AA' or 'NT'.

Note

AlphaLength is 20 for 'AA' and 4 for 'NT'.

MatchEmission

Symbol emission probabilities in the MATCH states.

The format is a matrix of size ModelLength-by-AlphaLength, where each row corresponds to the emission distribution for a specific MATCH state.

InsertEmission

Symbol emission probabilities in the INSERT state.

The format is a matrix of size ModelLength-by-AlphaLength, where each row corresponds to the emission distribution for a specific INSERT state.

NullEmission

Symbol emission probabilities in the MATCH and INSERT states for the NULL model.

The format is a 1-by-AlphaLength row vector.

Note

NULL probabilities are also known as the background probabilities.

BeginX

BEGIN state transition probabilities.

Format is a 1-by-(ModelLength + 1) row vector:

[B->D1 B->M1 B->M2 B->M3 .... B->Mend]
MatchX

MATCH state transition probabilities.

Format is a 4-by-(ModelLength - 1) matrix:

[M1->M2 M2->M3 ... M[end-1]->Mend;
 M1->I1 M2->I2 ... M[end-1]->I[end-1];
 M1->D2 M2->D3 ... M[end-1]->Dend;
 M1->E  M2->E  ... M[end-1]->E  ]
InsertX

INSERT state transition probabilities.

Format is a 2-by-(ModelLength - 1) matrix:

[ I1->M2 I2->M3 ... I[end-1]->Mend;
  I1->I1 I2->I2 ... I[end-1]->I[end-1] ]
DeleteX

DELETE state transition probabilities.

Format is a 2-by-(ModelLength - 1) matrix:

[ D1->M2 D2->M3 ... D[end-1]->Mend ;
  D1->D2 D2->D3 ... D[end-1]->Dend ]
FlankingInsertX

Flanking insert states (N and C) used for LOCAL profile alignment.

Format is a 2-by-2 matrix:

[N->B  C->T ;
 N->N  C->C]
LoopX

Loop states transition probabilities used for multiple hits alignment.

Format is a 2-by-2 matrix:

[E->C  J->B ;
 E->J  J->J]
NullX

Null transition probabilities used to provide scores with log-odds values also for state transitions.

Format is a 2-by-1 column vector:

[G->F ; G->G]

HMMStruct = pfamhmmread(File,'TimeOut',TimeOutValue) sets the connection timeout (in seconds) to retrieve data from the PFAM database.

For more information on HMM profile models, see HMM Profile Model.

Examples

Read a locally saved PFAM HMM-formatted file into a MATLAB structure.

pfamhmmread('pf00002.ls')

ans = 

                   Name: '7tm_2'
    PfamAccessionNumber: 'PF00002.15'
       ModelDescription: '7 transmembrane receptor (Secretin family)'
            ModelLength: 293
               Alphabet: 'AA'
          MatchEmission: [293x20 double]
         InsertEmission: [293x20 double]
           NullEmission: [1x20 double]
                 BeginX: [294x1 double]
                 MatchX: [292x4 double]
                InsertX: [292x2 double]
                DeleteX: [292x2 double]
        FlankingInsertX: [2x2 double]
                  LoopX: [2x2 double]
                  NullX: [2x1 double]

Version History

Introduced before R2006a