Skip to content
Mark Teese edited this page Apr 26, 2017 · 7 revisions

TMSEG

======

Stable version: 2.2.1

TMSEG is a tool which predicts transmembrane proteins, and their transmembrane helices and topology.

Compatible with Java 8 (current release) and Java 7.

HOWTO Install

Install git large-file-storage (git-lfs)

  • Download and install git lfs
    • Linux : Unzip, double click install.sh.
    • Windows : Install from .exe
    • See git lfs wiki for help during installation
  • Complete installation
    • Linux: $ git lfs init --> should return "git lfs initialized"
    • Windows:>git lfs install --> should return "git lfs initialized", >git-lfs --help --> should return the help documents
  • Clone the repository
  • Installation is ended!

HOWTO Run, Basics

The TMSEG input requires evolutionary data in the form of a PSSM, typically generated via PSI-BLAST.

Detailed instructions for PSSM generation are currently available on the RostLab TMSEG Wiki.

To run the project, please use the following command:

java -jar tmseg.jar -i examples/query.fasta -p examples/query.pssm -o query.out

Input

  • -i Input FASTA file/folder
  • -p Input PSSM file/folder

Output

  • -o Output file/folder (human readable)
  • -r Output file/folder (raw prediction scores)

Option

  • -m FLAG if set, do multi-job (interpret input/output paths as folders)
  • -x FLAG if set, a previous prediction is processed (must be supplied in FASTA file)
  • -t FLAG if set, only the topology prediction is performed (-x must be set)

Expected Results

For the preceeding query, you are supposed to have in your query.out file:

# SEGMENT START END RI

##

# OUTSIDE 1 9

# TRANSMEM 10 34 8

# INSIDE 35 79

# TRANSMEM 80 102 8

# OUTSIDE 103 121

[...]

>sp|Q60019|NQO8_THET8 NADH-quinone oxidoreductase subunit 8 OS=Thermus thermophilus (strain HB8 / ATCC 27634 / DSM 579) GN=nqo8 PE=1 SV=1

MTWSYPVDPYWMVALKALLVVVGLLTAFAFMTLIERRLLARFQVRMGPNRVGPFGLLQPLADAIKSIFKEDIVVAQADRFLFVLAPLISVVFALLAFGLIPFGPPGSFFGYQPWVINLDLGILYLFAVSELAVYGIFLSGWASGSKYSLLGSLRSSASLISYELGLGLALLAPVLLVGSLNLNDIVNWQKEHGWLFLYAFPAFLVYLIASMAEAARTPFDLPEAEQELVGGYHTEYSSIKWALFQMAEYIHFITASALIPTLFLGGWTMPVLEVPYLWMFLKIAFFLFFFIWIRATWFRLRYDQLLRFGWGFLFPLALLWFLVTALVVALDLPRTYLLYLSALSFLVLLGAVLYTPKPARKGGGA

222222222HHHHHHHHHHHHHHHHHHHHHHHHH111111111111111111111111111111111111111111111HHHHHHHHHHHHHHHHHHHHHHH2222222222222222222HHHHHHHHHHHHHHHHHHHHH111111111111HHHHHHHHHHHHHHHHHHHHHHH222222222222222HHHHHHHHHHHHHHHHHHHHHH1111111111111111111111111111HHHHHHHHHHHHHHHHHHHH22222222HHHHHHHHHHHHHHHHHHHHHHH1111111111111111HHHHHHHHHHHHHHHHHHHHHH2HHHHHHHHHHHHHHHHHHHHHH11111111111

The first part tells you how is the protein segmented, the second gives you the name of the protein.

Residue Annotations in TMSEG output

  • N : not a membrane protein
  • S : signal peptide
  • 1 : intracellular
  • H : transmembrane helix
  • 2 : extracellular

Application flow

see [This chart](Application Flow.pdf)