Predict the Topology of a Membrane Protein

Richel J.C. Bilderbeek

2020-11-11

This demo shows how to estimate the location of the amino acids in a membrane protein. This package tmhmm uses the tool TMHMM to do so. The amino acids of a membrane protein are estimated to be either inside the cell (the cytosol side), outside of the cell (the surroundings of the cell) or in the transmembrane part.

Load the library:

library(tmhmm)

For this vignette to work, TMHMM must have been installed. TMHMM can be installed using install_tmhmm, but this does require a download link that must be requested from https://services.healthtech.dtu.dk/service.php?TMHMM-2.0.

install_tmhmm("https://services.healthtech.dtu.dk/download/28c408dc-ef5e-47ad-a284-66754bcd27f7")

The TMHMM installation is checked here, with the goal of producing a helpful error message:

check_tmhmm_installation()
#> Error in check_tmhmm_installation(): TMHMM binary not found at location '
#> /home/richel/.local/share/tmhmm-2.0c/bin/decodeanhmm.Linux_x86_64'
#> 
#> Tip 1: from R, run 'tmhmm::install_tmhmm()'
#>   with a (non-expired) download URL
#> Tip 2: request a download URL at the TMHMM request page at
#> 
#> https://services.healthtech.dtu.dk/service.php?TMHMM-2.0

We need a FASTA file to work on:

fasta_filename <- system.file("extdata", "tmhmm.fasta", package = "tmhmm")
cat(readLines(fasta_filename), sep = "\n")
#> >5H2A_CRIGR you can have comments after the ID
#> MEILCEDNTSLSSIPNSLMQVDGDSGLYRNDFNSRDANSSDASNWTIDGENRTNLSFEGYLPPTCLSILHL
#> QEKNWSALLTAVVIILTIAGNILVIMAVSLEKKLQNATNYFLMSLAIADMLLGFLVMPVSMLTILYGYRWP
#> LPSKLCAVWIYLDVLFSTASIMHLCAISLDRYVAIQNPIHHSRFNSRTKAFLKIIAVWTISVGVSMPIPVF
#> GLQDDSKVFKQGSCLLADDNFVLIGSFVAFFIPLTIMVITYFLTIKSLQKEATLCVSDLSTRAKLASFSFL
#> PQSSLSSEKLFQRSIHREPGSYTGRRTMQSISNEQKACKVLGIVFFLFVVMWCPFFITNIMAVICKESCNE
#> HVIGALLNVFVWIGYLSSAVNPLVYTLFNKTYRSAFSRYIQCQYKENRKPLQLILVNTIPALAYKSSQLQA
#> GQNKDSKEDAEPTDNDCSMVTLGKQQSEETCTDNINTVNEKVSCV

Estimating the locations of the amino acids:

if (is_tmhmm_installed()) {
  locatome <- run_tmhmm(fasta_filename)
  cat(locatome, sep = "\n")
}

The legend of these locations:

Character Location
i Inside or cytosol-side
o Outside or surroundings-side
M Transmembrane