• Pattern searches for the identification of putative lipoprotein genes in Gram positive bacterial genomes

      Harrington, Dean J.; Sutcliffe, I.C. (2002)
      N-terminal lipidation is a major mechanism by which bacteria can tether proteins to membranes and one which is of particular importance to Gram-positive bacteria due to the absence of a retentive outer membrane. Lipidation is directed by the presence of a cysteine-containing `lipobox' within the lipoprotein signal peptide sequence and this feature has greatly facilitated the identification of putative lipoproteins by gene sequence analysis. The properties of lipoprotein signal peptides have been described previously by the Prosite pattern PS00013. Here, a dataset of 33 experimentally verified Gram-positive bacterial lipoproteins (excluding those from Mollicutes) has been identified by an extensive literature review. The signal peptide features of these lipoproteins have been analysed to create a refined pattern, G+LPP, which is more specific for the identification of Gram-positive bacterial lipoproteins. The ability of this pattern to identify probable lipoprotein sequences is demonstrated by a search of the genome of Streptococcus pyogenes, in comparison with sequences identified using PS00013. Greater discrimination against likely false-positives was evident from the use of G+LPP compared with PS00013. These data confirm the likely abundance of lipoproteins in Gram-positive bacterial genomes, with at least 25 probable lipoproteins identified in S. pyogenes