What is TEMSP

TEMSP (3D TEmplate-based Metal Site Prediction) predicts metal binding sites based on protein structures. It uses relative positions of Cα and Cβ atoms of potential ligand residues in an input structure to search a template library comprising pairs of ligand residues that has been derived from known metal-binding sites in protein structures. Retrieved templates from the search are combined and scored. If a potential site is predicted, a structure model for it can be built.
    For the prediction of zinc-binding sites, Cys, His, Glu and Asp are considered as potential ligand residues. Parameters used by TEMSP have been optimized using this training dataset. TEMSP has been tested on this independent test dataset, resulting 86.0% sensitivity and 95.9% selectivity as well as an average IoUR value of 0.96.

Training dataset (338 chains noted by the respective PDB IDs followed by chain identifiers ):
  1A5T_A, 1ADT_A, 1AJY_A, 1AK0_A, 1B66_A, 1BBO_A, 1BF6_A, 1BI0_A, 1BOR_A, 1C3R_A, 1CO4_A, 1CTT_A, 1CU1_A, 1D4U_A, 1DGZ_A, 1DO5_A, 1DX8_A, 1DY1_A, 1DYQ_A, 1E4U_A, 1EH6_A, 1EU4_A, 1F6U_A, 1FAQ_A, 1FBV_A, 1G25_A, 1GL4_A, 1H1Z_A, 1HP7_A, 1HR6_B, 1I3J_A, 1I6N_A, 1IA9_A, 1J3G_A, 1J7N_A, 1JJD_A, 1JK0_A, 1JM7_A, 1JM7_B, 1JN7_A, 1JOC_A, 1JQG_A, 1JW9_B, 1K24_A, 1K6Y_A, 1L1T_A, 1LBU_A, 1LLM_C, 1LPV_A, 1M9O_A, 1NCS_A, 1NZJ_A, 1ODH_A, 1OHT_A, 1OZJ_A, 1P4Q_B, 1P5D_X, 1P7M_A, 1PCX_A, 1PL8_A, 1PY0_A, 1Q14_A, 1Q2L_A, 1QWR_A, 1QX0_A, 1R1H_A, 1R79_A, 1RLY_A, 1RQG_A, 1RUT_X, 1RYQ_A, 1S0U_A, 1SE0_A, 1SRK_A, 1SU0_B, 1T4W_A, 1T8H_A, 1TAQ_A, 1U5K_A, 1UL4_A, 1UWY_A, 1V33_A, 1V5R_A, 1V87_A, 1VD4_A, 1VK6_A, 1VQ0_A, 1VQ2_A, 1VSR_A, 1VYX_A, 1W57_A, 1WD2_A, 1WEP_A, 1WEQ_A, 1WEV_A, 1WFE_A, 1WFF_A, 1WII_A, 1WIM_A, 1WIR_A, 1WJ2_A, 1WJP_A, 1WJV_A, 1WKQ_A, 1WPK_A, 1WWF_A, 1X31_D, 1X3C_A, 1X3Z_A, 1X6M_A, 1XA6_A, 1XER_A, 1XJH_A, 1XRT_A, 1XTO_A, 1Y02_A, 1Y0J_B, 1Y79_1, 1YG9_A, 1YLK_A, 1YUJ_A, 1Z05_A, 1Z3I_X, 1Z8R_A, 1ZE9_A, 1ZR9_A, 1ZU1_A, 1ZY7_A, 2A0B_A, 2A25_A, 2AKL_A, 2AQP_A, 2AU3_A, 2AYJ_A, 2B5L_C, 2B5W_A, 2BAI_A, 2BJR_A, 2BNM_A, 2BZ1_A, 2C1I_A, 2CJS_C, 2CKL_A, 2CKL_B, 2CON_A, 2CRW_A, 2CS2_A, 2CS8_A, 2CSH_A, 2CT7_A, 2CTT_A, 2CUP_A, 2CZR_A, 2D5B_A, 2D8R_A, 2D8S_A, 2D9H_A, 2DIP_A, 2DJR_A, 2DKT_A, 2DMI_A, 2DPH_A, 2DRP_A, 2E2Z_A, 2E5R_A, 2E61_A, 2E6I_A, 2EBV_A, 2ECG_A, 2ECT_A, 2ECW_A, 2EGM_A, 2EK8_A, 2ELI_A, 2ELM_A, 2ELN_A, 2ELP_A, 2EO4_A, 2EQE_A, 2EXU_A, 2F8B_A, 2F9Y_B, 2FE3_A, 2FGY_A, 2FK4_A, 2FNF_X, 2FR5_A, 2G0D_A, 2GHF_A, 2GTQ_A, 2GVI_A, 2H1N_A, 2H6L_A, 2HF1_A, 2HJN_A, 2HVY_C, 2HZ8_A, 2I0M_A, 2I50_A, 2I5O_A, 2IDA_A, 2IHX_A, 2ISW_A, 2IXD_A, 2J6A_A, 2JIG_A, 2JM1_A, 2JM3_A, 2JMO_A, 2JOX_A, 2JR7_A, 2JUN_A, 2JVX_A, 2K0A_A, 2K16_A, 2K1P_A, 2K2D_A, 2K4X_A, 2K5C_A, 2K7R_A, 2K9H_A, 2KAE_A, 2KAK_A, 2KDP_A, 2KDX_A, 2KI7_B, 2KKH_A, 2KN9_A, 2KQ9_A, 2KQB_A, 2KR1_A, 2NUT_A, 2O3E_A, 2ODD_A, 2ODX_A, 2OH3_A, 2OLM_A, 2OSO_A, 2OSV_A, 2OWA_A, 2OWO_A, 2OZU_A, 2P09_A, 2PEB_A, 2PG3_A, 2PGF_A, 2PPT_A, 2Q7S_A, 2QFA_A, 2QGP_A, 2QKD_A, 2QSW_A, 2R6M_A, 2RI7_A, 2RIQ_A, 2RMN_A, 2ROW_A, 2RPR_A, 2RPZ_A, 2UZ9_A, 2UZG_A, 2V0C_A, 2V9K_A, 2VL6_A, 2VR2_A, 2VRD_A, 2W0T_A, 2W3Q_A, 2WAD_A, 2WJY_A, 2WKX_A, 2WWY_A, 2YQP_A, 2YRE_A, 2YRK_A, 2YSA_A, 2YSM_A, 2YU4_A, 2YV5_A, 2YVR_A, 2ZE7_A, 2ZNR_A, 2ZZE_A, 3ALC_A, 3BAL_A, 3BK2_A, 3BO5_A, 3BOC_A, 3BOF_A, 3BQ5_A, 3BVU_A, 3C10_A, 3C37_A, 3C5K_A, 3CG7_A, 3CSQ_A, 3D00_A, 3D68_A, 3DI4_A, 3DPL_R, 3DRA_B, 3DXT_A, 3DZY_A, 3DZY_D, 3E1Z_A, 3E6U_A, 3EH2_A, 3EO3_A, 3EPQ_A, 3F6Q_B, 3FEH_A, 3FW3_A, 3GI1_A, 3GIP_A, 3H1W_A, 3H3E_A, 3H6L_A, 3HTK_C, 3I2D_A, 3ICJ_A, 3IEH_A, 3IO2_A, 3IT7_A, 3IUF_A, 3IUU_A, 3JU2_A, 3JWP_A, 3K9T_A, 3KDE_C, 3KHI_A, 3KNV_A, 3KSV_A, 3L0A_A, 3L11_A, 3LGD_A, 3LQ0_A, 3LY0_A, 5GAT_A

Independent test dataset (100 chains):
  1B8T_A, 1CL4_A, 1DVP_A, 1E7L_A, 1EPW_A, 1KWG_A, 1LBA_A, 1LFW_A, 1LI5_A, 1LML_A, 1LV3_A, 1ML9_A, 1N0Z_A, 1NEE_A, 1OHL_A, 1P9R_A, 1PG5_B, 1PXE_A, 1PZW_A, 1QWY_A, 1QYP_A, 1R42_A, 1R44_A, 1R6O_A, 1RMD_A, 1S3G_A, 1T3K_A, 1TON_A, 1TOT_A, 1TXL_A, 1UUF_A, 1UW0_A, 1V5N_A, 1VJE_B, 1WEO_A, 1WEU_A, 1WFK_A, 1WIL_A, 1XB8_A, 1Y13_A, 1ZFD_A, 1ZW8_A, 258L_A, 2C6A_A, 2CQE_A, 2CS3_A, 2CSY_A, 2D6F_C, 2DAR_A, 2E5S_A, 2E9H_A, 2ELO_A, 2ELU_A, 2EN8_A, 2EOD_A, 2F3B_A, 2FC6_A, 2FU5_A, 2GFO_A, 2GQJ_A, 2GX8_A, 2HJH_A, 2HSI_A, 2I1O_A, 2I9W_A, 2IJD_1, 2IMR_A, 2J2S_A, 2J4X_A, 2J9U_B, 2JKS_A, 2JZ8_A, 2KE1_A, 2NLY_A, 2P53_A, 2PW6_A, 2QWX_A, 2R3A_A, 2RF5_A, 2RO1_A, 2VY5_A, 2WFQ_A, 2YRT_A, 2ZKL_A, 3A32_A, 3BKF_A, 3BLD_A, 3BYR_A, 3CE2_A, 3COQ_A, 3CWW_A, 3GA8_A, 3GL6_A, 3HRU_A, 3I9F_A, 3IFU_A, 3IR9_A, 3IRB_A, 3ISZ_A, 3L2Q_A

  In both training and testing, the following 'intersection over union ratio' (IoUR) was used to quantify the accuracy of results.

    This ratio balances between the numbers of correctly and wrongly predicted ligand residues for a particular binding site. If unspecified, we have considered prediction results with IoUR > 0.5 to be true positives

Wei Zhao, Meng Xu, Zhi Liang, Bo Ding, Jian Zhan, Liwen Niu, Haiyan Liu*, Maikun Teng*
Structure-based de novo prediction of zinc-binding sites in proteins of unknown function.
Bioinformatics. 2011 May 1;27(9):1262-8.

COPYRIGHT(C) 2010, Lab of Protein Crystallography (LPC), School of Life Sciences, USTC. ALL RIGHTS RESERVED.
:: CONTACT US :: zhaowei@ustc.edu.cn or ameng@ustc.edu