To achieve specificity in DNA-bindinginteraction, the accommodation of the protein three dimensional structures withthe DNA three dimensional structures in the binding site are required (Schneideret al. 2014). Also, all the amino acids existing at the interaction site do notestablish a biochemical binding with DNA molecules.
Some of the amino acidshave a very especial and important binding role while others have a limited orno role in this connection. Based on the information of database (database ofbinding pairs in protein-nucleic acid interactions) and study of thelog-likelihood, the tendency of the amino acids’ becoming or not becoming abond and the probability of their cooperation in the interaction sites with DNAhas been shown ().Proteins possess a lot of DNA-bindingmotifs that are able to mediate many important specific and non-specificinteractions. Based on the classification of SCOP database, DNA-bindingproteins are classified in more than 70 super families (Rohs et al. 2010).These proteins are divided based on the secondary structure of DNA-bindingconsensus regions to groups with ?-helix structure, ?-sheet, ?-? mixture,combination of several regions (a mixture of more than one type of region andtheir binding structures). Up to date, many DNA-binding protein motifs wereidentified by X-ray crystallography and have been deposited in the Protein DataBank (PDB).
DNA-binding classification based on the DNA-binding motif classifyproteins with the same DNA-binding motif into same clusters suggesting that themotif-based classification of DNA-binding proteins may not necessarily correspond theirstructural and functional properties characterizing protein-DNA recognition (Tripathiand Gupta 2013). In this section, we will focus on the major groups ofDNA-binding motifs that members of them share same structural features inbounding DNA molecule (for more details see Luscombe et al., 2000; Rohs et al2012).Helix-Turn-Helix (HTH) motif: Helix-Turn-Helix (HTH) DNA-binding motifis one of the predominant structural DNA-binding motifs that are seen in manyDNA-binding proteins belonging to different SCOP super families (Posinski etal. 1999). For example, HTH DNA binding domain was identified in TranscriptionFactors and Enzymes from prokaryotes to eukaryotes (Luscombe et al.
, 2000). However,the structure of the HTH motif in the DNA-binding proteins are structuralconserved in containing proteins, the structure outside the HTH motif region inthe protein are greatly different in the various HTH containing proteins (Joneset al. 2003).
Helix-turn-helix motif contain a 20-amino acid segment of twoalmost perpendicular alpha helix connected by a four-amino acid containing betaturn. However, the longer linkers such as loops were observed in different HTHmotif containing proteins. Sequence variety between HTH structural motifs infunctionally diverse proteins allows them to recognize distinct set of DNAsequences (Luscombe et al. 2000).
In HTH structural motif, the first alphahelix refers as the probe helix and plays a stabilizer role in the interactionbetween the protein and the DNA. The second alpha helix recognizes theDNA-binding site and referred as recognition or probe helix (Pellegrini-Calaceret al. 2005). It binds the DNA-binding site through a series of hydrogen bondsand hydrophobic interaction with the DNA major groove in exposed bases. It isobserved that the NH2-terminal end of the recognition helix pointed into themajor groove and no base-specific contact were shown within the minor groove (Schleif1988).
However, in some exceptional examples such as O6-alkylgunanine-DNA alkyltransferase, HTH structural motifs are intact to minor groove (Daniels et al.2004). Also, the recognition helix supports the contact with the DNA backboneas the linker made it (Propper et al. 2014). HTH motifs is part of aDNA-binding protein and amino acid residues outside the HTH motif play theimportant role in regulating DNA recognition and binding (Pllegrini-Calace etal.
2005; Yesudhas et al. 2017). Also, additional alpha helix and its adjacentbeta sheet to HTH motif extended the HTH structural motif to WingedHelix-Turn-Helix (wHTH) motif which is considered as components of the HTH-DNA-binding motif (Teichmann etal. 2012). The extra secondary structure not only observed to interact with theminor groove but also, as seen in Regulatory Factor X1 (RFX1), are able tocontact with major groove (Gajiwala et al. 2000).
Helix-Loop-Helix motif (HLH): Helix-Loop-Helix motif contains two alpha helixes withdifferent length that connected by a loop. This motif can be dimerized by otherHLH containing proteins with interacting in a coiled-coil arrangement to formhomodimer (binding to the same other) or heterodimer (binding to differ HLHmotif; Nikanta and Angshmman 2015). DNA-binding affinity and specificity areupgraded by binding dimerization partner (Jones 2004). Since two alpha helixesrather than one can interact with the target DNA.Leucine-Zipper motif: Like HLH motif, the structure ofLeucine-Zipper motif is bipartite (Schindler et al. 1992; Wang et al 2015).
Each part of the motif includes a single about 60-amino-acid-long alpha helixjoined together by their ?-helical leucine zipper region (30 amino acid sectionat the carboxyl-terminal end of each part) to form short left handed coiledcoil structure like inverted Y-shaped structure (Loscombe et al. 2000). The two-coiledcoil ? helices apart from each other allow their side chains to contact withthe major groove of DNA (Hu et al. 1992). In the zipper region every two turnsof the alpha helix (every eight amino acid positions) contain Leucine or asimilar hydrophobic amino acid that can pack side-by-side with mediatedhydrophobic contacts (Luscombe et al. 2000). Dimerization allows thejuxtaposition of the DNA-binding regions of each subunit alpha helix arms inY-shaped structure (amino terminus of each helix) leading to contact with majorgroove in opposite direction of the DNA (Pogenberg et al.
2012).Zinc-Coordinating containing DNA-bindingproteins: Zinc coordinatingDNA binding domains are one of the most predominant motifs in DNA bindingproteins (Rohs et al. 2010). Zinc-coordinating motif consist of about 25-30amino acids residues containing Cysteine and Histidine amino acids thatcoordinate a zinc ion. This motif can coordinate one or more zinc atoms (Ebentand Altman 2008; Mc-Ewan et al. 2011; Kochauczyk et al. 2015).
Proteinsinvolving Zinc coordinating protein divers overall folding structure and DNAbinding role. Zinc-coordinating residues type and order determine the class ofzinc-coordinating motif (Bagliro 2009). Some important class of zinccoordinating motif classes is C2H2, C3HC4, C4, C3H, C4HC3 and C2HC5 (where C refers to Cysteine and Hrefers to Histidine residues). C2H2 Zinc-coordinating class are the predominantDNA-binding motif have been identified in transcription factors (Kirishna etal. 2003). Three-dimensional structures in this motif consist of a two-strandedanti-parallel ?-sheet and ?- helix; two pairs of conserved histidine andCysteine in the alpha helix and second beta sheet coordinate a single zinc ion(Michalek et al. 2011).
The interspersed cysteine and histidine residuescovalently bind zinc atoms, folding the amino acids into loops known as zincfingers. Zinc ion (Zn2+) has a structural role in maintaining the protein fold(Cohen et al. 2002) and affinity to binding to DNA (Mc-Ewan et al. 2011),however, no role in protein interaction to the DNA.
Zinc-coordinating motifpossesses stable structures (Krezel et al. 2014; Pace et al. 2014), and theyrarely undergo conformational changes upon binding their target. The mainlyobserved consensus sequence of a single finger is: Cys-X2-4-Cys-X3-Phe-X3-Leu-X2-His-X3-His (Newton et al. 2000).
Zinc-coordinatingmotif containing proteins have more than one Zinc-coordinating motif that leadsto making tandem contacts connecting by short oligopeptide, each one specificfor a certain nucleotide sequence in their target DNA molecule and wrap aroundthe DNA in a spiral manner. The motif binds to three adjacent nucleotides byinserting alpha helix in the major groove (Krishna et al. 2003). Hormonereceptor proteins can bind to the appropriate ligand and translocation from thecell cytoplasm to nucleus to bind and regulate transcripted stage(s) containingone of the important classes of zinc-coordinating motif (Leon et al. 2000;Kirishna et al.
2003). This class of zinc-coordinating motif is characterizedby two antiparallel alpha helices capped by loops at their amino terminal ends.The single zinc ion is coordinated in each helix-loop pair by four conservedcysteine (C4) amino acids (Kirishna et al. 2003). This class ofzinc-coordinating motif interacts with DNA by one of the alpha helixescontacting in the DNA major groove in the binding site, and the other alphahelix and loops are contact to the DNA backbone (Luscombe et al. 2000). Also,the DNA-binding motif can form homo- or heterodimer by the loops leading intothe second alpha helix.
Galactose induced gene transcription factor proteinsare one of the zinc-coordinating containing protein that contains two zinccoordinating ion in each DNA binding domain (Mc-Pherson et al. 2006). This zinccoordinating motif has a pair of alpha helices that one of them contact withthe DNA major groove in the binding site (Mangeslsdorf and Evans 1995) and theother makes the backbone interaction. The two Zinc ions are coordinated by sixcysteine amino acids where two cysteine amino acids are shared by both Zincions (Chung et al. 2013). Loop-sheet-helix DNA-binding class ofZinc-coordinating motif consists of a loop leading out of the main body of theprotein connected to small sheet and an alpha helix. Another loop that leadsback into the protein also draws on the alpha helix.
The motif can bind the DNAmajor groove and the loop in the minor groove in DNA binding site (Joerger etal. 2007). However, binding to minor groove are not confer specificity (Luscombeet al. 2000). The zinc ion is coordinated by three cysteine residues and ahistidine residue in the two loop regions (Luscombe et al.
2000). This motif ischaracterized in the DNA binding motif of P53 transcription activator protein (Joegeret al. 2005; Modhumalav et al. 2009).Beta sheet mediated DNA interacting motifs: Beta sheet mediated DNA interactingmotifs use beta-stand structure recognition and binding to DNA-binding site, althoughless common than alpha helix containing motifs. From this group of DNA-bindingmotifs, TATA box binding protein family characterized by using the large betasheet in recognition DNA sequence and binding to minor groove in the bindingsite (Patikoglou et al. 1999). For binding the ten-stranded antiparallel betasheet to minor groove, the DNA in binding site undergoes some conformation andconfiguration like unwinding and bending that makes possible contact betweenthe protein and the binding site minor groove (Lebrun et al.
1997). Also,smaller two- or three stranded beta sheets or hairpin motifs were observed inbeta sheet mediated DNA interacting motifs to bind either the DNA major orminor groove (). The proteins with this class of beta sheet containing DNAinteracting motif, such as MetJ repressor, Arc repressor and T-domain families,are very diverse function.
Although the overall structures of these proteinsare different there are common themes in the use of the binding motif (Luscombeet al. 2000).Loop mediated DNA interacting motifs: In Loop mediated DNA interacting motifs,DNA recognition and binding is done by intervening loops (Rohs et al. 2010;Luscombe et al.
2000). The loop mediated DNA interacting motif containingproteins are significantly diverse structure outside of the DNA-binding motif.Immunoglobulin-like loop mediated DNA interacting domain are one of the commonstructural domains in transcription factor families like NFkB, STAT p53, RUNXand TBX families participating in various processes such as immunity, cellcycle and apoptosis, and development and pattern formation. Structure of Immunoglobulin-likedomain consist of four beta-strands embedded in an antiparallel curledbeta-sheet sandwich with a total of three to five additional strands and bindsDNA in the major groove (Kuriyan et al. 1995; Rohs et al. 2010;Pourhassan-Moghaddam et al.