As increase in AT% in mtDNA and/or recognition

As the TAR10 triplet (principally TAG) is present inmore than of 50% of the trn genes forall taxa and genomic systems combined, it could have be already present in trn genes of the Last Unicellular Common Ancestor (LUCA) which is estimated to have livedsome 3.5 to 3.8 billion years ago (Dodd et al., 2017) . It is probably an ancestralcharacter which was present in proto-trn sequences.As the percentage of TAA10 strongly increases in trn genes of organelles, one can ask whether if this character wasnot already present in their bacterial ancestor. It is now assumed that despite this diversity,all mitochondria derive from an endosymbiotic ?-proteobacteriumwhich has been integrated into a host cell related to Asgard Archaea approximately 1.

5-2 billion years ago (Roger et al., 2017) . However, the earliest fossils possessingfeatures typical of fungi date to 2.4 billion yearsago (Bengtson etal., 2017). Moreover,the eukaryotic cells would be chimeras constituted of an archaebacterium and one or more Eubacteria (Margulis et al., 2000) . In addition, all current models forthe origin of eukaryotes have mitochondria in the eukaryote common ancestor.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

So, as the level of TAA10 is very low in trn genes of ?-aproteobacteria, it could therefore be a derivativetrait that may be related to the increase in AT% in mtDNA and/or recognitionconstraints by mt-aaRSs. Similarly, it is generally acceptedthat all chloroplastsand their derivatives are derived from a single cyanobacterial ancestor (Lewis, 2017) and incurrent cyanobacteria, the respective percentage of TAG10 and TAA10 tripletsare 62.5 and 3.6 respectivly. Theincrease in the percentage of TAA10 appears to be a characteristic of organelles.

In all the taxa for all tRNA types combined, the ATR49 triplets are alwayspresent in smaller numbers than TAR10. Moreover, their numbers are negligibleexcept in organelle genomes mainly those of mitochondria. The low level of ATR49 triplets inPseudocoelomata is due to the frequent absence of T-arm in their mt-tRNAs. In mitochondria, in some taxa the frequency ofATG49 is higher than those of ATA49 while in others this is the other wayaround. The variability is not surprising, given the around 2 billion years of mtDNAevolution Lang et al.,1999) . It must be noted that the nt G is overrepresentedat the 5′-end of the 5′-acceptor- and D-stems, quite often at the 5′-end of theT-stem but rarely at the equivalent position of the acceptor-stem.

In taxawhere the percentage of ATA49 is higher than those of ATG49, G is most oftennot the nt majority at the 5 ‘end of the T-stem. Moreover, differences betweenthe relative percentages of ATG49 and ATA49 could be due, at least in part, tovariations in the AT% in organelle DNAs. Thepercentage of ATR49 is very low in ?-proteobacteria and weakerin this last taxa versus those found in all the Proteobacteria or Eubacteria,and it is also very weak in cyanobacteria, so the significant rate of ATR49triplets would seem to be a derived condition oforganelle DNAs rather than a conserved primitive state lost in current prokaryotes.This trait has probably appeared during the transition from endosymbiotic bacterium to permanentorganelle which has entailed a massive number of evolutionary changes includinggenome reduction, endosymbiotic and lateral gene transfers, emergence of newgenes and the retargeting of proteins (Roger et al.

, 2017) . The timingof the mt-endosymbiosis and of the proto-mitochondria to mitochondriatransition is uncertainbut one might trace the origin of the ATR49 triplets between at least the first eukaryote common ancestor (FECA) and thelast eukaryotic common ancestor (LECA). A second event occurred, at least, in the mitochondria of the ancestors of Opisthokonta(i.e.

, Metazoa and Fungi) would have led to a net increase in the number ofATR49 triplets. ATR49 means that the last two of the V-R are AT it turns outthat this mainly concerns the mt-trngenes whose V-R size is only 4 nts, which are almost exclusively present in theFungi/Metazoa clade.There are large differences in the frequencies of the TAR10 and ATR49triplets depending on the type of trngenes (Table 2) andtaxa (data not shown), the selective variations in some taxa suggest that theincrease in frequency for some types of triplets would be much more recent thanwhat had been mentioned above, in addition, decreases are also observed. Thereare, however, very conservative trends such as the presence of ATR10 tripletsin genes specifying tRNA-Ala. Analyses on mt-trn genes ofDeuterostomia for which a great number of sequences for each type are available(from 1085 to 1382) shown that only the tRNA-Cys and tRNA-Glu types showed a percentage with intermediate values for TAR10, in all othercases the values are extreme, 9 and 10 types with values ranging from 0.

4% to9.8% or greater than 82.4% respectively (Table 3). In contrast, half of the tRNA types havelow percentages of the ATR49 triplet, and for only four types the percentagesare ? to 77.

8%. There wouldalso be a tendency suggestingthat tRNA types with very high or very low levels percentages of TAR10 mostoften have low levels of ATR49.  3.2Examples of putativeimplications of TAR10 and ATR49 as stop and start codonsIn order to investigate possibleimplications of TAR10 and ATR49 triplets in translation, analyzes wereperformed in GenBank using as keywords: “TAA stop codonis completed by the addition of 3′ A residues to the mRNA”, “alternative start codon” or “start codon not determined”and mitochondrion complete. It was then looked if according tothe case downstream or upstream, there was a trn gene. In the positive, the TAR10 or ATR49 were searched and the same investigation was then made inconspecific mt-genomes. Using this strategy, these triplets were only found inmetazoan mtDNAs in which overlappingmt-trn genes havelong been known.

An example of the putative use ofthe TAR10 triplet as a stop codon is presented in Table 4, it concernsa subclass of parasitic flatworms (Eucestoda, Platyhelminthes). In their mt-geneticcode only TAG and TAA are stop codons, which avoids possible bias due to use ofother types of termination codon. In49 out of 66 complete mt-genomes, the first in frame potential stop codon ofthe cox1 gene is in the downstream trnT gene and in 21 cases, the tripletis the TAG10; so, this induces a 10 nt overlap between cox1 and trnT genes. Authors considering that thislong overlapping would be impossible have proposed a number of alternativeoptions but favoring those that avoid any overlap (e.g.

,Le etal., 2002) . 1/ cox1might use an earlier, atypical stop codon.

2/ The 3′-end of the cox1 mRNA could have an abbreviated stopcodon (U or UA instead of UAG10) upstream the trnT gene whichis completed by polyadenylation. 3/If in the potential long transcript, the cleavage would occur just after G10,the cox1 mRNA would end with thecomplete UAG10 as stop codon and the first ten nts of the trnT gene would added by an editing process. 4/ the trnTgene would be shorter in its 5′-end lacking the nts from 1 to 8 or 9, e.g., this has been proposedfor the mt-trnT of Cyclophyllidea (Echinococcus granulosus, Hymenolepis diminuta andTaenia crassiceps). If the full stop codon is used,then there is only a single nt (G10) overlaps between cox1 and trnT. Moreover,if the end of the cox1 gene is at thelevel of T9, thestop codon would complete by polyadenylation; whereas, if the protein gene hasa complete stop codon, the nt G (at the position 10) would add by edition. In thealternative structures, the D-arm is absent whereas it is typical for this tRNAin digeneans and in other phyla.

Moreover, mt-trnT genes issuing from species of the order Cyclophyllidea for which the first potential stop codon is at differentpositions (upstream or downstream the trnTgene, or in this last gene but upstream or downstream TAG10 or at this lastposition) exhibit similar secondary structure with a D-arm. In addition, the high level of nt conservationin the 5′-endof the trnT genes of cestoda (i.e., G1, G2, T7, T8, A9, G10, T11, T12 and A14) suggests strongly thatthe 5′-acceptor-stemand the T-stem are under positive selection pressure. All this implies that the hypothesis of D-armlesstRNAs is, according to us, not very probable.

Concerning the putativeATR49 start codon, in GenBank, the number of complete mt-genomes found using the key-words previouslymentioned were relatively low; moreover, in some cases the upstream geneencoded a protein, specified a rRNA and/or there was only one mention for agiven taxa. A significant example within Deuterostomia (frogs) is presented in Table 5. In the superfamily Hyloidea, the ATA49 triplet isfrequently the first potential complete start codon at the level of the genepair encoding or specifying NAD1 and tRNA-Leu2 respectively. In two families(Bufonidae, Hylidae), for all the sequences (16 belonging to 14 differentspecies), the first ATR triplet found in frame in the ORF of the nd1 gene is ATA49.

For four sequencesbelonging to three other frog families, the ATR49 triplet is missing from the trnL2 gene; moreover, an ATA triplet isintegrally present in the V-R of the trnL2gene of Heleophryne regis but it isnot in frame with the following gene. For these last four cases, the authors ofthe sequences proposed alternative start codons; it seems obligatory that therebe but this has not been experimentally verified. For several authors who havesequenced parts of mtDNAs of Hylidae, the nd1gene would start at ATA49 for about 140 sequences (e.g., Roelants et al., 2005) ).In the two studied taxa,Blast analyses of the NCBI ESTs and SRA (SequenceRead Archive) databases havebeen performed but no result supports the proposed hypotheses, that is to saycorresponding to a transcript starting at an ATR49 or terminating at a TAR10,could be found.

However, for each taxa, the percentage of transcripts ofmt-origin is relatively low; moreover, when there are, the number of fullymatured transcripts is even lower.