|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Biochemistry Department, School of Life Sciences, University of Sussex, Falmer, Brighton, Sussex BN1 9QG, UK
(Correspondence should be addressed to M Wallis; Email: m.wallis{at}sussex.ac.uk)
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
The recently available genome sequences for a variety of mammals allow the sequences of specific genes to be derived for various species. This approach has recently been used to derive the sequence of elephant GH, confirming that this is strongly conserved (Wallis 2008). Here, this comparative genomics approach is extended to the study of elephant prolactin and that of two other members of the mammalian group Afrotheria (to which elephant belongs) for which genomic data are available (hyrax, Procavia capensis and tenrec, Echinops telfairi). Molecular studies on mammalian phylogeny suggest that the Afrotheria diverged early during evolution of Eutheria (Murphy et al. 2004). The approach taken has established that: 1) these Afrotherian species do not appear to have families of prolactin-like genes – if such additional genes do exist in any of these species they must be considerably more different from pituitary prolactin than is the case for the prolactin-like genes seen in rodents or ruminants; 2) tenrec prolactin is strongly conserved but the evolution of elephant and hyrax prolactins has been very rapid, probably driven by positive selection. The evolution of prolactin in Afrotheria thus shows a pattern of episodes of rapid change imposed on a slow basal evolutionary rate, as has previously been seen for other mammalian prolactins, and indeed many other mammalian protein hormones (Wallis 2000, 2001). However, unlike the episodes of accelerated evolution of prolactin seen in rodents, ruminants and primates, the bursts of rapid change seen in Afrotheria are not followed by duplications of the prolactin or GH genes leading to families of prolactin– or GH-like genes expressed in the placenta.
| Materials and Methods |
|---|
|
|
|---|
Genomic sequence data were obtained from the ensembl (release 50; http://www.ensembl.org; Hubbard et al. 2007) or ncbi (http://www.ncbi.nlm.nih.gov/traces) websites, separately for each of the species studied, by searching ensembl assemblies, and/or the ncbi trace archive, using the BLAST or BLAT search methods (Altschul et al. 1990). Table 1 summarizes the data extracted for Afrotherian species for which a substantial amount of genomic sequence is available, together with dog and opossum, used as out-groups. The prolactin gene sequences derived for the species studied here are given at the following website: http://www.lifesci.sussex.ac.uk/home/Mike_Wallis/Prolactin/Afro-PRL/.
|
The reliability of the available sequence data was assessed in two ways, using the approach taken previously (Wallis 2008). First, examination of traces allowed direct identification of difficult regions; in a few cases, this led to changes being made in the sequence derived from the ensembl database. In many cases, regions of sequence corresponding to the prolactin gene were covered by two or more sequence runs, often on both strands, but this was not always the case. In general, the traces indicated that most of the data are of high quality – exceptions are noted below.
The second criterion used to judge the reliability of the data was comparison of sequences obtained from the genomic projects with those available previously (usually protein or cDNA sequence). This was possible for elephant, dog and opossum prolactin sequences. Discrepancies were found in each of these cases and are discussed in detail below.
Sequence analysis
Prolactin gene and protein sequences were aligned with those available for other mammalian prolactins using the Clustalw programme (Higgins & Sharp 1988), followed by manual adjustment where necessary. A putative sequence for the prolactin of the ancestral placental mammal (a.p.m.) was derived from this (Wallis 2000). Since all orders of placental mammals arose at approximately the same point in evolutionary time, this was constructed as a consensus sequence, with each eutherian order being given equal weight, and the marsupial sequences being used as out-groups to resolve ambiguities. Nonsynonymous (dN) and synonymous (dS) substitution rates in coding sequences were determined using the codeml programme in the paml (phylogenetic analysis by maximum likelihood) package of Yang (2007) with a defined tree based on that in Murphy et al. (2004). Evolutionary rates for DNA sequences (corrected substitution rates) were determined using the baseml programme in the same package, with the HKY85 model. Phylogenetic trees were constructed using data from codeml and the programme TREEVIEW (Page 1996). 3D structures were assessed using the molecular modelling programme rasmol. Accessibility of residues within the 3D structure was determined using the programme NACCESS (Hubbard & Thornton 1993).
| Results and discussion |
|---|
|
|
|---|
The genomic data analysed are summarized in Table 1, and the prolactin gene sequences derived here are available on the website referred to above. Prolactin gene sequences were derived for two Afrotherian species for which no data were previously available: P. capensis (cape rock hyrax) and E. telfairi (tenrec; small Madagascar hedgehog). The prolactin gene sequence was also obtained for Loxodonta africana (African elephant), for which the protein sequence has been described previously (Li et al. 1989). The prolactin gene sequences of dog (Canis familiaris) and a marsupial, Monodelphis domestica (grey short-tailed opossum) were also obtained and used as out-groups. In each case, the gene sequence was assessed on the basis of the known organization of the human prolactin gene (Truong et al. 1984), and shown to accord with the five-exon structure of this gene. The alternative 5' untranslated exon found about 5800 nucleotides upstream of the pituitary start site in the human prolactin gene (DiMattia et al. 1990, Berwaer et al. 1994) was not detected in the species studied here.
The elephant, dog and opossum prolactin gene sequences can be checked against data previously available, providing a basis for assessing the reliability of the approach used.
The sequence of elephant prolactin derived from the genomic data differed significantly from the previously published protein sequence (Li et al. 1989). These differences are discussed in detail below; there seems to be little discrepancy in terms of the original data. A cDNA sequence for dog prolactin is available in the ncbi/embl/ddbj database (AY741405 [GenBank] ), but this sequence differs substantially from that derived from the genomic sequence. The prolactin gene sequence derived from the genome is of high quality, with all regions covered by multiple traces, covering both DNA strands. It has been used previously (Wallis et al. 2005, Alam et al. 2006); dog prolactin is very similar to pig prolactin. No indication was found of more than one prolactin-like gene in the dog genome, on the basis of extensive BLAT searches using dog, human and pig prolactin coding sequence. This does not completely exclude the possibility of additional prolactin-like genes, but does imply that if they are present they must differ very substantially from the identified gene. The approach taken provides no information about possible splice variants. A cDNA sequence for opossum prolactin is available in the ncbi/embl/ddbj database (AF067726 [GenBank] ). This differs from the sequence derived from the genome at 12 nt, giving seven differences at the protein level. The genome-derived sequence was of high quality, with all these sites covered by 8–12 traces, on both DNA strands. No indication was found of additional prolactin-like genes in the opossum genome. The derived coding sequence was similar to that reported for prolactin from two other marsupials – brushtail possum (Trichosurus vulpecula) and brown bandicoot (Isoodon macrourus; Curlewis et al. 1998, Veitch et al. 2006).
For each of the Afrotherian prolactin gene sequences, analysis of traces indicated some potential polymorphism, which is indicated on the sequences on the website. For elephant and tenrec, this would not affect the derived amino acid sequence, but for hyrax two potential polymorphisms could alter the protein sequence. Although polymorphism seems the most likely explanation for the variation observed, occurrence of more than one very similar gene following gene duplication cannot be ruled out. But for none of these species was there any evidence for additional genes encoding distinctly different prolactin-like proteins.
Novel Afrotherian prolactin sequences
The derived amino acid sequences of elephant, hyrax and tenrec prolactins are included in the alignment given in Fig. 1.
|
There is thus little disagreement between the data provided by Li et al. (1989) and the elephant prolactin sequence derived from the genome. Detailed examination of the sequence traces indicates that the data for the elephant prolactin coding sequence (CDS) are of high quality, with several sequence runs covering each exon, on both strands, with no discrepancies. The genomic sequence also provides the signal peptide sequence. Despite the differences between the sequence described previously and that obtained from the genome, the conclusion that the elephant prolactin sequence diverges considerably from that derived for the ancestral placental mammal (a.p.m.) is unchanged – elephant and a.p.m. sequences differ at 49 residues (Fig. 1).
The sequence derived for hyrax prolactin is even more different from that of the a.p.m. than is that of elephant – differing at 94 residues, 47% of all sites. On the other hand, the sequence derived for tenrec prolactin is conserved, differing from that of the a.p.m. at only eight sites (Fig. 1). Thus, the very variable evolutionary rates previously identified for prolactin across the mammals (Wallis 1981, 2000) also apply within the Afrotheria.
The sequence of dog prolactin is very similar to that predicted for the a.p.m. of placental mammals, and to other conserved prolactin sequences, such as those of pig and horse (Fig. 1). It was used here as a reference/out-group sequence for evaluation of the Afrotherian sequences, as a conserved prolactin from a mammal with a well-characterized genomic sequence. The sequence of opossum prolactin is also quite strongly conserved, differing little from the a.p.m., providing a suitable out-group for eutherian prolactins.
A phylogenetic tree based on the sequences given in Fig. 1 (not shown, but similar to the tree derived for nonsynonymous sites in the coding sequence, see below) confirmed the slow basal rate of evolution for mammalian prolactins, and emphasized the remarkable variability within Afrotheria.
Adaptive evolution of prolactin in Afrotheria
Episodes of rapid change of the sort identified in lineages leading to elephant and hyrax prolactins can result from two processes: 1) loss of function, so that neutral substitutions accumulate rapidly because the normal purifying selection is relaxed and 2) functional change with the rapid accumulation of substitutions due to adaptive selection. An effective way of distinguishing between these is to compare rates of non synonymous (dN) and synonymous (dS) substitutions in coding sequences – substitutions that respectively do and do not change the amino acid sequence (Yang & Bielawski 2000). If the sequence of a protein is being maintained by purifying selection then dN/dS will be <<1.0. In cases where function is lost completely, so that the sequence is evolving by accumulation of neutral mutations, dN/dS=1.0. Where a change in rate is due to adaptive change dN/dS will increase; if dN/dS is significantly greater than 1.0, then adaptive selection is confirmed, though a lower value does not necessarily exclude this explanation.
dN and dS values were determined using an alignment of mammalian prolactin coding sequences (excluding signal sequences) and the programme codeml, model 1 (separate ratio for each branch). The values obtained were used to construct the phylogenetic trees shown in Fig. 2. As expected, rates of evolution based on dS are relatively uniform, but those based on dN values are very variable, reflecting particularly the episodes of rapid change identified for protein sequences. The episodes of rapid change on the lineages leading to elephant and hyrax contrast with the slow evolution seen for tenrec. dN/dS ratios for branches leading to elephant (0.99) and hyrax (1.13) are elevated substantially compared with that on the branch leading to tenrec (0.046). However, use of model 2 in codeml (2 dN/dS ratios for branches) showed that the elevated ratio for hyrax was not significantly greater than 1.0.
|
Varying evolutionary rates within Afrotherian prolactin genes
A disadvantage of the use of dN/dS ratios is that dS may be correlated with dN (Mouchiroud et al. 1995). In this case, the elevation of dN may be accompanied by elevation of dS, so that dS is not really a suitable reference. An alternative approach is to compare the rate of evolution of exons and introns within a gene. Normally, exons (constrained by purifying selection) evolve more slowly than introns (little if at all constrained); if the exon rate increases to a level above that of adjacent introns, then adaptive change is indicated. Accordingly, rates of evolution for exons and
300 bases of adjacent introns (a limitation necessary because not all intron sequences were complete) were determined using the programme baseml (Yang 1994) and alignments of human, dog, tenrec, elephant and hyrax DNA (opossum could not be used as an out-group because of problems aligning introns). Results are shown in Fig. 3. The pattern for dog and tenrec prolactin genes is characteristic of fairly slowly evolving genes (Graur & Li 2000), with in every case the evolutionary rate for an exon being much less than that of the adjacent intron. For hyrax the situation is reversed; the mean rate for exons is 2.3xthat for introns (P=0.002, Student's t-test), again providing strong support for adaptive evolution. For elephant the mean rate for exons is 1.49xgreater than that for introns, but the difference is not significant.
|
The 3D structure of human prolactin has been determined using nmr spectroscopy (Keeler et al. 2003, Teilum et al. 2005). The distribution of the sites that change along the branches leading to hyrax, elephant and tenrec was explored (Table 2). There is a tendency for residues in the central core (identified as residues with solvent accessibility less than 10%) and receptor binding sites (as given by Teilum et al. 2005) to be relatively conserved, but only in the case of elephant is this significant (P=0.015, Fisher's exact test, Zhang et al. 1998, with application of Bonferroni correction). Overall, there is no strikingly non-random distribution of substitutions of the sort observed for GH (Wallis 2008).
|
Previous studies have established that prolactin, like several other mammalian polypeptide hormones, shows an episodic pattern of evolution. This work has established that this is true also of three orders of Afrotherian mammals. The rapid evolution on the lineage leading to elephant has been confirmed, PRL of the tenrec is strongly conserved, but PRL of the hyrax has evolved at a remarkably accelerated rate. In the case of the hyrax, study of dN/dS ratios, and the observation that exons are evolving faster than introns, strongly indicate that the accelerated evolution was driven by adaptive selection. In the case of the elephant the case is less clear cut, but the fact that elephant prolactin is produced in high yield in the pituitary gland, and retains biological activity (Li et al. 1987), militates against loss of function as the explanation of accelerated evolution.
If the bursts of rapid evolution seen in these hormones are driven by adaptive selection, what is the nature of the functional changes involved? Too little is known about the biological properties and functions of prolactin in Afrotheria to allow firm conclusions to be drawn, but it could result from the process of function switching (Wallis 1997, Forsyth & Wallis 2002). If prolactin acquired a second function, the importance of which fluctuated over time, each switch (fluctuation) would lead to adaptation and additional substitutions. Repeated fluctuations could lead to substantial sequence change with relatively little change in function. Another explanation for rapid change could be maternal-foetal competition (Haig 2008), although for this to be fully effective gene imprinting may be necessary, about which no information is available for Afrotherian prolactins.
In other groups where prolactin evolution shows marked acceleration, gene duplication is also found. In ruminants and rodents, multiple prolactin gene duplications have given rise to extensive families of prolactin-related genes, expressed mainly in the placenta, and with diverse functions (Wallis 1992, Takahashi 2006, Ushizawa & Hashizume 2006, Soares et al. 2007). In primates, the prolactin gene is not duplicated, but there is duplication of the GH gene, and GH in at least some primates possesses lactogenic activity. In the case of Afrotheria careful searching of the genomic data bases and Trace Archives has revealed no evidence for additional genes encoding prolactin-like proteins. Similarly, no evidence for duplication of the GH gene in elephant or hyrax was obtained (Wallis 2008). This may be because such genes are too diverged from prolactin (or GH) itself to allow detection, but if this is the case the divergence must be very substantial. Searching bovine, rat and mouse Trace Archives using homologous or heterologous prolactin CDS sequences revealed multiple prolactin-like genes, clearly reflecting the additional prolactin-like genes in these species, so if additional prolactin-like genes do exist in tenrec, elephant or hyrax they must be more divergent than is the case for the ruminant or rodent prolactin gene families. It thus seems likely that the rapid evolution of prolactin in lineages leading to elephant and hyrax is not associated with duplication of prolactin or GH genes in these species. It should be noted, however, that the duplications that gave rise to families of prolactin-like genes in ruminants and rodents appear to have occurred after the episodes of rapid evolution in these groups. In Afrotheria, therefore, the mechanisms leading to rapid evolution of prolactin may have been similar to those occurring in ruminants and rodents, but without subsequent duplications of the prolactin gene, at least in the species considered here.
| Declaration of interest |
|---|
|
|
|---|
| Funding |
|---|
| References |
|---|
|
|
|---|
Altschul SF, Gish W, Miller W, Myers EW & Lipman DJ 1990 Basic local alignment search tool. Journal of Molecular Biology 215 403–410.[CrossRef][Web of Science][Medline]
Berwaer M, Martial JA & Davis JRE 1994 Characterization of an up-stream promoter directing extrapituitary expression of the human prolactin gene. Molecular Endocrinology 8 635–642.
Chen EY, Liao Y-C, Smith DH, Barrera-Saldaña HA, Gelinas RE & Seeburg PH 1989 The human growth hormone locus: nucleotide sequence, biology, and evolution. Genomics 4 479–497.[CrossRef][Web of Science][Medline]
Curlewis JD, Saunders MC, Kuang J, Harrison GA & Cooper DW 1998 Cloning and sequence analysis of a pituitary prolactin cDNA from the brushtail possum (Trichosurus vulpecula). General and Comparative Endocrinology 111 61–67.[CrossRef][Web of Science][Medline]
DiMattia GE, Gellersen B, Duckworth ML & Friesen HG 1990 Human prolactin gene expression. The use of an alternative noncoding exon in decidua and the IM-9-P3 lymphoblast cell line. Journal of Biological Chemistry 265 16412–16421.
Forsyth IA & Wallis M 2002 Growth hormone and prolactin – molecular and functional evolution. Journal of Mammary Gland Biology and Neoplasia 7 291–312.[CrossRef][Web of Science][Medline]
Gon
alez Alvarez R, Revol de Mendoza A, Esquivel Escobedo D, Corrales Félix G, Rodríguez Sánchez I, González V, Dávila G, Cao Q, de Jong P, Fu Y-X et al. 2006 Growth hormone locus expands and diverges after the separation of new and old world monkeys. Gene 380 38–45.[CrossRef][Web of Science][Medline]
Graur D & Li W-H. Sunderland, MA: Sinauer.
Haig D 2008 Placental growth hormone-related proteins and prolactin-related proteins. Placenta 22 S36–S41.
Higgins DG & Sharp PM 1988 CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73 237–244.[CrossRef][Medline]
Hubbard SJ & Thornton JM 1993 NACCESS, Computer Program, Department of Biochemistry and Molecular Biology, University College London..
Hubbard TJP, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F & Cutts T 2007 Ensembl 2007. Nucleic Acids Research 35 D610–D617.
Hulmes JD, Miedel MC, Li CH & Pan YCE 1989 Primary structure of elephant growth hormone. International Journal of Peptide and Protein Research 33 368–372.[Web of Science]
Keeler C, Dannies PS & Hodsdon ME 2003 The tertiary structure and backbone dynamics of human prolactin. Journal of Molecular Biology 328 1105–1121.[CrossRef][Web of Science][Medline]
Li CH, Chung D, Bewley TA & Cabrera CM 1987 Elephant prolactin: isolation and characterization. International Journal of Peptide and Protein Research 29 472–477.[Web of Science][Medline]
Li CH, Oosthuizen MMJ & Chung D 1989 Primary structure of elephant pituitary prolactin. International Journal of Peptide and Protein Research 33 67–69.[Web of Science][Medline]
Li Y, Wallis M & Zhang Y-P 2005a Episodic evolution of prolactin receptor gene in mammals: coevolution with its ligand. Journal of Molecular Endocrinology 35 411–419.
Li Y, Ye C, Shi P, Zou X-J, Xiao R, Gong Y-Y & Zhang Y-P 2005b Independent origin of the growth hormone gene family in New World monkeys and Old World monkeys and hominoids. Journal of Molecular Endocrinology 35 399–409.
Mouchiroud D, Gautier C & Bernardi G 1995 Frequencies of synonymous substitutions in mammals are gene-specific and correlated with frequencies of nonsynonymous substitutions. Journal of Molecular Evolution 40 107–113.[CrossRef][Web of Science][Medline]
Murphy WJ, Pevzner PA & O'Brien SJ 2004 Mammalian phylogenomics comes of age. Trends in Genetics 20 631–639.[CrossRef][Web of Science][Medline]
Ohta T 1993 Pattern of nucleotide substitutions in growth hormone-prolactin gene family: a paradigm for evolution by gene duplication. Genetics 134 1271–1276.
Page RDM 1996 TREEVIEW: an application to display phylogenetic trees on personal computers. Computer Applications in the Biosciences 12 357–358.
Soares MJ, Konno T & Alam SMK 2007 The prolactin family: effectors of pregnancy-dependent adaptations. Trends in Endocrinology and Metabolism 18 114–121.[CrossRef][Web of Science][Medline]
Takahashi T 2006 Biology of the prolactin family in bovine placenta. I. Bovine placental lactogen: expression, structure and proposed roles. Animal Science Journal 77 10–17.[CrossRef][Web of Science]
Teilum K, Hoch JC, Goffin V, Kinet S, Martial JA & Kragelund BB 2005 Solution structure of human prolactin. Journal of Molecular Biology 351 810–823.[CrossRef][Web of Science][Medline]
Truong AT, Duez C, Belayew A, Renard A, Pictet R, Bell GI & Martial JA 1984 Isolation and characterization of the human prolactin gene. EMBO Journal 3 429–437.[Web of Science][Medline]
Ushizawa K & Hashizume K 2006 Biology of the prolactin family in bovine placenta. II. Bovine prolactin-related proteins: their expression, structure and proposed roles. Animal Science Journal 77 18–27.[CrossRef][Web of Science]
Veitch C, Gemmell RT & Curlewis JD 2006 Cloning and sequence analysis of a pituitary prolactin cDNA from the northern brown bandicoot (Isoodon macrourus). General and Comparative Endocrinology 146 304–309.[CrossRef][Web of Science][Medline]
Wallis M 1981 The molecular evolution of pituitary growth hormone, prolactin and placental lactogen: a protein family showing variable rates of evolution. Journal of Molecular Evolution 17 10–18.[CrossRef][Web of Science]
Wallis M 1992 The expanding growth hormone/prolactin family. Journal of Molecular Endocrinology 9 185–188.
Wallis M 1997 Function switching as a basis for bursts of rapid change during the evolution of pituitary growth hormone. Journal of Molecular Evolution 44 348–350.[CrossRef][Web of Science][Medline]
Wallis M 2000 Episodic evolution of protein hormones: molecular evolution of pituitary prolactin. Journal of Molecular Evolution 50 465–473.[Web of Science][Medline]
Wallis M 2001 Episodic evolution of protein hormones in mammals. Journal of Molecular Evolution 53 10–18.[Web of Science][Medline]
Wallis M 2008 Mammalian genome projects reveal new growth hormone (GH) sequences. Characterization of the GH-encoding genes of armadillo (Dasypus novemcinctus), hedgehog (Erinaceus europaeus), bat (Myotis lucifugus), hyrax (Procavia capensis), shrew (Sorex araneus), ground squirrel (Spermophilus tridecemlineatus), elephant (Loxodonta africana), cat (Felis catus) and opossum (Monodelphis domestica). General and Comparative Endocrinology 155 271–279.[CrossRef][Web of Science][Medline]
Wallis OC & Wallis M 2002 Characterisation of the GH gene cluster in a new-world monkey, the marmoset (Callithrix jacchus). Journal of Molecular Endocrinology 29 89–97.[Abstract]
Wallis OC & Wallis M 2006 Evolution of growth hormone in primates: the GH gene clusters of the New World monkeys marmoset (Callithrix jacchus) and white-fronted capuchin (Cebus albifrons). Journal of Molecular Evolution 63 591–601.[CrossRef][Web of Science][Medline]
Wallis OC, Mac-Kwashie AO, Makri G & Wallis M 2005 Molecular evolution of prolactin in primates. Journal of Molecular Evolution 60 606–614.[CrossRef][Web of Science][Medline]
Yang Z 1994 Estimating the pattern of nucleotide substitution. Journal of Molecular Evolution 39 105–111.[Web of Science][Medline]
Yang Z 2007 PAML4: phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution 24 1586–1591.
Yang Z & Bielawski JP 2000 Statistical methods for detecting molecular adaptation. Trends in Ecology and Evolution 15 496–503.[CrossRef]
Yang Z & Nielsen R 2002 Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Molecular Biology and Evolution 19 908–917.
Zhang J, Rosenberg HF & Nei M 1998 Positive Darwinian selection after gene duplication in primate ribonuclease genes. PNAS 95 3708–3713.
Received in final form 12 November 2008
Accepted 14 November 2008
Made available online as an Accepted Preprint 17 November 2008
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | CONTACT US | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |