Population structure of Citrus tristeza virus from field Argentinean isolates

We studied the genetic variability of three genomic regions (p23, p25 and p27 genes) from 11 field Citrus tristeza virus isolates from the two main citrus growing areas of Argentina, a country where the most efficient vector of the virus, Toxoptera citricida, is present for decades. The pathogenicity of the isolates was determinated by biological indexing, single-strand conformation polymorphism analysis showed that most isolates contained high intra-isolate variability. Divergent sequence variants were detected in some isolates, suggesting re-infections of the field plants. Phylogenetic analysis of the predominant sequence variants of each isolate revealed similar grouping of isolates for genes p25 and p27. The analysis of p23 showed two groups contained the severe isolates. Our results showed a high intra-isolate sequence variability suggesting that re-infections could contribute to the observed variability and that the host can play an important role in the selection of the sequence variants present in these isolates.


Introduction
Citrus tristeza virus (CTV) (Genus: Closterovirus, Family: Closteroviridae), one of the most important citrus pathogens worldwide, is endemic in Argentina and represent a major threat to the citrus production of the country.CTV virions are filamentous, flexuous particles with a helical nucleocapsid architecture consisting of two coat proteins [1,2].One of these proteins is a 25 kDa major capsid protein (CP), which forms a long virion body coating 95% of the particle length.The other protein is a 27 kDa capsid protein (CPm), homolog to the CP, which is assembled at one end of the virion [3,4].The p25 and p27 genes encode CP and CPm proteins, respectively [5,6].The virus has a positive RNA genome of near 20 Kb with approximately half of the 5 0 end region of the genome coding for the replicase associated domains.The rest of the genome encode genes that are not required for RNA replication and are involved in virion assembly, host interaction, vector specificity, movement within the plant and RNA silencing.These genes are expressed from 3 0 co-terminal subgenomic RNAs [7,8].A protein of 23 kDa (P23), encode at the 3 0 end of CTV genome, posses an RNA-binding motif rich in residues of cysteine and histidine located in the core of a putative zinc-finger domain [9].This protein controls the asymmetrical accumulation of plus and minus RNA strands during replication [10].The over-expression of the P23 protein in transgenic plants induced phenotypic aberrations that resembled symptoms caused by CTV infection in lime plants, whereas transgenic plants expressing a truncated version of P23 were normal, indicating that this gene could be involved in symptoms development [11].Recently, the proteins P25, P20 and P23 have been found to act as RNA silencing suppressors in Nicotiana tabacum and N. benthamiana plants [12].
CTV isolates differing in symptoms, affecting various citrus species and transmitted by different aphid vectors have been described worldwide [13,14], suggesting a wide genetic variation in CTV populations.This variation might be increased by the fact that citrus trees often remain in the field for 30 or more years, which provide many opportunities for changes in the viral population within individual trees, due to repeated aphid inoculations or to cultural practices such as topworking to new varieties [15].Similar to other RNA viruses, CTV-infected tissues contain a population of sequence variants that may partially result from the error-prone nature of the RNA-dependent RNApolymerase [16,17].A virus isolate does not consist of a single RNA sequence but it contains a population of related sequence variants, often referred as quasispecies [18][19][20].Furthermore, the presence of divergent sequence variants and recombination events have also been detected in some CTV isolates [21].Therefore, studies of genetic structure and diversity would be important in understanding the mechanisms involved in CTV evolution responsible for the generation and/or maintenance of genetic diversity within a virus population.This knowledge may be highly relevant for the development of viral control strategies.
CTV is propagated by infected buds and locally spread by several aphid species [22].The most efficient vector is T. citricida, followed by Aphis gossypii [23][24][25][26][27]. Since the introduction of T. citricida into South America early last century, damaging CTV has caused major citrus declines [28].The aphid T. citricida is the main CTV vector in Argentina, were the CTV isolates analyzed in this study were collected.Since 1990, cultivars of pigmented grapefruit in the Northwest region of Argentina have been seriously affected by CTV, showing severe symptoms as stem pitting and decline, and a great quantity of trees have died.In this study, we analyzed the structure and genetic diversity of 11 CTV isolates from naturally-infected citrus plants located in the main citrus-growing areas of Argentina.The powerful resolving properties of single-strand conformation polymorphism (SSCP) applied to the analysis of p27, p25 and p23 genes allowed us to determine the genetic variability present within each individual isolate.Analysis of the nucleotide sequences was used to estimate the genetic diversity of the isolates and to infer the phylogenetic relationships among them.To our knowledge, this is the first study that analyzes variability and population structure of naturally-infected trees in a citrus growing area where T. citricida is present.

CTV isolates
Eleven CTV isolates were collected from naturally-infected plants in Northeast (Entre Rı ´os and Corrientes provinces) and Northwest (Jujuy, Salta, Tucuma ´n and Catamarca provinces) citrus-growing regions of Argentina (Table 1).These trees are located in areas where CTV has been naturally spreading for more than half a century and its present incidence is close to 100%.Probably the plants were infected since their introduction in the field.Samples were multiplied by graft-inoculation on citrus seedlings of the same species as the original hosts.Isolates were biologically characterized by inoculation in citrus hosts, as previously described [29] (Table 1).The 11 CTV isolates studied belong to the collection kept at the EEA Concordia-INTA.

Nucleotide sequence and statistical analysis
Complete sequences were analyzed using PileUp, Assemble, Gap and BestFit programs of Genetics Computer Group (GCG) package version 10.0 [31].Multiple alignments of nucleotide sequences were obtained using PileUp, Pretty (GCG) or CLUSTAL W software [32].
Heterozygosity was estimated according to Nei equation 8.5 [33], using the number and frequency of SSCP haplotypes.Genetic diversity was determined using the software DNAsp version 3 (DNA Sequence Polymorphisms) [34], and nonsynonymous and synonymous substitutions were calculated by Pamilo-Bianchi-Li method [35,36] using the software MEGA version 2.1 [37].
Phylogenetic relationships were inferred by the 4.0 b6 version of PAUP* [Phylogenetic Analysis Using Parsimony (*and other methods)] [38] using parsimony and maximum likelihood methods, and phylogenetic trees were visualized using TreeView 1.6.1 program [39].

Nucleotide sequence accession numbers
The nucleotide sequence data reported in this article have been published in the GenBank database under accession numbers AY750732 to AY750845.Other CTV nucleotide sequences used in our analysis were obtained from the indicated GenBank entries: AF260651 (T30), AF001623 (SY568), Y18420 (T385), U16304 (T36) and U56902 (VT).

Genetic variation within CTV isolates
The origin and pathogenicity characteristics of the CTV isolates used in this work are listed in Table 1.Isolates were biologically characterized by inoculation in indicator citrus hosts.SSCP analysis was employed to assess the population structure of three genomic regions of 11 CTV Argentinean field isolates.The p27, p25 and p23 genes of each isolate were reverse transcribed and PCR amplified with specific primers, using as template more than 100 ng of genomic dsRNA per reaction, to minimize the possibility of nucleotide misincorporation in former phases of the PCR.For each gene and isolate, 10 clones were PCRamplified and analyzed by SSCP.To increase the sensitivity of sequence variants detection each PCR product was cleaved in two small DNA fragments by digestion with a suitable endonuclease prior to SSCP analysis.The SSCP analysis showed a variable number of haplotypes (Fig. 1a and Table 2).Analysis by SSCP of the initial PCR product from individual isolates clearly shows DNA patterns that coincide with the prevalent haplotype.This method was also useful to demonstrate the presence of different sequence variants within the same isolate.The representative variant for p27 gene from isolate C257-7 is shown in Fig. 1b.In the same figure, we show two clones depicting Fig. 1   the DNA patterns of the most frequent variants corresponding to prevalence of 40% and 30%, respectively (lanes 1 and 2).Results for p23 from the C268-2 isolate are also shown in Fig. 1b.In this case, variants corresponding to frequencies of 50% (lanes 1, 3, 5 and 7) and 20% (lanes 4 and 9) are clearly identified within the same isolate (Fig. 1b).
Our results showed that 6 of 11 isolates had an intrinsic structure consisting of one major variant for the three genomic regions (Table 2).Nevertheless, there were 5 isolates where the predominant haplotype had a frequency less than 50%.This results differ with previous reported data for California and Spain, were almost always, with only a few exceptions, one haplotype was clearly predominant for the different genomic regions analyzed [40].In the citrus growing areas of these countries the most efficient CTV vector, the brown citrus aphid T. citricida, is absent and the main vector of the virus is Aphis gossypii, the second most efficient vector [26].
For p25 and p27 genes the predominant haplotype for 9 of 11 isolates showed the same SSCP profile (data not shown).The exceptions were the C268-2 and C257-2 isolates (both genes).In contrast, the presence of similar SSCP profiles among isolates was not detected for the p23 gene (data not shown).
Variability of each genomic region and CTV isolate was calculated from the haplotype frequencies by using the heterozygosity (h) and mean heterozygosity (H) parameters (Table 3) [33].The results show differences in h.The H was higher than 0.55 in seven isolates.A value lower than 0.55 was observed for isolates C268-2, C270-3, C271-2 and C271-8 (Table 3).The p23 gene showed the highest values of mean heterozygosity.These results showed great intra-isolate sequence variability among these isolates, despite the presence of a major haplotype.This could be explained by the high number of aphids visiting each individual citrus tree, and causing re-infections with different variants.Two important factors can contribute to this hypothesis: (1) the endemic nature of CTV in Argentina and (2) the predominant presence of the most efficient vector, T. citricida, in the same region.

Genetic diversity of CTV population: nucleotide diversities
Sequence analysis of the most frequent haplotypes of each CTV gene and isolate were used to estimate the nucleotide diversity (Fig. 2).Total genetic diversity (Dt) showed, for p23 gene, higher values than p25 and p27 genes (P = 0.000).These last two genomic regions exhibited similar Dt values (Fig. 2).The obtained nucleotide diversity agrees with values previously reported for different genome regions of CTV [17,40] and correlates with the mean heterozygosity obtained in this study.In a previous work, we analyzed the genome region corresponding to p20 and p23 genes in grapefruit Argentinean field isolates and we also observed a high intra-isolate variability [41,42].Although the total diversity was similar to that found in other citrus areas, the intra-isolate variability was higher  in the Argentinean isolates.This variability constitutes a serious problem in the control of CTV, especially in grapefruits in Argentina.Nucleotide substitutions were computed to estimate the degree of selective constraints acting on each gene.Values of synonymous substitutions per synonymous site (dS) were similar for the three genes, being the lowest value the corresponding to p27, although the difference was not statistically significant (Fig. 2).The rate of synonymous substitutions is similar for many genes, unless the gene be disturbed by codon usage bias or RNA structure.These observations could explain the slightly lower dS value found for p27 gene in the Argentinean CTV isolates analyzed.On the other hand, values of nonsynonymous substitutions per nonsynonymous site (dN) were significantly higher for p23 gene.The dN/dS ratio indicates the grade of selective constraints on each gene.The dN/dS ratio for p25 and p27 genes was similar (0.08) and lower than p23 gene (0.26), indicating negative selective pressure acting on both coat protein genes.The dN/dS value found in this work for both CTV CP genes agrees with values previously found for other plant virus coat proteins like Yam mosaic virus (YMV) [43], Cucumber mosaic virus (CMV) [44], and Barley yellow dwarf virus (BYDV) [45], but was higher than the dN/dS ratio estimated for CTV CP of California and Spain isolates [17,40].

Phylogenetic relationships among the isolates
Unrooted phylogenetic trees were obtained, for each gene, from the nucleotide sequences of the major haplotype of each CTV isolate.In addition, sequences of five worldwide completely sequenced CTV isolates were included in the analysis.The phylogenetic relationships among CTV isolates were inferred using maximum parsimony (MP) and maximum-likelihood methods (ML).In order to determine the model of maximum-likelihood method to use with each gene, the likelihood ratio test was employed.For p25 and p27 genes, the selected model was HKY, and for p23 gene Fig. 3 Unrooted parsimony trees of the predominant haplotypes of each gene (p27, p25 and p23) from Argentinian isolates.Sequences of T36, T385, T30, VY and SY568 were included as reference standards.Bootstrap values for 1,000 replicates are indicated.Branch lengths are proportional to number of nucleotide changes the model used was GTR + C. Trees obtained using both methods, MP and ML, showed similar topologies.In addition, trees from amino acid deduced sequences, using the maximum parsimony method, also exhibited similar topologies to the trees obtained using nucleotide sequences (data not shown).Figure 3 shows the unrooted trees obtained for each gene.Nucleotide changes among sequences are represented as branch lengths in the trees.In the p25 and p27 gene trees, most of the Argentinian isolates grouped in a clade, except the isolate C268-2, that was most distantly related to the remaining isolates (92% of nucleotide similitude).The C257-2 isolate, which grouped within the Argentinean isolates, was located more distant of them (98% of nucleotide similitude).These results agree with the observed using SSCP analysis.
Phylogenetic analysis of p23 gene revealed a tree with a higher number of groups than the trees of p25 and p27 genes.One of these groups, having 99% of bootstrap support, was composed by the most severe isolates, except isolates C270-3 and C269-1.These two severe isolates, that grouped together with a bootstrap value of 96%, are the only two of the Argentinian isolates analyzed that caused seedling yellows (SY) symptoms on Duncan grapefruit hosts.Isolate C268-2 grouped with the mild isolates T30 and T385 from Spain, and was the most diverged of the Argentinean CTV isolates studied, also true for the p23 gene.The nucleotide changes of p23 gene sequences that support the cluster of the severe isolates C257-7, C269-6, C278-1 and SY568 were mainly synonymous.The nonsynonymous changes produced were: V25I, a conservative change, and T53N, mapping in a position inside of the proposed RNA-binding domain.Interestingly, the T53N change was also found in the T36 isolate.This is a non-conservative change that might have some effect on the biological function of P23.The nucleotide sequence substitutions that support the grouping of isolates C270-3 and C269-1 were all synonymous except the changes R111G, D138N and G199D.The amino acid changes found in p23 encoded-protein are scattered distributed along the protein sequence.However, the region corresponding to the RNA-binding domain is conserved (Fig. 4).The polymorphism of a specific region of p23 allows to the discrimination between mild and severe isolates [46].The amino acid differences found by Sambade et al. [46] in p23 were also found in the isolates analyzed in this study.The severe isolates showed the amino acids A 78 S 79 R 80 , except C269-1 and C270-3 from Argentina that showed the amino acids G 78 L 79 K 80 , like T36 isolate from Florida (biogroup III).The isolates from biogroups I, II and III showed, in these positions, the amino acids ALK (isolates C268-2, T30 and T385), GSR (isolates C257-9, C271-2, C271-8 and C257-10) and ALR (isolate C257-2).
Field plants are exposed to continuous environment changes and successive re-infections by aphids.The coexistence of variants with a high nucleotide diversity within-CTV population of Argentinean isolates might probably reflect changing environment conditions and/or the high frequency of re-infections by aphids in the field.In this study, all the severe isolates were obtained from grapefruits, since we could not identify severe CTV isolates in other hosts in Argentina.The adaptation of CTV populations to a new host could be an important factor to select the haplotypes present in the viral isolates.The sequence of p23 gene correlates with the pathogenicity of the isolates.The amino acids differences previously found in this region were also found in the isolates analyzed in this study, indicating that the phenotype observed in the severe Argentinean isolates was not caused by new variants of the virus.

Evidence of mixed infections in Argentinean isolates
In order to confirm the occurrence of co-infections, suggested by the high intra-isolate variability, random selected minority SSCP haplotypes were sequenced.Phylogenetic trees were obtained using these sequences (Fig. 5).Minority haplotypes of some isolates did not cluster with the corresponding predominant haplotype, indicating the presence of diverged sequence variants within individual isolates.The sequence dissimilitude between the variants was 6-8% for p25 and 6-11% for p23.These results confirm that re-infections with diverged sequence variants could be frequent in these isolates, despite the low number of minority haplotypes analyzed.
In summary, in this study, we analyzed the variability of CTV isolates in a citrus growing area where the most efficient vector is the main vector.Also, this is the first study that analyzes naturally-infected trees in a region where grapefruits plants are seriously affected.High intraisolate variability was a characteristic of these isolates indicating that the presence of the vector could contribute to the transmission of the most severe variants of the virus and under certain conditions, involving environmental and host factors, these variants could produce a severe phenotype in inoculated plants.
Fig. 5 NJ trees including minority sequence variants, the predominant sequences from each isolate are indicated in bold and as ''M'', selected minority variants are indicated as ''min''.Minority haplotypes of the isolates that did not clustered with the corresponding predominant haplotype are underlined and the genetic distance of the variant with respect to the predominant sequence is in brackets

Fig. 4
Fig. 4 Alignment of deduced amino acid sequences of the predominant haplotypes of p23 gene.Sequences from GenBank were included.For each position, identical residues are indicated by dots.

Table 1
Pathogenicity characteristics of Argentinian field CTV isolates

Table 3
Heterozigosity of three genomic regions of 11 Argentinian CTV isolates