Tài liệu Báo cáo Y học: Interallelic recombination is probably responsible for the occurrence of a new as1-casein variant found in the goat species potx

A systematic analysis performed in an autochthon southern Italy breed identified a new rare allele M, which was characterized at both the protein and genomic level.. A comparison of spec

Trang 1

Interallelic recombination is probably responsible for the occurrence

Claudia Bevilacqua1,2,*, Pasquale Ferranti3,4, Giuseppina Garro3,4, Cristina Veltri1, Raffaella Lagonigro1, Christine Leroux2, Emilio Pietrola`1, Francesco Addeo3,4, Fabio Pilla1, Lina Chianese3and Patrice Martin2 1

Dipartimento di Scienze Animali, Vegetali e dell’Ambiente, Facolta` di Agraria dell’Universita` del Molise, Campobasso, Italy;

2

Laboratoire de Ge´ne´tique biochimique et de Cytoge´ne´tique, INRA, Domaine de Vilvert, Jouy-en-Josas, France;

3

Dipartimento di Scienza degli Alimenti, Facolta` di Agraria, le Universita` di Napoli ‘Federico II’, Portici, Italy;

4

Istituto di Scienze dell’Alimentazione del CNR, Avellino, Italy

The as1-casein (as1-Cas) locus in the goat is characterized by

a polymorphism, the main feature of which is to be

qualit-ative as well as quantitqualit-ative A systematic analysis performed

in an autochthon southern Italy breed identified a new rare

allele (M), which was characterized at both the protein and

genomic level The M protein displays the slowest

elec-trophoretic mobility of the as1-Cas variants described so far

MS and automated Edman degradation experiments

showed that this behavior was due to the loss of two

phos-phate residues in the multiple phosphorylation site (64SP-SP

-SP-SP-SP-E-70E) consecutively to a Serfi Leu substitution

at position 66 of the peptide chain (64S-SP-L-SP-SP-E-70E)

This was confirmed by sequencing a genomic DNA

frag-ment encompassing exon 9 where the 8th codon (TCG) was

shown to be mutated to TTG Sequencing of amplified

genomic DNA segments spanning the 5¢ and 3¢ flanking

regions of each exon allowed us to identify 23 single

nuc-leotide polymorphisms and two insertion/deletion events in the coding as well as the noncoding regions A comparison of specific haplotypes defined for each of the as1-CasF, A and

Malleles indicates that the M allele probably arises from interallelic recombination between alleles A and B2, followed

by a Cfi T transition at nucleotide 23 of the ninth exon The region encompassing the recombination break point was putatively located between nucleotide 86 upstream and nucleotide 40 downstream of exon 8 Interallelic recombi-nation therefore appears to be a possible means of gener-ating allelic diversity at the as1-Caslocus, at least in the goat The previously proposed molecular phylogeny must now be revised, possibly starting from two ancestral allelic lineages Keywords: as1-casein gene; allelic recombination; genetic polymorphism; goat milk

Caseins comprise the main protein fraction of ruminant

milk They are encoded by four tightly linked genes [1],

clustered in a 250-kb genomic DNA segment [2] in the

following order: as1, b, as2 and j [3] They have been

mapped on chromosome 6 in cattle and goats [4,5] The

as1-casein locus (as1-Cas) is characterized in the goat by a

polymorphism, the main feature of which is to be

qualit-ative as well as quantitqualit-ative Indeed, more than 11 alleles

have so far been characterized [6], distributed among seven

different classes of protein variants (as1-CasA to as1-CasG),

associated with four levels of expression ranging between 0

(as1-Cas0) and 3.5 gÆL)1(as1-CasA, B, and C) per allele

Whereas the as1-CasE variant, which is 199 amino-acid residues in length, only differs from variants A, B and C by single amino acid substitutions [7], the F variant displays an internal deletion of 37 residues [8], leading to the loss of a hydrophilic cluster of five contiguous phosphoseryl resi-dues: 64SerP-SerP-SerP-SerP-SerP-Glu-70Glu This deletion arises from the outsplicing of three exons (9, 10 and 11) during the processing of primary transcripts, probably because of a single nucleotide deletion occurring within the first (exon 9) unspliced exon [9] More recently, the B allele has been split up into four alleles giving rise to the synthesis

of four protein variants B1, B2, B3, and B4, which differ as a result of amino-acid substitutions [6] These substitutions have no effect on the net charge of the protein, which therefore makes the relevant variants indistinguishable on PAGE Variant B1is considered to be the original type in goat because it shows the closest homology to its bovine and ovine counterpart [6]

The distribution of these different alleles or variants has been investigated in a great variety of breeds and popula-tions [6,10–13] Breeds from the Mediterranean area usually display a high frequency of ‘strong’ alleles (mainly A and B) However, local and now rare breeds generally do not follow this rule and are often the source of rare ‘germoplasms’ Three novel as1-Cas variants (H, I and L) have been identified by Chianese et al [14] in southern Italian goat populations More recently, a further novel and rare

Correspondence to P Martin, Laboratoire de Ge´ne´tique biochimique

et de Cytoge´ne´tique, INRA, Domaine de Vilvert,

78 352 Jouy-en-Josas, France Fax: + 33 1 34 65 24 78,

Tel.: + 33 1 34 65 25 82, E-mail: martin@jouy.inra.fr

Abbreviations: a s1 -Cas, a s1 -casein; UTLIEF, ultra-thin-layer

isoelectric focusing; LC/ES/MS, liquid chromatography/electrospray/

mass spectrometry; ACRS-PCR, amplified created restriction

site-PCR.

*Present address: INSERM E9925, Interactions de l’e´pithe´lium

intestinal avec le syste`me immunitaire, Faculte´ Necker-Infants

Malades, 156, rue de Vaugirard, 75 743 Paris Cedex 15, France.

(Received 29 August 2001, revised 17 December 2001, accepted

9 January 2002)

Trang 2

variant, named M, was detected in the Molisane

Montefal-cone goat breed [15], which was shown, in addition, to

display a rather high frequency of the F allele [16]

In this paper, we report the characterization of this new

variant at both the protein and genomic level The complete

amino-acid sequence of the M variant has been determined

Starting from genomic DNA, we amplified, by PCR, the

coding regions (exons) and their intron flanking regions,

which have been subsequently sequenced Such a dual

approach has made it possible to identify the mutation

specific for the as1-CasMallele Extensive comparisons of

these sequences with those of previously characterized

alleles have allowed the identification of additional

poly-morphic sites, the arrangements (haplotypes) of which

strongly suggest an interallelic recombination (or a gene

conversion) event at the origin of the as1-CasMallele This

is, to our knowledge, the first hypothesis of a genomic

recombination event to account for genetic polymorphism

at a locus encoding a milk protein

M A T E R I A L S A N D M E T H O D S

Animals

A total of 147 individual milk samples were analysed from

Montefalcone goats, which are localized in southern Italy

(Molise region) Eight goats were used, as well as two bucks,

for peripheral blood (15–30 mL), which was subsequently

used for DNA extraction

Casein preparation

Whole casein was prepared by acid precipitation of

individual skimmed milk as described by Aschaffenburg &

Drewry [17]

Gel electrophoresis

Vertical disc PAGE at pH 8.6, preparation of casein

samples and polyclonal antibodies against as1-Cas, and

immunoblotting experiments were performed as described

elsewhere [18]

Preparation of polyacrylamide gel ultra-thin layers

(0.25 mm) and isoelectric focusing (UTLIEF) were carried

out as recommended by EEC Regulation no 690/92 [19]

The pH gradient in the range 2.5–6.5 was obtained by

mixing Ampholine (Pharmacia LKB) 2.5–5, 4.5–5.4, and

4–6.5 in the volume ratio 1.6 : 1.4 : 1

2D gel electrophoresis (PAGE in the first dimension

followed by UTLIEF in the second) has been described

elsewhere [18]

Enzymatic hydrolyses

Trypsin (Boehringer Mannheim) hydrolysis was carried out

in 0.4% NH4HCO3, pH 8.5, at 37°C, for 4 h, in a

substrate/enzyme ratio of 50 : 1 (w/w) Dephosphorylation

with calf intestine alkaline phosphatase (Boehringer

Mann-heim) was performed in the same buffer by using 1 mU

enzyme/mg casein at 37°C for 18 h; these conditions have

been previously shown to produce complete

dephosphory-lation of the sample [20] Reactions were stopped by

freeze-drying

Liquid chromatography/mass spectrometry analysis

of proteins and peptides The whole caprine casein samples were fractionated by the procedure of Jaubert and Martin [21], modified by Ferranti

et al [22]

Liquid chromatography/electrospray/mass spectrometry (LC/ES/MS) was performed using a HP1100 modular system on-line connected to a Platform (Micromass) single quadrupole mass spectrometer The selectively precipitated casein phosphopeptides were fractionated by RP-HPLC on

a 214TP54, 5 lm Vydac C18, 250· 2.1 mm internal diameter column (Vydac, Hesperia, CA, USA) Solvent A was 0.3 mL trifluoroacetic acid per L water Solvent B was 0.2 mL trifluoroacetic acid per L acetonitrile Samples (500 lg) were dissolved in 200 lL water and injected on to the HPLC column equilibrated in solvent A A linear gradient from 0% to 37% B was applied at a flow rate of 0.5 mLÆmin)1over 60 min The column effluent was split

1 : 25 to give a flow rate of 4 lLÆmin)1 into the electrospray nebulizer The bulk of the flow was run through the detector for peak collection as measured by following A220 The ES-mass spectra were scanned from

1800 to 400 lm at a scan cycle of 5 s per scan The source temperature was 120°C and the orifice voltage 40 V Mass values were reported as average masses Signals recorded in the mass spectra of peptides were associated with the corresponding tryptic peptides on the basis of the molecular mass, taking into account the enzyme specificity and the reported amino-acid sequence of as1-Cas from different species Quantitative analysis of components was performed

by integration of the multiple charged ions of the single species [22]

Sequence analysis Automated Edman degradation was performed using an Applied Biosystems model 477A Protein Sequencer with on-line phenylthiohydantoinyl amino acid (Pth-Xaa)-HPLC analyzer Phosphorylated peptides were modified

by the procedure of Ferranti et al [20]

Genomic DNA preparation Goat genomic DNA was prepared from leucocytes isolated from the plasma fraction of EDTA-anticoagulated periph-eral blood samples, as described previously [23,24] Oligonucleotides

Intronic primers used either for amplification from genomic DNA or for sequencing of amplified DNA fragments were provided by Genosys Biotechnologies Inc (Cambridge, UK) and Primm (Milano, Italy)

Their sequences are given in Table 1, together with those used for genotyping

PCR conditions

In vitroamplification was performed with the thermostable DNA polymerase of Thermus aquaticus (Taq polymerase) using either a 480 or a 2400 thermal cycler (PerkinElmer), essentially as described [25] A typical 50-lL reaction

Trang 3

mixture consisted of 5 lL 10· PCR buffer (500 mMKCl,

100 mMTris/HCl, pH 9.0, 1% Triton X-100), 3 lL 25 mM

MgCl2, 2.5 lL 5 mM dNTPs mixture, 0.5 lL (25 pmol)

each primer, 2 lL template DNA, and 0.25 lL (1.25 U)

Taqpolymerase (Promega) To avoid evaporation (with 480

thermal cycler), the mixture was covered with 70 lL mineral

oil After an initial denaturing step of 5 min (or 10 min) at

94°C, the reaction mixture was subjected to the following

three-step cycle which was repeated 35 times: denaturation

for 30 s (or 1 min) at 94°C, annealing for 30 s (or 2 min) at

47–60°C, and extension for 30–60 s (or 3 min) at 72 °C,

using the 2400 (or 480) thermal cycler To estimate the concentration of PCR products, 5 lL each reaction mixture was analysed by electrophoresis, in the presence of ethidium bromide (0.5 lLÆmL)1) in a 2% SeaKem (FMC) or Gibco BRL Life Technologies agarose slab gel in Tris/borate/ EDTA (8.9 mMTris, 8.9 mM boric acid, 0.2 mM EDTA,

pH 8.0) buffer

For genotype as1-CasM, using the amplified created restriction site (ACRS)-PCR procedure [26], experimental conditions are essentially the same as those mentioned before except for the primer concentration (50 pmolÆ50 lL)1 reac-tion mix) and the concentrareac-tion of the agarose slab gel used

to visualize the PCR products was 4% (2% Gibco-BRL and 2% high-resolution agarose FMC)

Sequencing of amplified genomic DNA fragments PCR products were either directly sequenced or sequenced after cloning (fragments amplified between primers C9U and C9L) into SmaI-digested pUC18 plasmid vector, using fluorescent Cycle Sequencing (AmpliTaq FS, Dye Termi-nator Cycle Sequencing Kit; PerkinElmer) with an ABI 377A or an ABI 310 DNA sequencer

R E S U L T S PAGE analysis and immunoblotting of whole casein Figure 1A shows the typical electrophoretic patterns yielded, in polyacrylamide gel at pH 8.6, by the new as1-Cas phenotype, subsequently shown to be a heterozygous M/F (M being the new variant), in comparison with two reference phenotypes AA (lane 1) and FF (lane 2) This new phenotype is characterized, under these conditions, by the presence of a protein band with a slower mobility (lane 3, *) occurring within the ascomplex As the as1-Cas and as2-Cas overlap in the same zone of the gel, the as1-Cas composition

of each phenotype was analysed by immunostaining after

Table 1 Primers used in the present study Each pair of primers

amplifies the target exon and its flanking regions (from 60 to 200

nucleotides upstream and downstream) Primers ending with U (upper)

and L (lower) are positioned 5¢ and 3¢ from the target exon,

respect-ively Given the small size of introns 4 and 10, primers C45U/C45L

and C1011U/C1011L were designed to amplify together exons 4 and 5

and exons 10 and 11, respectively Sequencing of exon 7 was performed

starting from a genomic DNA fragment produced by amplification

between C7U and C8L Primers in italics were used in the genotyping

of allele M.

C2U 5¢ AAT CAA ATT TTA TTA TAA GAC C 3¢

C3U 5¢ GGT GTC AAA TTT AGC TGT TAA A 3¢

C3L 5¢ GCC CTC TTC TCT AAA AAG GTT T 3¢

C45U 5¢ TGA CTG TGT TTT TCA CTT CT 3¢

C45L 5¢ GCT TTG TTA ATT CTG CAG TA 3¢

C7U 5¢ CAT GAA GCA ATA TAT CTG CTC C 3¢

C7L 5¢ TGG TCA ACA TAC ATG TTG CAT C 3¢

C8L 5¢ TGG CAC AAC ATT GTA CAT TCT TGG G 3¢

C9U 5¢ GTA TGG AAG TGT GGA ATA GTT T 3¢

C9L 5¢ GGA CAC CAC AGA TAT CCA ATA G 3¢

C1011U 5¢ CAT AAA ACT AAC AAT ACA TGT 3¢

C1011L 5¢ TAG CAG ATA TTG AAA AGG AG 3¢

C12U 5¢ CCA GTG AAT ATT CAG GAC TGA T 3¢

C12L 5¢ AGG CTC TAG CAT GAT TTG ATG T 3¢

C13U 5¢ GCA TTT TTA TTT TGA ATG TAA A 3¢

C13L 5¢ TAG TTC AAA TGC ACA TCT TAT 3¢

C14U 5¢ GGC AGA GAA TAC GTT TAT ACT AA 3¢

C14L 5¢ TCT CAG ATT GAC TAC TAC AAC TT 3¢

C15U 5¢ CAT GAA AAG CAT TTC AAA AA 3¢

C15L 5¢ TAA AAA ACA GTG GTT ACC AA 3¢

C16U 5¢ CTA AAG AGT ACA CTA TCC TCA C 3¢

C16L 5¢ TTG CTG TGG TTG CCT ATC CTA 3¢

C17U 5¢ TGA TTT CTC ATA CAC TGT TG 3¢

C17L 5¢ TTG ATA AGG CAA CAA TAT GC 3¢

C18U 5¢ GTC CCA ACT TGA AAT CCT GAT C 3¢

C18L 5¢ CAA GTT TAT AGT CTA CAC GTT GTA C 3¢

C19U 5¢ CTT AGC ATC TTC CAT GGC TTG ATC 3¢

C19L 5¢ ATA CAC ACA AAC TCA CAA GG 3¢

MWU 5¢ CAA CAT ATT TTA AAT AAA ATT GAC AAT 3¢

C9LM* 5¢ ATA AAA ATG GTA TAC CTC ACT TGT*C 3¢

C9UM1 5¢ TAA CAA TGA TTC TCT TTC TTT TAG 3¢

C9LM1 5¢ AAT CTT TAT TTT GTC TCT GAC AA 3¢

Fig 1 Disc-PAGE at pH 8.6 of individual whole caprine casein samples containing different a s1 -Cas variants AA, FF and MF Phenotypes are indicated at the top of each lane Staining was with (A) Coomassie Brilliant Blue and (B) polyclonal antibodies against a s1 -Cas a–e iden-tify a s1 -Cas bands of the MF sample in order of increasing mobility towards the anode.

Trang 4

transfer to NC paper with specific polyclonal antibodies

raised against as1-Cas; the result is shown in Fig 1B The

new as1-Cas phenotype (M/F) comprises at least five

components (a, b, c, d, and e) Two of these (a and c)

appear to be shared with variants A and F, while

components e and d seem to be in common with the A

variant Therefore, band b represents the only component

specific to the M variant The intensities of the bands in the

MF pattern indicate that variant M is a ‘strong’ variant like

variants A, B and C, i.e it has a high level of expression

However, as the intensities of three apparently homologous

components (a, c, and e) in the AA and MF profile were

different, further heterogeneity of the PAGE components

may be suspected

To understand the high degree of heterogeneity observed

with goat as1-Cas and to try to explain the difference in

band intensities, further electrophoretic experiments were

carried out, including UTLIEF analysis and 2D

electro-phoresis followed by staining with polyclonal antibodies

In UTLIEF (results not shown) the as1-CasM/F phenotype

comprised at least seven major components, two of which

were in common with variant F Using 2D electrophoresis

(Fig 2), at least two main spots surrounded by a number of

minor components differing in their pI were found in each

PAGE band This large microheterogeneity, which also

occurs for other casein phenotypes (results not shown), may

be attributable to nonallelic as1-Cas forms generated by

defective mRNA splicing and to differently phosphorylated

a -Cas chains, as reported by Ferranti et al [20]

MS and sequence analyses

To determine the molecular mass of the new variant (M), whole caseins of individual milks of the phenotypes A/A, F/F, and M/F were subjected to HPLC separation (Fig 3) The retention time of variant M was shorter than that of the

A variant while the relative percentage was the same The HPLC fractions were analysed by ES/MS, and the molecular masses of as1-CasA, B, and F were in agreement with the expected masses [7,9] The molecular mass deter-mined by ES/MS of the as1-Cas components occurring in the sample containing M/F variants was 23 134/23 214/23

294 Da (Fig 4) After alkaline phosphatase hydrolysis, the molecular mass of the three main peaks shifted to the single value of 22 734 Da, indicating the occurrence of three

as1-Cas species carrying five, six, and seven phosphate groups, respectively A set of small HPLC peaks eluted before the main a -Cas peak gave a molecular mass of

Fig 2 2D electrophoretic analysis of a whole casein sample prepared

from the milk of a single goat, heterozygous M/F at the a s1 -Cas locus.

Disc-PAGE was performed in the first dimension followed by

UTLIEF in the second dimension The UTLIEF pattern in the pH

range 2.5–6.5 is shown on the left Staining was with polyclonal

anti-bodies raised against a s1 -Cas.

Fig 3 RP-HPLC analysis of casein fractions from goats of different genotypes F/F (A), M/F (B) and A/A (C) at the a s1 -Cas locus.

Trang 5

18 715/18 795/18 875 Da (18 555 Da after alkaline

phos-phatase hydrolysis), corresponding to that expected for the

F variant This result is the first evidence for the

heterozy-gous status (M/F) of the individual goat milk analysed

In addition to this, the HPLC profile confirms that the

M variant is abundantly expressed Thus, as previously

mentioned, we were working with a mixture of two

unresolvable variants, one of which (M) accounts for more

than 80% of the whole as1-Cas This overrepresentation of

the M variant allowed us to continue the molecular

characterization with such a material

The as1-Cas fraction containing the M variant was

digested with trypsin, and the resulting peptide mixture

analysed by LC/ES/MS (Fig 5) The peptide sequence

determined for the M variant was identical with that yielded

by variant B2(from the published sequence [6,7]) except for

two substitutions located in peptide 62–79 MS and

automated sequence analysis actually demonstrated that

peptide 62–79 (molecular mass 1833 Da and sequence

AGSSLSSEEIVPNSAQQK, where S indicates a

phos-phorylated serine residue) contains the two substitutions

Ser66fiLeu and Glu77fiGln, as compared with the B2

variant The substitution Serfi Leu at position 66, first

makes this site unphosphorylatable and secondly impairs

the phosphorylation of Ser64 in the M variant The

sequence determined is consistent with the molecular mass

measured for the native protein The phosphorylated

residues are therefore Ser46, 48, 65, 67, 68 (fully), and

Ser41 and Ser115 (partly), which originate in proteins with

five, six and seven phosphates/mol, explaining the

hetero-geneity of phosphorylation observed for the native protein

by ES/MS analysis (Fig 4) Finally, peptide E96QLLR100,

diagnostic of the F variant, was present among the peptides

identified by Edman degradation after tryptic digestion and

RP-HPLC fractionation, confirming the heterozygous

sta-tus (M/F) of the sample analysed

Experimental strategy designed to analyse the new

as1-Cas variant at the nucleotide level

To determine the coding sequence of a gene, there are at

least two possible strategies: it is possible to analyse it at

both the genomic level and messenger level The most straightforward option is undoubtedly mRNA extraction to construct a cDNA molecule The structure of the coding region is then readily obtained by sequencing the cDNA

In our situation, however, such a strategy was not possible Given the low number of animals in the popula-tion, it was not possible to slaughter individuals of interest

In addition, as the animals were from private flocks bred in mountain meadows, it was not possible to make mammary tissue biopsy samples under appropriate hygienic condi-tions

To overcome this, we tried to extract mammary mRNA from milk somatic cells, using the technique first described

by Martin et al [27] Unfortunately, we could not obtain enough material to synthesize cDNA However, as expected from the phenotypic analysis (at the protein level), the few animals yielding in their milk the as1-CasM variant were exclusively heterozygous M/F at the as1-Cas locus There-fore, analysis of their transcripts could have been rather difficult because of the occurrence of at least nine different forms of messenger arising from the F allele [9] Finally, to integrate this new allele into the phylogenetic tree proposed

by Grosclaude & Martin [6], we also needed to obtain information about relevant noncoding regions in which specific and informative mutations are localized

For these reasons, we decided to analyse the sequence of the M allele at the genomic level After amplification of each exon and its intron-flanking regions, amplified genomic DNA fragments were sequenced The knowledge of the structural organization of the goat gene encoding the as1-Cas [9] made this strategy possible In addition, the complete sequence of the bovine gene [28] was also available and showed that the two genes display the same organiza-tion (number and sizing of exons) and 95% similarity at the exon sequence level As goats and cattle are phylogenetically close and known intron sequences in the goat show strong similarity to their bovine counterparts, we designed prim-ers upstream and downstream of each exon to amplify and analyse genomic regions including flanking intron

Fig 4 Deconvoluted electrospray mass spectrum of caprine a s1 -Cas

M variant.

Fig 5 LC/ES/MS analysis of the tryptic digest of the a s1 -CasM vari-ant The purified protein was digested with appropriate concentrations

of trypsin (see Materials and methods) The peptide mixture was analyzed using a Vydac C18 column (250 · 2.1 mm, 5 lm), on-line with a Platform mass spectrometer, as described in Materials and methods The peak of the variant peptide is indicated by an arrow.

Trang 6

sequences, starting from both the bovine and the goat

sequences

Analysis of the exon sequences at the genomic level

As the samples analysed were from goats that were

heterozygous (M/F) at the as1-Cas locus, to discriminate

between the two alleles and therefore determine the 19 exon

sequences coming from the M allele, sequence data were

compared with those from a homozygous F/F goat genomic

DNA sample All the sequences yielded were

unambigu-ously determined except that corresponding to the PCR

fragment encompassing the ninth exon in which a single

nucleotide deletion has been shown to occur in the F allele

[9] This makes the sequence chromatogram unreadable

from that point for the DNA template amplified from the

heterozygous M/F sample To overcome this problem, the

amplified fragment was cloned Of the 10 clones sequenced,

four displayed a typical F exon-9 sequence, and five showed

the same sequence, which was different from that of the

Fallele, with a 33-nucleotide exon 9 Taken together, the

exonic sequence data allowed us to construct a sequence

corresponding to the complete cDNA of the M allele This

sequence is given in Fig 6, where it is compared with that of

alleles F and A

Only four polymorphic nucleotides were identified, three

of which yielded amino-acid substitutions: (a) the transition

TfiC on the second nucleotide of the third codon within

exon 4, leading to a LeufiPro substitution at position 16 of

the peptide chain, as compared with the A variant; (b) a

transversion GfiC on the first nucleotide of the last exon-10

codon, leading to a GlufiGln substitution at position 77 of

the peptide chain, as compared with the F variant; (c) the

deoxycytidyl phosphate residue at position 23 in the 9th

exon of the A allele, which is deleted in the F allele, is

mutated to T in the M allele, giving rise to a Serfi Leu

substitution

Analysis of the intronic flanking sequences

The flanking intronic regions directly upstream and

down-stream of each exon were sequenced over 50–200

nucleo-tides and the complete sequences of introns 4, 7, and 10 were

determined for alleles A, F, and M In this way, 20 further

polymorphic sites were identified besides the four

polymor-phic exon nucleotides (Fig 7) In addition, an RsaI

restriction site was found between exon 6 and exon 8 of

alleles F and M, which is lacking in the A allele, giving a

total of 25 polymorphic sites useful for phylogenetic allele

comparisons Taking into account these data, it is worth

noting that in the 5¢ part of the gene, up to exon 8, the

nucleotide combination (haplotype) observed for the M

allele is identical with that shown by the F allele In contrast,

in its 3¢ part, beyond exon 8, the haplotype of the M allele is

identical with that of the A allele, except at the polymorphic

site located in exon 9

In addition, intron 5 was completely sequenced starting

from genomic DNA isolated from blood of two goats,

genotyped as M/F and F/F at the as1-Caslocus Compared

with the bovine sequence, a deletion spanning nucleotides

376 to 594 was observed for both goats The deleted region

in this intron did not match any known sequence in the

EMBL databank Subsequently, the existence of this

deletion was confirmed by PCR for six goats of different

as1-Cas genotypes (A/A, F/F, M/F) from different Italian breeds (Montefalcone, Teramana, Garganica, Girgentana, and Sarda) and for six sheep of different Italian breeds (Comisana, Gentile di Puglia, and Valle del Belice) These results: (a) confirm the difference in size ( 200 bp) previously reported [23] between goat ( 450 bp) and cattle (641 bp) intron 5, and (b) show that the ovine intron 5 is also shorter than the cattle one This could be expected, given the phylogenetic proximity between sheep and goats Genotyping of theM allele

The genotyping procedure designed consists of two steps The first one is an ACRS-PCR technique [26], the principle

of which is to create a TaqI restriction site (TCGA) by using

a mismatching primer (C9LM*) which allows both the F and M alleles to be discriminated from all the others (Fig 8, Step IA) These two alleles will be subsequently distin-guished after a second amplification which allows discrim-ination between the alleles on the basis of the fragment sizes (Fig 8, Step IIA)

In the first step, a 266-bp (265 bp for the F allele) DNA fragment is amplified between primers MWU and C9LM* with every allele After digestion with TaqI, the 265/266-bp fragment is split into two fragments (240 and 26 bp) for each allele except the M and F alleles, for which no TaqI site has been created (Fig 8, Step IB), because of mutations (deletion or substitution) occurring at position 23 in exon 9 (TTGA instead of TCGA)

To discriminate between the M and the F alleles, we took advantage of the presence of an 11-bp insertion in intron 9

of the F allele, which is lacking in the M allele Thus, using two primers, C9UM1 (forward) and C9LM1 (reverse), located just upstream from exon 9 and 82 nucleotides downstream of the 11-bp insertion site, respectively, a 238-bp DNA fragment was yielded by PCR starting from the M allele, whereas the F allele gives a 248-bp fragment (Fig 8, Step IIA)

Individuals analysed here, which allowed the M allele to

be characterized were heterozygous M/F Consistent with our structural results, they gave the two fragments (238 and

248 bp) as shown for one of them at Fig 8, Step IIB (lane 1) It is worth noting that the third band observed with this sample is due to the occurrence of a heteroduplex structure this was confirmed by analysing an amplification product from the mix of samples F/F and X/X (Fig 8, Step IIB, lane 4)

D I S C U S S I O N

We report the identification and the molecular character-ization of a new allele, named M, occurring at the as1-Cas locus in the goat This novel allele, characterized by the transition CfiT at position 23 in the 9th exon of the gene, was found in the Montefalcone breed, at very low frequency (< 2%) after phenotypic analysis of 147 individual milk samples All goats bearing the M variant were shown to be heterozygous (M/F and M/B)

Interestingly, the mutation specific for the M allele affects the same nucleotide as that which is deleted in the F allele, and shown to be responsible for the internal deletion of 37 amino-acid residues in the F variant, as a consequence of the

Trang 7

skipping of three successive exons during the course of the

processing of the primary transcripts [9] At the peptide

level, the CfiT transition, which leads to a Ser66 fi Leu66

substitution, gives rise to the loss of two of the five

phosphate groups clustered in the multiple phosphorylation

site of the as1-Cas This loss explains the lower

electro-phoretic mobility of the M protein compared with the other

caprine as1-Cas variants described so far This situation is

similar to that observed in sheep, with the a -CasD variant

(previously called the Welsh variant) Actually, this ovine variant has only two phosphoserine residues instead of five

in the homologous region of the as1-CasA and C variants [22] However, whereas the structural alteration in the D (Welsh) variant is associated with a reduction in milk casein content [29,30], the M variant, like the goat as1-CasA and B variants, must be considered a ‘strong’ variant, given the intensity of the isoelectrofocusing bands and the surface of the relevant peak in RP-HPLC

Fig 6 Nucleotide sequence of the expected a s1 -CasM cDNA obtained by genomic exon sequencing analysis: comparison with its A and F allele counterparts Numbering begins with the first nucleotide of the first exon (up) and the first amino-acid residue of the mature M protein (down) Dashes indicate nucleotides identical with those of the M allele The stop codon is symbolized by *** Numbers in vertical framed arrows indicate the position of the introns The boxes indicate amino-acid substitutions.

Trang 8

Unexpectedly, placing variant M in the phylogeny

(Fig 9) proposed by Grosclaude and Martin [6] turned

out to be rather difficult Indeed, a comparison of the

different variants at the peptide sequence level suggests a

hybrid structure for the M protein Taking into account

amino-acid combinations at the polymorphic residues

(haplophenotypes), the M variant, with a proline and

glutamine residue at position 16 and 77, respectively,

could be placed in both lineages (A and B) arising from

the putative ancestral protein B1 This possible dual

membership strongly suggests the involvement of a

recombination/gene conversion event between alleles from

the two lineages This hypothesis was strengthened by

genomic sequence data Although a mutation-driven

convergence cannot be excluded, an interallelic

recombi-nation/gene conversion event seems to be the most

plausible Intronic sequences relative to A, F and M

alleles (Fig 7) strongly support such a notion Indeed, a

detailed comparative analysis at 25 polymorphic sites,

including 23 single nucleotide polymorphisms, spanning a

large part of the transcription unit provides a haplotype

formula allowing each allele to be precisely characterized

Thus, the M allele unequivocally appears to be a hybrid

structure made of F-type allele sequences in its 5¢ part

followed by A-type allele sequences in its 3¢ part (except

the transition CfiT at nucleotide 23 in the ninth exon)

Following such a scheme, a recombination event would

have occurred around exon 8 (Fig 7) However, the

genomic sequences analysed do not allow us to distinguish

whether the mechanism by which allele M originates is

consecutive to a double (gene conversion) or to a single

(recombination) cross over However, as over the 10 kb

separating exon 8 from the end of the transcription unit

there are no sequence clues indicating a second cross over,

it seems most likely that the M allele originated in an

interallelic recombination event Gene conversion events,

which usually account for exchanges over short sequence

tracts [31], have been mainly described and intensively

investigated as mechanisms generating allelic diversity in

highly polymorphic genetic systems, such as the loci

encoding the class-II cell surface antigens of the major

histocompatibility complex in humans [32,33] Both mechanisms have also been thought to account for genetic disorders in humans, such as sporadic Alzheimer disease cases [34] and diabetic pathology involving the gene encoding insulin [35]

Simplified haplotype formulae strongly suggest that the allele that provided the 3¢ part of the recombinant allele (M)

is the A allele (Fig 10) In contrast, one can wonder whether the donor allele of the 5¢ part is the F allele or another allele belonging to the same B allelic lineage (excluding B1and C),

as they share the same simplified haplotype formula, up to exon 8 To reach a definite conclusion, the complete sequence of the 5¢ region of each allele would be required, because no differences have been found in the available sequences (exons and intron-flanking regions)

If our recombination hypothesis is correct, the break point should be located between nucleotide 86 upstream and nucleotide 40 downstream from exon 8, and the cross over should have been accompanied by a reciprocal exchange One can therefore expect to find the reciprocal recombinant allele among the alleles so far described The structural features of such a recombinant allele should be

an A-type sequence in the 5¢ part followed by a B-type (B2/B3/B4 or F) sequence in the 3¢ part The only allele found so far gathering such characteristics is allele B1, with a B2 simplified haplotype formula in its 3¢ part (Fig 10) This confirms our assumption and suggests that the M allele probably results from an interallelic recom-bination event involving alleles A and B2 whereas the reciprocal event might have given B1 However, with a leucine residue at position 66, it is clear that the M allele does not arise directly from the recombination event between alleles A and B2 It probably is derived from an intermediate hybrid allele (B2:A), putatively W, not yet identified, which was subsequently mutated at nucleotide

23 of the ninth exon

Because of its close similarity to its bovine and ovine

as1-Cas counterparts, allele B1 was considered to be the ancestral allele in the goat [6] The results reported here indicate that B1might result from an interallelic recombi-nation between alleles A and B, which can therefore be

Fig 7 Polymorphisms occurring at 25 sites in the goat a s1 -CasA, F and M alleles The position of each polymorphic site is identified and numbered relative to the nearest exon Intronic nucleotides are preceded by a ‘–’ or ‘+’ when they are upstream or downstream, repectively (e.g )11/1 corresponds to the nucleotide located 11 nucleotides upstream from the first exon) Polymorphic sites in an exon sequence are identified without a sign (e.g 8/4 identifies the 8th nucleotide of the 4th exon) RsaI/6–8 indicates the loss (–) or gain (+) of an RsaI restriction site within the DNA fragment spanning exon 6 to exon 8 Mutations specific for alleles M and F at position 23 in the 9th exon are highlighted The symbol D indicates the nucleotide deletion in allele F [6] The hatched boxes, identified by i7-e8-i8, encompass the putative recombination region.

Trang 9

Fig 8 Genotyping the M allele at the a s1 -Cas locus Step I: ACRS-PCR using the primers pair MWU and C9LM* yields a 265/266-bp fragment, whatever the allele Amplicons are then submitted to restriction by TaqI (A) The TaqI restriction site (TCGA) created in exon 9 is indicated Nucleotides C and A* correspond to the mutation characteristic for allele M and substitution introduced within the primer C9LM*, respectively Fragments generated are finally analysed by agarose gel (2% Metaphore + 2% agarose) electrophoresis (B) Lane 1, Molecular mass marker (pBR322 digested by HaeIII); lane 2, nondigested PCR product; lanes 3–5, homozygous X/X, heterozygous M/F and heterozygous X/F samples, respectively, where X represents an allele different from F, B, E, and C Sizes (in bp) of DNA fragments are given on the right of the gel Step II: AS-PCR to discriminate between alleles F and M (A) Amplification between primers C9UM1 and C9LM1 generates DNA fragments of characteristic size for the allele (B) Agarose gel (2% Metaphore + 2% agarose) analysis of amplicons from heterozygous M/F (lane 1), homozygous X/X (lane 2), homozygous F/F (lane 3), F/F + X/X mix (lane 4), with X different from F, B, E, and C Lane 5 shows a molecular mass marker (pBR322, HaeIII digested) Sizes (in bp) of DNA fragments are given on the left of the gel.

Trang 10

considered representatives of two ancestral allelic lineages The reciprocal proposal, i.e B1and W are parental alleles, the recombinant products of which are A and B2, cannot be ruled out (Fig 10) The latter proposal is, however, less plausible, given the low frequencies at which alleles B1and

Mhave been found in the goat populations analysed so far

It is worth noting that both alleles are characteristic of local breeds, Poitevine (France) and Montefalcone (Italy), respectively Nevertheless, whatever the hypothesis retained, the existence of two ancestral allelic lineages seems to be the most likely scenario Thus, interallelic recombination between two alleles may be responsible for the generation

of four possible allelic lineages (represented by A, B2, B1, and W), one of which (W/M) is revealed by this work The high polymorphism of the goat as1-Cas system provides further evidence that allelic diversity can arise from multiple pathways, including shuffling of polymorphic sequences generated by point mutations, through interallelic recombi-nation events

Fig 9 Phylogeny proposed by Grosclaude and Martin [6] for the

a s1 -Cas alleles and differences between the corresponding variants The

phylogenetic tree proposed is based on the existence of a single

ancestral allele (B 1 ), which was considered to be the original one given

its close similarity to its ovine and bovine a s1 -Cas counterparts.

Fig 10 A new phylogenetic tree integrating the possible interallelic recombination between two allelic lineages The four alleles (B 2 , A, B 1 , and W) putatively involved in the recombination event are schematically represented as a chain of six boxes (mimicking exons) on which are indicated polymorphic amino-acid residues and their position in the peptide chain A simplified haplotype formula is thus provided (e.g HPSPERT and HLS P QRT for alleles B 2 and A, respectively) The RsaI polymorphic restriction site and insertions occurring, respectively, between exons 6 and 8 and within intron 9 are indicated Alleles deriving from these four ‘potentially recombinant’ alleles (boxed) are circled Arrows indicate a possible pathway of evolution to alleles associated with high (black) or reduced (red) amounts of casein synthesized The M allele is derived from allele W by

a single nucleotide transition CfiT (nucleotide 23/exon 9) leading to the occurrence of a leucine residue (allele M) instead of the Ser (putative allele W) in the multiple phosphorylation site of a s1 -Cas The new phylogeny has been enriched with three novel variants (H, I and L) reported in 1997 by Chianese et al [14].

Tiêu đề	Interallelic recombination is probably responsible for the occurrence of a new as1-casein variant found in the goat species
Tác giả	Claudia Bevilacqua, Pasquale Ferranti, Giuseppina Garro, Cristina Veltri, Raffaella Lagonigro, Christine Leroux, Emilio Pietrolà, Francesco Addeo, Fabio Pilla, Lina Chianese, Patrice Martin
Trường học	Università degli Studi del Molise
Chuyên ngành	Biochemistry
Thể loại	Journal article
Năm xuất bản	2002
Thành phố	Campobasso

Định dạng
Số trang	11
Dung lượng	593,18 KB