Legume lectin carbohydrate regognition : clonning of peanut( arachits hypogea) and winged bean (psophocarpus tetragono;obus) agglutinins and reswsign of carbohydrate specificity of PNA
Abstract
Recognition of cell-surface carbohydrates by lectins has wide implications in important biological processes such as protein targeting to cellular compartments, homing of lymphocytes, host-pathogen interactions, and fertilization. The ability of plant lectins to detect subtle variations in carbohydrate structures found on molecules, cells, and organisms has made them a paradigm for protein-carbohydrate recognition (Sharon, 1993; Rini, 1995). Lectins from legumes display a considerable repertoire of carbohydrate specificities, owing perhaps to the sequence hypervariability in the loops constituting their combining site. This, along with the commonly faced problems in cloning and expression of plant genes, has deterred attempts to re-engineer their specificities. These problems constitute the central theme of this thesis, which describes the cloning of two lectins from seeds of peanut (Arachis hypogaea) and winged bean (Psophocarpus tetragonolobus). Active recombinant peanut lectin is produced in Escherichia coli, which is further used as a system to carry out structure-based redesign of its carbohydrate specificity. The current understanding of protein-carbohydrate interactions, in particular involving legume lectins, has been reviewed in Chapter 1.
Peanut (Arachis hypogaea) agglutinin (PNA), a 110 kDa homo-tetrameric lectin, is specific for tumor-associated T-antigenic determinant (Gal1,3GalNAc; Thomsen-Friedenreich antigen). PNA is widely used as a marker for the expression of this O-linked glycan by poorly differentiated and transformed cells. Thus, the ability of PNA to specifically recognize immature thymocytes has been exploited in their separation from mature thymocytes as a prelude to bone marrow transplantation. Although first identified by G. W. G. Bird in 1964 as a haemagglutinating activity in peanut seeds, the protein was initially purified by two groups (Lotan et al., 1975; Irimura et al., 1975). It was shown to recognize terminal - and -galactosides and prefer T-hapten (Gal1,3GalNAc). Elucidation of the three-dimensional structure of PNA by X-ray crystallography demonstrated that despite being very similar to other legume lectins in its tertiary structure, PNA has a most unusual quaternary structure reported thus far for any tetrameric protein (Baneijee et al., 1994). Complete primary structure of PNA has been determined using protein sequencing methods (Young et al., 1991) and subsequently, the cDNA encoding PNA precursor has been cloned (Arango et al., 1992). The PNA precursor has an extra 23-aa signal peptide and 14-aa C-terminal peptide, both of which need to be cleaved for production of functional protein. Chapter 2 describes the cloning, sequencing, and production of functionally processed form of PNA in E. coli. Coding sequence for PNA, amplified from the peanut genomic DNA, was used for the production of the recombinant PNA in E. coli. PNA was purified by refolding inclusion bodies solubilized in the presence of a denaturant.
Lectins from legumes constitute one of the most well-studied families of proteins, yet the lack of a rigorous framework to explain their carbohydrate binding specificities has precluded a rational approach to alter their ligand-binding activity in a meaningful manner. Studies reported in Chapters 3 and 4 deal with the redesign of the recognition propensity of PNA. Although extensively used as a tool for the recognition of tumor-associated Thomsen-Friedenreich antigen (T-antigen; Gal1,3GalNAc) on cell surfaces of malignant cells and immature thymocytes, PNA also recognizes N-acetyllactosamine (Gal1,4GlcNAc) present at the termini of several cell surface glycoproteins. The crystal structure of PNA-lactose complex revealed the presence of leucine 212 at a position close enough to contact the acetamido group on LacNAc. Chapter 3 describes studies on two leucine mutants, L212N (Leucine Asparagine) and L212A (Leucine Alanine), which exhibit distinct preference towards T-antigen and N-acetyllactosamine, respectively. The carbohydrate binding studies reveal that mutant L212N does not recognize LacNAc at high concentrations, thus making it an exquisitely specific cell-surface marker compared to its wild-type counterpart.
Certain legume lectins, like EcorL and SBA, bind both galactose and N-acetylgalactosamine, while others like PNA do not bind the latter. On examination of the three-dimensional structure of EcorL-lactose complex, it was proposed that the presence of a hydrophobic cavity, which can accommodate bulky substituents such as acetamido or dansylamido (NDns) at C-2 of the lectin-bound galactose, is responsible for the preferences displayed. However, mutagenesis studies on EcorL, where such hydrophobic pockets were altered, did not completely abolish the GalNAc binding, suggesting the contribution of additional stereochemical factors. This became evident on the analysis of the crystal structure of PNA-lactose complex, where the side-chain Glu129 is in a position close enough to sterically oppose any substitutions at the C-2 hydroxyl of galactose. Chapter 4 describes two PNA mutants, E129D and E129A, which allow binding of GalNAc, unlike the wild-type PNA which does not recognize GalNAc at all. This observation underscores the subtlety of carbohydrate recognition by legume lectins, where a minor change at a single non-interacting amino acid could drastically influence the specificity at the primary binding site.
Chapter 5 describes the cloning and sequence of the basic lectin from the seeds of winged bean, Psophocarpus tetragonolobus. The 58 kDa (subunit Mr 29,000) winged bean basic lectin, WBA I, has an isoelectric point of >9.5 and agglutinates trypsinized rabbit and trypsinized type A and B, but not trypsinized type O human erythrocytes. Earlier thermodynamic and kinetic studies done in our laboratory using GalNDns as the fluorescent ligand (Khan et al., 1986) showed that the -linked disaccharides of Gal and GalNAc are considerably better ligands than their -linked counterparts. Further studies confirmed that the combining site of WBA I is extended and encompasses all the residues of WBA I-specific blood group A-reactive trisaccharide (GalNAc1,3Gal1,4Glc). The primary structure of WBA I, determined by protein sequencing in the laboratory by K. D. Puri, showed that it belongs to the family of single-chain legume lectins. Sequence analysis suggested that one of the loops constituting the carbohydrate binding site of WBA I is longer than that in other legume lectins. Since protein sequencing methods are prone to a certain degree of errors, we carried out cloning of WBA I by PCR amplification using oligonucleotides deduced from the protein sequence. The complete DNA sequence of WBA I confirmed that it exhibits considerable homology to other legume lectins and shares highest identity with Erythrina corallodendron lectin. Furthermore, the DNA-deduced amino acid sequence of WBA I is quite similar to that from protein sequencing, and the amino acid residues involved in carbohydrate binding, metal-binding, and binding of hydrophobic ligands are highly conserved. The sequence also provides information about the two potential glycosylation sites of WBA I. Since natural WBA I is glycosylated whereas the recombinant protein is not, the latter displays extensive resistance to solubilization, as is evident from the attempts to express the protein in E. coli.
In conclusion, cloning, expression, and rational-based modification of the specificity of PNA have been achieved. The work on PNA conclusively shows that by subtle manipulations of the binding site of a legume lectin, one could achieve meaningful alterations in specificity. In this respect, residues other than the ones that make primary contact with the saccharide are more relevant in influencing the lectin specificity. The data, as summarized in Chapter 6, provides a framework for understanding the molecular recognition of carbohydrates by PNA as well as legume lectins in general. This work provides groundwork for engineering lectins to further improve their specificities in an attempt to develop better tools for carbohydrate recognition.
Appendix A describes the sequence analysis of several legume lectins. The carbohydrate binding site of legume lectins has been shown to be constituted by hypervariable regions of the protein. Careful analysis of this hypervariability by aligning sequences of the four major carbohydrate binding sites, as shown in this section, reveals the existence of a uniform pattern in the variability of the sequences. The classification of legume lectins on the basis of the sizes of their binding loops proves that there is a common theme in the origin of the primary specificity of these proteins. These results contribute to a further understanding of the molecular basis of carbohydrate recognition by legume lectins.

