Expression and characterization of a new serine protease inhibitory protein in Escherichia coli

1Mientrung Institute for Scientific Research, Vietnam Academy of Science and Technology, 321 HuynhThuc Khang, ThuaThien Hue 531600, Vietnam 2Graduate University of Science and Technology, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cau Giay, Ha Noi 122300, Vietnam 3Institute of Marine Biochemistry, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cau Giay, Ha Noi 122300, Vietnam 4NNT Hi-Tech Institute, Nguyen Tat Thanh University, 300A Nguyen Tat Thanh, Ho Chi Minh city 748000, Vietnam


INTRODUCTION
Proteases are enzymes that catalyze the hydrolysis of peptide bonds and play an important role in almost all biological processes; however, their uncontrolled activity often leads to diseases. Excessive protein proteolysis can relate to several diseases, including cancer, cardiovascular, inflammatory, neurodegenerative, bacterial, viral and parasitic diseases 1 . In these cases, protease inhibitors (PIs) can be used as one of the versatile tools for regulating proteolytic activity of target proteases 2 . To date, enzyme inhibitors have received increasing attention, not only for discovery of their structures and action mechanisms but also for potential application in different fields 1,3 . Marine sponges are known to harbor diverse microbial communities [4][5][6] and represent a prolific source of natural products from their associated microorganisms [7][8][9][10] . Recent studies have shown that many potential protease inhibitors have been isolated from sponge-associated microorganisms [11][12][13][14] ; however, the exploration and exploitation potential PIs from microorganisms are still a big challenge because almost all microorganisms are resistant to cultivation in laboratory conditions, especially symbiotic microbes. Fortunately, new approaches (e.g., metagenomics) provide powerful tools for predicting, detecting and expressing novel bioactive genes from noncultured microorganisms [15][16][17][18][19] . This opens new avenues for discovering new bioactive compounds, including PIs, in the future. In order to meet the demands of finding novel and potential PIs from the marine environment, especially PIs from sponge-associated microorganisms, we optimized the expression and characterization of a new serine protease inhibitory protein PI-QT in an E. coli expression system.

Materials
The gene PI-QT is a new gene encoding a serine protease inhibitor from the metagenome of marine sponge QT collected from Quang Tri, Vietnam. It was synthesized and inserted into cloning vector pUC57 (GenScript, Piscataway, NJ, USA). The sequence of gene PI-QT was deposited in the National Center for Biotechnology Information (NCBI) database with the accession number MK359987.

Sequence analysis, multiple sequence alignment, and phylogeny analysis
The PI-QT sequence was compared with other known sequences in the NCBI database using the Blast program. The open reading frame (ORF) of PI-QT was determined using ORF Finder program (https://www .ncbi.nlm.nih.gov/orffinder/). The molecular weight and pI of the deduced protein PI-QT were examined using the Compute pI/Mw tool of the Expasy server (http://web.expasy.org/compute_pi/). Multiple sequence alignments were performed using the ClustalW algorithm in the MEGA 7.0 software (http ://megasoftware.net). A phylogenetic tree was constructed by the neighbor-joining (NJ) method and support of a bootstrap analysis with 1,000 replications implemented in the MEGA 7.0 software. Protein structure model of PI-QT was predicted using SWISS-MODEL program (https://swissmodel.expas y.org/) and (PS) 2 -v2: protein structure prediction server (http://ps2.life.nctu.edu.tw/).

Determination of protein in soluble and insoluble fractions
Cell culture was centrifuged at 10,000 rpm for 15 min and then supernatant removed. The cells were resuspended in TE buffer (Tris 20mM, EDTA 10mM, PM F 0.05mM) and incubated at -75 o C for 1 hr, then thawed at 50 o C for 30 min, and sonicated on ice with a Misonix Ultrasonic Liquid Processors (company name & location). The solution was centrifuged at 13,000 rpm for 15 min, then the supernatant (soluble fraction) collected. The pellet was resuspended in TE buffer to the original volume (insoluble fraction). Next, SDS loading buffer was added to the pretreatment culture (total protein), soluble protein fraction and insoluble protein fraction , and then the proteins were denatured at 100 o C for 10 min. The expression of the protein was checked on 12.6% SDS-PAGE gel.

Western blot analysis
After electrophoresis on SDS-PAGE gel, the recombinant protein was transferred to polyvinylidene difluoride (PVDF) membrane. The membrane was then blocked with TBS (containing 5% skimmed milk) and incubated with anti-TRx antibody (Sigma-Aldrich, St Louis, MO, USA) at a dilution of 1:1000 for overnight at 4 o C. The membrane was washed 4 times with TBS washing buffer and incubated with anti-mouse IgG secondary antibody at a dilution of 1:5000 for 1 hr at room temperature. The membrane was washed with TBS washing buffer and visualized by addition of ρnitro blue tetrazolium chloride 5-bromo-4-chloro-3indolyl phosphate (NBT/BCIP) solution.

Purification of protein PI-QT
Purification of the recombinant protein PI-QT was performed by chromatography on Ni-NTA affinity chromatography column and then eluting with imidazole at concentrations of 100, 300, and 500 mM. The purified protein was removed from salts by dialysis tube (SnakeSkinTM Dialysis tubing) (Thermo Fisher Scientific) in TBS buffer (50 mM Tris HCl, 50 mM NaCl, pH 7.4) at 4 o C for 24 h. Next, the recombinant protein was removed from the TRx-tag fusion using Thrombin kit (Novagen, Darmstadt, Hesse, Germany).

Protease inhibitory assay Trypsin inhibition assay
The purified protein (50 µl of 0.05 mg/ml) was added to a mixture of 50 µl of trypsin (Sigma-Aldrich) solution (0.05 mg trypsin/ml of 0.05 M Tris-HCl) and 100 µl of 0.05 M Tris-HCl, pH 8.0, containing 0.03 M CaCl 2 . The mixture was incubated at 37 • C for 10 min, then 1.0 ml of 0.8 mM BapNA (Sigma-Aldrich) solution was added and incubated at 37 • C for 10 min. The reaction was stopped by adding 20 µl of 30 % (v/v) glacial acetic acid. Subsequently, the solution was centrifuged at 10,000 rpm for 15 min and the absorbance were measured at 410 nm against appropriate blanks.

a-Chymotrypsin inhibition assay
The purified protein (50 µl of 0.05 mg/ml) was added to a mixture of 50 µl of a-chymotrypsin (Sigma-Aldrich) solution (0.05 mg a-chymotrypsin/ml of 0.05 M Tris-HCl) and 100 µl of 0.05 M Tris-HCl, pH 8.0, containing 0.03 M CaCl 2 . The mixture was incubated at 37 • C for 10 min, then 1.0 ml of 0.86 mM BTpNA (Sigma-Aldrich) solution was added and incubated at 37 • C for 10 min. The reaction was stopped by adding 20 µl of 30 % (v/v) glacial acetic acid. Subsequently, the solution was centrifuged at 10,000 rpm for 15 min and the absorbance were measured at 410 nm against appropriate blanks. Bowman-Birk Inhibitor (BBI) (Sigma-Aldrich) was used as a positive control for trypsin and achymotrypsin inhibition assays.

Identification of the recombinant protein Tryptic digestion in gel
The protein band from SDS-PAGE was excised, destained and dehydrated with 50% acetonitrile. The protein was reduced with DTT (65 mM) and subsequently alkylated with IAA (55 mM). The peptides were desalted using Zip-Tips (Millipore, Bedford, MA) according to the manufacturer's instructions. The purified peptides were collected for analysis by LC-MS/MS.

Identification of the recombinant protein by shotgun proteomics
NanoLC-nanoESI-MS/MS analysis was performed on a nanoAcquity system (Waters, Milford, MA, USA) connected to an OrbitrapVelos hybrid mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) equipped with a PicoViewnanospray interface (New Objective, Woburn, MA, USA). Peptide mixtures were loaded onto a 75 µm ID, 25 cm length C18 BEH column (Waters, Milford, MA, USA), packed with 1.7 µm particles with a pore size of 130 Å, and were then separated using a segmented gradient in 60 min from 5% to 40% solvent B (acetonitrile with 0.1% formic acid) at a flow rate of 300 nl/min and a column temperature of 35 • C. Solvent A was 0.1% formic acid in water. The mass spectrometer was operated in the data-dependent mode. After acquisition of spectra, the proteins and peptides were identified by the PEAKS software (Bioinformatics Solutions Inc., Ontario, Canada).

Characterization of protease inhibitor PI-QT
Effect of pH and temperature on activity of the protease inhibitor were determined by performing protease inhibitor assay at pH 7 after incubating the purified protease inhibitor in buffers of different pH (2-12) for 24 h and at different temperatures (30 • C to 70 • C) for 1 h. Effect of metal ions, surfactants, and oxidizing agents on activity of the protease inhibitor were evaluated after incubating the protease inhibitor with 1 mM concentrations of metal salts (MgSO 4 , CuSO 4 , ZnSO 4 , CaCl 2 , MnCl 2 , FeCl 2 , NaCl), 1% (v/v) of surfactants (Tween 20, Tween 80 and Triton X 100), and oxidants (H 2 O 2 and DMSO) for 30 min at 37 • C.

Statistical Analysis
The assays were performed in triplicate and expressed as the mean ± standard error of the mean (SEM). The statistical analysis was performed by t-test and one-way analysis of variance (ANOVA) followed by Tukey's multiple comparison tests using the SPSS v.22 (SPSS Inc, Chicago, IL, USA). The results were considered to be significant at P < 0.05.

Amino acid sequence and phylogeny analysis of protein PI-QT
The gene PI-QT was 1,287 bp in length and had an open reading frame of 429 amino acid with a calculated molecular mass of about 50 kDa and a theoretical isoelectric point of 4.56. Comparison of the deduced amino acid sequence of PI-QT with the sequences in the NCBI database showed that protein PI-QT was most similar with serpins with similarities <55%. Multiple alignments of the deduced amino acids of PI-QT with the most homologous serpins in NCBI database showed that the deduced peptide PI-QT shared conserved active site residues with microbial serpin members (Figure 1). The phylogenetic tree based on the neighbor-joining method (Figure 2A) located the protein PI-QT between two microbial serpin clades: one serpin clade from a candidate phylum of bacteria (Poribacteria) originally identified in the microbiome of marine sponges and another serpin clade from bacterial phyla Firmicutes and Cyanobacteria. Based on the database comparison, the protein PI-QT was considered as a new microbial serpin. The protein structure of PI-QT was predicted by SWISS-MODEL ( Figure 2B) and (PS) 2 -v2 model (Figure 2  C).

Construction of the expression vector pET-32a(+)/PI-QT
The vectors pUC57 containing gene PI-QT and the expression vector pET-32a(+) were digested with EcoRI and NotI (Supplementary 1A). The gene PI-QT was then inserted into the opened expression vector pET-32a(+) using enzyme T4 ligase, transformed into E. coli strain TOP10F' , and plated on LB plate supplemented with ampicillin (50 µg/mL). Agarose gel analysis of the plasmids isolated from colonies and cut with enzymes EcoRI and NotI (Supplementary 1B) showed bands of~1.3 kb (corresponding to size of gene PI-QT) and bands of 5.9 kb (corresponding to size of vector pET-32a(+)). In addition, the sequencing results of the plasmid showed that the gene sequence inserted into the vector was identical to the sequence of the designed gene PI-QT (data not shown), thereby demonstrating that gene PI-QT was inserted successfully into expression vectors pET-32a(+) to form the recombinant vector pET-32a(+)/PI-QT.

Expression of recombinant protein in E.coli strain BL21(DE3)
The recombinant vector pET-32a(+)/PI-QT was transformed into E.coli BL21(DE3) by heat shock and incubated on LB/amp medium overnight. Therefore, white colonies ( Figure 3A) were incubated in LB/amp broth until OD 600 was about 0.8 -1.0 IPTG was then added in culture and incubated at 37 o C for 4 h. SDS-PAGE gel analysis of protein expression profile showed the presence of an overexpressed protein band of 64 kDa (Figure 3B, lane 3), corresponding to the size of recombinant protein PI-QT with Trxtag fusion, whereas this foreign protein band was not present in negative control sample (Figure 3B, lane 1) containing only vector pET32a(+) and samples containing the recombinant vectors but not induced with IPTG (Figure 3 B, lane 2). However, the recombinant protein was expressed mainly in the insoluble fraction (Figure 3 C, lane 2). Therefore, optimization of the expression conditions is necessary to increase the amount of the recombinant protein in the soluble fraction.

Determination of suitable IPTG concentration for recombinant protein expression
Expression of the recombinant protein PI-QT was induced with different IPTG concentrations (0, 0.5, 1.0, and 1.5 mM). The SDS-PAGE gel analysis showed the recombinant protein was expressed at 1 mM IPTG, but not at low or high concentration of IPTG (0.5 mM and 1.5 mM). However, the expressed protein was still mainly present in the insoluble fraction (data not shown).

Determination of suitable temperature for recombinant protein expression
The expression of the recombinant protein was also investigated at different temperature (20,25,28, and 30 o C). The experimental results showed that the amount of recombinant protein in soluble fraction reached the highest value (423 mg/L with about 45% in the soluble fraction) when the recombinant protein was expressed at 25 o C (Figure 4), whereas the expression of recombinant protein was not observed at 20 o C (data not shown). At 28 o C and 30 o C, the amount of recombinant protein produced was higher; however, only a small amount of the recombinant protein was detected in the soluble fraction (only about 20% of total protein).

Determination of suitable cell density for recombinant protein expression
The expression of the recombinant protein at different pre-induction cell densities (OD 600 = 0.4, 0.5, 0.6, and 0.7) was investigated (Figure 5). The results showed that expression of the protein was not observed at OD 600 = 0.4. The total amount of produced protein and protein content in the soluble fraction were increased with increasing pre-induction cell density and reached the highest values at OD 600 = 0.6 -0.7 (409 mg/L with 90% in the soluble fraction).

Purification and identification of the recombinant protein
The purification of the recombinant protein by Ni-NTA affinity chromatography column showed that the recombinant protein was of high yield with imidazole concentrations of 100 and 300 mM; however,     protein purity was low. In the case of imidazole concentration of 500 mM, the amount of the obtained protein was lower but had higher purity than those of the two above cases (Figure 6). To confirm that the purified protein was the protein PI-QT, we performed Western lot assay using anti-Trx antibody. The Western lot analysis showed the presence of a protein band of 64 kDa on the hybrid membrane that was similar to the size of protein PI-QT with Trx-tag fusion (Figure 7A). In addition, in order to remove the Trx-tag fusion from the recombinant protein PI-QT, the protein was cut off from Trxtag by thrombin. SDS-PAGE gel analysis of the protein treated with thrombin showed 2 bands of 50 kDa and 14 kDa, which corresponded to the size of protein PI-QT and Trx-tag, respectively (Figure 7 B). This revealed that the Trx-tag fusion was removed successfully from the recombinant protein.
In order to identify the recombinant protein by mass spectrometry, the recombinant protein was cut, hydrolyzed, chromatographed, and analyzed for spectroscopic data (see methods). The LC-MS/MS spectroscopic data analysis of the recombinant protein (Supplementary 2 & 3) revealed that the sequence of the peptide fragments hydrolyzed and extracted from the recombinant protein was identical to the designed sequence. Since the protein was hydrolyzed into small peptides and only about 30% of peptides were recov-   (Supplementary 3), it is therefore not possible to find similar polypeptides on available databases. However, analyses based on the protein database built from the designed sequence by the software PEAKS and of the recovered peptide sequences confirmed that the expressed and purified recombinant protein was the desired protein.

Activity and characterization of the recombinant protease inhibitor
The recombinant protein was evaluated for its protease inhibitory activity against trypsin and achymotrypsin. Protease inhibitory assay showed that the recombinant protein exhibited inhibitory effects against trypsin and α-chymotrypsin with specific activities of 975 ± 26 U/mg and 417 ± 14 U/mg, respec-tively. Compared to protein PI-QT, BBI (positive control) showed better inhibitory effects against trypsin and α-chymotrypsin with specific activities of 3303 ± 66 U/mg and 1340 ± 58 U/mg, respectively. The activity of the protease inhibitor peaked at pH 7 and still maintained more than 60% of its activity within pH 4-9. The activity of the protease inhibitor sharply declined at highly acidic (pH 3) and alkaline (pH 10) conditions ( Figure 8A). The obtained results also showed that the protease inhibitor was most active at temperatures of 20-35 o C, and still maintained more than 60% of its activity up to 50 o C. The protease inhibitor activity decreased rapidly at temperature >60 o C (Figure 8 B).
The experimental results showed that the presence of surfactants (Tween 20, Tween 80, and Triton X 100) led to decreases of activity of the protease inhibitor compared to the control ( Figure 8C). Furthermore, the activity reduction of the protease inhibitor was observed with the presence of oxidizing agents, H 2 O 2 and DMSO (Figure 8 C). Effects of metal ions on the protease inhibitor were observed in this study (Figure 8 D). The obtained results showed that the presence of Zn 2+ , Mg 2+ , and Ca 2+ ions enhanced the activity of the protease inhibitor, whereas the presence of Cu 2+ , Mn 2+ , Fe 2+ , and Na + ions had negative effects and did not support protease inhibitor activity compared to the control.

DISCUSSION
Protease inhibitors play an important role in the regulation of protease activity and have been used as a potential tool in different fields. Indeed, the discovery and exploitation of novel PIs from different sources have garnered increased attention across scientific research areas. In the present study, we have expressed and characterized a new serine protease inhibitor protein (PI-QT) from the metagenome of sponge-associated microorganisms in E. coli. The homogenous serpins in the NCBI database with the protein PI-QT are from microorganisms, suggesting that the protein PI-QT is a microbial serpin. Interestingly, several homogeneous serpins with the protein PI-QT have been detected from sponge-associated bacteria (candidate phylum Poribacteria) based on metagenome data. The protein PI-QT had an open reading frame of 429 amino acid with a calculated molecular mass of about 50 kDa. These values were consistent with the observed average molecular mass of most serpin proteins, ranging from 40 -60 kDa with 330 -500 amino acids 20 . The protein PI-QT exhibited inhibitory effects against trypsin and α-chymotrypsin with specific activities of 975 ± 26 and 417 ± 14 U/mg, respectively. The specific activity of the protein PI-QT was comparable with that of novel serpins reported in recent studies [21][22][23][24] . For example, Jiang et al. 21 have cloned and expressed a novel serpin (Spi1C) from metagenomic library of uncultured marine microorganisms; the Spi1C protein exhibited inhibitory effects against trypsin and α-chymotrypsin with specific values of 6940 and 3640 U/mg, respectively. Chan et al. 22 have purified a thermostable trypsin inhibitor from small pinto beans and reported that the purified protein showed inhibitory activity against trypsin with specific value of 2398 U/mg. Shamsi et al. 23 have isolated and purified a novel Kunitz trypsin inhibitor (ASPI) from garlic Allium sativum; the ASPI protein showed inhibitory activity against trypsin with specific activity of 30376 U/mg. In another study, Mohan et al. 24 purified and characterized a protease inhibitor from Capsicum frutescens with specific activity against trypsin of 6749 U/mg. The effects of IPTG concentration, pre-induction cell density, and temperature on the expression of the protease inhibitor PI-QT were evaluated to investigate the suitable and optimal conditions for expression of the protease inhibitor PI-QT in E. coli. Previous studies have reported that the synthesis of recombinant protein by expression vector pET-32a(+) is controlled by the T7 promoter and that this promoter is induced by the presence of IPTG in the culture medium. The concentration of IPTG in culture medium can, therefore, influence the expression of recombinant protein.
The transcription may be hindered and the amount of foreign protein produced may be reduced due to low concentrations of IPTG in medium, whereas high concentrations of IPTG can cause cytotoxicity and inhibit cell growth 25 . In addition, it has also been reported that the optimal temperature for growth of E. coli is 37 o C; however, this temperature is not suitable for producing foreign protein. The high temperature may result in eliminating plasmids. In addition, expression of protein at high temperatures can produce proteins with a tertiary structure, resulting in the loss of biological activity of proteins, whereas low expression temperature may reduce the cleavage of target protein by intracellular protease and significantly increase the amount of recombinant protein 26,27 . The experiments also showed that the protease inhibitor PI-QT were stable and maintained its activity within a wide range of pH and temperature. The stability of the protease inhibitor PI-QT within a wide range of pH and temperature is one of promising and interesting characteristic for its applications in biotechnological and pharmaceutical industries. Similar results were observed for many protease inhibitors of the Kunitz family in previous studies. The protease inhibitors in the Kunitz family are stable in wide pH ranges (pH 4-10), but sensitive to extreme pH conditions 21,28-31 . Under strongly acidic or alkaline conditions, the proteinaceous inhibitors are denatured and then lose their activity partially or completely. In addition, high temperature can affect the intramolecular disulfide bridges which are presumably responsible for the functional stability of Kunitz type protease inhibitors 32 . The negative effects of surfactant and oxidizing agents on the activity of the protease inhibitor PI-QT were also observed in this study. The effects of surfactants on the activity of protease inhibitors could be attributed to a reduction of hydrophobic interactions, whereas effects of oxidizing agents on the activity of protease inhibitors may be attributed to the probable oxidation of the amino acid methionine at the reactive site of the inhibitors 33 . Furthermore, metal ions can reduce the activity of the protease inhibitor PI-QT, with the exception of some metal ions, such as Mg 2+ , Ca 2+ and Zn 2+ . Our study, together with previous studies, reveal that exploration of new bioactive compounds by expression of metagenome-based screening genes can be a versatile and potential strategy for mining new natural products from the marine environment as well as other environments [15][16][17] .

CONCLUSIONS
A new gene encoding a serine protease inhibitor PI-QT screened from the metagenome of spongeassociated microorganisms was optimized and expressed successfully in E. coli. The recombinant protein had a molecular mass of about 50 kDa. The Western lot assay and spectroscopic data analyses co firmed expression of the recombinant protein PI-QT. The recombinant protein exhibited inhibitory activity against trypsin and α-chymotrypsin with 975 ± 26 U/mg and 417 ± 14 U/mg, respectively. The protease inhibitor was stable within a wide range of pH and temperature. This study has shown that expression of recombinant protein from metagenome-based screening genes is a promising tool for the discovery and potential use of new bioactive compounds.