Developing a high resolution melting method for genotyping and predicting association of SNP rs353291 with breast cancer in the Vietnamese population

Introduction: Breast cancer is the one of the most common types of cancer as well as the second leading cause of cancer death in women in the world. In recent studies, microRNAs (miRNAs) have been demonstrated to play a crucial role as a new potential biomarker in the association with breast cancer. Single Nucleotide Polymorphisms (SNPs) located on specific miRNA may result in breast cancer. Among the SNPs, SNP rs353291 has shown to be associated with breast cancer in individuals of Caucasian background. Furthermore, this SNP is observed in a high percentage of mutant alleles in the Vietnamese population. Thus, SNP rs353291 was selected as a candidate SNP for investigation in this study. The frequency of SNP rs353291 was evaluated by High Resolution Melting (HRM) method, which is a highly powerful method to detect variants in DNA sequence, especially for SNP genotyping. Methods: In this study, the association between this SNP and risk of breast cancer in the Vietnamese population was evaluated in 90 cases and 96 healthy controls via genotyping using an optimized HRM protocol. Result: The genotyping results revealed that SNP rs353291 is a polymorphism in the Vietnamese population. We have successfully identified frequencies of AA, AG and GG to be 40%, 42.2% and 17.8%, respectively. In particular, the calculated frequencies of allele G was 61.1% while risk allele A was 38.9%. The association between this SNP and breast cancer in Vietnam revealed that there is an obvious decreased risk of breast cancer among Vietnamese population when comparing G allele to A allele (G vs A: OR=0.92, 95% CI: 0.62-1.36, p= 0.677); the results also showed that heterozygote model had a reduced risk of breast cancer compared to dominant model (GA+GG vs AA: OR=0.94, 95% CI: 0.52-1.70, p=0.839). Conclusion: However, since the p-values were >0.05, our results only show a correlation rather than a significant association between SNP rs353291 and breast cancer risk in the Vietnamese population. ! Biomed Res Ther 2017, 4(12): 1812-1831 !1812 DOI: 10.15419/bmrat.v4i12.387 ISSN: 2198-4093 www.bmrat.org


Introduction
Breast cancer is not only one of the most common cancers in women but also the leading cause of cancer death of women worldwide. In 2012, it was reported that new cases of breast cancer reached up to nearly 1.7 million and that breast cancer had become the second most common cancer overall , especially in developing countries. According to Globocan that year, among the seven countries of South-Eastern Asia, Vietnam had a high incidence rate comprising 23 percent . Moreover, according to the Association of Cancer in Ho Chi Minh City in 2012, 30 out of 100,000 women in Hanoi are affected by this deadly cancer, while the ratio in Ho Chi Minh City was 20 out of 100,000 women (Association of Cancer ). Therefore, these statistics demonstrated that breast cancer was increasing at an alarming rate; it not only had become a prominent cancer in women worldwide (next to cervical cancer) but also more prevalent among women in Vietnam.
The role of microRNAs (miRNAs) has recently been studied due to their potential role as biomarkers for various types of cancer, including breast cancer . microRNAs belong to a large family of small (approximately 20-22 nucleotides), non-coding RNAs that regulate the expression of target genes by targeting mRNAs to trigger either translational repression or mRNA degradation . Indeed, miRNAs take part in almost all biological processes such as apoptosis, cell growth and differentiation; this is due to their direct targeted effect on human mRNA . It has been demonstrated that the activities of miRNAs play an important role in cellto-cell communication and role as therapeutic and diagnostic markers .
Single Nucleotide Polymorphisms (SNPs) are the most abundant variant in human genome. They can affect the function of various genes when they appear within a gene or in a regulatory region near a gene. Most SNPs within miRNA sequences or their target sites have been found to be associated with many kinds of cancers . There have been numerous investigations into the function and influence of certain SNPs located on certain miRNA region and their effects on breast cancer pathway Onay et al., 2006;. Among these potential SNPs, SNP rs353291 may have an association with breast cancer on miRNA 143 host gene transcript (miR143HG). SNP rs353291 is located 450 bp upstream from the miR145 gene located inside the miR143HG (in the long arm of chromosome 5 region 32 at position 148,810,746). MiR143HG is the gene region which consists of miR143 and miR145. SNP rs353291 may play an important role in the expression of miR145, leading to abnormal activities of miR145. MiR145 is shown to take part in TP53 pathway which controls the regulation of estrogen receptor-α and deathpromoting signals in breast cancer cells. The advancement of the apoptotic process of wild-type TP53-expressing and ER-α-positive breast cancer cells is mediated by tumor suppressor activity of MiR145. The expression of MiR145 not only triggers TP53 activation but also restrains ER-α, resulting in the positive regulatory death loop which leads to re-expression of miR145 in breast cancer patients . Moreover, miR145 can inhibit breast cancer cell growth through RTKN. RTKN is considered as another target of miR145; it is known as a coding gene and potential marker for identification of breast cancer cells. It is expressed in low amount in normal cells but at high levels in cancer cell lines. In cells, resistance to apoptosis plays a significant role in tumorigenesis . Thus, from all the evidence above, it can be assumed that miR145 could be a promising target for the treatment of breast cancer since it is a potent tumor suppressor that can regulate multiple cellular pathways, as described in Figure 1  . Previously, there has been some evidence showing that SNP rs353291 has an association with breast cancer . It was demonstrated that this SNP had a close relationship with an increased risk of developing breast cancer in individuals of Australian Caucasian background in two independent case control cohorts (p=0.041 and p=0.023). Since the Vietnamese population is distinct from other populations, our current study seeks to analyze the correlation between SNP rs353291 and breast cancer in Vietnamese women.
The High-Resolution Melting (HRM) method is a powerful one for SNP genotyping and, thus, was applied in this study. This method allows researchers to rapidly determine and accurately genotype genetic variants with a large number of samples in a short time. It is a measurement of fluorescence change accompanied by double strand DNA melting using a saturated DNAintercalating dye. It characterizes nucleic acid samples based on their melting behavior. Hence, different genotypes in PCR products are detectable because this method changes the shape of DNA melting curves. It is massively costeffective, due to low-cost dyes and less optimization time, compared to other genotyping technologies, such as sequencing or Taqman SNP typing. In addition, since this method is closed-tube no processing is required between amplification and analysis, thus errors and contamination are highly avoidable (Premier Biosoft).
This study was aimed to analyze the correlation between SNP rs353291 and the risk of breast cancer in Vietnamese women. The study was accomplished using an optimized HRM method.

Sample preparation
The population of interest in this study was the Vietnamese population. Blood samples were collected from the Oncology Hospital, Ho Chi Minh City. The sample population included 100 cases and 100 healthy controls. The group of healthy controls were derived from healthy people. All the cases were diagnosed with breast cancer by the Oncology Hospital and prepared for surgery, and were eligible to participate in the study. The blood samples were collected, based on the required criteria that they were from female individuals who belonged to the Vietnamese-Kinh population. All patients were given consent forms to sign and study approval was obtained from the Ethical Committee of the Oncology Hospital, HCMC, Vietnam (under the decision number 177/HĐĐĐ-CĐT, granted the 18 th of November, 2014). The collected blood samples were stored in tubes containing EDTA at -20 O C until use.
Collected blood was stored in 2 mL-tubes containing EDTA so as to prevent anti-coagulation; samples were transferred from the Oncology Hospital laboratory and transferred to our laboratory within 24 hours. The blood samples were frozen at -80 o C until DNA extraction was performed. Samples from cases were to be compared to those from healthy controls. DNA from blood samples was extracted by salting-out method followed by a protocol by Hue et al. . In this protocol, white blood cells were isolated from whole blood ISSN: 2198-4093 www.bmrat.org by centrifugation and then cell lysis buffer (Tris HCl-10mM, sucrose-11%, MgCl 2 -5mM, and Triton X100-1%) was added to lyse and release cellular components. Next, pellets were treated with 300 µL of nuclei lysis buffer (Tris HCl-10mM, TmTS-1%, EDTA-10mM, and sodium citrate-10mM) to lyse the nuclei and release DNA. After that, the DNA was separated from other components and cell debris by adding aqueous phase using 500 μL of chloroform. The upper aqueous phase containing DNA was then transferred to a new 1.5ml micro-centrifuge tube. The DNA was precipitated out of the solution using absolute ethanol and 100 μL of NaCl-5M, followed by ethanol-70%. All aforementioned reagents were obtained from Thermo Fisher Scientific Waltham, MA, USA). The supernatant was discarded and the precipitated DNA was kept overnight for drying. Finally, the dried and clear DNA was dissolved in molecular water or RNAse-free water and then stored in -20 o C for further use. After the extraction, DNA samples were measured by absorbance in a NanoDrop 1000 Spectrophotometer (Thermo Scientific, USA).DNA purity was determined by calculating the ratio of absorbance at 260 nm to absorbance at 280 nm. The A260/A280 ratio in the range of 1.7-2 was considered high purity DNA for HRM analysis.
Primer design SNP rs353291 is located 450 bp upstream from the MIR145 gene, in the intron region of the long arm of chromosome 5 at position 148,810,746. The sequence of SNP rs353291 was identified using Gene Bank database. The sequence and other information related to this SNP were obtained from NCBI. The online tool Primer3plus was applied to design a specific pair of primers (Primer3plus). One set of primers for HRM analysis was designed by Primer3plus with the following required criteria: product size of HRM primer from 80 -150 bp and sequencing primer from 300-400bp, size 18 -27 bp, temperature of melting (Tm) of primers around 60 -70°C (65 0 C is optimal), and difference in Melting Temperature of Forward primer and Reverse primer not allowed to exceed 3 0 C. After that, the specificity of the primers was tested by NCBI Blast (NCBI BLAST) and UCSC insilico PCR (UCSC In silico pcr), to limit any undesired PCR product. To eliminate secondary structure, Oligo Analyzer online tool (Oligo Analyzer) was used. The pairs of primers which had the highest specificity were then used to predict the HRM melting curves of their amplicons using UmeltHet (UmeltHets). The best primer pair should yield products with 3 distinct curves and peaks for the 3 genotypes GG, AA and GA. Adjustments would be made for components such as [Mg2+] (mM) and % DMSO. "Very High -0.1°C" resolution was used to achieve the best distinguished melting curves representing each genotype. Beside the primers for HRM analysis, an extra pair of primers for sequencing (to confirm genotypes of three positive control samples) was also designed.

Genotyping method optimization
Three stages of optimization of HRM analysis were conducted: initial optimization, control findings and final optimization. PCR thermal cycling was set For initial optimization, Ta optimization was performed in range 58 0 C -72 0 C using thermal cycle PCR Eppendorf instrument (Eppendorf, Germany) and TopTaq Master Mix Kit (Qiagen. The optimization step was carried out by using Thermal cycle PCR Eppendorf instrument and TopTaq Master Mix kit (Qiagen, Germany). Total volume of each PCR reaction was 25 µL, which includes 1X Toptaq Master Mix, 0.2 µM of each primer, 10ng of DNA, and molecular-grade water. A negative control was included in each assay. After that, the PCR products were run on a 2% electrophoresis gel for 20 minutes at 80 V.
For the positive control identification step, since this genotyping method requires positive controls, some samples were selected randomly to perform HRM reactions at the previous optimized Ta. Components of each reaction included: 1X PCR buffer, 200 µM of each deoxynucleotide triphosphate (dNTP), 2.0 mM MgCl 2 , 0.2 µM forward primer, 0.2 µM reverse primer, 2.5 units of HotstarTaq, 10 ng of DNA, and molecular water. The different melting curve groups were considered as having different genotypes. After that, three possible genotype samples were selected for the sequencing. The three genotypes of SNP rs2853826 (AA, AG, and GG) were expected to have three positive controls.
For post optimization of MgCl 2 , as three controls were determined, optimization had to be conducted again in order to obtain the clustered and distinct melting curves of the three controls together. To test the differences between theory obtained by Umelt Hets software and reality, the post optimization for MgCl 2 was required to estimate the best condition for the HRM reaction. Gradient MgCl 2 concentrations (from 1.5-3.5 mM) were evaluated.

Genotyping
Finally, the optimal HRM conditions were applied to the 100 breast cancer samples and 100 healthy controls for genotyping. For each running time, three positive controls and one negative control were included in the plate. The results were analyzed using LightCycler® 96 SW 1.1 software (Roche, Swiss). The samples with identified genotype(s) then were proceeded to the next step (calculation for genotypic frequency).

Statistical analysis
The Hardy-Weinberg equilibrium (HWE) in controls was analyzed. P <0.05 was considered representative of departure from HWE. To determine whether the SNP frequency in cases and controls were significantly different, allele and genotype frequencies were compared using Chi-square test. Furthermore, OR with 95% CI was used to assess the strength of association between rs353291 and Vietnamese patients.

Results
Primer designFour pairs of primers had been selected based on strict criteria by Primer3Plus and the predicted melting curves by UmeltHets. As a result, the pair of HRM primers (HRM F1, HRM R1) and Sequencing primers (SEQ F1, SEQ R1) satisfied the requirement of the HRM analysis (Table 1, Fig. 2). Specificity of this primer pair was checked carefully on NCBI Blast and showed high specificity; on Oligo Analyzer the primer pair showed very weak secondary structure formation (∆G >-1). anneal, which would then result in more non-specific products. Moreover, it was noted that there may be a borderline between success and failure to amplify the products of this SNP at 62 O C. In addition, there was no extra band on the gel, which confirms the high specificity of the primer sets.

Identification of positive controls by sequencing
The main purpose of this step was to identify 3 control samples for the 3 different genotypes: homozygote wild-typed (AA), heterozygote mutant (GA) and homozygote mutant (GG). Eight random samples were chosen to be analyzed by HRM. The assay was performed under conditions of 3.0 mM MgCl 2 and optimal Ta of 60 0 C. As a result, 8 samples were separated into three groups; one sample represented each group of curve pattern, and then all were subjected to sequencing to determine genotype (Fig. 4).

Optimization of Ta sequencing primer
The annealing temperature of the sequencing primers was investigated in order to obtain the best sequencing results. The gradient Ta was tested in the range of 61 O C to 71 O C to find the optimal Ta temperature for the amplification of PCR product on 1.5% agarose gel. As indicated in Figure 5, the brightness of the bands decreased from 61 O C to 71 O C. Therefore, the optimal Ta was selected as 69 O C (under the temperature of 71 O C).

Confirm 3 control samples by sequencing
Three suspected samples have been sequencing to confirm the genotypes (Fig. 6).

Final HRM protocol
Before genotyping, the 3 positive controls were tested under proper conditions to find out in which MgCl 2 concentration allowed the control samples to have the most distinctive curve (g.g. the best melting curve in HRM assay). Optimum MgCl 2 concentration would help to reduce non-specific amplification, thus allowing a good discrimination of the 3 control samples. In this study, three positive controls were ran with different MgCl concentrations: 3 mM and 3.5 mM.
It was shown that 3.0 mM and 3.5 mM MgCl 2 both give distinctive melting curves and shapes, with the 3 genotype curves separating clearly. Since the 3 amplification curves at 3.0 mM MgCl 2 reached a Ct value at 24, while the amplification curve of AA genotype reached Ct value at 28 (Fig. 7). The latter may lead to un-stability and, thus, 3.0 mM MgCl 2 concentration was considered to the the ideal concentration. Therefore, 3.0mM MgC l2 was selected for the optimized HRM protocol in genotyping rs353291.

Genotyping results
After discovering the optimal HRM conditions, 100 samples from breast cancer patients and 100 controls (without cancer) were evaluated by HRM assay. For each running on HRM, we performed about 16 to 24 reactions (2-3 HRM strips). A typical genotyping result is shown in Figure 8; the main results are listed in Table 2 and

Evaluation of the HRM method
The success of this method was measured by the stability (Tm change) and sensitivity of 200 samples (100 cases/100 controls samples) on the running HRM without any repeats. The normalized melting curves are shown in Figure  8B. Each genotype grouped together. With HRM, different homozygotes had similar shapes but different Tm. Heterozygotes had a different shape from homozygotes. Similarly, melting peaks also showed distinguishable peaks for each genotype of each SNP (Fig. 8A). Moreover, in HRM analysis, three genotypes of rs353291 were most easily distinguishable by different plot panels (Fig. 8C).
The stability represents Tm change among control samples and different runs, using R statistic software Rstudio 3.2.2. Table 4 shows the average Tm with its standard deviation (SD) for each genotype. The different Tm among samples in each genotype is small. As a result, SD is quite small (0.11 -0.15) which indicates the small difference of Tm values. In particular, the Tm difference among samples and different runs is not remarkable. According to Table 5, the average ΔTm between two genotypes is significant difference (P <0.001). However, the variance of ΔTm is not remarkable (from 0.06 to 0.14). High stability is the basis of accurate distinctions of all three genotypes for this SNP.
The ANOVA test showed that Tm values among the genotypes could be significantly differentiated by having p-values < 0.001 (Table 4). For rs353291, Tm values of the three genotypes did not overlap each other. The distance between three yellow boxes are large (Fig. 9); even the smallest ΔTm value (AA vs. AG) reaches 0.8 which highlights the proper distance needed to have a clear differentiation between three genotypes. In addition, AA peak is different from AG peak. Consequently, the result of three genotypes of this SNP can be read easily by applying HRM method with a large sample amount or many runs.  In most cases, it was possible to directly distinguish the three different genotypes of SNP rs353291 using HRM (Fig. 8). Nevertheless, there were some cases where the HRM curves did not cluster around any of the control, or had strange melting patterns (Fig. 10). The samples that generated poor results were called failed samples. Finally, the frequencies for each genotype of this SNP were calculated. Overall, for 200 samples, the number of failed samples when genotyping rs353291 was 14 samples. Therefore, we were able to successfully optimize the HRM method with high sensitivity up to 93%.

Discussion
The main purpose of this study was to develop/optimize the High Resolution Melting (HRM) method for genotyping SNP rs353291 in order to estimate the frequency of this SNP, as well as determine the potential association of this SNP and breast cancer in the Vietnamese population. In terms of the genotyping method, the discrimination between two genotypes/amplicons was significantly distinguishable (p-value < 0.001) via ANOVA test (Tables 4, 5). Furthermore, three genotypes could be discerned by the shape of their melting curves, melting peaks, and/or different plots. The Hardy Weinberg Equilibrium (p-value=0.276; thus >0.05) demonstrated that our population research reached the equilibrium according to Hardy-Weinberg. Thus, this population was able to represent the Vietnamese population. In our study, we successfully genotyped all samples by HRM with high stability and accuracy.
There are so many studies of SNPs on miRNAs which have been conducted recently. It was shown that miRNAs play an essential role in various biological functions including development, cell differentiation, and progression of human diseases ). Among them, the mutation on SNP rs353291 (located on miR145 gene) has been reported to be associated with an increased risk of developing breast cancer in two different groups of Australian Caucasians (p-value=0.041 and p=0.023, respectively) . Moreover, it has been used to determine the therapeutic potency of microRNA-145 against breast cancer . Therefore, SNP rs353291 has been assumed to play a role in decreasing the risk of breast cancer by changing the structure and function of miR145. Compared with the research study result in the two Australian populations in which G was the risk allele thought to increase the risk of breast cancer (p = 0.041, p = 0.023, respectively), the result in this study highlighted that SNP rs353291 showed no association with decreased risk of breast cancer in Vietnamese women (p-value >0.05).
Case-control studies of breast cancer among Vietnamese women highlight the potential and success of the HRM method. In this study there is a strong evidence for HRM, a technique that is powerful, simple, fast and effective. An optimal HRMA protocol was applied for genotyping. The study results indicate that A is the major allele with a frequency of 61.1% (38.9% for allele G). Besides, the frequency of major allele A in the Vietnamese population in this study (61.1%) is greater than the frequency of A allele in the control population (58.9%). According to association analysis, A allele may play a role of risk allele when all the models containing A have the potential to increase the risk of breast cancer (OR > 1). However, the confidence interval of all the models seem to be large and the down-index is smaller than 1 (lowest is 0.55). From these calculations, it is assumed that allele A is not entirely account for risk for disease, and may at times even reduce the disease. This ambiguity further proves that allele A may have no effect on the disease. Besides, all the p-values were greater than 0.05; thus, there is no statistical significance. We can conclude, therefore, that SNP rs353291 has no association with risk of breast cancer, based on our study. These results suggest that the Vietnamese population are distinct from other populations concerning this SNP frequency. However, due to the small sample size in this study (90 cases/96 controls), the results were not reliable. Moreover, the power of this study is very low (7%). Thus, we should further study a larger sample size to understand the correlation between this SNP and breast cancer risk in Vietnamese population.
In addition, further studies should be conducted on a larger sample size in order to evaluate the correlation between this SNP and breast cancer risk in Vietnamese population. As well, the study needs to be more adequately powered to increase the reliability of the results. According to our calculation, the power of this study could reach 80% if sample size were expanded up to 9,264 cases/controls.

Conclusion
The high resolution melting protocol had been successful in genotyping SNP rs353291 on miR-145. This method is advantageous in that it is quick, simple and effective. Therefore, the optimized HRM protocol could be applied to further association studies of this SNP. The genotyping results showed that the chosen sample sets for genotyping this SNP are suitable for representing the Vietnamese population (P HWE > 0.05).
The association analysis indicates that there was no significant association between SNP rs353291 and breast cancer risk in 96 cases/90 controls. The present study should prompt further larger-scale studies of breast cancer predisposition genes in larger sample sizes. This will help enhance the reliability of the study along with helping to identify the role/function of this SNP before potential use as a biomarker for early breast cancer diagnosis.