Abstract
Hypertrophic cardiomyopathy (HCM) is a very heterogeneous disease. Although primarily caused by mutations in genes encoding sarcomeric proteins, other genes might explain that heterogeneity. Potential candidate genes are GATA transcription factors that regulate the expression of proteins associated with HCM. Exons of GATA2, GATA4, and GATA6 genes were sequenced in 212 patients with unrelated HCM previously analyzed for genes encoding the most frequently mutated sarcomeric proteins. Functional effects of variants were predicted by in silico analyses. 3 potentially pathogenic variants were identified: c.-77G>A in GATA2, p.Ala343Thr (rs370588269) in GATA4, and p.Pro555Ala (rs146243018) in GATA6. Multivariate analyses showed that angina was more frequent in patients carrying sarcomeric and GATA rare variants (55% vs 23.2% in non-carriers of GATA rare variants, OR (95% CI) 7.12 (1.23 to 41.27), p=0.029). Among patients without a known causal mutation, GATA rare variants were associated with a greater maximum posterior wall thickness (16.4±4.4 vs 14.0±3.1 mm in non-carriers, p=0.021). Thus, variants having a putative effect on GATA genes would alter the expression of their target genes and could modify the hypertrophic response. Therefore, although relatively infrequent in patients with HCM, they may represent a novel insight into the molecular mechanisms related to the pathogenesis of HCM.
Significance of this study
What is already known about this subject?
Hypertrophic cardiomyopathy (HCM) is primarily caused by mutations in genes encoding sarcomeric proteins.
Non-sarcomeric genes have been described as modifiers of HCM features.
GATA family of transcription factors participates in the hypertrophic response.
What are the new findings?
We describe 31 genetic variants, 4 out of them not previously described, in GATA2, GATA4, and GATA6 genes in a cohort of 212 patients with hypertrophic cardiomyopathy.
Three rare variants were identified in patients: c.-77G>A in GATA2, p.Ala343Thr in GATA4, and p.Pro555Ala in GATA6, having putative functional damaging effects in the protein, predicted by in silico analyses.
The presence of rare variants in GATA, alone or together with sarcomeric mutations, associates with greater maximum posterior wall thickness and angina, respectively.
How might these results change the focus of research or clinical practice?
These variants could offer novel insight into the molecular mechanisms related to the pathogenesis of hypertrophic cardiomyopathy (HCM).
The replication of these findings in bigger populations could provide new genetic variants to use in screening, diagnosis, and prognosis in patients with HCM.
Understanding the effect of these variants on the expression or function of cardiac proteins could help in the design of new therapeutic drugs.
Introduction
Hypertrophic cardiomyopathy (HCM, MIM #192600) is the most common form of inherited cardiomyopathy and is defined by the presence of increased left ventricular wall thickness, which cannot be explained solely by an abnormal loading condition.1 The identification of a large number of mutations in genes encoding contractile proteins leads to the consideration of HCM as a ‘disease of the sarcomere’.2 However, mutations in these genes can explain about 60% of patients with HCM.1 Importantly, a wide phenotypic heterogeneity is commonly found among carriers of the same mutation, even within a family. Thus, additional genetic variants have been sought in recent years. Mutations in genes encoding z-disc proteins,3 calcium handling proteins,4 or involved in protein degradation5 have also been related to the pathogenesis of the disease and can explain an additional relatively small number of HCM cases. Furthermore, polymorphisms in genes of the renin–angiotensin system,6 ,7 in the resistin gene,8 and in the MEF2C gene,9 among others, have been described as modifiers in HCM features, with controversial results in some cases.10 These findings support the search for additional genes implicated in the pathophysiology of HCM or in the interindividual variability.11 Likely candidates are cardiac transcription factors that regulate the expression of cardiac genes involved in the hypertrophic response.12 Among these regulators is the GATA family, composed of six members in vertebrates. GATA1, GATA2, and GATA3 are mainly expressed in hematopoietic cell lineages13 but also in endocardial and vascular endothelium, where GATA2 is one of the main regulators of endothelin-1,14 ,15 a recognized prohypertrophic factor.16 GATA4, GATA5, and GATA6 are widely expressed in tissues derived from mesoderm and endoderm, such as heart, gut, liver, and gonads.17 Interestingly, overexpression of GATA4 or GATA6 has been described as necessary and sufficient to induce a hypertrophic response, whereas GATA4 or GATA6 knockdown inhibits the development of cardiac hypertrophy.18 ,19 Despite the association between deregulation of GATA transcription factors and cardiac hypertrophy, there are no studies that examine whether their genetic variation is associated with HCM. Thus, the aim of this study was the screening of GATA2 (MIM #137295), GATA4 (MIM #600576), and GATA6 (MIM #601656) genes in a cohort of patients with HCM in order to find genetic variants in these transcription factors that might be associated with the pathophysiology or with the phenotypic heterogeneity of this disorder.
Methods
Patients and controls
A total of 512 unrelated individuals were analyzed. Patients with HCM (n=212) were recruited at the Department of Cardiology at Hospital Universitario Central de Asturias (HUCA), in Oviedo, Spain. All of them were white, from Asturias, a region of the North of Spain with 1 million inhabitants. The HCM diagnosis was performed according to the American College of Cardiology/European Society of Cardiology (ACC/ESC) guidelines.20 These patients had previously been screened for mutations in the most frequently mutated sarcomeric genes (MYBPC3, MYH7, TNNT2, TNNI3, ACTC1, TNNC1, MYL2, MYL3, and TPM1) and their mutations are already published.21 ,22 The main clinical and anthropometric characteristics of the patients are summarized in table 1.
A group of 300 healthy white individuals from Asturias, 76% men, aged 20–75 years (mean±SD: 43.8±12.5 years) were recruited through the Blood Bank and the Department of Cardiology at HUCA and constituted the control population. They were healthy and did not have a history of cardiovascular or systemic diseases but asymptomatic hypertrophy cannot be ruled out.
All samples were obtained in accordance with the declaration of Helsinki guidelines and with ethics approval from the local Ethics Committee for Clinical Investigation of Asturias. All participants signed an informed consent for the study.
Genotyping
Genomic DNA was extracted from peripheral blood samples as previously described.23 Coding exons, flanking introns, and 5′ untranslated region (UTR) of human GATA2, GATA4, and GATA6 (GenBank accession numbers NG_029334, NG_008177, and NG_032677, respectively) were amplified using PCR as previously described.24 Primers were designed to anneal in the flanking introns on the basis of the genomic sequence (primer sequences and specific conditions are available on request). To search for genetic variants, PCR amplicons were then sequenced on both strands using an ABI3130 system, with BigDye chemistry (Applied Biosystems, Foster City, California, USA). Sequences were analyzed with BioEdit Sequence Alignment Editor and compared with reference sequences in public databases from National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/).
Classification of identified variants
Identified variants were classified as rare if minor allele frequency (MAF) was ≤0.5% in control population. Variants were considered novel if they were not previously reported in public databases (dbSNP, http://www.ncbi.nlm.nih.gov/snp/; 1000 Genomes Project, http://www.1000genomes.org/; or NHLBI Exome Sequencing Project (ESP) database, http://evs.gs.washington.edu/EVS/). Variants found in patients were considered as potentially HCM-related variants when they were absent or rare in our control population or a statistically significant difference was found between frequencies in patients and controls, and, in addition, a putative functional effect was described by at least two in silico prediction analysis tools. Cosegregation within the family was also examined for each potentially pathogenic GATA variant.
In silico sequence analyses
Amino acid sequence conservation among species was analyzed searching the Homologene database (http://www.ncbi.nlm.gov/HomoloGene). MutationTaster (http://www.mutationtaster.org/) was used to study DNA sequence alterations, analyzing evolutionary conservation, splice-site changes, loss of protein features, and changes that affect the amount of mRNA. The potential pathogenicity of non-synonymous substitutions was predicted using two complementary programs: Sorting Intolerant From Tolerant (SIFT; http://sift-dna.org) and Polyphen2 (http://genetics.bwh.harvard.edu/pph2/).25 Identification of possible transcription factor binding motifs affected by the genetic variants located in the 5′UTR was carried out by using the software MatInspector from Genomatix (http://www.genomatix.de/) with the default settings (matrix similarity 0.75). Mfold server (http://mfold.rna.albany.edu/) was used to evaluate the most thermodynamically stable form of the mRNA, calculating free energies and secondary structures for each variant using the default settings. SNPfold (http://ribosnitch.bio.unc.edu/snpfold/SNPfold.html) was used to evaluate the structural consequences of the variation on the mRNA through the computing of a correlation coefficient between the nucleotide pairing probabilities of the two sequences. Both programs were used to analyze changes located in 5′ and 3′UTRs. Non-synonymous variants affecting threonine, serine, or tyrosine residues were studied by Netphosk 1.0 (http://www.cbs.dtu.dk/services/NetPhosK/) to predict potential kinase-specific phosphorylation sites. Server microRNA.org (http://www.microrna.org/) and TargetScanHuman 6.0 (http://www.targetscan.org/) were used to check whether the 3′UTR variants potentially altered microRNA binding sites.
Statistical analyses
χ2 test or Fisher's exact test when necessary was used to compare genotypic and allelic frequencies in patients and controls and deviation from the Hardy-Weinberg equilibrium. Linkage disequilibrium (LD) and haplotype frequencies were calculating with Haploview 4.2.26 LD blocks were defined based on published criteria.27 Multivariate linear and logistic regression was used to adjust for demographic characteristics (gender and age). A p value of <0.05 was considered statistically significant. The Bonferroni multiple-testing correction was used when appropriated to adjust p values (0.05 corrected for 22 potential comparisons). Statistical analyses were carried out by Statistical Package for the Social Sciences (SPSS) V.17.0 for Windows.
Results
In the screening of GATA2, GATA4, and GATA6 genes, 18 rare (table 2) and 13 common (see online supplementary table S1) genetic variants were identified, with 4 of them never described previously. Nine of the variants were present only in patients with HCM. The genotype distribution of all variants did not deviate from the Hardy-Weinberg equilibrium when tested in controls, except for rs2713604, located in GATA2, and therefore, this polymorphism was excluded from further analyses. Individuals with rare variants were all heterozygous.
supplementary tables
Genetic variants identified in GATA2
In the screening of GATA2, 11 genetic variants were identified, 1 of them novel (table 2 and see online supplementary table S1). Out of them, 9 were found in patients and healthy subjects and there were no statistical differences in either allele or genotype frequencies between both groups. LD patterns defined a block of 1 kb consisting of rs1108606 and rs3803 polymorphisms, but the estimated haplotype frequencies did not differ between patients and controls (data not shown). Two rare variants were found in two patients (table 2), one variant in each: the novel c.-77G>A in 5′UTR (figure 1A) and c.*508G>A (rs116559910) in 3′UTR. Analysis of their potential functional effect (see online supplementary table S2) showed that only the variant c.-77G>A is predicted as disease-causing by MutationTaster. A significant alteration on 5′UTR secondary structure was predicted by SNPfold (p=0.006; correlation coefficient=0.74) and Mfold showed that the c.-77A allele changes the secondary structure of the 5′UTR, with only mild alterations in the free energy compared with the c.-77G allele (wild-type allele) (figure 1B). In addition, we also analyzed the putative effect of c.-77G>A in the binding sites for transcription factors with the MatInspector program, which predicted that this variant may create a potential binding site for the Gli-similar Krüppel-like family of transcription factors (GLIs) and abolish recognition motifs for the Activator Protein 2 (AP2) and Krüppel-like factors (KLF). Consensus sequences for AP2 and KLF were located in a region of the promoter conserved across diverse mammals (figure 1C). The patient carrying this variant (a man aged 38 years, carrier of the p.Ala583Val mutation in the sarcomeric gene myosin heavy chain 7 (MYH7)) had asymmetric left ventricle hypertrophy with an interventricular septum thickness of 23 mm, who originally presented with syncope and palpitations, developed an NYHA functional class III–IV dyspnea, and had episodes of angina. The proband's mother, who also carried both variants (figure 1D), was diagnosed with concentric left ventricle hypertrophy (21 mm of posterior wall thickness) at the age of 50, and presented with atrial fibrillation.
Genetic variants identified in GATA4
The screening of GATA4 in our cohort revealed 11 variants (table 2 and see online supplementary table S1), 7 of them found in patients and controls, and the analysis of the allele and genotype frequencies showed a statistically significant difference between both groups only for p.Asn352=(rs3729855), a synonymous polymorphism in exon 6. The frequency of T allele and CT genotype were significantly higher in patients with HCM when compared with controls (p=0.03) (table 2), an association that persisted after adjusting for sex and age (p=0.03; OR=10.78; 95% CI 1.21 to 96.28). However, this association did not survive the Bonferroni correction for multiple comparisons. The analysis of the LD between common variants identified in GATA4 indicated that they did not constitute haplotype blocks according to defined criteria.27
In addition, four rare variants in GATA4 were found only in patients (table 2). In silico analyses predict a functional effect only for p.Ala343Thr variant (see online supplementary table S2), located in exon 6 of GATA4 gene, in a residue conserved across mammalian species (figure 2). Thus, SNPfold predicted a potential effect in the structure of mRNA (p=0.05; correlation coefficient=0.09). In addition, Netphosk predicted a change in protein phosphorylation pattern identifying a consensus site for protein kinases B and C that could be affected by this change. This variant was found in a sporadic, man aged 56 years, without a sarcomeric mutation identified, and with an interventricular septum thickness of 24 mm. This patient presented with atrial fibrillation detected by ECG and a dilation of the left atrium of 46 mm.
Genetic variants identified in GATA6
Analysis of all seven exons and the flanking introns of GATA6 in our cohort revealed nine variants, including three not previously described (table 2 and see online supplementary table S1). Among the variants, eight were considered rare; therefore, LD and haplotype frequencies were not determined for this gene. Allele and genotype frequencies did not differ significantly between patients and controls. However, among the rare variants, the p.Pro555Ala change (figure 3A) was predicted to be pathogenic by MutationTaster and Polyphen-2 tools (see online supplementary table S2). In addition, SNPfold showed a marginally significant p value (p=0.06; correlation coefficient=0.90). The altered amino acid was conserved from Xenopus to humans (figure 3B). This change was found in two men carrying a sarcomeric mutation and with a family history of HCM and sudden cardiac death, and also in one control. One of these patients is an asymptomatic carrier of the p.Gly236del mutation in the sarcomeric gene cardiac myosin binding protein C (MYBPC3), with an asymmetric left ventricle hypertrophy (17 mm of interventricular septum thickness). The other patient who was diagnosed at the age of 51 carried the p.Arg663His mutation in MYH7 and presented with an asymmetric left ventricle hypertrophy (29 mm of interventricular septum thickness), with left ventricle outflow tract obstruction of 80 mm Hg. This patient developed an NYHA functional class I–II dyspnea, angina, and aortic murmur. In this patient, cosegregation within the family (figure 3C) showed that his sister was diagnosed at the age of 48 with left ventricle hypertrophy, with an interventricular septum thickness of 22 mm and showed fatigue, and also carries both variants. The proband's son also carries both variants, but at the age of 28, he had normal echocardiographic parameters.
Three rare variants in GATA6 were found only in patients and two of them involve changes in the amino acid sequence (p.Val99Ala, p.His324_His326del), although any putative functional effect was determined by in silico analysis.
Effect of rare variants in GATA2, GATA4, and GATA6 on HCM phenotype
In order to determine whether the presence of rare variants in these genes could modify the disease phenotype, the clinical characteristics of patients with HCM carrying the 18 GATA rare variants (MAF≤0.5% in control population) were analyzed in both carriers of sarcomeric mutation and patients without a known causal mutation. Among the 70 patients with mutations in sarcomeric genes, 9 carried rare variants in at least 1 of the GATA genes examined (table 3). Multivariate analyses showed that, among the double carriers, angina was more frequent (55.5% vs 23.2%, OR (95% CI) 7.12 (1.23 to 41.27), p=0.029) (table 4). In the analysis of patients without sarcomeric mutation, patients carrying rare variants in GATA genes had greater maximum posterior wall thickness after adjusting for sex and age (16±4 vs 14±3 mm in non-carriers, standardized coefficient β=2.39, p=0.019) (table 4).
Discussion
HCM is a genetic cardiac disorder with a high risk for sudden death in undiagnosed patients, which highlights the importance of an accurate diagnosis. Cardiac transcription factors play a key role since they directly regulate an important number of genes whose expression is altered in cardiac hypertrophy.12 In fact, exclusive overexpression of several of these transcription factors, mainly GATA4 and GATA6,18 ,19 but not only,28 ,29 is sufficient to induce cardiac hypertrophy, highlighting their importance in the pathogenesis of HCM, which some authors considered, at least in part, as a transcriptional disorder.30 Therefore, any functional genetic variant affecting these genes could be affecting the developing of HCM. To the best of our knowledge, this is the first analysis of the genetic variation of three members of the GATA family of transcription factors in HCM, although loss-of-function mutations in GATA4 and GATA6 have recently been linked to dilated cardiomyopathy.31 ,32
Herein, the screening of GATA2, GATA4, and GATA6 in our cohort identified 31 genetic variants, 18 of them with a very low frequency in our population but also in general databases and, therefore, could be considered as pathological candidates.33 Out of them, three had a predicted functional effect using in silico analyses: one in the 5′UTR of GATA2 (c.-77G>A) and two non-synonymous changes, one in GATA4 (p.Ala343Thr), and another in GATA6 (p.Pro555Ala).
The c.-77G>A variant in GATA2 was found in a carrier of a sarcomeric mutation and both changes cosegregated with HCM in the family. Then, it could act as phenotype modifier, taking into account that the disease was caused by the sarcomeric mutation. Bioinformatic analyses showed that this variant could affect gene transcription due to the removal of recognition motifs for AP2 and KLF transcription factors. In fact, among KLF family members, KLF15 is an antihypertrophic factor, through the inhibition of GATA4 and MEF2 function.34 In addition, this variant could modify translation efficiency as it could affect the mRNA secondary structure.35 Furthermore, variants in gene regulatory regions have been previously described as potential modifiers of the HCM phenotype in carriers of sarcomeric mutations, as it was found in MEF2C.9
Regarding non-synonymous changes, p.Ala343Thr variant was predicted to affect mRNA structure as well as GATA4 phosphorylation, which is essential to activate the hypertrophic response by increasing its transactivating activity.36 Furthermore, this change was located in the carboxy-terminal transactivation domain through which GATA4 interacts with other transcription factors such as GATA6 and MEF2 to synergistically activate downstream target promoters.37 ,38 In addition, the carrier of p.Ala343Thr variation also had atrial fibrillation, which was reported to be linked to GATA4 mutations.39
This study also identifies a higher frequency of patients carrying the T allele for p.Asn352=, located in exon 6 of GATA4. This polymorphism was first described in the Glasgow Heart Scan Study, in 95 subjects whose age-adjusted left ventricle mass was in the upper decile of the sex-specific distribution.40 Although this polymorphism did not survive the Bonferroni correction, the rare allele was over-represented in patients compared with our control subjects and with other populations previously described; therefore, this association should be confirmed in larger populations and then, it could be taken into account in the analyses of other patients with HCM.
The other non-synonymous change, p.Pro555Ala located in GATA6, could have a damaging functional effect as predicted by in silico studies. This change could affect GATA6 protein stability because it is located in the carboxy-terminal domain identified as critical for GATA6 degradation via ubiquitination and proteosomal degradation.41 The p.Pro555Ala substitution was found in two patients, both carriers of a sarcomeric mutation. A family study in one of the two cases showed that p.Pro555Ala and the MYH7 mutation cosegregated with the disease. The fact that the proband's son was healthy at the age of 28 does not rule out the possibility of developing the disease later on, as his father and aunt were diagnosed at ages 51 and 48, respectively. Interestingly, this variant has been identified as damaging and associated with conotruncal heart defects.42
In summary, the rare variants found in the studied GATA genes have the potential to act as modifiers of the clinical phenotype in our cohort of patients, with different associations depending on the presence of sarcomeric mutations. Thus, among patients without known sarcomeric mutations, rare variants in GATA genes were associated with higher thickness of left ventricle walls, although only reached statistical significance in the measure of posterior wall. In addition, patients carrying both rare variants in GATA genes and mutations in sarcomeric genes developed angina more frequently than carriers of sarcomeric mutations alone. This could be explained because angina is a consequence of the insufficiency of blood flow reaching the coronary arteries to the thickened myocardium, the latter is also associated with the presence of GATA variants, and because GATA factors can be critical regulators of cardiac angiogenesis, as it is demonstrated for GATA4.43
Several GATA gene mutations have been described in relation to several human diseases which include mainly different types of cancer and congenital heart diseases.44 All of our patients had left ventricular hypertrophy and none of them had symptoms of these diseases in which GATA variants have been implicated, although we cannot completely rule out that the GATA variants we have found are associated with undiagnosed diseases.
These facts suggest that rare functional GATA variants, although relatively infrequent in HCM, could be modifiers of the phenotypic expression of the disease, adding novel genetic factors to the list of genes associated with the phenotypic expression and the severity of hypertrophy, as are polymorphisms of the MEF2C gene,9 renin–angiotensin system,6 or peroxisome proliferator-activated receptor-γ coactivator 1 α,45 among others. In addition, these results could help in our understanding of the mechanism and provide a novel insight into the etiology and epidemiology of HCM, and furthermore, would contribute to the diagnosis and therapy of this disease.
This study has some limitations. First, despite the ethnic homogeneity of our sample population, the small number of patients requires the replication in larger populations to confirm these findings. Second, the control group was not evaluated by echocardiography to confirm the absence of hypertrophy, and therefore, there could be an underestimation of the impact of our findings. Third, although in silico analyses showed putative functional effects, in vitro studies are necessary to delineate how these variants affect the expression and activity of these transcription factors.
Acknowledgments
The authors thank the patients and their families who gave consent to participate in this study.
Footnotes
Contributors EC and IR contributed to the conception and design of the work. CA-M, JR-R, MM, JG, MN-D, CM, JBC-A, and IR contributed to the acquisition, analysis, or interpretation of data. CA-M and IR drafted the manuscript. JR-R, MM, JG, EC, MN-D, CM, and JBC-A revised the manuscript critically for important intellectual content. All the authors approved the final version to be published. All the authors agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Funding This work was financed by Plan Nacional de I+D+i 2008–2011; Plan Estatal de I+D+i 2013–2016; Instituto de Salud Carlos III (ISCIII)–Fondo Europeo de Desarrollo Regional (FEDER) (PI07/0659, PI10/00173 and PI13/00497); Plan de Ciencia, Tecnología e Innovación 2013–2017 del Principado de Asturias (GRUPIN14–028); Instituto Reina Sofía de Investigación Nefrológica; Fundación Renal Íñigo Álvarez de Toledo; and Red de Investigación Renal-RedInRen from ISCIII (RD06/0016/1013 RD12/0021/1023 and RD16/0009/0017). CA-M and IR were financially supported by Fundación para el Fomento en Asturias de la Investigación Científica Aplicada y la Tecnología (FICYT).
Disclaimer The funders had no involvement in the study design; in the collection, analysis, and interpretation of the data; in the writing of the report; and in the decision to submit the paper for publication.
Competing interests None declared.
Ethics approval Ethics Committee for Clinical Investigation of Asturias.
Provenance and peer review Not commissioned; externally peer reviewed.