At a glance
First | 1-4 of 16 | Last
View all figures
Identification of a new genetic disorder and its causative gene.
Disease mechanism of CHOPS syndrome.
Similar transcriptional profile for CHOPS syndrome and CdLS.
Immunoblot analysis of SEC components in CHOPS syndrome skin fibroblast cell lines.
Molecular interaction among SEC, cohesin and RNAP2.
Sanger sequence chromatograms of AFF4 missense mutations.
Supplementary Fig. 1
Gene expression levels of immediate early response genes upon serum stimulation.
Supplementary Fig. 2
Statistics of expression arrays using CHOPS syndrome skin fibroblasts and Affymetrix U133plus2 chips.
Supplementary Fig. 3
AFF4 target gene expression levels are higher in CHOPS syndrome samples compared to the control samples.
Supplementary Fig. 4
Total RNA-seq results.
Supplementary Fig. 5
AFF4 mRNA expression levels and AFF4 mutant allele expression levels.
Supplementary Fig. 6
Sanger sequencing result of CRISPR/cas9 introduction of AFF4 deletion.
Supplementary Fig. 7
ChIP-seq analysis of AFF4, RAD21, RNAP2 and SPT5.
Supplementary Fig. 8
ChIP-seq analysis of RAD21 and AFF4.
Supplementary Fig. 9
Correlation between AFF4 peaks and gene transcription.
Supplementary Fig. 10
Proposed working models of transcriptional regulation governed by SEC and cohesin interaction.
Supplementary Fig. 11
Abstract• Introduction• Results• Discussion• Methods• Accession codes• References• Acknowledgments• Author information• Supplementary information
Transcriptional regulation during embryogenesis is of paramount importance in determining the precise temporospatial gene expression pattern. Recent evidence suggests that such transcriptional control is mainly achieved by regulation of transcriptional elongation1. Approximately 50 bp after transcriptional initiation, RNAP2 pauses in genes whose expression is mainly controlled at the elongation phase. For this transcriptional pausing, NELF (negative elongation factor) and DSIF (DRB sensitivity–inducing factor, a heterodimer of SPT4 and SPT5) are required2. Serine residues located within the C-terminal heptapeptide repeats of RNAP2 undergo phosphorylation modifications during gene transcription. Ser5-phosphorylated RNAP2 (RNAP2 Ser5ph) is enriched at the promoter-proximal region, and Ser2 phosphorylation occurs after transcriptional elongation starts. Therefore, Ser2-phosphorylated RNAP2 (RNAP2 Ser2ph) is regarded as actively elongating RNAP2 (ref. 3). Mobilization of the paused RNAP2 machinery is governed by the SEC, which comprises multiple proteins including AFF4, ELL2 and positive transcription elongation factor b (P-TEFb), which is a heterodimer of CDK9 and cyclin T1 (ref. 4). CDK9 phosphorylates the Ser2 residue of RNAP2 to initiate transcriptional elongation3.
As many multiple-congenital-anomaly genetic conditions are due to disruption of proper transcriptional processes during embryogenesis, disruption of the transcriptional elongation process would be a likely pathogenic mechanism for developmental disorders. However, no causal links have been identified thus far. Through the characterization of a new genetic disorder caused by gain-of-function mutations in the AFF4 gene, encoding a key component of the SEC, we have established a link between the dysregulated transcription observed among disorders of cohesin function (collectively termed ‘cohesinopathies’) and disruption of RNAP2 elongation, caused by altered genome-wide binding of AFF4 and cohesin.
Abstract• Introduction• Results• Discussion• Methods• Accession codes• References• Acknowledgments• Author information• Supplementary information
A new genetic disorder caused by AFF4 missense mutations
Germline mutations of cohesin complex structural and regulatory components cause CdLS (MIM 122470, 300590, 610759, 614701 and 300882)5. CdLS is a multisystem developmental disorder characterized by craniofacial dysmorphisms, intellectual disabilities, growth retardation, limb anomalies and several other systemic abnormalities5. Mutations in NIPBL have been identified in nearly 60% of CdLS probands, and HDAC8, SMC1A, SMC3 and RAD21 mutations account for an additional small portion of patients with CdLS6, 7, 8, 9. The probands in the current study (CHOPS T254S, CHOPS T254A and CHOPS R258W) were originally suspected of having CdLS, owing to the presence of intellectual disability, short stature and craniofacial dysmorphisms (Fig. 1a). However, their physical features are clinically distinctive from those of typical CdLS probands, which allowed us to classify this as a new clinical entity (Supplementary Table 1). We propose ‘CHOPS syndrome’ as an acronym to describe this new genetic disorder: C for cognitive impairment and coarse facies, H for heart defects, O for obesity, P for pulmonary involvement and S for short stature and skeletal dysplasia.
Figure 1: Identification of a new genetic disorder and its causative gene.
Identification of a new genetic disorder and its causative gene.
(a) CHOPS syndrome probands: CHOPS T254S (female with short stature, intellectual disability, chronic lung disease, obesity, brachydactyly, vertebral abnormalities, patent ductus aretriosus, horseshoe kidney and dysmorphic facial features), CHOPS T254A (male with short stature, intellectual disability, tracheomalacia, subglottic and tracheal stenosis, obesity, brachydactyly, cervical vertebrae abnormalities, ventricular septal defect and patent ductus arteriosus, cryptorchidism, hearing loss and dysmorphic facial features) and CHOPS R258W (female with short stature, intellectual disability, laryngomalacia, narrow oropharynx, brachydactyly, kyphoscoliosis, patent ductus arteriosus, ventricular septal defect, cataracts and dysmorphic facial features). Written permission to publish the photographs was obtained from the parents of the CHOPS syndrome probands. (b) AFF4 protein structure demonstrating the location of the de novo missense mutations identified in the three probands (highlighted by rectangles). Missense mutations altered highly conserved amino acid residues. NHD, N-terminal homology domain; TAD, transactivation domain; NLS, nuclear localization signal; NoLS, nucleolar localization signal; CHD, C-terminal homology domain.
Full size image (297 KB)
The striking phenotypic similarities among the three unrelated probands, all born to unaffected, non-consanguineous parents, led us to hypothesize that their clinical features were likely the result of a detrimental germline de novo mutation in the same gene. To test this hypothesis, we performed exome sequencing on these three probands. Each proband had 414 (CHOPS T254A), 720 (CHOPS R258W) and 725 (CHOPS T254S) rare deleterious mutations (nonsynonymous, protein-truncating, deletion, duplication or splice-site mutations). Exome statistics are listed in Supplementary Table 2. Among these, we identified variants in 16 genes that were common to all probands. However, the variants in 14 genes were also present in in-house control samples, arguing against causality. We further scrutinized variants in the remaining two genes, AFF4 and ZFHX3. We confirmed the genetic variants of AFF4 and ZFHX3 by Sanger sequencing. All of the missense mutations found in AFF4 (c.760A>G (p.Thr254Ala), c.761C>G (p.Thr254Ser) and c.772C>T (p.Arg258Trp)) were de novo (not present in the six biological parents of the three probands) (Fig. 1b and Supplementary Fig. 1). All three missense mutations mapped to the ALF (AF4-LAF4-FMR2) homology domain of AFF4, and these missense mutations altered highly evolutionarily conserved amino acids (Fig. 1b). All of the identified ZFHX3 variants were inherited from one of the parents of each proband (all of whom were unaffected). We confirmed paternity using STS markers in all three probands. Screening of an additional 25 probands with atypical features of CdLS failed to identify mutations in the AFF4 gene. However, none of these probands exactly fit the CHOPS syndrome phenotype, indicating that mutations in this gene are highly correlated with the specific phenotype as characterized by the three patients described here. Collectively, these data support the hypothesis that missense mutations affecting the ALF homology domain of AFF4 cause CHOPS syndrome.
npj Genomic Medicine – Open for Submissions
Mechanism of AFF4 mutations leading to CHOPS syndrome
A missense mutation affecting the ALF homology domain of Aff1 (also known as Af4) was reported in the robotic mouse, an ataxia mouse model created by N-ethyl-N-nitrosourea (ENU) mutagenesis10. The pathogenetic mechanism of this missense mutation is a gain-of-function effect due to decreased clearance of the protein by SIAH1 ubiquitin E3 ligase11. Given the vicinity of the location of the missense mutations in our probands to that described in the robotic mouse, we hypothesized that the missense mutations found in the three probands would likewise disrupt ubiquitination-dependent proteasomal degradation of the AFF4 protein. To test this hypothesis, we created an AFF4 and SIAH1 overexpression model using HEK293T cells. When an expression vector for wild-type AFF4 was transfected with an expression vector for SIAH1, the amount of AFF4 protein was markedly decreased; however, constructs for AFF4 containing the missense substitutions found in the three CHOPS syndrome probands showed hindered degradation of AFF4 with SIAH1 overexpression (Fig. 2a). The addition of MG132, which is a proteosomal degradation inhibitor, resulted in the recovery of AFF4 bands.
Figure 2: Disease mechanism of CHOPS syndrome.
Disease mechanism of CHOPS syndrome.
(a) Decreased proteosomal degradation of mutant AFF4 in HEK293T cells. The protein blot demonstrates the disappearance of wild-type (WT) AFF4 bands with the addition of vector encoding SIAH1; such disappearance was not observed in HEK293T cells overexpressing mutant AFF4. The numbers beneath the AFF4 bands indicate signal intensity normalized to the band intensity of the AFF4-only condition for each category and to α-tubulin levels. (b,c) Expression levels of the MYC and JUN genes in patient-derived skin fibroblasts and the HEK293T cell line with AFF4 overexpression. GM01652, GM02036 and GM08398 are control fibroblast cell lines. Elevation of MYC and JUN expression was observed in CHOPS syndrome skin fibroblast (b) and the HEK293T AFF4 overexpression model (c). MYC and JUN expression was normalized against TBP levels. Data represent means ± 2 s.d. **P < 0.01, ***P < 0.001, two-tailed t test; n = 3 technical replicates per group. Empty, empty expression vector control. Full size image (285 KB) Previous Figures/tables index Next Several genes including MYC and JUN have been identified as direct transcriptional targets of AFF4 (ref. 12). We confirmed upregulation of MYC and JUN expression in patient-derived skin fibroblast samples by quantitative RT-PCR, although one proband with an AFF4 p.Arg258Trp alteration had a normal level of JUN expression (Fig. 2b). We also evaluated the expression levels of MYC and JUN in HEK293T cells transfected with overexpression vectors for AFF4. With wild-type AFF4 overexpression, the expression levels of MYC and JUN increased, confirming that these genes are downstream targets of the AFF4 protein. Overexpression of the mutant AFF4 constructs resulted in stronger activation of MYC expression. Without coexpression of SIAH1, the difference between wild-type and mutant AFF4 overexpression was minimal; however, with the addition of SIAH1 overexpression, the expression difference between the wild-type and mutant AFF4 constructs increased both for MYC and JUN (Fig. 2c). Because the SEC has an important role in the precise regulation of the immediate gene responses to various stimuli such as heat shock, retinoic acid or serum1, 4 as well as in developmental regulation, we evaluated the effects of the AFF4 missense mutations on the immediate gene response using patient-derived skin fibroblast cell lines. These cell lines demonstrated upregulation of the FOS and EGR1 genes, which was comparable to that in control cell lines, upon serum stimulation (Supplementary Fig. 2). Therefore, the proband-derived cell lines did not demonstrate an impaired response to serum stimulation with regard to the gene expression levels of FOS and EGR1, and mutant cells still maintained the capacity to mount a response to various stimuli. Transcriptome analysis of CHOPS syndrome Because of the critical role of the SEC in transcriptional regulation, we looked for abnormalities in global gene expression. We performed genome-wide expression profiling using Affymetrix U133 Plus 2.0 arrays on patient-derived skin fibroblast cell lines. We used two patient-derived samples (CHOPS T254A and CHOPS T254S) and three age- and sex-matched control samples, with all samples run in duplicate. We did not use the CHOPS R258W cell line owing to the lack of an appropriate ancestry-matched control cell line. Using a false discovery rate (FDR) cutoff of 0.2, we identified 288 genes downregulated and 445 genes upregulated in patient-derived samples relative to control samples (Supplementary Fig. 3 and Supplementary Table 3). Gene ontology term analysis using DAVID showed that the upregulated genes were enriched for homeobox proteins, skeletal system development/morphogenesis and anterior/posterior pattern formation, as well as embryonic organ development/morphogenesis, and that the downregulated genes were enriched for actin-binding proteins and extracellular matrix components (Supplementary Table 4). Previously, Luo et al. reported the results of AFF4 ChIP-seq and RNA sequencing (RNA-seq) after knockdown by small interfering RNA (siRNA) of AFF4 (ref. 12). Combining the data reported by Luo et al., we selected seven genes as probable direct targets of AFF4, owing to the presence of an AFF4 peak near their promoter regions and downregulation of their expression levels upon AFF4 knockdown. All seven target genes were upregulated by 9 to 127% (mean = 48.9%) in skin fibroblasts from patients with AFF4 mutations in comparison to control samples (Table 1 and Supplementary Fig. 4). The upregulation of five of these genes was statistically significant: MYC, JUN, TMEM100, ZNF711 and FAM13C. These observations further support the notion that AFF4 mutations render gain-of-function effects. Table 1: Expression levels of AFF4 target genes Full table Figures/tables index Transcriptome similarities between CHOPS syndrome and CdLS Given the phenotypic similarities between the three CHOPS syndrome probands with AFF4 mutations and individuals with CdLS, we compared the gene expression pattern of skin fibroblast samples with AFF4 mutations to those from CdLS probands with NIPBL frameshift mutations using the Affymetrix Human Gene 2.0 ST array. We observed a general positive correlation (Pearson's correlation coefficient = 0.34, P < 1 × 10−300) for the gene expression patterns of CHOPS syndrome and CdLS (Fig. 3a). We identified the top 250 most dysregulated (overexpressed and underexpressed) genes for these diagnoses. Expression levels of up- and downregulated genes were similar for the CdLS probands and the probands with AFF4 mutations (Fig. 3b,c). Likewise, genes whose expression was downregulated in CdLS were similarly underexpressed in the probands with AFF4 mutations. Dysregulated genes in CHOPS syndrome demonstrated a similar expression pattern as observed in the samples from CdLS probands (Fig. 3b,c). There were 29 common genes within the top 250 most upregulated genes in CdLS and CHOPS syndrome (odds ratio (OR) = 13.05, P = 4.9 × 10−21, Fisher's test). The top 250 most downregulated genes in CdLS and CHOPS syndrome included 27 genes in common (OR = 11.04, P = 4.4 × 10−18, Fisher's test). For further analyses, we established hTERT-immortalized cell lines from fibroblasts derived from patients with CHOPS syndrome as well as CdLS and performed total RNA-seq analyses (Supplementary Table 5). We observed positive correlation between transcriptional changes in CHOPS syndrome and CdLS (Supplementary Fig. 5), as we did for primary cultured cells. We scored 302 upregulated genes and 216 downregulated genes in CHOPS syndrome (Supplementary Fig. 5b,c and Supplementary Table 6). Of the 78 upregulated genes and 220 downregulated genes in CdLS, 30 and 72 genes, respectively, showed significantly similar expression tendencies to those observed in CHOPS syndrome. Although the CHOPS R258W sample did not show overexpression of JUN in quantitative RT-PCR (Fig. 2b), the profile of dysregulated genes was very similar for CHOPS R258W in comparison to CHOPS T254A and CHOPS T254S (Supplementary Fig. 5d). Figure 3: Similar transcriptional profile for CHOPS syndrome and CdLS. Similar transcriptional profile for CHOPS syndrome and CdLS. (a) General comparison of expression changes in CHOPS syndrome and CdLS demonstrated positive correlation between the two syndromes. (b) Comparison of the top 250 most dysregulated genes in each syndrome. Left, upregulated and downregulated genes in CdLS showed similar changes in CHOPS syndrome samples. Right, upregulated and downregulated genes in CHOPS syndrome demonstrated similar changes in CdLS samples. The bottom of each box represents the first quartile, and the top of the each box represents the third quartile. The line in the middle of each box represents the median. The notches/whiskers demonstrate the confidence interval. (c) Volcano plots. Top left, volcano plot defining the top 250 genes whose expression levels were higher (purple dots) and lower (green dots) in the CHOPS syndrome samples. These plots represent the top 250 genes that were either up- or downregulated in the probands with the highest magnitude of difference and P < 0.05. Bottom left, distribution of these 250 genes, whose expression was up- or downregulated in CHOPS syndrome, in the CdLS samples. Top right, distribution of the top 250 genes whose expression was upregulated (purple dots) or downregulated (green dots) in the CdLS syndrome samples. Bottom right, distribution of these 250 genes, whose expression was up- or downregulated in CdLS, in the CHOPS syndrome samples. The genes from the plots above are labeled with the same colors in the bottom plots. Full size image (316 KB) Previous Figures/tables index Next Chromatin accumulation of AFF4 in CHOPS syndrome With the confirmation that the missense mutations found in the three CHOPS syndrome probands caused decreased clearance of mutated AFF4 protein due to acquired resistance to SIAH1-mediated proteasomal degradation, we analyzed the amount of chromatin-associated SEC component in the CHOPS syndrome cell lines. The amount of AFF4 protein in patient-derived skin fibroblasts was elevated, supporting the suggestion that the missense mutations found in these probands result in a more stable AFF4 protein (Fig. 4a). The mRNA levels of AFF4 were comparable between proband and control samples (Supplementary Fig. 6a). In the samples from probands, the wild-type and mutant alleles were expressed in almost equal amounts (Supplementary Fig. 6b). We further analyzed the amount of chromatin-associated AFF4, ELL2 and CDK9 in CHOPS syndrome cell lines and found that accumulation of AFF4 mainly occurred on the chromatin fraction in the CHOPS syndrome cell lines (Fig. 4b). The increase in AFF4 accumulation in CHOPS syndrome proband CHOPS T254A was due to the mutant AFF4 allele, as the amount of AFF4 protein was clearly decreased in cell lines where the mutated AFF4 allele was deleted (Fig. 4b and Supplementary Fig. 7). The amount of other SEC components such as ELL2 and CDK9 did not show major alterations in the chromatin fraction of the CHOPS syndrome cell lines relative to control cells. The CdLS cell line with an NIPBL frameshift mutation (CDL006) did not show elevation in the amounts of any of these proteins (Fig. 4b). Figure 4: Immunoblot analysis of SEC components in CHOPS syndrome skin fibroblast cell lines. Immunoblot analysis of SEC components in CHOPS syndrome skin fibroblast cell lines. (a) CHOPS syndrome sample cell lysates demonstrate a stronger AFF4 band; however, the amounts of NIPBL, MAU2 and SMC1 are unchanged in comparison to control samples (GM02036 and GM03348). (b) Accumulation of AFF4 in the chromatin fraction was seen in the CHOPS syndrome samples; however, the amounts of ELL2 and CDK9 remained unchanged. Numbers beneath the bands represent the signal intensity normalized against the band intensity of the control GM02036 sample and also normalized against the α-tubulin level for total cellular lysates and the histone H3 level for chromatin fraction samples. The asterisk indicates a nonspecific band. The CdLS sample used was CDL006 with a frameshift mutation of NIPBL. The CHOPS T254A AFF4 null clone had biallelic AFF4 frameshift mutations. The CHOPS T254A AFF4 mut del clone had a 1-bp deletion in the mutant AFF4 allele with an intact wild-type AFF4 allele. Full size image (183 KB) Previous Figures/tables index Next Genome-wide chromatin binding of AFF4, RNAP2 and cohesin To better understand the molecular mechanism leading to the similar transcriptional profiles for CHOPS syndrome and CdLS, we evaluated the genome-wide binding patterns of AFF4, elongating RNAP2, paused RNAP2, SPT5 and the cohesin subunit RAD21 by performing ChIP-seq (Supplementary Table 7). To understand the shared characteristics of CdLS and CHOPS syndrome, we focused on differentially expressed genes in these two syndromes compared with a cell line from a healthy control. Overall, downregulation and upregulation of genes at the level of RNA correlated well with increasing or decreasing chromatin binding levels of RNAP2, elongating RNAP2 (Ser2ph), paused RNAP2 (Ser5ph) and SPT5 in both CdLS and CHOPS syndrome (Supplementary Fig. 8). AFF4 binding around the transcription start site (TSS) was elevated both in the CHOPS syndrome and CdLS samples relative to the control sample among the transcriptionally upregulated genes (Supplementary Fig. 9). In contrast, among transcriptionally downregulated genes, we did not observe such enhanced TSS binding of AFF4 (Supplementary Fig. 9). However, in neither syndrome could we find a correlation between changes in AFF4 binding and transcription (Supplementary Fig. 10). Interestingly, we observed an increase in RAD21 binding in comparison to the control at the TSSs of upregulated genes and the opposite effect at the TSSs of downregulated genes (Supplementary Fig. 9). Molecular interactions among the SEC, cohesin and RNAP2 These findings prompted us to examine the physical interaction of the SEC, cohesin and RNAP2 (Fig. 5). In HeLa cell lysate, immunoprecipitation with an antibody to STAG1 (SA1) yielded positive immunoblot bands for various SEC components, including AFF4, ELL2, cyclin T1 and CDK9; however, we did not observe such interactions with an antibody to STAG2 (SA2) (Fig. 5a). Furthermore, we found that RNAP2 forms a complex with SMC1 and STAG1 but not with STAG2 in HeLa cell lysate (Fig. 5b). Phosphorylated forms of RNAP2 were preferentially enriched in the coimmunoprecipitated fraction with STAG1 but not STAG2. Addition of elongation inhibitors such as DRB (5,6-dichlorobenzimidazole 1-β-D-ribofuranoside) and flavopiridol, which is a CDK9 kinase inhibitor, decreased the amount of both RNAP2 Ser2ph and Ser5ph precipitated with STAG1. Therefore, the RNAP2 interacting with STAG1 is primarily the elongating form of RNAP2. Figure 5: Molecular interaction among SEC, cohesin and RNAP2. Molecular interaction among SEC, cohesin and RNAP2. Results of immunoprecipitation and immunoblotting. (a) Protein interaction between the SEC and cohesin. STAG1 (SA1) interacts with various SEC components including AFF4, ELL2, cyclin T1 and CDK9. Immunoprecipitation with antibody to STAG2 (SA2) did not show such interaction with the SEC. Addition of DRB slightly increased the amount of SEC components interacting with STAG1. Goat serum was used for the mock condition. (b) Protein interaction between cohesin and RNAP2. STAG1 interacts with various forms of RNAP2. Addition of DRB (D) and flavopiridol (F) decreased the amount of RNAP2 Ser2ph and Ser5ph precipitated with STAG1. RNAP2 was not immunoprecipitated with antibody to STAG2. Un-ph, unphosphorylated; IP, immunoprecipitation; N, no treatment. Full size image (96 KB) Previous Figures/tables index Discussion Abstract• Introduction• Results• Discussion• Methods• Accession codes• References• Acknowledgments• Author information• Supplementary information Through the identification of gain-of-function mutations in AFF4 as the cause of a new syndrome consisting of a highly conserved pattern of features that we have termed CHOPS syndrome, we demonstrate that an altered RNAP2 distribution and resultant transcriptional elongation abnormalities underlie the transcriptional dysregulation of CHOPS syndrome as well as CdLS. CHOPS syndrome represents the first human developmental disorder caused by germline mutations in an SEC component, although synonymous mutations were previously suggested to correlate with autism in two individuals13, 14. AFF4 is a scaffold protein comprising the core component of the SEC. In the probands reported here, the missense mutations create a resistance to ubiquitination-dependent proteasomal degradation, resulting in excessive amounts of accumulation of mutant AFF4 protein. This persistence of AFF4 protein in turn results in an altered genomic distribution of AFF4 and leads to the unique transcriptome pattern of CHOPS syndrome. The transcriptomic effects of the AFF4 gain-of function mutations seen in CHOPS syndrome were opposite to those observed in AFF4 knockdown experiments, confirming that the disease mechanism of these mutant AFF4 proteins is a gain of function rather than haploinsufficiency or a dominant-negative effect. In addition, the phenotype of the CHOPS syndrome probands is different from that of the Aff4 knockout mouse as well as human subjects with genomic deletions encompassing the AFF4 gene, lending further support to a gain-of-function effect15, 16. Comparing the AFF4 binding patterns among the upregulated and downregulated genes, an alteration was seen only in the upregulated genes, further supporting the notion that the primary effect of the AFF4 mutations found in CHOPS syndrome is gain of function, leading to transcriptional activation (Supplementary Fig. 11). The fundamental role of transcriptional elongation in the control of gene expression during embryogenesis has recently begun to be better understood1. Therefore, CHOPS syndrome can be regarded as a multiple-congenital-anomaly syndrome caused by disturbance of the transcriptional elongation process. Although accumulation of AFF4 was observed for the transcriptionally activated genes in CHOPS syndrome and CdLS, there was no correlation between AFF4 ChIP-seq and RNA-seq or RNAP2 Ser2ph ChIP-seq in either syndrome (Supplementary Fig. 10). This finding suggests that AFF4 controls gene expression for only a subset of AFF4-bound genes, which is consistent with previously reported data by Luo et al., demonstrating that the knockdown of AFF4 leads to downregulation of expression for only a very small subset of genes (seven genes) among those that harbor an AFF4 ChIP-seq peak around the TSS in the HEK293T cell line12. In CHOPS syndrome and CdLS, transcriptional alterations are more likely seen for the genes where expression is mainly regulated by the transcriptional elongation step. Similar findings were reported for other transcription elongation controller molecules such as DSIF and NELF2, 17. In the CHOPS syndrome fibroblast cell lines, even with the massive accumulation of AFF4 in the chromatin fraction, the amount of other SEC components (ELL2 and CDK9) remained unchanged. However, the amount of these proteins forming the SEC is likely to be increased, given that the primary effect seen in CHOPS syndrome is activation of AFF4 target genes. CDK9 is known to form a complex with BRD4 or 7SK RNA and HEXIM outside of the SEC, as well as existing as a free CDK9 molecule18, 19, 20, 21. Therefore, the amount of CDK9 forming the SEC is likely to be increased with the concurrent decrease in the amount of CDK9 forming complexes with BRD4 or 7SK RNA and HEXIM or free CDK9 in the nucleus. This speculation is supported by the fact that transfection with construct for AFF4 alone exerted transcriptional activation effects on SEC target genes such as MYC and JUN, even without concurrent overexpression of other SEC components (Fig. 2c). AFF4 is expressed in fetal brain, lung and heart and in adult brain, heart, skeletal muscle and pancreas22. It is very intriguing that all CHOPS syndrome probands manifested with chronic lung disease, given the strong expression of AFF4 in fetal lung. In addition, AFF4 is implicated in regulation of appetite, as hypothalamic AFF4 expression is upregulated upon fasting23. Therefore, it is possible that gain-of-function mutations in AFF4 lead to increased appetite, resulting in the obesity seen in CHOPS syndrome probands. Although CHOPS syndrome and CdLS are clinically recognizable as distinct entities, there is some phenotypic overlap between these two diagnoses suggesting a common underlying pathogenesis. Given the apparently divergent molecular etiologies of each, we were somewhat surprised to discover striking similarities in the transcriptomic profiles of CHOPS syndrome and CdLS. Effects on gene transcription by mutated cohesin components or regulators have been proposed as the pathogenic mechanism of CdLS, establishing a critical role for cohesin in transcriptional regulation24. One mechanism by which cohesin controls transcription is through the creation of looping DNA structures that allow for physical interactions between distal regulatory elements and promoter regions25, 26. However, the exact mechanism by which cohesin regulates transcription remains poorly understood. We discovered that both CHOPS syndrome and CdLS display a similar disruption in the behavior of AFF4 and RAD21. Interestingly, the genes whose expression is significantly increased in CHOPS syndrome and CdLS demonstrate elevated AFF4 binding. In combination with the finding that similar sets of genes are upregulated and downregulated in CdLS and CHOPS syndrome, we can conclude that the altered genome-wide distribution of AFF4 binding affects cohesin binding in CHOPS syndrome, with the inverse relationship being true in CdLS (where the altered genome-wide distribution of cohesin affects AFF4 binding). Furthermore, as a molecular link between CHOPS syndrome and CdLS, we demonstrated a physical interaction between the SEC, the STAG1-cohesin complex and RNAP2. This represents the first demonstration, to our knowledge, of such a physical interaction not only between the SEC and cohesin but also between cohesin and RNAP2. Recently, functional differences between the STAG1 and STAG2 components of cohesin have started to emerge. Previously, the Stag1 knockout mouse model demonstrated the importance of STAG1 in transcriptional regulation, as well as a critical function in chromosomal segregation27, 28. Our results point to an important role for STAG1 in the pathogenesis of CdLS. A role for cohesin in transcription elongation has previously been proposed. Cohesin selectively binds genes in which RNAP2 pauses just downstream of the TSS, and it was hypothesized that cohesin facilitates transition of paused RNAP2 to elongation24, 29. This role of cohesin in transcriptional elongation is likely facilitated by NIPBL-MAU2, with AFF4 and the SEC mediating its effect. Recently, cohesin loaders (NIPBL-MAU2) were shown to be important in maintaining nucleosome-free regions in yeast models, and it was hypothesized that derangements of nucleosome positioning cause transcriptional disturbance in cohesinopathies30. Our observation suggests that NIPBL and MAU2 also have a role in the transcriptional elongation process, which requires the presence of paused RNAP2 at gene promoters. In summary, we report the identification of a new genetic disorder, CHOPS syndrome, caused by mutations in AFF4 and demonstrate that altered genome-wide binding of AFF4 leading to RNAP2 gene body accumulation underlies the transcriptional abnormalities observed. We also demonstrate a molecular link between CHOPS syndrome and CdLS and implicate cohesin in transcriptional elongation. These observations underscore the importance of proper proximal pausing of RNAP2 as a crucial regulator of gene expression during human embryogenesis. Methods Abstract• Introduction• Results• Discussion• Methods• Accession codes• References• Acknowledgments• Author information• Supplementary information Human subjects. All individuals enrolled in the study were evaluated by clinical geneticists experienced in the diagnosis of CdLS. All patients and family members were enrolled in the study under an institutional review board–approved protocol of informed consent at the Children's Hospital of Philadelphia. Exome sequencing and Sanger sequencing. Genomic DNA was extracted from peripheral blood cells. Exome capture was performed with Agilent SureSelect v 4 (Agilent Technologies). Captured DNA was sequenced using the Illumina HiSeq 2000 platform with paired-end reads of 100 bp. Sequencing reads obtained in fastq format were subjected to the whole-exome sequence analysis pipeline. Reads were mapped to human genome GRCh37 (1000 Genomes Project version) using Novoalign version 2.08. Base quality recalibration was also performed as part of the mapping in Novoalign. Duplicate reads were marked using Picard version 1.79. The rest of the workflow was based on the Genome Analysis Toolkit (GATK) package31 best practices through indel realignment and base recalibration steps. Variant calling was performed using the UnifiedGenotyper with default parameters. The resulting vcf files were subjected to variant effect prediction using SnpEff tool v3.1 (ref. 32) with RefSeq transcripts. For confirmation of genetic variants identified by exome sequence, Sanger sequence was performed. PCR primer sequences are available on request. Reagents. MG132, DRB, mouse monoclonal antibody to Flag (F3165) and antibody to α-tubulin were (T6074) from Sigma-Aldrich. Flavopiridol was from Enzo Life Sciences. Antibody to HA (12CA5) was from Roche. Antibodies to AFF4 (ab57077; used for immunoblotting), CDK9 (ab75848), MAU2 (ab46906), STAG1 (ref. 33), STAG2 (ref. 33), SMC1 (ref. 33), STAG1 (ab4457), STAG2 (ab4463), SMC1A (ab21583) and histone H3 (ref. 33) were from Abcam. Antibody to RAD21 was custom and used in our previous study9. Antibodies to ELL2 used for immunoblotting (A302-505A) and AFF4 used for ChIP-seq analysis (A302-538A) were purchased from Bethyl Lab. Antibodies to NIPBL (sc-374625) and cyclin T1 (sc-10750) were from Santa Cruz Biotechnology. Antibody to SPT5 was provided by Y. Yamaguchi (Tokyo Institute of Technology). Antibodies to RNAP2 were used in a previous study and were provided by H. Kimura (Osaka University)34. Plasmids encoding Myc-DDK-tagged AFF4 (RC217692) and Myc-DDK-tagged SIAH1a (RC206576) were obtained from OriGene Technologies. To generate HA-His-tagged SIAH1a, the SIAH1A cDNA was subcloned into pCMV6-AC-HA-His, using the OriGene RapidShuttling kit (SfgI-MluI, PS200008) according to the manufacturer's directions. Mutagenesis of the AFF4 expression plasmid to introduce each of the nonsynonymous variants was performed using the QuikChange Site-Directed Mutagenesis kit by Agilent Technologies following the manufacturer's protocol. Plasmids were sequenced after site-directed mutagenesis to confirm the change and to rule out the introduction of additional, nonspecific changes. Cell culture. HEK293T and HeLa cells were cultured in DMEM containing 10% FBS and 1% penicillin-streptomycin. HEK293T cells were obtained from the American Type Culture Collection (ATCC). HeLa cells were provided by T. Hirota (Japanese Foundation for Cancer Research). Cells were grown to 70% confluency and transiently transfected with plasmid DNA using Lipofectamine 2000 (Life Technologies), according to the manufacturer's instructions. At 24 h after transfection, media were changed to DMEM containing 10 μM MG132 for 7 h where indicated. Cells were lysed with SDS sample buffer for immunoblotting. Skin fibroblast cell lines from three CHOPS syndrome and two CdLS probands were used in this study. The control fibroblast cell lines were ordered from the Coriell Cell Repository. Fibroblast cell lines were uniformly cultured in RPMI 1640 or DMEM supplemented with 20% FBS, antibiotics (100 U/ml penicillin and 100 μg/ml streptomycin) and 1% L-glutamine. Skin fibroblast cell lines used for total RNA-seq and ChIP-seq were immortalized by exogenous expression of hTERT. Skin fibroblast cell lines were found negative for mycoplasma contamination. Protein analyses. To obtain total cell extracts, cells were lysed with lysis buffer (20 mM HEPES, 10 mM KCl, 100 mM NaCl, 1.5 mM MgCl2, 0.34 M sucrose, 10% glycerol, 10 mM sodium fluoride, 10 mM β-glycerophosphate, 10 mM sodium butyrate, 1 M DTT, 20% Triton X and cOmplete protease inhibitor cocktail (Roche)). To obtain soluble cell extracts, cells were lysed with lysis buffer and centrifuged at 1,500g. DNA in the chromatin fraction was digested by treatment with benzonase (Novagen, Merck Millipore). Immunoblot band quantification was performed with ImageQuant TL (GE Healthcare Life Sciences). Immunoprecipitation. HeLa cells were untreated or treated with 50 μM DRB or 1 μM flavopiridol for 1.5 h. Cells were collected by trypsin and washed with PBS. Chromatin-containing factions were prepared from cell pellets by treatment with buffer A (10 mM Tris-HCl (pH 7.5), 10 mM NaCl, 3 mM MgCl2, 0.2% NP-40, cOmplete–EDTA free, PhosSTOP). Lysates were treated with buffer B (40 mM Tris-HCl (pH 7.5), 200 mM NaCl, 3 mM MgCl2, 10% glycerol, 0.2% NP-40, cOmplete-EDTA, PhosSTOP) containing benzonase. Chromatin-bound protein (input) and the insoluble fraction were separated by centrifugation at 20,000g for 15 min. Immunoprecipitation in the input fraction was performed using protein G magnetic beads conjugated with antibody to STAG1 or STAG2. Expression arrays. Skin fibroblasts were plated in T75 flasks with approximately 700,000 cells for U133P2 and 200,000 to 350,000 cells for Human Gene 2.0 arrays. Cells were collected for RNA extraction at 70–80% confluency for U133P2 and 40–70% confluency for Human Gene 2.0 arrays. The samples used for expression array analysis included two CHOPs syndrome samples (CHOPS T254S (CDL160): a 6-year-old European-ancestry female with a p.Thr254Ser alteration; CHOPS T254A (CDL444): a 12-year-old European-ancestry male with a p.Thr254Ala alteration) and three age- and sex-matched controls (GM01652: an 11-year-old European-ancestry female; GM02036: an 11-year-old European-ancestry female; GM08398: an 8-year-old European-ancestry male) for U133P2 array analysis. The samples used for Human Gene 2.0 arrays included two CHOPs syndrome samples (CHOPS T254S (CDL160) and CHOPS T254A (CDL444)), two CdLS samples (CDL006: a 7-year-old European-ancestry female with NIPBL mutation (NIPBL c.742_743delCT; p.Leu248Tfs*6); CDL015: a 10-year-old European-ancestry male with NIPBL mutation (NIPBL c.2969delG; p.Gly990Aspfs*2)) and four age- and sex-matched control samples (GM01652, GM01864, GM02036 and GM03348) (GM03348: an 8-year-old European-ancestry female). RNA was extracted using the RNeasy mini kit (Qiagen). cRNA was purified, fragmented and labeled using the 3′ IVT Express kit (Affymetrix). RNA (100 ng) was used from each sample. First-strand cDNA was synthesized from the extracted RNA, and second-strand cDNA was then synthesized from the first-strand cDNA. Subsequently, in vitro transcription and labeling of aRNA (amplified RNA) was performed. Labeled aRNA was hybridized to Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Purified sense-strand cDNA was generated from extracted RNA using the Ambion WT Expression kit. RNA (100 ng) was used from each sample. Purified sense-strand cDNA samples were fragmented and labeled using the Affymetrix GeneChip WT Terminal Labeling and Controls kit. Labeled cDNA was hybridized to Affymetrix GeneChip Human Gene 2.0 ST arrays for transcriptome analysis. Arrays were washed and stained according to standard Affymetrix protocols using an Affymetrix GeneChip Fluidics Workstation 450 and scanned using an Affymetrix GeneChip Scanner 3000 7G. The DAT files were converted into cell intensities (CEL files) by Affymetrix Command Console software, and these were used for subsequent analyses. The sequences from which these probes were derived were selected from GenBank, dbEST and RefSeq. Data were processed and analyzed within the R/Bioconductor environment. Original Affymetrix probes on both platforms were mapped to the current version of Entrez Gene using the library files provided by the BRAINARRAY database. Raw data were normalized and summarized with the RMA (robust multichip averaging) method to obtain gene expression levels. Pairwise comparison of sample groups was performed by the SAM (significance analysis of microarrays) method, which evaluated differential gene expression by fold change, P value and FDR. Fold changes and P values were used to create the volcano plots. The top 250 genes with the largest fold change and P < 0.05 were imported into the DAVID online tool for functional categorization. DAVID reported predefined gene sets or pathways whose members were significantly over-represented in the 250-gene list. Quantitative RT-PCR. Extracted RNA samples were also used for quantitative RT-PCR. RNA was reverse transcribed with TaqMan Reverse Transcription Reagents (Life Technologies), and synthesized cDNA was used for quantitative PCR analysis by droplet digital PCR system (QX100, Bio-Rad)35, 36. Gene expression was analyzed using TaqMan gene expression assay probes, including AFF4 Hs00232683_m1, MYC Hs00153408_m1, JUN Hs01103582_s1, FOS Hs04194186_s1 and EGR1 Hs00152928_m1 (Life Technologies). Internal control probes used were TBP 4326322E and CYC 4326316E (Life Technologies). For allele-specific mRNA detection of AFF4, Custom TaqMan SNP Genotyping Assays (Life Technologies) were used, and TaqMan probes were designed for three AFF4 mutations found in CHOPS syndrome probands. Probe sequences are available on request. For serum stimulation, cells were cultured in serum-free medium for 2 d and were then either left untreated or treated with serum for 30 min before RNA extraction. Gene expression levels for the FOS and EGR1 genes, both of which were induced by serum stimulation, were measured by quantitative RT-PCR1. RNA-seq analyses. For total RNA-seq, we used immortalized skin fibroblast cell lines from CHOPS T254A (CDL444), CHOPS T254S (CDL160), CHOPS R258W, CDL006, GM01652, GM02036 and GM03348 (CHOPS R258W (CDL559): an 8-year-old African-American female with a p.Arg258Trp alteration). All samples were sequenced on the Illumina HiSeq 2500 platform as single-end 50-bp reads and mapped to the human genome (UCSC build hg19). Sequencing and mapping statistics are summarized in Supplementary Table 5. Total RNA was extracted using TRIzol (Life Technologies) and Nucleospin RNA (Macherey-Nagel) following the manufacturers' instructions. The RNA-seq analysis was performed using TopHat version 2.0.8b37 and Cufflinks version 2.1.1 (ref. 38) with the default parameter set. We applied Ensembl gene annotation (GRCh37) for both TopHat and Cufflinks. We considered protein-coding genes only, and gene-level expression values were estimated by fragments per kilobase of exon per million fragments mapped (FPKM) score. We obtained a total of 519 and 299 deferentially expressed genes for CHOPS syndrome and CdLS, respectively (Supplementary Table 6). ChIP-seq analyses. For ChIP-seq, we used immortalized skin fibroblast cell lines of CHOPS T254A (CDL444), CDL006 and GM2036 samples. Chromatin immunoprecipitation was performed as previously described39. Briefly, cells were cross-linked with 1% formaldehyde for 10 min. Soluble cell lysates were prepared by sonication or both MNase treatment and sonication. Lysates were then incubated with protein A or protein G Dynabeads (Life Technologies) conjugated with antibodies. Beads were washed, and bound fractions were eluted. After reversal of cross-linking for elutes and input samples, DNA was purified using the PCR purification kit (Qiagen). DNA from input and immunoprecipitation fractions was further sheared to an average size of approximately 150 bp by ultrasonication (Covaris), end repaired, ligated to sequencing adaptors and amplified according to the manufacturer's instructions (NEBNext ChIP-seq Library Prep Master Mix Set for Illumina; New England BioLabs). Sequenced reads were mapped to the human genome using Bowtie version 1.1.0 (ref. 40), allowing two mismatches in the first 28 bases per read and outputting only uniquely mapped reads (-n2 –m1 option). Mapping statistics are summarized in Supplementary Table 7. ChIP-seq analysis and visualization were performed using DROMPA version 2.5.3 (ref. 41). To facilitate comparison of detected peaks between different ChIP-seq experiments, the number of mapped reads was normalized with the total number of mapped reads. To eliminate uncertain sites, we ignored regions with low mappability (mappability <0.3 for a 1-kb window). Scatter plots for RNAP2, AFF4 and SPT5 are shown in Supplementary Figures 8 and 10. We applied CG command to normalize and count the mapped reads within each gene body region. Plotting of averaged reads around TSSs (Supplementary Fig. 9) was performed using PROFILE command. RefSeq gene annotation was obtained from the UCSC Genome Browser. Genome editing. For the introduction of AFF4 deletion, the CRISPR/cas9 system was used42. gRNA and hCas9 vectors were purchased from Addgene. AFF4 exon 3 target sequence was introduced into an empty gRNA vector. The gRNA target sequence was ATGGGCCGCACATAGGCAG. AFF4 exon 3–targeting gRNA vector and hCas9 plasmids were transfected into skin fibroblast cell lines using the Neon electroporation system (Life Technologies). After electroporation, single-cell cloning was performed. For each clone, the presence or absence of genome editing was confirmed by Sanger sequencing. URLs.