Structure-Based Approach Toward Identification of Inhibitory Fragments for Eleven-Nineteen-Leukemia Protein (ENL)
Abstract
Lysine acetylation is an epigenetic mark principally recognized by bromodomains, and recently, structurally diverse YEATS domains have also emerged as readers of lysine acetyl/acylations. Here, we present a crystallography-based strategy and the discovery of fragments binding to the ENL YEATS domain, a potential drug target. Crystal structures combined with synthetic efforts led to the identification of a submicromolar binder, providing the first starting points for the development of chemical probes for this reader domain family.
Introduction
Post-translational modification at lysines is one of the key mechanisms that regulate epigenetic signaling. Lysine acetylation is one of the most common epigenetic marks, specifically recognized by bromodomains and some double PHD finger (DPF) domains. Recently, YEATS domains have emerged as a third class of histone acetylation readers that are present in four human proteins: ENL (MLLT1), YEATS2, AF9 (MLLT3), and glioma amplified sequence 41 (GAS41 or YEATS4). Interaction with acetylated histones was first demonstrated for AF9 and subsequently confirmed for the remaining family members. Interestingly, YEATS domains display an expanded reader activity, also recognizing other types of lysine modifications, including propionylation, butyrylation, and crotonylation.
The YEATS domain consists of approximately 120 to 140 amino acids and is evolutionarily conserved from yeast to human. It exhibits an immunoglobulin-like topology with an elongated beta-sheet sandwich core capped by one or two short helices. The binding pocket is constructed by three loops emanating from the Ig fold. A number of conserved aromatic residues shape a flat, extended architecture of the binding groove that is capable of recognizing acyl-lysine containing sequences. Proteins harboring the YEATS reader domain are often associated with histone acetyl transferase (HAT) and chromatin-remodeling complexes, implicating their diverse roles in the regulation of chromatin structure, histone acetylation, gene transcription, stress signaling, mitotic progression, and DNA damage response. In addition, the ability to preferentially recognize other lysine acylation marks suggests that the YEATS family proteins might exert differential regulatory functions compared to the prototypical Kac reader families with their own cognate targets.
Dysfunction of YEATS proteins has been linked to diseases, notably cancer. For instance, the fusion of AF9 or ENL and human mixed lineage leukemia (MLL) proteins is frequently found in acute myeloid leukemia, and these fusions constitute oncogenes that are drivers of this highly aggressive cancer. In addition, GAS41, a common subunit of SRCAP (Snf2 related CREBBP activator protein) and Tip60 HAT complexes, is a growth-promoting protein. These roles suggest that YEATS proteins are potential targets for drug development. Indeed, two recent studies identified the ENL (MLLT1) YEATS domain as a compelling target in AML.
Development of chemical probes targeting the acetyl-lysine readers, such as the bromodomain family, has provided significant insight into the biological function of these proteins and their potential as drug targets. However, in contrast to bromodomains, no inhibitors of YEATS have been reported to date. This prompted us to identify chemical scaffolds that can interact with the binding pocket of this acetyl-lysine reader family, focusing on oncogenic ENL. In a similar manner to bromodomains, we chose a structure-based approach to identify initial fragment binders. We predicted an essential chemical moiety that could mimic the beta-sheet type hydrogen bonding pattern of acylated lysine and established a small fragment-like library. Using structure-based approaches and biophysical characterization such as thermal shift and isothermal calorimetry (ITC), we identified potential inhibitory ligands for ENL, which may provide chemical starting points for further development of potent inhibitors for this protein and the other members of the histone acylation reader YEATS family.
Results and Discussion
To date, all available crystal structures of ENL and other YEATS proteins have been solved in complexes with peptides. We therefore determined the apo-structure of ENL to expand our knowledge on a non-liganded form of the protein. In the ENL apo-structure, all structural elements were well-defined by electron density, including loops 1, 4, and 6 that defined the recognition site for acylated lysine. This suggested that the binding pocket is well-structured prior to the binding of ligands. Surprisingly, the side chain of Y78, located on loop 6 and together with F28 and F59 forming part of the conserved aromatic acyl-lysine binding triad, exhibited two orientations not observed previously in the peptide-complexed structures. The “in” conformation, with the side chain positioned on top of the binding site, resembled the orientation observed in the ENL-Kac27H3 complex, while the “out” conformation exhibited a ninety-degree side chain rotation. This conformation was stabilized by a hydrogen bond to the backbone of loop 1 E26. The unexpected flexibility of this tyrosine suggested an intrinsic flexible nature of the binding site, interchangeable between the confined “in” conformation and a more open surface when adopting the “out” conformation.
Despite sharing acetyl-lysine recognition functions, comparative analyses revealed diverse features between canonical acetyl-lysine binding sites of ENL and bromodomains such as BRD4. First, the two families harbor distinct primary, secondary, and tertiary structures and therefore have diverse acetyl-lysine binding sites. In bromodomains, the acetyl moiety is anchored by hydrogen bonds to a conserved asparagine and via a water molecule to a tyrosine residue. In contrast, in ENL, the backbone of loop 6 and the aromatic triad mediate this recognition process. Additionally, the two readers exploit different structural elements toward formation of the pocket. Two helices and the ZA loop, acting as a selectivity filter for protein partners and small molecule inhibitors, contribute to a deep, groove-like construction in bromodomains, which contrasts with a surface-exposed, tunnel-like pocket formed through three loop elements and the side chains of the aromatic triad in ENL. These differences also result in diverse patterns of conserved water molecules present in the pockets. Typically, five water molecules conduct a complex hydrogen bond network in bromodomains compared to two structural waters present in ENL located adjacent to the Y78 and A79 backbones.
We next determined the Kac-complexed structure to investigate how the observed flexibility is influenced upon binding of small ligands. A comparison between this structure, the apo-structure, and the peptide complex revealed a conserved binding mode of the acetyl moieties and reorientations of H56 and D57 side chains.
Comprehensive structural analyses revealed that the distinct cave-like characteristic of the pocket in ENL, as well as in other YEATS members, may have evolved for compatibility with various lysine acylations. The front open, adjacent to H56 and loop 6 A79, provides a more restrained entrance for the lysine backbone, while the rear open end, cradled by F28 and F59, adopts a flat, hydrophobic environment for accepting the elongated acylation modification such as crotonylation. The profound structural differences between the pockets of bromodomains and YEATS domains therefore suggest that current acetyl-lysine mimetic bromodomain ligands are unlikely to bind to YEATS domains. With a largely hydrophobic and aromatic surface area of approximately ninety-three square angstroms and a pocket volume of about thirty-four cubic angstroms, the ENL binding pocket is smaller than most bromodomain binding sites, yet could be sufficiently large for the development of potent inhibitors.
The unique features of the ENL pocket prompted us to presume that the narrow binding site with two open ends might be able to accommodate linear molecules harboring a central carbonyl group, which were considered as initial fragment-like scaffolds. For fragment selection, we postulated that suitable ligands should contain a central amide functional group to mimic acetyl-lysine and at least an aromatic property at the carbonyl (R1) end or, if flipped amide, nitrogen (R2) end to mimic pi-stacking with F28 and F59, an interaction observed for the crotonyl group. Based on this assumption, we first selected a small set of nineteen fragment-like compounds, classified into groups based on the positions of their aromatic rings. These compounds were initially tested using thermal shift assays; unfortunately, no detectable shifts in melting temperature were observed. We then sought to exploit an alternative crystallography-based approach to verify interactions and to determine binding modes. Ligand soaking was performed for all compounds; however, only ten crystals with compounds 1, 2, 3, 9, 10, 11, 12, 14, 15, and 19 preserved diffraction quality. Examination of electron density maps revealed in all structures additional density in proximity to the Y78 backbone where the carbonyl of bound acetyl-lysine was typically situated. This consistent observation likely confirmed our hypotheses regarding the use of a central amide group as an acetyl-lysine mimetic for ENL. However, an assessment of the ligand binding modes was only possible for compounds 12 and 19, where a complete trace of electron density enabled an accurate placement of the ligands. These compounds were from two different groups in our selection, yet their binding modes were similar. The amide core was observed to flip in comparison to that of acetyl-lysine. While the carbonyl of the flipped amide maintained the beta-sheet type hydrogen bonding pattern of acylated lysine to the Y78 backbone, the nitrogen group further engaged a direct bond to the S58 side chain that swung slightly backward. At the front end of the ENL pocket, both aromatic benzene and benzimidazole groups of 12 and 19, respectively, attached to the amide carbonyl at R1, were sandwiched between loop 4 H56 and loop 6 A79, adopting a nearly planar orientation to the histidine imidazole. The decorations of the R1 aromatic groups of both compounds were highly solvent exposed, showing little or no interaction with the protein.
Surprisingly, accommodation of the R2 decoration at the rear pocket was highly different between these two ligands. For compound 12, the R2 benzene ring directly attached onto the amide nitrogen atom tucked in planar between F28 and F59 and potentially formed three-layer pi-stacking with these phenylalanine residues, an interaction expected to mimic lysine crotonylation. In contrast, a similar interaction was not observed for the chlorobenzene of compound 19. This could be due to a constraint geometry of the sp3 carbon spacing atom at R2, which in comparison to 12 forced a rotation of nearly one hundred twenty degrees to ascend the chlorobenzene group outward from the rear pocket to the solvent-exposed region. The orientation of the ring system of 19 potentially also created steric constraints for Y78 and consequently induced the tyrosine to adopt the “out” conformation. Unlike 12, this conformation of Y78 resulted in a rather exposed, less rigid pocket. Both co-crystal structures of ENL with 12 and 19 provided valuable structural insights into the binding mode, highlighting, for example, the importance of direct attachments of the aromatic moieties to the R1 and R2 position of the amide core, which may provide fundamental keys not only for a decrease of flexibility of both ligands and Y78 but for enhancing optimal conditions for strong pi-pi contacts. In parallel, development of a high-throughput assay and screening identified similar chemotypes, in essence a benzimidazoleamide hit with low micromolar affinity.
To test this chemical scaffold, we used our available chemistry to preliminarily synthesize a small set of R2-benzimidazole-amide derivatives with modification on R1, resulting in compounds 20 to 24. Using ITC, we observed that these derivatives bound ENL with good affinities in the low micromolar range, with compound 20 showing the best potency, demonstrating a submicromolar Kd of 0.8 micromolar. We then focused on characterizing the binding mode of compound 20, and successful soaking and structure determination enabled insight into the binding mode of this compound in ENL. As expected, the Kac mimetic amide core retained its flipped binding orientation, maintaining the interactions to Y78 and S58 as observed for 12 and 19. The R1 3-iodo-4-methylbenzene was situated in the front cavity, adopting a planar conformation to H56 for a pi-stacking contact at a distance of about four angstroms. The R2 benzimidazole, as expected, located at the rear pocket, was sandwiched between and feasibly induced a three-layer pi-stacking with F28 and F59. However, apart from these predictable fundamental key interactions, compound 20 was observed to engage further hydrogen bonds to the protein, which were not present previously in 12 and 19. This included a direct contact between the nitrogen of the benzimidazole ring and S76 backbone carbonyl. In addition, the extended piperidine decoration was observed to protrude further along the protein surface at the rear end, where it exerted a conformation that enabled a direct contact between its nitrogen and the E75 carboxylic side chain at a distance of about three point one angstroms. We performed ITC to assess thermodynamics of the binding of 12, 19, and 20 to ENL. Binding of 20 was characterized by a favorable enthalpic binding enthalpy and a Kd of about eight hundred seven nanomolar. Binding enthalpy was observed for 12 and 19, indicating interactions in the twenty to fifty micromolar Kd region. The presented data demonstrate that the acetyl-lysine binding site of ENL is druggable using a central aromatic scaffold decorated with an amide that acts as an acetyl-lysine mimetic moiety.
Conclusions
YEATS domains have a fundamentally different binding site than the well-explored acetyl-lysine readers of the bromodomain family. These differences enable YEATS domains to also recognize larger modifications at the epsilon-nitrogen of lysines, including crotonylation and other acyl-modifications not recognized by bromodomains, which typically have a closed, water-filled binding pocket. Thus, targeting YEATS domains will require a different design of acyl-lysine mimetic inhibitors. The information provided by this structure-based fragment study will facilitate the design and development of more potent inhibitors. In our laboratory, we used benzimidazole amides for the development of a potent chemical probe for ENL and AF9, which will facilitate further study of the role of this interesting reader domain in normal physiology and disease.
Experimental Section
For protein purification, crystallization, and structure determination, the ENL YEATS domain (residues 1 to 148) was subcloned into pNIC-CH, and the recombinant protein incorporating a noncleavable C-terminal His6-tag was expressed in E. coli Rosetta. Purification was performed using nickel affinity and size-exclusion chromatography. Pure protein in twenty-five millimolar Tris, pH 7.5, three hundred millimolar sodium chloride, and zero point two millimolar TCEP was concentrated to zero point four five millimolar and used for crystallization at twenty degrees Celsius with various reservoir solutions using PEG/PEG smears as precipitant. Ligands at five to forty millimolar concentration were soaked into crystals, which were subsequently cryoprotected with twenty-five percent ethylene glycol. Diffraction data collected at BESSY II, SLS, and Diamond Light Source were processed using XDS or iMOSFLM and scaled with aimless. Molecular replacement was performed in PHASER using PDB code 5j9s. Model rebuilding alternating with structure refinement was performed in COOT and REFMAC. The models were verified using MolProbity. Data collection and refinement statistics are summarized in the Supporting Information.
Compounds used in this study were obtained from commercial sources except for compounds 11, 12, and 20 to 24, which were synthesized and checked for their purity of greater than ninety-five percent using Waters UHPLC or Varian ProStar HPLC. Detailed syntheses and characterizations are provided in the Supporting Information.
Isothermal titration calorimetry data were measured with a NanoITC instrument at twenty-five degrees Celsius. Protein at three hundred micromolar in twenty-five millimolar Tris, pH 7.5, five hundred millimolar sodium chloride, zero point five millimolar TCEP, and five percent glycerol was titrated into twenty to thirty micromolar compounds.
Associated Content
The Supporting Information is available free of charge on the ACS Publications website. It includes an overview of the YEATS domain structure, a list of fragments, electron density maps, ITC data, details on compound synthesis and analytical data, and crystallographic data collection and refinement statistics. Molecular formula strings are also provided.
Accession Codes
Coordinates and structure factors have been FHD-609 deposited with accession codes 6hq0, 6hpw, 6hpx, 6hpy, and 6hpz.