Ex vivo discovery of synergistic drug combinations for hematologic malignancies

,


Introduction
Acute myeloid leukemia (AML) is a heterogenous and aggressive blood cancer that primarily affects older adults [1].AML is relatively uncommon, accounting for 1 % of all cancers and 20,000 of diagnosed cases in the United States each year [2].However, the incidence of AML is projected to increase due to demographic aging [1].Over the past decade, the Federal Drug Administration (FDA) has approved many targeted therapeutics for treating AML [3].However, patient outcomes remain poor, as inter-and intra-patient tumor heterogeneity poses a challenge for traditional drug development strategies that focus on single-agent therapies [4,5].
Alternatively, drug combination therapies are an effective approach for tackling tumor heterogeneity.For example, the combination of venetoclax with azacitidine increased overall survival 46 % higher than azacitidine alone in patients newly diagnosed with AML who are age 75 years and older or who do not qualify for intensive chemotherapy.This combination treatment is now the standard of care in that population [6, Abbreviations: AML, acute myeloid leukemia; AP, average precision; CDI, coefficient of drug interaction; DMSO, dimethyl sulfoxide; DNN, deep neural network; DSS, drug sensitivity screening; FDA, Federal Drug Administration; FLAG, fludarabine, cytarabine, and filgrastim; FMO, fluorescence-minus-one; HTS, highthroughput screening.

7].
Despite the success of combination approaches, many fail in clinical trials.Also, the number of possible combinations of individual treatments greatly surpasses the number of patients who can enroll in clinical trials [8].As a result, we need better methods to efficiently uncover effective drug combinations at the preclinical stage, predict combination therapeutic strategies for clinical development, and identify drug combinations that more effectively treat cancers [8,9].
To efficiently identify potential drug combinations for clinical development, we need an effective preclinical tool that is scalable and translatable.However, developing such a tool has challenges.One challenge is that drug combination discovery workflows are difficult to efficiently scale.Many tools used to find single-agent drugs, such as high-throughput screening (HTS) of small-molecule libraries in cellbased assays, poorly scale for screening massive drug combinations [10].For example, with modest screening workflows, an HTS campaign of a small-compound library with ~10,000 drugs that have been tested for safety in clinical trials can be completed in 1 week.However, HTS of all pair-wise combinations of those same compounds to identify synergistic drug combinations would require testing nearly 50 million pairs.This magnitude is experimentally intractable and far beyond the scale of even the largest HTS systems.As a result, drug synergy screens in cell lines have been limited to relatively small drug libraries.Even still, these limited explorations of drug combinations are resource-intensive, which restricts screening to simple, low-cost assays that scale efficiently [11,12].
Another challenge is that preclinical tools are difficult to clinically translate.As of 2023, HTS of drug combinations has largely been limited to simple bulk readouts of viability and drug sensitivity in in vitro models with cancer cell lines [11,12].Although such approaches are inexpensive and scalable, they have limited biological relevance and clinical translatability.These features are important because drug sensitivity and gene expression pathways for multidrug resistance are different between generated cells and primary samples [13][14][15].
One approach to bridging the preclinical gap in disease model translatability involves drug sensitivity assays with ex vivo models.Such models are valuable for screening primary AML cancer tissue, which is routinely collected in large amounts from patients' blood or bone marrow.These tissues can be cultured ex vivo and screened with drugs to identify potential therapeutics [16][17][18][19].For example, we and others showed that ex vivo drug sensitivity screening (DSS) can accurately predict clinical therapeutic responses in patients with AML across multiple therapeutic classes [20][21][22][23].Ex vivo DSS also supports "functional precision medicine" in which a patient's DSS data can be used to test many potential treatment options to find optimal, personalized therapies for that particular patient [18,24].With this approach to personalized medicine, selected therapies guided by DSS improved survival of patients with hematologic malignancies [24].These findings support that ex vivo DSS might be a promising approach to identify translatable, personalized therapeutics that transcend HTS-screening approaches in cell lines.
Despite the promise of ex vivo DSS, this approach is not without limitations.First, unlike screens with immortalized cancer cell lines, ex vivo DSS is limited by the finite quantity of leukemic cells in a given sample.This limitation restricts the number of drugs or drug combinations that can be screened per patient [25].Indeed, the largest reported DSS of drug combinations in an ex vivo AML model tested only 50 drug combinations per patient sample [26].
Second, ex vivo DSS cannot simply use rapid and low-cost assays of bulk, cell viability like CellTiter-Glo that are commonly used to screen drug combinations in cell lines [27].These bulk methods measure the aggregate sensitivity across all cells and do not distinguish the drug effects between healthy and cancer cell populations within patient tissue ex vivo [17].As a result, ex vivo DSS of AML tissue can only predict patient responses when using complex, functional assays with single cell resolution, such as HTS multicolor flow cytometry [20].However, these single-cell assays are much more costly, slower, and less efficient than bulk viability assays [28][29][30].
The limitations of DSS restrict the number of drug conditions that can be tested ex vivo, especially for identifying optimal, personalized combination therapies.Thus, we need better methods for ex vivo drug screening that are scalable and clinically translatable.To this end, we aimed to develop a fully automated platform for identifying synergistic drug combinations directly in patient tissue.Herein, we describe our design of ComboFlow, which comprises (1) a miniaturized assay to measure drug sensitivity with flow cytometry, (2) combinatorial drug pooling, and (3) automated data analysis with a deep neural network (DNN).We then used this system to rapidly screen thousands of combinations of approved and investigational agents in primary AML tissue.

Patient samples
Remnant bone marrow and peripheral blood specimens from deidentified patients diagnosed with AML, high-risk myelodysplastic syndrome, and acute lymphocytic leukemia were obtained from various remnant providers.Bone marrow samples from healthy volunteers were supplied by AllCells, Inc. (Alameda, CA).This study was reviewed by the WCG Institutional Review Board and deemed as exempt research under 45 CFR 46 and associated guidance.

Ex vivo drug sensitivity screening
Bone marrow and peripheral blood samples were red blood cell-lysed and then cultured in 384-well plates for DSS as previously described [21,31].For each patient screen, a panel of 80 small molecules that were either FDA-approved or previously tested in clinical trials for treating hematologic malignancies was dispensed in triplicate with a Labcyte Echo Acoustic Liquid Handler (Beckman Coulter, San Jose, CA, no.550) (Supplementary Table 1) [21].Samples were then incubated in a robotic incubator at 37 • C, 95 % humidity, and 5 % carbon dioxide for 48 hours before being assessed for drug sensitivity.

MiniFlow assay characterization
To characterize the performance of acoustic dispensing with the Labcyte Echo Acoustic Liquid Handler (Beckman Coulter, San Jose, CA, USA, no.550), antibody-fluorophore conjugates were dispensed across a range of volumes (10,25,35,50,75, and 100 nL) into 384-well plates containing 20 µL phosphate-buffered saline per well.Fluorescence intensities were then measured using the Tecan M200 Infinite Pro plate reader (Tecan, Männedorf, Switzerland) and compared to a manually prepared standard curve for each antibody-fluorophore conjugate (n = 28 replicates per dispense volume).To characterize the performance and detection range of MiniFlow based on counting of blasts, dilutions of blasts at known concentrations were plated in 384-well plates, stained, and analyzed by flow cytometry.The measured blast counts per well were then compared to the expected blast counts.To assess the accuracy of blast classification with the MiniFlow assay, blast percentages estimated with the MiniFlow assay (10,000 cells per well) were compared to those estimated with standard washed staining (50,000 cells per well) across 28 patient samples.
The performance of MiniFlow assay components was analyzed by plotting the expected vs observed values in the characterization experiments of (1) Echo-based dispensing of stains, (2) MiniFlow identification of blasts, and (3) the blast count detection range of the MiniFlow assay.Pearson's correlation coefficient was used to measure correlations between expected and observed values.

Drug sensitivity and combination scores
For each drug and patient sample, "blast score" was used as the metric for drug sensitivity.This score was calculated by normalizing the viable total blasts to vehicle control [dimethyl sulfoxide (DMSO)].Doseresponse curves were generated by fitting the Hill equation to doseresponse data, and the area under the curve was calculated with Graph Prism version 7.0 (GraphPad Software, San Diego, CA, USA).Hierarchical clustering of patient samples and drug sensitivity was implemented with the UPGMA algorithm and Euclidean distance metric using the SciPy (v.1.3.1)package and plotted with seaborn (v.0.9.0) to generate heatmaps [32].
For both pooled and deconvolution synergy screens, we employed the coefficient of drug interaction (CDI) to discern drug combinations that display effects surpassing additivity, indicating potential synergy.Using the Bliss independence model as a reference, which assumes that drugs exhibit their effects independently, the expected blast score for a given combination is computed as the product of the blast scores of the individual drugs [33].The CDI is then calculated by taking the ratio of the observed blast score to the expected one.Synergy is inferred when CDI < 1, antagonism when CDI > 1, and additivity when CDI = 1.
For a drug pair A and B with respective blast scores of S A and S B , and combination blast score, S AB , the CDI is given by: For a drug pool P composed of multiple agents A,B,C,D and E, the methodology parallels the above.The blast score for the pool is divided by the product of the blast scores of the individual agents: When no drug pools in a screen showcased synergy based on the CDI, we adopted a modified CDI that incorporates a more lenient criterion for additivity.This modified CDI assumes that the expected blast score of a drug pool is approximately equivalent to the additive effects of the two most active agents in that pool, rather than all five.Hence, if drugs A and B possess the greatest, individual activity among all agents in the pool P, the modified CDI is: Differences between observed and expected combination effects across patient samples were statistically evaluated using the Wilcoxon matched-pairs signed rank test.

Combination pooling algorithm
The combination pooling algorithm builds pools of drug combinations by serially constructing optimal pools.Given a set of compounds, a pool is started by selecting a random compound from that set.Then the algorithm attempts to identify which compounds, if added to that pool, maximize the "pool-combination score" heuristic.For a potential compound, the pool-combination score increases by one for each new compound pair that is generated by adding the potential compound to a pool.Conversely, the pool-combination score decreases by the square of the frequency of the repeat if adding a compound generates repeat pairs.Once pools with repeats start to generate, the algorithm uses a "poolelimination heuristic."This step assesses which previously generated pool-if eliminated from the current pool set-would enable the generation of a new pool with an optimal pool-combination score.The algorithm continues to run until all compounds have been paired with every other compound in the library.
If this heuristic approach does not converge on an efficient solution, the algorithm can instead model the pool-set generation problem as a "block design" and use non-serial, constructive methods from combinatorics to create efficient pool sets [34].In the context of combinatorial theory for block-designs, our pooling problem finds an analogy with finite projective planes, an abstract mathematical concept from algebraic geometry.In these planes, compounds from our library are akin to points in the mathematical space, while the pools of compounds equate to lines connecting these points.Finite projective planes have the property where every pair of distinct points (or compounds) lies on precisely one line (or pool), and each line contains several points [34].Hence, if the number of compounds and the desired pool size meet the criteria for which a finite projective plane can be constructed, such a plane presents an optimal solution to the pooling problem.This algebraic geometric approach offers a constructive block design method, permitting efficient generation of pools in certain scenarios.
Example code implementing the described pooling algorithm is publicly accessible and can be found at our GitHub repository: https:// github.com/kamran-allele/SynergyPooling The combination pooling algorithm was implemented in Python v.3.7.3.Examples of pool sets are listed in Supplementary Tables 2-3.

AutoGater training and testing
To train the DNN blast classifier, we processed cytometry data from 25 MiniFlow screens, assembling an input matrix where columns represent flow markers and the rows correspond to individual cell events.The classifier's objective was to categorize each event as either "blast" or "not blast".Optimization of the classifier was carried out by comparing its predictions with annotations derived from manual gating by a human analyst using FlowJo v10 (BD Life Sciences, Ashland, OR, USA).The manual gating strategy is further described in the preceding Flow Cytometry section.Within this manually gated data, events classified under the blast gates, such as those for CD34+ cells, were labeled as "blasts".Conversely, events that fell under other categories, including lymphocytes, granulocytes, dead cells, and debris, were designated as "not blast".
Raw values for the fluorescence channels from each event were rescaled with an arcsine transformation.For each screen, we used fluorescence-minus-one (FMO) staining controls to normalize fluorescence channel values.This normalization was done for each fluorescence channel by calculating the 95th percentile of the fluorescence intensity in that channel.This percentile calculation used the respective fluorescence-minus-one control for that channel to obtain channel normalization thresholds.Then for all transformed channel values we subtracted the normalization thresholds to obtain normalized channel values.To ensure numerical stability during training, channel values were then standardized to have zero mean and unit variance.In the training dataset of MiniFlow screens, blasts constitute only a small fraction of all events detected by the flow cytometer.Thus, the dataset was downsampled and balanced such that half of the events in the subsample dataset were blasts.
This processed data served as the input for a deep neural network (DNN) implemented using the Scikit-learn, Keras and TensorFlow Python libraries.This network was a fully connected, feed-forward architecture with four hidden layers.Each layer utilized exponential linear unit activation functions.The layers were structured with 256, 128, 64, and 8 nodes, respectively.The network culminated in an output layer with a sigmoid activation function to predict binary states ("blast" or "not blast").To train the DNN, we used the Adam algorithm with a learning rate of 10 − 3 , a binary cross-entropy for a loss function, and a batch size of 1024 and 100 total epochs.
To validate the classifiers, we tested the trained classifier on a separate dataset of 25 MiniFlow screens and generated precision-recall curves.Classifier performance was determined by calculating the average precision of the precision-recall curve.Classifier performance was further evaluated by using the classifier to estimate the blast scores for every drug condition in the validation dataset and comparing these estimates to blast scores assessed by human analysts.Pearson's correlation coefficient was used to measure the correlation between these methods.
For all 50 MiniFlow screens in the training and validation datasets, the patient sample for each screen was unique.This was done to ensure AutoGater had sufficient exposure to the phenotypic diversity across AML specimens and to prevent overfitting to a subset of AML phenotypes.

Execution of pooled screens and deconvolution of hits with ComboFlow
We used ComboPooler to design pooled combination screens that efficiently cover all 3160 drug pairs of an 80 drug library using 320 pools, with 5 drugs per pool (Supplemental Table 2).To detect synergy amongst compounds in a pool, each compound must be dosed at a sufficiently low concentration in which the drug is still active but has only modest single agent activity [35,36].Based on the single agent, dose-response data from previous MiniFlow screens of our compound library, we identified optimal dosages that would be effective in our pooled screens.We adopted the rationale of Severyn et al., choosing the approximate EC10-doses eliciting 10 % of the maximal response-for each compound, allowing us to capture submaximal yet active responses facilitating the detection of potential synergy (Supplemental Table 1) [35].
After conducting the initial pooled combination screen on a patient sample, we used AutoGater to analyze flow cytometry data and calculate blast scores for single agents and pools.Initial identification of pool hits was conducted by filtering out pools with insufficient blast cell reduction, specifically those with more than 50 % of blast cells remaining.This step ensured focus was maintained on pools demonstrating high antitumor activity.Subsequently, the remaining pools were ranked by CDI to assess and prioritize pools based on their synergistic activity.The most promising pools, as determined by CDI values, were advanced to deconvolution in secondary screens.For this phase, 3-5 pools exhibiting the highest synergy were selected, contingent upon the amount of patient sample available.During secondary screening, each individual drug pair from these top-performing pools was tested both as single agents and as pairs.This allowed for the precise identification of drug pairs that contributed to the synergistic killing of blast cells, as quantified by their CDI scores."

Statistical analysis
All statistical analyses were conducted with Graph Prism version 7.0 (GraphPad Software, San Diego, CA).For all analyses, P < .05 was considered significant.

Design of the ComboFlow system
To make ex vivo DSS amenable to drug combination screening, we designed ComboFlow to comprise three modules (Fig. 1).The first module, which we call MiniFlow, is a miniaturized screening assay that enables ex vivo drug testing with flow cytometry in a 384-well format using a no-wash staining method (Supp Fig. 1A).This smaller format uses fivefold less sample per well than existing larger formats and supports testing a greater number of drug combinations in each patient sample (Supp Fig. 1B).
To increase the number of screenable conditions and drug combinations, we developed the second module, ComboPooler.This module designs screens of pooled drug combinations that further reduce consumption of sample material collected from patients (Fig. 1).Rather than test one pair of drugs per well, the ComboPooler algorithm designs screens in which drugs are pooled into sets of five drugs per well, such that each well contains up to 10 possible drug pairs.Using this pooling approach, we can screen for potentially synergistic compounds in pooled MiniFlow screens by identifying "pool hits", or wells in which drug pools eliminate cancer cells more effectively than single agents alone.This pooling strategy reduces the number of wells needed for synergy screening by 10 times.However, to deconvolute and definitively determine which compounds in a pool hit are actually synergistic, secondary screens are still needed.
To conduct secondary screens ex vivo, we developed the third module, AutoGater.AutoGater is a classifier based on deep learning that automatically analyzes the high-dimensional data of the initial screening datasets from the ComboPooler (Fig. 1).With AutoGater, we can rapidly identify synergistic pool hits from the initial ComboPooler screen run on part of a patient sample.We can then determine which specific compound pairs in the top pool hits are synergistic by immediately conducting a secondary MiniFlow screen on the remaining part of a patient sample in a separate plate.In this secondary screen, each potentially synergistic pair is tested individually instead of pooled, which can help identify personalized combination therapies for individual patients.This rapid analysis-design loop is an important aspect of this process because primary AML samples have limited viability.Without stromal support, primary AML samples can often only be maintained in ex vivo culture for 2-5 days before the sample declines or changes due to differentiation of blasts or clonal selection [37].
We designed ComboFlow to integrate all three modules in an automated platform.With this approach, we can take sample tissue from a patient and quickly conduct ex vivo drug screening (ComboPooler), assay for drug effects (MiniFlow), identify pool hits (AutoGater), and deconvolute potential hits using the remaining patient material.In just 4 days, our system can identify, confirm, and personalize synergistic drug combinations for a patient.

MiniFlowefficient and scalable assay of drug effects ex vivo
Patient responses to therapies have been accurately predicted by ex vivo drug screens with flow cytometry assays in 96-well plates [20,23].These flow readouts are predictive because they can distinguish drug effects in leukemic versus non-leukemic cells from patients [17,20].However, these flow readouts are limited to screening small sets of drugged conditions because they consume up to five times more cells than simpler bulk methods that are used for larger-scale, exploratory searches of patient drug sensitivity.Thus, flow cytometry approaches have limited use as an assay for screening huge numbers of drug combinations across many patient samples [17].
To maximize the number of drugs or conditions that could be screened per sample, and still capture physiologically relevant information, we miniaturized a flow-based assay to a 384-well format that requires fewer cells (Supp Fig. 1).However, scaling down flow cytometry assays to use fewer cells is challenging.These assays use aggressive centrifugation and cell resuspensions steps that remove unbound staining antibodies from cell suspensions, as well as fragile cell populations.These cell populations are particularly valuable when staining small numbers of cells from primary samples [38].We reasoned that developing a no-wash staining panel that eliminates the wash step entirely would reduce the number of cells needed for a flow-based screening assay (Supp Fig. 1B).However, because unbound antibody-fluorophore conjugates cause high background fluorescence, we needed to identify conjugates that provide adequate signal-to-noise [39].We surveyed many antibody-fluorophore conjugates for our 12 markers of interest and found more than 500 conjugates.This high number makes manual testing difficult.To overcome this difficulty, we adapted acoustic dispensing, a method commonly used for dispensing small molecules dissolved in DMSO for HTS screening, [40] to dispense antibody-fluorophore solutions.With this approach, we could rapidly test many different antibody stains (Fig. 2A) precisely and accurately (Fig. 2B).
We then used acoustic dispensing to (1) test 110 antibodyfluorophore conjugates and (2) identify conjugates for which no-wash staining could effectively resolve marker-positive populations, despite lower signal-to-noise in a no-wash assay (Fig. 2C).With this approach, we optimized the MiniFlow flow cytometry panel for no-wash staining of our 12-markers of interest to identify leukemic blasts in patient samples in a 384-well plate format.We found that this panel accurately identified leukemic blasts and, with far fewer cells, performed similar to standard panels with wash staining in 96-well plates (Fig. 2C-D).
Next, we found that our no-wash approach accurately and precisely quantified leukemic blasts in 384-well plate cultures.These quantifications were accurate and precise across a wide dynamic range of cell counts and with a lower limit of detection of just 200 blasts in 10 µL of cell-suspension per well (Fig. 2E).These findings suggest that our method could be used in HTS screening of drug effects via flow.
Next, we used MiniFlow to screen drugs in primary samples from patients with AML.With our ex vivo DSS results (with the no-wash method), we saw differences in drug sensitivity across patient samples (Fig. 2F) and multiple drug classes (Fig. 2G) while using five times fewer cells per well than 96-well methods.
In separate work, we showed that MiniFlow accurately predicted responses to single-agent and combination therapies in AML and related myeloid malignancies [21,31].These findings show that MiniFlow can extract clinically relevant results, supporting that MiniFlow may be an effective tool to discover therapeutics and produce translatable results.

ComboPoolerefficient combination pooling to explore drug combinations
Although MiniFlow increases the number of screenable conditions per sample by reducing the number of cells required, we also needed to increase capacity for combination screening.For example, with Mini-Flow, more than 95 % of AML samples we receive can be screened with a library of 80 drugs that are FDA-approved or in clinical trials for AML.ComboFlow is an automated platform for identifying drug combination synergies ex vivo.This platform consists of three modules (MiniFlow, ComboPooler, and AutoGater) that work in concert to screen many drug combinations ex vivo using size-limited primary samples from patients with hematologic malignancies.MiniFlow is a miniaturized assay for screening drugs in cell populations with flow cytometry.In the ComboFlow platform, ComboPooler designs MiniFlow screens that efficiently search the drug combination space for synergies of pairwise drug combinations in pools of five or more drugs using limited sample material.AutoGater is a classifier based on a deep neural network that automatically gates and rapidly analyzes large volumes of high-dimensional datasets from flow cytometry screening generated by MiniFlow.

(caption on next page) K.A. Ali et al.
However, the pair-wise combination space of that library spans more then 3000 unique pairs.Testing all these combinations in AML samples, which are often too small for such testing, would be challenging, even with MiniFlow.
To explore a large combinatorial space, standard screening methods require large screening sizes.To reduce this size and increase the capacity for combination screening, we developed a strategy to screen pooled combinations and identify synergistic drug combinations.This strategy moved beyond individually testing drug pairs.Instead, we designed screens consisting of many pools in which each pool contained many drug combinations per well.Across all the pools in a screen, every possible pair of drugs occurs in at least one pool.This approach maximizes the number of drug combinations tested per pool while limiting the number of pools needed to search the drug combination space.For example, with pool sizes of five compounds per pool, creating 10 drug pairs per pool can theoretically reduce screening conditions by up to 10fold (Fig. 3A).
To design pooled combination screens with optimal efficiency and enable HTS of drug combinations ex vivo, we created ComboPooler.This pooling algorithm efficiently constructs sets of pools that optimally span the drug combination space.We used ComboPooler to generate pool sets covering all possible pair-wise combinations of 80 compounds in pool sizes containing five compounds.ComboPooler generated a highly efficient pool set that covered the combination space with just 320 pools (Supp.Table 2).This number is only 2 % more pools than the optimal lower bound of 316 pools and a 9.9-fold smaller screen size than directly testing compound pairs (Fig. 3B).ComboPooler also outperformed existing pooling approaches used for combination screening in AML cell lines.For example, Lappin et al. required at least 457 pools, which is 44.6 % more pools than the optimal lower bound [35,36,41].
ComboPooler's increased efficiency was more apparent with a larger pool size.We found that for pool sizes of nine compounds, we could cover the pair-wise combination space of an 81-compound library with optimal efficiency using just 81 pools (Supp.Table 3).Although Lappin et al covered a library of 80 compounds with pool sizes of 10 compounds, they still required nearly twice as many pools (160 pools) to cover all pairs [36].Our approach was more efficient because we used half as many pools to cover all pairs in a 81-compound library.

AutoGaterautomated gating of pooled flow screens with supervised machine learning
Together, MiniFlow and ComboPooler efficiently design pool sets to explore the drug combination space and identify pools that may contain Fig. 2. MiniFlow uses acoustic dispensing and no-wash staining to scale drug screening on ex vivo samples with a flow-based assay.(A) Acoustic dispensing can be used to add antibody-staining solutions to 384-well plates for high-throughput flow cytometry assays.(B) To assess the accuracy of acoustic dispensing, various volumes of antibody-fluorophore conjugates were dispensed with the Echo into 384-well plates containing PBS and fluorescence intensity was measured with a plate reader.Fluorescence intensities from a manually prepared standard curve were used to calculate the "observed dispense volume" for each acoustic dispense and were plotted against the expected dispense volumes.The correlation was calculated by the Pearson coefficient (r ≥ .991,P < .0001),and the line of identity is indicated by the dashed line.Precision remains until the dispense range lowers to ~10 nL(n = 28 technical replicates per stain per dispense volume).(C) Differences in staining signal-to-noise for identifying blast populations with MiniFlow (no-wash, low cell input) versus the standard approach (washed, scaled-up).With the standard approach, blasts (blue) stained positive for the blast marker CD34 and myeloid marker CD33, but negative for the T-cell marker CD3.These markers were resolved from patient T cells (red), which have the opposite marker profile.With MiniFlow, staining blasts can still be resolved, despite reduced separation in populations due to background signal.(D) MiniFlow's no-wash, acoustic staining accurately measures blast percentages across patient samples.The blast percentages estimated from 28 leukemia samples stained with the MiniFlow approach in 384-well plates (10,000 cells per well) match those from the standard approach in 96-well plates (50,000 cells per well).The correlation was calculated by the Pearson coefficient (r = .997,P < .0001),and the line of identity is indicated by the dotted line.(E) Blasts can be accurately counted in MiniFlow's small-volume microcultures in 384-well plates.Serial dilutions of blasts were detected across a large dynamic range in miniaturized culture (n = 16 replicates by per sample per dilution).For each sample, the correlation was calculated by the Pearson coefficient (r ≥ .970,P < 0.0001), and the line of identity is indicated by the dotted line.The dynamic range for measuring blast counts with MiniFlow is wide with precision dropping only at very low concentrations of blasts per well as indicated by the vertical dashed line.(F-G) MiniFlow can screen drugs on primary samples from patients with acute myeloid leukemia and can profile differences in ex vivo DSS across patients.(F) The BCL2 inhibitor, venetoclax, was tested on 10 patient samples using MiniFlow and a broad range of sensitivity was observed across samples.Error bars represent mean ± SD from n = 3 replicates.(G) Hierarchical clustering of area under the curve (AUC) values of MiniFlow dose responses for 10 drugs tested across 24 patients.Patient cluster into two groups, with one group showing relatively higher sensitivity to the broad class chemotherapies.synergistic drug combinations.However, secondary screens are still needed to deconvolute and definitively determine which compounds in a pool hit are actually synergistic.This process requires follow-up screens, similar to those done for pooled screening approaches developed for cell lines.However, our workflow uses primary samples, which can only be reliably maintained in ex vivo culture for a short time.After just 5-7 days of culture, most AML samples quickly lose their viability or change due to differentiation of blasts or clonal selection [37].Thus, after collecting data from the initial pooling screen, the data must be quickly analyzed to design and conduct the secondary screen.This analysis step must rapidly analyze large amounts of high-dimensional data at single-cell resolution across many drug conditions.
To manually gate drug screens based on ex vivo flow cytometry with 12 markers and no wash step, an expert analyst needed several hours per screen.Also, because these assays were completely automated, readouts were frequently done late at night, when human analysts are not available to manually gate the data.This mistiming delayed follow-up screens by 12-24 hours.To reduce the times between completing the primary screen, gating the resulting flow data, and designing a follow-up screen from the results of the primary screen, we set out to automate the gating and analysis of the flow screens.We assessed other automated methods that used supervised learning to train classifiers to gate data from standard flow cytometry assays.We noticed that the methods were trained on datasets that are low throughput, use washes, have high signal-to-noise, and have many events per sample [42][43][44].These automated methods also have suboptimal performance with test data sets and are sensitive to batch effects [45][46][47].Our application is much more challenging, because MiniFlow uses no-wash staining, which has lower signal-to-noise staining that leads to less well-resolved populations than standard washed staining methods.Furthermore, the flow data for our analysis is from a high-throughput drugged assay with large amounts of dead cells and debris, and far fewer events per condition, which further complicates clear resolution of populations.
To aid human analysts in gating, we incorporated fluorescenceminus-one (FMO) staining controls into MiniFlow for every patient drug screen.Although FMO staining controls are often used in traditional flow cytometry workflows, incorporating such controls in ex vivo flow cytometry screens can be challenging.Patient samples are limited, and including the many staining controls needed for a high-color flow cytometry panel can consume large amounts of precious patient sample.However, because MiniFlow uses small numbers of cells per well, we could easily include staining controls for all 12 markers in our flow cytometry panel for all screens.
We reasoned that by normalizing the MiniFlow data for each screen by the FMO staining control data we could better control for batch effects, low signal-to-noise, and sample variability.By incorporating FMO normalization into the preprocessing of data for training a classifier, this approach would support robust automated gating of MiniFlow screens.To test this reasoning, we trained classifiers with and without screen normalization of training data and validated the classifiers' performance on a separate validation dataset.We found that with screen normalization, the classifier reliably and accurately identified blasts across all screens with an average precision (AP) of 0.95 and had 10.6 % better performance than the classifier trained without normalization (Fig. 4A).We also saw that for most screens in the validation dataset (22/25), classifier performance was high even without normalization, with AP values greater than 0.9.For these screens, training with normalization only modestly improved classifier performance, with an average increase in AP of 0.058 (or a 5.8 % relative improvement).Conversely, for three validation screens, the classifier without normalization performed poorly with AP values of 0.59, 0.69, and 0.73.However, incorporating normalization into the training improved the classifier AP for these screens to 0.86, 0.91, and 0.92, respectivelyan average relative improvement of 34 %.These findings suggest that the increase in classifier performance with screen normalization is due to better compensation for batch effects between screens rather than a broad improvement in accuracy across all screens.With this high accuracy, AutoGater enabled instantaneous gating and automated analysis of MiniFlow screening data to produce screening analysis similar to human analysts (Fig. 4B).

ComboFlow -identifying synergistic drug combinations for AML with ex vivo screening
After developing MiniFlow, ComboPooler, and AutoGater, we integrated these components to create the ComboFlow system.We then used this system to decipher synergistic drug combinations in samples from patients with AML that are potentially clinically actionable.We designed a screen to explore the combination space among all 3160 pairs from a library of 80 drugs.These drugs had safety profiles that were previously tested in clinical trials for treating hematologic malignancies [21].To validate our approach, we tested ComboFlow with 20 primary AML samples.The system quickly identified the known synergistic combination of bortezomib and panobinostat [48,49] (Fig. 5).Interestingly, with our approach, the bortezomib and panobinostat combination seemed broadly toxic because it killed non-leukemic, white blood cells in the samples.This finding aligns with the narrow therapeutic window of that combination therapy for patients with multiple myeloma, who commonly experience toxicity and high grade leukopenia due to this treatment [50,51].More importantly, this finding shows that the ComboFlow approach could reveal not only valuable synergistic combinations, but also combinations with adverse effects due to their broad, non-specific toxicity.
ComboFlow also identified a new synergistic combination of agents: dactinomycin combined with fludarabine (hereafter called ND475) (Fig. 5).ND475 was synergistic in a subset (35 %) of patient samples (Fig. 6A-B) and had much higher specificity for leukemic blast populations (Figs. 5, 6C) than other white blood cells.We reasoned that this restricted activity was desirable for a synergistic combination for clinical treatment because frequent synergy across patients and cell types was more likely to indicate combinations that were broadly myelosuppressive and very toxic.Notably, ND475 synergistically killed leukemic cells that were otherwise insensitive to either drug alone (Fig. 6A-B).We characterized ND475 with isobologram analysis, which revealed that ND475 showed ex vivo synergy at various concentrations of the two agents (Fig. 6D).For example, in two patient samples, the dose-response curves for fludarabine were shallow, with many blasts still alive at even the highest dose of fludarabine.And in both samples, low-dose dactinomycin did not reduce blast counts.However, when combined, we found that low-dose dactinomycin markedly enhanced the depth of the fludarabine dose response, and ND475 synergistically killed the leukemic blasts (Fig. 6D).We also tested ND475 in healthy bone marrow samples.In contrast to AML blasts, we found that ND475 had limited effects in hematopoietic progenitors and no synergy (Fig. 6C-D).These findings support that to maximize synergistic killing of leukemic blasts while minimizing toxicity to healthy white blood cells, a new therapeutic approach could combine dactinomycin at low concentrations with fludarabine therapies.
In this study, we showed that ComboFlow can efficiently search the drug combination space for individual patients with AML.With this finding, we can envision using ComboFlow to support personalized medicine approaches for combination therapy.For example, fludarabine is routinely used as part of the FLAG therapy regimen (fludarabine, cytarabine and filgrastim) to treat relapsed or refractory AML [52].FLAG treatment is often intensified by adding idarubicin, mitoxantrone, gemtuzumab-ozogamicin, or venetoclax [52,53].However, these combinations often are highly toxic due to their myelosuppression.Also, we do not know which of these combinations is optimal for each patient, or if alternative combinations may be more effective [52].With Combo-Flow, we found that the combination of fludarabine and dactinomycin is synergistic in a subset of AML samples and that the synergy specifically kills leukemic cells while sparing healthy white blood cells.These data suggest that adding low-dose dactinomycin to FLAG may be an effective way to improve FLAG therapy for certain patients.To identify the patients who would most benefit from this combined therapy, we could take a personalized medicine approach with ex vivo DSS.This approach could be done by using MiniFlow to determine if ND475 is synergistic ex vivo for each patient.Furthermore, because MiniFlow is fast and fully automated, these tests could be rapidly completed upon receipt of a patient sample, and the data could be quickly returned to the clinic.This approach to patient selection and personalized medicine could maximize the likelihood that each patient will benefit from a particular combination.However, it's essential to underscore that while these findings open avenues for personalized medicine, the data generated requires further validation, especially when contemplating its integration into human therapy decisions.Such validation would be instrumental in ensuring the safety and efficacy of proposed treatments.

Limitations of ComboFlow and synergy for combination therapy development
Although ComboFlow enables unprecedented ex vivo explorations of the drug combination space, the approach has several limitations.First, our assay is short, which restricts us to measuring the effects of drugs that act relatively quickly, within a couple of days.However, drugs like chromatin remodelers have epigenetic mechanisms of action, and these drugs may need extended incubations of over 1 week before their effects emerge [54].Thus, to observe synergy with slower-acting agents, we might need longer-term cultures.In this study, we limited the MiniFlow assay to short-term incubations because it can be challenging to culture patient samples for extended periods.However, the MiniFlow assay could be improved to incorporate stromal support, which would allow samples to be cultured for extended times and enable us to screen for longer-term activity of synergistic drugs [54].
To address drugs with shorter in vivo exposures on the order of hours, the ComboFlow assay could be adapted to a two-step dosing protocol.This would involve an initial exposure to short-exposure agents, followed by a wash step and subsequent addition of longerexposure agents.In our development of MiniFlow, we observed that wash steps introduce high variability in cell counts at readout for AML samples cultured in 384-well plates, especially 48-72 hours post drug application in conditions with significant cell death.As such we designed MiniFlow to be a no-wash assay.However, this does not preclude incorporating a wash step shortly after dosing into MiniFlow, since we observed in our initial testing that wells with high viability and thus minimal cell death and debris-are less affected by wash steps, demonstrating little variability in cell counts.Consequently, since shortterm exposure to drugs at clinically relevant concentrations is unlikely to cause a substantial drop in viability that is observable within just a few hours after dosing-we expect that incorporating a wash step early Representative examples of deconvoluted hits from pooled synergy screens were plotted to highlight which combinations were synergistic (low coefficient of drug interaction (CDI)) and specifically killed blasts but spared non-leukemic white blood cells.Each point represents a drug pair's CDI, blast score, and white blood cell score averaged across all screened AML patient samples in which non-leukemic, white blood cells constituted at least 15 % of all cells in the untreated sample.This inclusion criterion was adopted to allow us to simultaneously assess drug effects across different cell populations within a patient sample and visualize the specificity of each compound pair toward leukemic cells over normal white blood cells.Known synergies were identified including proteasome inhibitors in combination with the histone deacetylase inhibitor panobinostat.However, that combination had activity in normal white blood cells as well, consistent with its narrow therapeutic window in the clinic.ComboFlow also identified the novel synergy of dactinomycin in combination with fludarabine which was found to be highly specific to blasts with little activity in normal white blood cells.The concentration of each drug within the tested pairs are provided in Supplementary Table 1.Another limitation is that ComboFlow searches specifically for synergistic drug combinations.However, other effective combination strategies do not rely on synergy.For example, Palmer et al found that synergy is not needed to explain why many approved combination therapies for cancer have clinical activity [55,56].Instead, these combinations are effective because they have additive effects or independent action that can address key clinical challenges of intra-and inter-tumor heterogeneity [57,58].This could potentially be addressed by incorporating into ComboFlow alternative combination efficacy metrics such as the "drug combination sensitivity score" introduced by Malyutina et al. that assesses overall combination sensitivity in addition to synergy [59].For those combination strategies, we believe that ex vivo DSS with flow cytometry is particularly useful because the approach can resolve differences in drug sensitivity across patients as well as within a heterogeneous tumor.In this study, we used flow cytometry to identify how drugs specifically affect the whole population of leukemic cells for each patient sample.But with additional analysis or more sophisticated staining panels, we could use MiniFlow to measure differences in drug sensitivity among leukemic subpopulations within each sample.With this approach, MiniFlow could be used to better identify drugs with additive or independent effects, and to facilitate the discovery of effective combinations independent of synergy.

Conclusions
In conclusion, we created the automated ComboFlow system to explore the massive landscape of drug combinations with a highdimensional screening assay in ex vivo models.This low-cost, translatable approach overcomes size constraints of using primary samples with limited cells and makes combination drug screening feasible ex vivo.We used this system to identify highly synergistic drug combinations (dactinomycin with fludarabine) that have been used separately to treat AML for years but have not been reported as synergistic or tested in drug combination trials.And because ComboFlow identifies synergy rapidly and for individual patients, the approach could also be used for personalized medicine.We envision that at scale, the ComboFlow platform can explore even larger landscapes of drug combinations across many cancer indications.In this way, ComboFlow could uncover combination therapeutics with high efficacy in targeted patient populations and greatly increase the screenable landscape for therapeutic development in oncology.

Fig. 1 .
Fig. 1.ComboFlow is an automated platform for identifying drug combination synergies ex vivo.This platform consists of three modules (MiniFlow, ComboPooler, and AutoGater) that work in concert to screen many drug combinations ex vivo using size-limited primary samples from patients with hematologic malignancies.MiniFlow is a miniaturized assay for screening drugs in cell populations with flow cytometry.In the ComboFlow platform, ComboPooler designs MiniFlow screens that efficiently search the drug combination space for synergies of pairwise drug combinations in pools of five or more drugs using limited sample material.AutoGater is a classifier based on a deep neural network that automatically gates and rapidly analyzes large volumes of high-dimensional datasets from flow cytometry screening generated by MiniFlow.

Fig. 3 .
Fig. 3.The ComboPooler screening strategy to efficiently screen pooled drug combinations.(A) Drug pooling can theoretically reduce the number of assay wells or tests needed to screen a compound library for synergistic drug pairs.(B) The ComboPooler efficiently designs screening drug combinations in pools and reduces the numbers of wells needed to screen the pairwise drug combination space for a given compound library by approximately 10-fold across 5 drug pools.

Fig. 4 .
Fig. 4. The AutoGater classifies blasts and enables automated gating and analysis of MiniFlow data.(A) AutoGater was trained on MiniFlow screens with and without per-screen normalization of fluorescence channels.Using the validation screen dataset, blast classification was better when AutoGater used per-screen normalization.(B) In the validation screens, drug sensitivity was similar between the automated analysis of the dataset for the MiniFlow with AutoGater and the manual calculations by human analysts (blast score).The correlation was calculated by the Pearson coefficient (r = .995,P < .0001),and the line of identity is indicated by the red dashed line.

Fig. 5 .
Fig. 5. ComboFlow screening identified known and new synergistic compounds.Representative examples of deconvoluted hits from pooled synergy screens were plotted to highlight which combinations were synergistic (low coefficient of drug interaction (CDI)) and specifically killed blasts but spared non-leukemic white blood cells.Each point represents a drug pair's CDI, blast score, and white blood cell score averaged across all screened AML patient samples in which non-leukemic, white blood cells constituted at least 15 % of all cells in the untreated sample.This inclusion criterion was adopted to allow us to simultaneously assess drug effects across different cell populations within a patient sample and visualize the specificity of each compound pair toward leukemic cells over normal white blood cells.Known synergies were identified including proteasome inhibitors in combination with the histone deacetylase inhibitor panobinostat.However, that combination had activity in normal white blood cells as well, consistent with its narrow therapeutic window in the clinic.ComboFlow also identified the novel synergy of dactinomycin in combination with fludarabine which was found to be highly specific to blasts with little activity in normal white blood cells.The concentration of each drug within the tested pairs are provided in Supplementary Table1.

Fig. 6 .
Fig. 6.Dactinomycin and fludarabine appear synergistic in ComboFlow screens across a subset of samples from patients with AML.(A) Several samples toward the left of the plot were relatively resistant to dactinomycin or fludarabine alone but highly sensitive to their combination.Error bars represent mean ± SD from n = 3 replicates.(B) Across samples from 16 patients with AML, dactinomycin and fludarabine are synergistic.The observed killing showed greater fractional killing than would be expected if the combination of drugs had shown Bliss independence (Wilcoxon matched-pairs signed rank test; P < .05).(C) Boxplot of the synergy (excess blast killing over bliss independence) of dactinomycin and fludarabine across different cell populations in the MiniFlow assay.Synergy was not seen with all cells in an AML sample in bulk, but was apparent when focusing on the leukemic blast population.In contrast, the combination does not appear synergistic in other cell populations, such as lymphocytes or in hematopoietic progenitors from bone marrow of healthy donors.In each boxplot, the median is indicated by the line, the box top and bottom edge correspond to the 1st and 3rd quartile, and the whiskers denote the minimum and maximum values from 8 patient samples.(D) Sensitivity to fludarabine in dose-response testing was enhanced at a low dose of dactinomycin and fludarabine in AML samples.In contrast, the combination has little effect in hematopoietic progenitors from healthy bone marrow donors.Error bars represent mean ± SD from n = 3 replicates.