1 Introduction
There are inherent limitations of clinical trials (RCTs) done before a medication is authorised in terms of cost, surveillance time, size and a lack of diversity of patients included (e.g. that pregnant women, children and elderly often are not included) (Sanson-Fisher et al. 2007). Another issue is that RCTs are limited in their ability to detect rare adverse drug reactions (ADRs) and potential drug interactions (DDIs) since they usually only include a few thousand participants at most. Moreover, the concurrent use of multiple medications is often an exclusion criterion in RCTs, leaving gaps in the assessment of polypharmacy risks. As a result, tested drug combinations tend to be limited to those where physicians and pharmacologists have existing clinical experience or hypotheses about potential interactions (Heijden et al. 2002). This means that monitoring of the safety of medications and the potential for DDIs after marketing is essential. This monitoring is often referred to as post-authorization pharmacovigilance.
Elderly individuals are commonly prescribed multiple medications due to the increasing prevalence of comorbidities associated with aging. One study suggested that patients aged 75 or older in Austria took on average 7.5 drugs (Schuler et al. 2008). Among this demographic, 10\% of hospitalizations were attributed to ADRs, and 18.7\% of these ADRs were caused by a combination of more than one drug, a drug-drug interaction. More broadly, a meta-analysis attributed approximately 22.2\% of hospitalizations caused by adverse drug reactions to DDIs (Dechanont et al. 2014).
Disproportionality analysis (DA) encompasses widely used methods in pharmacovigilance to detect ADRs by assessing the frequency of adverse events (AE) associated with specific drug consumption relative to what would be expected. Most established disproportionality methods like the proportional reporting ratio (PRR) (Evans et al. 2001), reporting odds ratio (ROR) (Puijenbroek et al. 2002), Bayesian Confidence Propagation Neural Network (BCPNN) (Bate et al. 1998), and the Gamma Poisson Shrinker (GPS) (DuMouchel 1999) allow for the identification of signals that warrant further investigation in the context of single drug assessment. The tree-based scan statistic (Kulldorff et al. 2003; Heo et al. 2024) takes advantage of the tree structure of the Anatomical Therapeutic Chemical (ATC) drug classification in order to allow to select a drug family rather than a single drug, a family corresponding to a subtree of the ATC tree. Likelihood-ratios tests then allow to choose the most relevant family.
In order not to overinterpret high proportions in case of rare drug cocktails, it is possible to derive a score from an independence test on the contingency table counting the presence or absence of the adverse event depending on the consumption of the drug cocktail. (Gosho et al. 2017) use a chi-square statistic to do so, while (Ahmed et al. 2010) rather use the Fisher exact test which has the advantage of being non-asymptotic.
Fewer DA methods are available to use for multiple drug consumption, such as the \Omega shrinkage method (Norén et al. 2008). This method allows the use of a DA measure in order to detect sets of the type Drug-Drug-Adverse Event and stands as the standard measure when it comes to detecting the interactions of two drugs. Adaptations of the PRR for a single drug has been proposed like the Concomitant Signal Score (CSS) (Noguchi et al. 2020) and the PRR adaptation for DDIs (Wang et al. 2020).
Other methods than DA measures exist in order to detect DDIs. Among them, well known methods encompass logistic models (Van Puijenbroek et al. 2000, 1999) and association rules methods (Noguchi et al. 2018; Ibrahim et al. 2016). One can refer to multiple reviews for a more detailed overview of existing methods (Ibrahim et al. 2021; Hauben 2023).
Computational pharmacovigilance methods are typically applied to Spontaneous Reporting Systems (SRS), which aggregate Individual Case Safety Reports (ICSRs)—real-world data submitted by healthcare professionals, manufacturers, and patients. Such reports contain information about intake of medications for each patient as well as their experienced AE. While SRS databases, such as the FDA Adverse Event Reporting System (FAERS), provide the scale necessary to detect rare events, they possess inherent limitations. These include under-reporting, where only a fraction of adverse events are captured (Bate and Evans 2009), and various reporting biases, such as the notoriety bias (Pariente et al. 2007). Crucially, the spontaneity of these reports often results in an incomplete recording of all drugs taken by a patient. Furthermore, because the exact timing of drug administration is frequently omitted, it is difficult to distinguish between simultaneous intake and medications taken at different times within the same reporting window. This temporal ambiguity potentially limits the precision of interaction analyses, as the observed cocktail may represent a sequence of exposures rather than a concurrent pharmacological event. Consequently, because the total number of patients exposed to a drug is unknown, these databases cannot provide true incidence rates, necessitating the use of disproportionality analysis to identify safety signals (Almenoff et al. 2005).
While previous work showed that further practical experiences should be of interest for high-order drug interaction testing (Tekin et al. 2018), it is hard to know which exact combination one should test and easy to miss an important combination in the overwhelming space of high-order drug cocktails. The present article proposes a novel method to identify drug cocktails that are overrepresented in adverse event reports, which constitute candidates for further interaction assessment. It requires the choice of a scoring function able to measure, for a given AE, the importance of the disproportionality for any medication set, including single medications. It then takes advantage of the tree structure of the ATC classification to explore the space of medication sets in two ways: a genetic algorithm looking for the sets of high score, and a MCMC algorithm learning the distribution of the scores for a given cocktail size in order to evaluate how extreme a high score is. A post-treatment step using penalized logistic regression then allows to assess whether an identified cocktail reflects a true pharmacological interaction or merely the combined effect of its components. The approach is also original with respect to the fact that a medication denotes either a drug or a drug family in the sense of an internal node of the ATC classification.
The developed algorithms, available in the corresponding R package emcAdr (Bangard 2025), aim to guide pharmacology researchers toward drug combinations that might correspond to drug interactions and require more rigorous monitoring.
2 Methods
2.1 ATC Tree and Cocktail Definition
Multiple classifications of drug active ingredients exist, one being the ATC classification. This system is structured hierarchically and can be represented as a tree with five levels containing, at the time of this study, 6809 nodes. The leaves of this tree are the active ingredients, while the first level consists of nodes representing organs or systems affected by the descendant active ingredients. The remaining nodes correspond to therapeutic or pharmacological families.
By applying a Depth-First Search algorithm to enumerate each node of the ATC tree, we can define a drug cocktail as a set of integers corresponding to specific nodes in the tree. More formally, a cocktail C of size k\geq 1 is defined by C = (x_1, \dots, x_k) \in \Delta^{k} where k \in \mathbb{N} , and \Delta represents the set of numbered nodes of the ATC tree T. Figure 1 shows examples in a simplified tree, the green nodes denoting the considered cocktail. Note that for convenience, we also refer to single drugs as cocktails.
Considered cocktails can include internal nodes of the tree (thus representing families of active ingredients), allowing for the detection of more general signals. For example, paracetamol might send a weak signal, while if we move up the tree, analgesics as a whole could represent a stronger and more general signal. Therefore, all patients taking at least one drug from this drug family will be considered in the computation of the risk. That notion is equivalent to the notion of cut in the tree-scan method (Kulldorff et al. 2003).
2.2 Cocktail Risk Characterization
In the field of pharmacology, accurately characterizing the risk associated with drug administration is a complex task. The aim of the developed method is to search the space of cocktails to maximize a score indicating if individuals taking a given drug combination have a higher risk of an AE. While the algorithm is designed to be flexible and can be run with any scoring function, we focus on a comparison of established metrics to justify our selection (see Section 3.1).
2.2.1 Scoring Functions in Pharmacovigilance
To evaluate the disproportionality of drug cocktails presence for a particular AE, several statistical measures have been proposed in the literature:
- Proportional Reporting Ratio (RR)
- A standard measure defined as the ratio of the probability of an AE occurring in the group exposed to the cocktail versus the non-exposed group. Its application in signal generation from spontaneous reports was notably discussed by (Evans et al. 2001).
- PRR for Drug-Drug Interactions (PRR)
- Unlike the single-drug PRR, this adaptation for drug combinations evaluates whether a combination represents a synergistic risk. Specifically, as discussed by (Wang et al. 2020), a signal is often defined by comparing the lower bound of the confidence interval of the combination’s risk to the maximum risk observed for the individual drug components.
- \Omega Shrinkage
- A Bayesian measure specifically developed for drug-drug-event combinations. As proposed by (Norén et al. 2008), it utilizes a “shrinkage” approach to reduce false positive signals in cases with limited data by pulling the observed association toward the value expected under independence.
- Concomitant Signal Score (CSS)
- Introduced as an improved detection criterion, this metric aims to enhance the detection of DDI signals by building upon the PRR framework, as detailed by (Noguchi et al. 2020).
- Hypergeometric Disproportionality Score
- While the methodology can accommodate various scores, we propose the use of a score derived from the Fisher exact test. As highlighted by (Ahmed et al. 2010), this approach has the advantage of being non-asymptotic. Consider a dataset of N patients, among which K experience the adverse event AE. Let n_C be the number of people taking a cocktail C and x the number of patients taking C and experiencing AE. We define the risk to be:
H(C) = -\log(\mathbb{P}(X \ge x))
where X \sim \mathcal{H}(n_C, K, N). This measure effectively functions as the log p-value under the null hypothesis that the number of people with AE in a uniform sample of n_C out of N people follows the hypergeometric distribution. It has the distinct advantage of accounting for the total number of patients exposed to the cocktail; for instance, H(C) will assign a higher risk to a cocktail taken by 100 patients with 50 AE cases than to a cocktail taken by 10 patients with 5 AE cases, whereas simple ratios like the RR might treat them identically. Such hypergeometric measures are well-established in other high-dimensional fields like bioinformatics for functional enrichment analysis (Grossmann et al. 2007).
2.3 Distinguishing Combined Risk from True Interaction
While the H(C) score effectively identifies cocktails associated with a significant increase in the AE frequency, it does not inherently distinguish a synergistic pharmacological interaction from situations corresponding to the independent addition of individual effects or an effect driven by a subset of the cocktail (e.g., “innocent” medications co-prescribed with a high-risk drug).
To address this, we propose a post-treatment of detected signals using Firth’s penalized likelihood logistic regression (Firth 1993). This method is particularly suited for spontaneous reporting systems as it reduces bias in maximum likelihood estimates and provides a solution to the problem of separation, which often occurs with rare adverse events. Using the logistf R package (Heinze et al. 2023), we estimate a model for a given cocktail C (e.g., d_1, d_2) as follows:
\text{logit}(P(AE=1)) = \beta_0 + \beta_1 \mathbb{1}_{d_1} + \beta_2 \mathbb{1}_{d_2} + \beta_{3} \mathbb{1}_{d_1 \cap d_2}
An interaction is confirmed if the interaction coefficient \beta_{3} is positive and statistically significant. This indicates that the risk associated with the co-prescription exceeds the additive effect of the individual components on the logit scale. While the example above illustrates the case of a size-two cocktail for simplicity, this post-treatment generalizes naturally to cocktails of any size k by including all 2^k - 1 interaction terms in the model, or by testing nested models comparing the full cocktail to its sub-cocktails. Users may implement the post-treatment of their choice.
2.4 Disproportionality Identification
2.4.1 High-risk Drug Cocktails Identification
The number of possible cocktails is 2^L, where L denotes the number of nodes in the ATC tree, and even for a given cocktail size k, the number of possibilities is still {L}\choose{k}. It is therefore not possible to compute H(C) for each possible cocktail C included in the dataset. Instead, to explore the space of cocktails and locate those at high-risk of AE, we use a genetic algorithm (Pétrowski and Ben-Hamida 2017). It simulates the evolution of a population of cocktails according to the principle of natural selection, in order to search for the most performing individuals with respect to an evaluation criterion based on H.
The steps are the following, the algorithm repeating steps 2 and 3 until a user-defined number of iterations is reached.
- Initialization.
-
The genetic algorithm’s population consists of m cocktails. These cocktails are randomly initialized and can vary in size.
- Evaluation and selection.
-
At iteration n, the current population P_n undergoes an evaluation and selection phase. The evaluation computes the score H(C) for each cocktail C in the population. A new population Q_{n+1} of the same size is drawn by sampling m times a pair of cocktails in the original population and copying the one with highest score. Note that this step performs selection as the expectation of the number of copies in Q_{n+1} of a cocktail from P_n is an increasing function of its score. A penalization is however applied to avoid a uniformization of the population, as further explained in Section 2.4.2.
- Stochastic modification.
-
Mimicking the genetic drift in nature, stochastic modifications occur in the population in order to explore the large space of cocktails. First, a crossover operation allows two cocktails to exchange information. Here, the crossover involves exchanging sub-trees between two cocktails as follows:
-
- an internal node v of the ATC tree is randomly selected;
-
- the nodes of the subtree rooted at v are exchanged between the two sequences.
-
After performing the crossover, a mutation is applied to the resulting individuals, chosen from two types. The first type is a local mutation which changes a randomly selected node of the cocktail to one of its free neighboring nodes. This mutation is further explained in Section 2.5. The second type is an addition/deletion mutation which operates as follows, with k being the cocktail length and \alpha a chosen hyperparameter:
-
- with probability \min(1,\frac{\alpha}{k}), a node uniformly drawn from \Delta is added to the sequence;
-
- with probability 1 - \min(1,\frac{\alpha}{k}), a uniformly drawn node from the cocktail is removed.
-
An example of crossover and mutations can be seen on Figure 1 (a), (c) and (d).
-
Applying those modifications to Q_{n+1} yields a population P_{n+1} which is used to loop at step n + 1.
2.4.2 Distance Between Drug Cocktails and Cocktail Penalization
If at some point of the algorithm, an individual of P_n has a high score compared to other individuals, the genetic algorithm may converge to a population consisting entirely of that cocktail. To avoid this uniformization phenomenon, similar cocktails are penalized in the evaluation phase as follows,
H_{pen}(C) = \frac{H(C)}{\sum_{C_i \in \mathcal{C}} \text{Sim}(C,C_i) }
The computation of the similarity \text{Sim}(C,C') is based on a distance inspired by the Levenshtein distance (Levenshtein et al. 1966). However, unlike the traditional Levenshtein distance, sequences are treated as unordered sets. For two drug cocktails, C_1 and C_2, of sizes n_1 and n_2, the distance d(C_1 , C_2) is defined as the minimal cost required to transform C_1 into C_2 using three elementary operations.
- \text{Ins}_a(C) consists of adding a to the cocktail C.
- \text{Del}_a(C) consists of deleting a from the cocktail C.
- \text{Sub}_{a,b}(C) consists of substituting a \in C by b.
An associated cost is defined for each operation.
- Substitution.
-
The cost associated with the substitution operation is chosen to be consistent with the conceptual similarity of cocktails. If a is a drug belonging to C, the cost should increase as the drug b diverges further from drug a. For example, if a is a drug, and b a drug family that contains a, the cost should be moderate. Conversely, if b is a drug family not containing a the cost should be higher. This distance is thus defined by the maximal distance between a and b to their Lowest Common Ancestor.
- Insertion, Deletion.
-
The deletion and insertion cost are chosen as \frac{depth(T)}{2}. This choice implies that a substitution always costs less than a deletion followed by an insertion. The latter are used only when the two cocktails do not have the same length.
A transformation f from C_1 to C_2 is a composition of elementary operations that go from C_1 to C_2. The associated cost \text{cost}(f) is defined as the sum of the cost of the operations used in f. Finally, d(C_1,C_2) = \min\limits_{f} \text{cost}(f : f(C_1) = C_2 ).
Finally, the maximum distance between C_1 and C_2 being (n_1 + n_2) \frac{depth(T)}{2}, we define the similarity as Sim(C_1,C_2) = 1 - \frac{2D(C_1,C_2)}{(n_1 + n_2)depth(T)}
The computation of the similarity is achievable in \mathcal{O}(n_1 \times n_2 \times \text{depth}(T) + |\Delta|) operations in the worst case. The algorithm to compute the similarity is detailed in the Section 5.1 (see Algorithm 1).
2.4.3 Output Clustering
Despite the diversity mechanisms integrated into the genetic algorithm, not all drug cocktails retrieved within the genetic algorithm population are unique. It is moreover common to encounter solutions that are merely variations of others, differing only by transformations such as changing a node in the ATC tree to its parent or child. To streamline analysis and enhance the efficiency, a post-treatment clustering of similar drug cocktails is implemented. This method allows to focus on the most risky cocktails within each cluster or to interpret pharmaceutically clusters rather than individual cocktails.
To do so, drug cocktails are embedded into a two-dimensional space using the UMAP algorithm (McInnes et al. 2018), which aims to preserve similarity in the latent space. This representation enables the effective use of conventional machine learning clustering algorithms in \mathbb{R}^2. Specifically, the DBSCAN algorithm (Ester et al. 1996) is applied to identify clusters of similar drug cocktails with an intuitive way of choosing hyperparameters.
2.5 Approximate P-Value Assignment for Drug Cocktails
Once a list of cocktails with high score has been established by the genetic algorithm, an important step of the analysis is to assign them p-values to decide if they are significant.
Consider a cocktail C of size k in the list and its score S_{obs}. Denote by H_0 the null hypothesis according to which a cocktail does not favor the AE, and by N_0 the number of cocktails of size k for which H_0 is the truth. Similarly, H_1 represents the alternative hypothesis of a cocktail favoring the AE and N_1 is the number of such cocktails of size k. The p-value corresponding to S_{obs} is then \mathbb{P}_{H_0}(S>S_{obs}), but cannot be estimated without a modelling hypothesis under H_0.
However, if \mathbb{P}_M (S) refers to the marginal distribution, that is the score distribution for the cocktails both in H_0 and H_1,
\mathbb{P}_M(S>S_{obs}) = \mathbb{P}_{H_0}(S>S_{obs}) \frac{N_0}{N_0+N_1} + \mathbb{P}_{H_1}(S>S_{obs}) \frac{N_1}{N_0+N_1} so that
\frac{N_0+N_1}{N_0}\mathbb{P}_M(S>S_{obs}) - \frac{N_1}{N_0} \mathbb{P}_{H_1}(S>S_{obs}) \leq \mathbb{P}_{H_0}(S>S_{obs}) \leq \frac{N_0+N_1}{N_0}\mathbb{P}_M(S>S_{obs}) Under the reasonable assumption N_1 << N_0, the probability \mathbb{P}_M(S>S_{obs}) on the marginal distribution can thus be used as an approximation of the p-value. Note that the upper bound yields that, under the weak hypothesis that less than 10\% of the cocktails of size k favor the AE, the real p-value is at most equal to 1.11 times its approximation. Moreover, the approximated p-value can be estimated by computing the score of cocktails sampled in the general cocktail population. The proposed method therefore focuses on a sampling scheme in the general population and uses the approximated p-value to declare significance.
A naive sampling of cocktails considers almost only cocktails taken by no patient in the dataset. A Metropolis-Hastings MCMC algorithm is thus considered, as it can be used by conditioning on the fact that all visited cocktails are present in the dataset (Au and Beck 2001).
To employ such an algorithm, it is necessary to define a state space \mathcal{C} = \{C_{1}, \dots, C_{p}\}, a computable target measure f(C_{i}), and conditional laws q(.|C_{i}) under which simulation is possible and new states can be proposed.
- State set.
-
The state set is made of all cocktails of k drugs for a fixed k.
- Proposal law.
-
The proposal law is defined as a mixture of two mutation laws of the current cocktail. They operate as follows:
-
- Random mutation consists of a completely random movement in the cocktail space.
-
- Local mutation involves a local movement relative to the structure of the drug tree. Here, a node x_p of the state C_i is changed to one of its free neighboring nodes.
At each iteration, the random and local mutations have probability p_R and 1-p_R, where p_R is a hyperparameter.
Figure 1 (b, c) presents examples of a random and a local mutation.
- State evaluation.
-
The evaluation of a drug cocktail is based on the score H(C). The chosen target measure is then: f_{T}(C_i) = \frac{1}{Z(T)} \times {e^{\frac{H(C_i)}{T}}}
-
where Z(T) = \sum_{C} e^{\frac{H(C)}{T}}. T is a parameter known as temperature, which modulates space exploration by more readily accepting cocktails with moderate scores (high T) or, conversely, by strongly favoring combinations of drugs with high scores (low T).
-
The acceptance probability of cocktail C_{i+1} from cocktail C_{i} is given by:
\min\left(1,\frac{f_T(C_{i+1})}{f_T(C_{i})} \times \frac{q(C_{i}|C_{i+1})}{q(C_{i+1}|C_{i})}\right)
The theory related to the Metropolis-Hastings algorithm (Robert and Casella 2004) ensures that the empirical distribution of f_T(C_i) for the constructed cocktail chain converges to the distribution of f_T(C). A very long realization of such a walk, therefore, allows for the approximation of the distribution that can be used to determine an empirical p-value for the score of a cocktail of interest. This enables the possibility to say whether or not a high-risk is truly significant (e.g. among the top 5% of scores). It defines what a high-risk is in our case.
2.6 Datasets
2.6.1 Simulated Data
Multiple datasets were simulated to evaluate the method performance against known outcomes. The datasets, designed to challenge the algorithm, simulate various patient scenarios. Each patient record includes prescribed medications and the corresponding occurrence of an adverse event AE.
The first dataset (D_1) is composed of 200,000 patients, and has the following characteristics:
- 1\% of the patients take a size-3 drug cocktail C_1 and have a \frac{1}{100} chance of having AE.
- 1\% take a size-3 drug cocktail C_2 and have a \frac{1}{200} chance of having AE.
- 1\% take a size-2 drug cocktail C_3 and have a \frac{1}{100} chance of having AE.
- 1\% take a size-2 drug cocktail C_4 and have a \frac{1}{200} chance of having AE.
A small percentage of the dataset (1.5 \% per combination) are combinations of 2 out of the 3 drugs from C_1 and C_2, but with no risk of AE. This helps to mitigate the false identification of sub-cocktails of C_1 and C_2 as high-risk cocktail because those who take two drugs of these cocktails will almost surely take the remaining drug of the cocktail.
The remaining 87\% of the dataset consists of patients assigned random cocktails drawn uniformly. The size s of each cocktail is drawn according to a Poisson distribution with \lambda = 4 (mean size of drugs cocktails taken by patients in the dataset). For each cocktail, s nodes of the ATC tree are selected uniformly, with each combination assigned an adverse event with probability \frac{1}{15000}.
All simulations use the complete ATC classification tree comprising 6809 nodes across five hierarchical levels at the time of this study. The cocktails C_1 through C_4 inducing an elevated AE probability were chosen as nodes at level 4 of the ATC hierarchy (i.e., pharmacological subgroups, one level above the leaf nodes representing individual active substances). This choice is deliberate: in the simulated data, patients are assigned specific drugs at level 5 (leaves) as would occur in practice, but any patient taking a drug that descends from one of the C_i nodes inherits the elevated AE probability associated with that family. This design tests the method’s ability to detect family-level signals — an important feature, since the underlying pharmacological mechanism may apply to an entire drug class rather than a single substance. The C_1 through C_4 nodes were selected to be non-overlapping (i.e., belonging to distinct subtrees) and to have a sufficient number of descendant leaves to generate realistic prescription patterns.
Three additional datasets were similarly constructed, varying the sizes of the cocktails inducing AE. Dataset D_1 contains cocktails of sizes two and three as described above. Dataset D_2 contains only size-two cocktails, Dataset D_3 only size-three cocktails, and Dataset D_4 contains size-two, three, and four cocktails.
2.6.2 FAERS Data
As a proof-of-concept on real data, the method was also assessed using the FAERS dataset, which consists of ICSRs submitted by healthcare professionals, consumers, and manufacturers. These reports include details on patient drug intake and the side effects experienced. The methods were deployed on FAERS data from the second quarter of 2013 to the second quarter of 2015, the restriction to a two-year window being motivated by computational time constraints.
Significant refinement of the FAERS data was required. Duplicate reports were removed, retaining only the report with the most recent ID. Subsequently, a link was established between the prescribed drugs and the ATC codes of each active ingredient. This process involved matching drug names to their respective active ingredients and converting these ingredients to their corresponding ATC codes. The DiAna dictionary (Fusaroli et al. 2024) facilitated the standardization of FAERS drug names to ATC codes. It is important to note that the DiAna dictionary handles combination products by splitting them into their constituent active substances rather than assigning a single ATC combination code. Reports with unmatchable drug names in the DiAna dictionary were excluded, reducing the dataset from 2,043,231 to 1,612,931 patients.
For this study, we focused on myopathy as the selected adverse event outcome. It is a clinically concerning condition with a sufficient number of reported cases in the dataset (536 cases). To validate our results, we compared the identified drug-myopathy associations with known drugs already established to cause myopathy (Miernik et al. 2024; Hall et al. 2011; Valiyil and Christopher-Stine 2010). Code for data refinement is available on GitHub.
3 Results
3.1 Score Comparison
To support the use of the hypergeometric score H(C) introduced in Section 2.2, we compared it to other risk scores on Dataset D_2, which comprises pairwise drug combinations. A set of cocktails was generated under the alternative hypothesis H_1, meaning they correspond to the drug combinations specifically designed to increase the probability of AE in Dataset D_2 (see Section 2.6.1). A second set of cocktails was sampled under the null hypothesis H_0, representing a baseline risk without specific interaction. For each cocktail in both sets, the risk scores were computed using the patient data in Dataset D_2. Figure 2 illustrates the performance of the risk scoring methods for detecting high-risk drug combinations introduced in Section 2.2.1. Each subplot displays the score values for cocktails representing true solutions, sampled under H_1 (green) and cocktails not representing true solutions, sampled under H_0 (red). The bottom-right panel presents the Precision-Recall (PR) curve, comparing the detection power of the scores in identifying high-risk cocktails. PRR is a special score because its value is either one or zero, which allows the computation of only one value of precision and recall. PRR is therefore represented by a single point rather than a PR curve.
PR curves of the different scoring methods are mainly comparable, confirming the conclusions of (Candore et al. 2015). However, as illustrated by the jitter plots, the hypergeometric score and the \Omega Shrinkage measure better rank and isolate true solutions from other cocktails. It must be emphasized that the \Omega Shrinkage measure is difficult to compare using a threshold, as the original article (Norén et al. 2008) suggests signaling a cocktail when the score exceeds zero. However, in the simulations, the computed score never exceeded zero, as depicted by the x-axis of its jitter plot. Despite these considerations, these two methods are the least biased toward cocktails taken by only a few patients.
The good behavior of the hypergeometric score and its easy generalization to cocktails of any size justify the use of that scoring function for the exploration of the cocktail space.
3.2 Application to Dataset D_1
3.2.1 Estimation of Risk Distribution
Risk distribution was estimated for size-two drug cocktails on Dataset D_1 (Section 2.6.1). Estimation of risk distribution for higher cocktail sizes is possible but it is nearly impossible to compare it to the true distribution as it is computationally prohibitive to obtain. The distribution estimated by the MCMC algorithm, is compared to the true risk distributions in Figure 3.
The left panels display the distributions of risk scores for both the estimated (top-left) and true (bottom-left) risk values. Both distributions share a similar shape, with the majority of risk scores concentrated at low values, as 95\% of the scores fall below 11. However, some differences can be observed, particularly in the tail of the distribution, where cocktails of higher risks are under-represented in the estimated distribution.
The right panels of Figure 3 present a Probability-Probability (PP) plot (top) and a Quantile-Quantile (QQ) plot (bottom), comparing the quantiles and probabilities of the estimated and true risk distributions. While the right panel of the figure demonstrates good agreement at lower risks, where most of the data lie, deviations at higher risk values suggest that the estimated distribution slightly underrepresents the risk for more extreme values.
Results indicate that the method performs well in estimating risk scores for the majority of cocktails, capturing the overall risk distribution with reasonable accuracy. The slight underestimations which are present for high-risk cocktails are not a problem since the interest of the method is to assign p-values. P-values are still reliable as shown in the PP-plot, Figure 3. Consequently, we can assign robust empirical p-values to any size-two cocktails based on their risks. Furthermore the calibration under the null hypothesis for the p-values attributed by the algorithm is assessed in Section 5.2.
3.2.2 Genetic Algorithm Output and Clustering
The genetic algorithm was applied to Dataset D_1 to identify high-risk drug cocktails. Multiple runs of the algorithm were conducted using different hyperparameter sets (population size, number of generations, parameter \alpha) to ensure the visit of different regions of the space of possible cocktails. The results were subsequently concatenated to create a comprehensive list of high-risk cocktails.
The genetic algorithm successfully identified nearly all high-risk size-two and size-three drug cocktails in Dataset D_1 (Figure 4). For size-two cocktails, the algorithm consistently found the exact high-risk combinations. However, for size-three cocktails, the algorithm sometimes identified cocktails that were very close to the true high-risk combinations, missing only one drug from the correct cocktail in a few cases (oftentimes, choosing parent nodes instead of the actual nodes).
To streamline the analysis of the large set of results, significant cocktails were filtered using the empirical p-value by setting a threshold of 5\%. Clustering techniques were then applied to group similar cocktails together. As discussed in Section 2.4.3, the UMAP algorithm has been used for dimensionality reduction, followed by the DBSCAN clustering method. This post-processing step allows to reduce redundancy by grouping cocktails that differed only slightly, such as by substituting a drug for another within the same pharmacological family. Such cocktails would have similar medical interpretations.
Figure 4 illustrates the results of the clustering process after mapping the selected cocktails in the plane. Each point represents a drug cocktail, and the 7 automatically determined clusters are represented by the colors.
The first conclusion is that the clusters are well separated in terms of the editing distance defined in Section 2.4.2. The genetic algorithm thus identified seven regions of the cocktails space of high score that are different in terms of interpretation as their cocktails are far away between classes. On the other hand, cocktails in the same clusters are close, indicating a possible pharmacological interpretation of the cluster.
The second conclusion is that the algorithm effectively found the true solutions or at least very similar solutions. Indeed, the embedding of the true solutions yields the four squares, that clearly belong to four different clusters among the seven. The method therefore identifies and separates regions of the cocktail space containing the true solutions. That simulation thus enhances the confidence that investigating the high-score cocktails returned by the methods may allow to detect relevant phenomena.
3.3 Application to the FAERS Spontaneous Reporting Data
3.3.1 Estimation of Risk Distribution
The risk estimation method was applied to the FAERS spontaneous reporting dataset presented in Section 2.6.2. Figure 5 presents a comparison between the estimated risk distribution and the true risk distribution for size-two drug cocktails. The left panel of the figure shows the distributions of risk scores, while the right panel presents the QQ plot and the PP plot, comparing the quantiles and probabilities of both distributions.
The histogram reveals that the estimated distribution aligns well with the true distribution for the majority of cocktails with lower risk scores. However, deviations begin to emerge in the tail of the distribution. Specifically, 15 of the riskiest cocktails in the true distribution were not captured by the estimated distribution as we see in the QQ plot. This explains the observed shift in the QQ plot at the highest quantiles since higher risk cocktails have not been found by the MCMC algorithm, highlighting the need of a complementary method like the genetic algorithm in order to find riskiest cocktails.
Despite this slight deviation, the empirical p-values remain robust for both lower and higher-risk cocktails as shown by the PP-plot in the Figure 5.
3.3.2 Genetic Algorithm Output and Clustering
The genetic algorithm was applied to the FAERS data focusing on the myopathy AE by running 180 parallel executions of the genetic algorithm on varying population sizes (from 100 to 1000 cocktails per generation). The whole procedure ran in less than 8 hours on a 24-core server.
Cocktail sizes present in the merged final populations vary from 1 to 6. The MCMC algorithm was run for each cocktail size in this range to assign an empirical p-value to each solution. All solutions having a p-value lower than 0.05 were kept. No multiple testing correction was made in order to avoid false negatives, that is disregarding interesting cocktails, even if this may inflate the number of false positives. 682 drug cocktails compose the final list. Applying Benjamini-Hochberg corrections to control the FDR at level 5\% and 10\% respectively kept the first 107 and 564 elements of the list.
The clustering procedure was applied to the 682 cocktails, leading to 15 clusters. To validate the results, the first 150 solutions were tagged with the families they correspond to, if any, without knowledge of the clustering’s result. To do so, some drugs or drug families known to be linked to myopathy adverse events were considered, that is hypolipemic drugs, Colchicine, corticosteroids, Ciclosporine, Beta blocking agents, Fluoroquinolones, and anti-malaria drugs (Miernik et al. 2024; Hall et al. 2011; Valiyil and Christopher-Stine 2010).
The final result is a table of 682 rows, one per selected cocktail, indicating its composition, the number of patients taking it, the number of patients taking it and facing myopathy, hypergeometric and RR score. It also contains the cluster it belongs to and the tagged families. The entire table is available on this link. Table 1 summarizes the cluster assignments for the top 150 cocktails identified in the genetic algorithm. Among these, eleven cocktails could not be assigned to any drug or drug family listed in the table headers. Note that a cocktail can be associated with more than one drug or drug family.
| Cluster | Hypolipemic Drugs | Colchicine | Steroids | Cyclosporine | Beta blocking agents | Domperidone | Fluoroquinolones |
|---|---|---|---|---|---|---|---|
| 1 | 31 | 0 | 3 | 0 | 0 | 0 | 4 |
| 2 | 0 | 11 | 0 | 4 | 0 | 0 | 0 |
| 3 | 7 | 0 | 0 | 0 | 19 | 0 | 0 |
| 4 | 0 | 0 | 0 | 0 | 0 | 50 | 0 |
| 5 | 0 | 0 | 0 | 0 | 0 | 19 | 0 |
| 6 | 0 | 0 | 0 | 0 | 0 | 0 | 4 |
This table shows the method’s capability to detect known ADRs from the FAERS dataset and highlights the ability of the clustering method to group cocktails with similar pharmacological interpretations. Specifically, cluster 1 corresponds to cocktails containing Hypolipemic Drugs, cluster 2 to Colchicine and Cyclosporine, cluster 3 to beta blocking agents and cluster 6 to Fluoroquinolones. These clusters align with previously reported pharmacological associations (Miernik et al. 2024). Similarly, clusters 4 and 5 correspond to Domperidone-containing cocktails which have been associated with adverse effects, including potential cardiac complications, as reported by the British Medicines and Healthcare products Regulatory Agency (Medicines and Healthcare products Regulatory Agency 2014).
Interestingly, additional drug families reported in (Miernik et al. 2024), such as anti-malarial drugs, were also identified but at later ranks. For instance, anti-malarial drugs are primarily grouped in cluster 9, as shown in the complete solution table.
3.3.3 Post-Treatment Interaction Analysis
The penalized logistic regression described in Section 2.3 was applied to the strongest signals identified by the genetic algorithm to assess whether the detected cocktails reflect true pharmacological interactions. For the cyclosporine–colchicine combination, the interaction coefficient remained positive and highly significant (p = 1.02 × 10^{−6}) after accounting for individual drug effects, corroborating existing medical literature indicating a potentiation of muscle toxicity (Ducloux et al. 1997).
Conversely, the analysis of the identified size-four cocktail (metformin, prasugrel, bisoprolol, simvastatin) revealed that the risk signal was almost entirely explained by a size-three sub-cocktail (metformin, bisoprolol, simvastatin). In this instance, the addition of prasugrel did not yield a significant additional interaction effect, illustrating the importance of the post-treatment step in identifying the core of a detected signal.
4 Conclusion
As co-medication becomes increasingly common, there is a growing need for methods capable of detecting signals of harmful drug combinations from the available large databases for further assessment. The proposed method addresses this need by identifying signals and assigning them a p-value using a hypergeometric disproportionality analysis measure. Additionally, the method enables the identification of broader signals within the ATC hierarchy by proposing “cocktails” of not only active substances but also chemical, therapeutic, and anatomical families, leveraging the hierarchical classification of active substances.
Application on synthetic datasets demonstrated that using the hypergeometric score reduces the false positives from cocktails taken by a small number of patients, enhancing the robustness of the measure. The results on these datasets of our MCMC algorithm to estimate the distributions of cocktail risks were encouraging, as the estimated distributions closely aligned with the true distributions, indicating reliable p-value assignment. Furthermore, the genetic algorithm effectively identified the majority of the harmful cocktails, with a high success rate, highlighting its efficiency in navigating the large solution space.
Applying this method to previous FAERS data for the myopathy adverse event yielded promising results. A literature review confirmed the intersection between the identified signals and drugs known to have a higher likelihood of causing myopathy, demonstrating the effectiveness of the proposed methodology. Results also indicate that certain drug combinations are more strongly associated with myopathy than individual medications. Notably, the cyclosporine/colchicine combination exhibits a higher hypergeometric score (73.1 vs. 63.4) and RR (879.4 vs. 57.1) compared to colchicine alone, reinforcing the importance of analyzing drug interactions in pharmacovigilance. This combination is known to be more likely to induce myopathy (Ducloux et al. 1997).
Furthermore, our approach identified a size-four drug cocktail (metformin, prasugrel, bisoprolol, simvastatin) associated with an increased risk signal (hypergeometric score = 72.2, RR = 3060.6). This combination was observed in nine patients, all of whom experienced the adverse event. While this finding demonstrates the feasibility of detecting higher-order drug interactions, we emphasize that no clinical validation is currently available for this specific combination. This underscores both the potential of the method in identifying complex drug interactions and the need for further validation through complementary studies.
As demonstrated in Section 3.3.3, the post-treatment logistic regression step provides clinical refinement by distinguishing true interactions from combined individual effects.
This approach can also be extended to other settings where adverse events are explored. For instance, the method could be applied using the ICD diagnosis classification or the MedDRA system, both of which are hierarchical classifications. Such an application would facilitate the identification of symptoms associated with the consumption of drug combinations.
The proposed method is implemented as an R package emcAdr, available on GitHub and on the CRAN (Bangard 2025). A tool allowing the processing of quarterly FAERS xml files to csv file directly usable by our method is also available on GitHub.
For researchers and stakeholders it is crucial to remember that our method is hypothesis generating as are other signal detection methods in single medications. These methods aim to generate as few false negative results as possible but with the trade-off of more false positives. This means that any results should firstly be assessed for biological plausibility as well as validated in a more formal causal framework such as RCTs and target trial emulations. Furthermore, while spontaneous reporting systems have a key role in pharmacovigilance, they have known drawbacks as underlined in the introduction. They however provide important knowledge about the safety of drugs, and with our method also drug cocktails.
Funding
This work was partially supported by the “PHC AURORA” programme (project number: 49704QC), funded by the French Ministry for Europe and Foreign Affairs, the French Ministry for Higher Education and Research and the Norwegian Council for Research.
References
5 Appendices
5.1 Appendix A : Distance Pseudo-code Algorithm
Algorithm 1 computes the distance between two cocktails, C_1 and C_2, based on three inputs.
The first input is an integer matrix M. This matrix represents the nodes of C_1 and C_2, as well as their corresponding parent nodes. The matrix has dimensions (PN, \text{ATC\_HEIGHT}), where P and N denote the sizes of C_1 and C_2, respectively. For example, consider the simplified tree shown in Figure 1 of the article. In this tree, C_1 could be the cocktail containing only the node \begin{bmatrix} 3 \end{bmatrix}, and C_2 could be the cocktail containing only the node \begin{bmatrix} 2 \end{bmatrix} (P = 1 and N = 1).
The corresponding matrix M would be: M = \begin{bmatrix} 3 & 1 & 0 \\ 2 & 2 & 0 \end{bmatrix}
The second input consists of idxC_1 and idxC_2, which represent the indices of the rows in M that contain the nodes of C_1 and C_2, respectively.
5.2 Appendix B : Calibration Under the Null
To assess the validity of the approximation of the p-value, we plotted the empirical distribution of p-values for cocktails presumed not to increase the adverse-event rate on the Dataset D_1. The theoretical p-values following a Uniform(0, 1) distribution, the approximated ones should exhibit a distribution close to uniform if the approximation is valid. Figure 6 shows a histogram of these p-values, which appears approximately flat.
5.3 Appendix C : FAERS Clustering
Figure 7 shows the clustering applied to the results of the genetic algorithm run on the FAERS datasets.
5.4 Appendix D : Algorithmic Complexity
The total time complexity of the high-risk cocktail identification is given by: \mathcal{O}(G \cdot P \cdot T_{eval}) Where G represents the number of generations (or MCMC iterations) and P represents the population size of the GA (this equals one for the MCMC algorithm). The evaluation time for a single cocktail, T_{eval}, is defined by the matching logic required to count occurrences across the database: T_{eval} = \mathcal{O}(N \cdot K \cdot R)
N (Number of Reports): The algorithm iterates once through the database to compute the hypergeometric score for a given cocktail. The runtime scales linearly with the number of patients.
K (Cocktail Size): The number of drugs in the candidate cocktail. In this study, K is typically small since there is a limit to the number of drugs a patient takes.
R (Drugs per Report): The number of medications listed in a single patient report. On our data the mean size of drug cocktails per patient is approximately 2.2. Because K and R are bounded, the complexity simplifies to O(N) per candidate evaluation.
5.5 Appendix E: Illustrative Example of Package Usage
This appendix demonstrates the end-to-end usage of the emcAdr package on a small simulated dataset that compiles in under a minute. It is intended as a hands-on illustration of the API: full reproduction of the results in Section 3.2 and Section 3.3 requires the longer-running scripts available in the article’s GitHub repository. An accompanying README.md file in the article’s GitHub repository documents the full reproduction workflow.
The toy dataset follows the same simulation design as Dataset D_1 (Section 2.6.1), at a reduced scale and with deliberately inflated adverse-event probabilities so that the signals remain detectable in a small sample. Patients taking only one of the two component drugs are added as decoys to break the spurious equivalence between the size-two cocktail and either of its components in a small dataset.
5.5.1 E.1 Setup and Toy Dataset Generation
Hide/Show the code
library(emcAdr)
library(dplyr)
library(purrr)
library(tidyr)
library(ggplot2)
library(umap)
# we load data in order to have the ATC_Tree_UpperBound_2014
load("./figures/datasets/simulated_dataset_2_3_med.RData")
set.seed(42)
# ATC tree leaves (active substances) -- 0-indexed for the C++ backend
leaves <- which(ATC_Tree_UpperBound_2014$ATC_length == 7) - 1L
# Two size-two cocktails solution, each defined by two drug-family ranges
solutions <- list(
C1 = list(d1_range = 48:53, d2_range = 1394:1396, ae_prob = 1/5),
C2 = list(d1_range = 889:892, d2_range = 660:661, ae_prob = 1/7)
)
N <- 5000
prop_full <- 0.02 # patients taking BOTH drugs of a cocktail
prop_decoy <- 0.03 # patients taking ONLY ONE drug of a cocktail
ae_baseline <- 1/300
# Decoy mechanism: for each cocktail we add patients who take ONLY one
# of the two component drugs, with no elevated AE risk. Without these
# decoys, in such a small dataset, every taker of d1 would also take d2,
# so H({d1}) would match H({d1, d2}) and the algorithm would surface size-1
# solutions "equivalent" to the size-2 cocktail solutions. Decoys break that
# perfect correlation: a sizeable share of d1-takers are now AE-free.
# This mirrors the "2 out of 3" decoys used for D1 in Section 2.6.1.
group_spec <- imap(solutions, \(cock, cname) list(
full = list(p = prop_full, ae_p = cock$ae_prob,
draw = \() c(sample(cock$d1_range, 1),
sample(cock$d2_range, 1))),
decoy_d1 = list(p = prop_decoy, ae_p = ae_baseline,
draw = \() sample(cock$d1_range, 1)),
decoy_d2 = list(p = prop_decoy, ae_p = ae_baseline,
draw = \() sample(cock$d2_range, 1))
)) |>
list_flatten(name_spec = "{outer}_{inner}")
# The remaining patients receive a random Poisson-sized cocktail
group_spec$random <- list(
p = 1 - sum(map_dbl(group_spec, "p")),
ae_p = ae_baseline,
draw = \() sample(leaves, 1 + rpois(1, 4), replace = FALSE)
)
stopifnot(group_spec$random$p > 0)
# Sample a group label for each patient, then realize their data
group <- sample(names(group_spec), N, replace = TRUE,
prob = map_dbl(group_spec, "p"))
# Note : columns of observations must have those names
toy_data <- tibble(group = group) |>
mutate(
patientATC = map(group, \(g) group_spec[[g]]$draw()),
patientADR = map_lgl(group, \(g) runif(1) <= group_spec[[g]]$ae_p)
) |>
select(patientATC, patientADR)
# Patient counts and AE counts by group
tibble(group = group, ADR = toy_data$patientADR) |>
count(group, ADR) |>
pivot_wider(names_from = ADR, values_from = n,
names_prefix = "AE_", values_fill = 0L)# A tibble: 7 × 3
group AE_FALSE AE_TRUE
<chr> <int> <int>
1 C1_decoy_d1 132 0
2 C1_decoy_d2 134 2
3 C1_full 80 16
4 C2_decoy_d1 174 0
5 C2_decoy_d2 177 0
6 C2_full 84 10
7 random 4174 17
The dataset contains 5000 patients, of whom approximately 2% take the full cocktail C_1, 2% the full cocktail C_2, and 3% take a single drug from each cocktail as a decoy. The remaining patients receive a random combination of active substances drawn from the ATC tree leaves.
C1_full patients show a 20% AE rate vs 0.4% in the random group. Decoys show baseline AE rates as designed.
Please note that the column of the patient dataframe has to be PatientATC and PatientADR.
5.5.2 E.2 Evaluating Cocktail Risk via the Hypergeometric Score
We start by computing the hypergeometric score H(C) (Section 2.2.1) for a solutions cocktail and for a random combination, to confirm that C_1 is clearly separated on the score scale.
Hide/Show the code
# Pick the C1 family and a random size-2 cocktail
solution_C1 <- c(47, 1393)
random_C <- sample(leaves, 2)
compute_hypergeom_on_list(list(solution_C1, random_C),
ATCtree = ATC_Tree_UpperBound_2014,
observations = toy_data)[1] 37.79204 0.00000
In datasets of this scale, the score function is sparse: most randomly drawn size-2 pairs are taken by zero or one patient. A random pair therefore has near 100% probability of scoring exactly zero, as illustrated by random_C above. The solution family C_1 scores H(C_1) \approx 38, but whether this magnitude is extreme among cocktails actually present in the data is the question answered by the next section.
5.5.3 E.3 Estimating the Null Distribution via MCMC
A short Metropolis-Hastings run (Section 2.5) approximates the distribution of H(C) for size-two cocktails. The full-scale analysis uses substantially longer chains.
Hide/Show the code
mcmc_chain <- DistributionApproximation(
epochs = 200000,
ATCtree = ATC_Tree_UpperBound_2014,
observations = toy_data,
temperature = 25,
Smax = 2, # Size of cocktail
num_thread = 4
)openMP available
Hide/Show the code
plot_frequency(
mcmc_chain$ScoreDistribution[-1],
binwidth = .3, xlab = "H"
)Hide/Show the code
# Convert the histogram-style MCMC output to a flat score vector.
# Index 1 corresponds to score = 0 cocktails, we drop it since we work only on
# non zero score.
mcmc_null <- histogramToDitribution(mcmc_chain$ScoreDistribution[-1])
# Empirical p-value of a solution cocktail under the estimated null
solution_score <- compute_hypergeom_on_list(
list(solution_C1),
ATC_Tree_UpperBound_2014,
toy_data
)
cat("p-value of solution C1 (", solution_C1, ") :",
mean(mcmc_null >= solution_score))p-value of solution C1 ( 47 1393 ) : 0.002559415
The bulk of the distribution sits near zero. The visible spike around H(C) \approx 4-5 corresponds to a small-sample artifact, cocktails taken by only one patient where the patient happens to experience the AE. With only \sim 45 AE cases over 5000 patients, such configurations are common enough to form a visible noise peak. Furthermore, with only 5000 data points, each cocktail tend to be taken by few patient. The solution families generate H(C) values in the 18-40 range, above this peak, which is why the empirical p-value of C_1 comes out at \sim 2.5 \times 10^{-3}.
For size-two cocktails, computing the exact score distribution is tractable and offered by the package via trueDistributionSizeTwoCocktail(). This allows assessing the MCMC approximation directly (see also Figure 3 in the body):
Hide/Show the code
true_dist <- trueDistributionSizeTwoCocktail(
ATCtree = ATC_Tree_UpperBound_2014,
observations = toy_data,
beta = 4,
num_thread = 4
)
# Compare via the package's qq-plot helper
qq_plot_output(mcmc_chain, true_dist)The PP/QQ-plot comparison would confirm that the MCMC tracks the null faithfully on the regions of H that actually occur in the data.
5.5.4 E.4 Identifying High-Risk Cocktails with the Genetic Algorithm
To mirror the multi-run aggregation workflow of Section 3.2, we run the genetic algorithm five times with different seeds and pool the final populations. A single short run typically converges on one of the two solutions. Aggregation across runs increases the chance of recovering both. For full-scale runs on D_1, parallel execution scripts compatible with SLURM and similar workflow managers are available in the article’s GitHub repository under figures/scripts/.
Hide/Show the code
n_runs <- 5
ga_runs <- map(seq_len(n_runs), \(i) {
set.seed(100 + i)
GeneticAlgorithm(
epochs = 400,
nbIndividuals = 200,
ATCtree = ATC_Tree_UpperBound_2014,
observations = toy_data,
num_thread = 4,
alpha = 1.6,
diversity = TRUE,
p_mutation = .09,
summary = FALSE
)$FinalPopulation |> as_tibble()
})
ga_top <- bind_rows(ga_runs) |>
mutate(cocktails = lapply(cocktails, sort)) |>
group_by(cocktails) |>
summarise(score = max(score), .groups = "drop") |>
filter(score > 0) |>
arrange(desc(score)) |>
mutate(cocktails_display = map_chr(cocktails, paste, collapse = ", "))
ga_top |> head(10)# A tibble: 10 × 3
cocktails score cocktails_display
<list> <dbl> <chr>
1 <int [2]> 40.6 0, 1393
2 <int [2]> 37.8 47, 1391
3 <int [2]> 37.6 45, 1393
4 <int [2]> 36.0 47, 1023
5 <int [1]> 33.2 1393
6 <int [2]> 18.9 630, 888
7 <int [2]> 18.9 658, 888
8 <int [2]> 18.7 659, 863
9 <int [2]> 18.6 659, 738
10 <int [2]> 18.2 0, 888
The top outputs correspond to the solution families. Cocktails involving the parent ATC nodes for C_1 (nodes 0, 45, 47 paired with 1391, 1393) score in the 36-40 range. Those for C_2 (nodes 630, 658, 659 paired with 888, 863) score around 18-19.
5.5.5 E.5 Output Clustering
For didactic purposes we apply the same UMAP + DBSCAN pipeline as described in Section 2.4.3 on the recovered cocktails. We expect only a handful of distinct signals at this scale.
Hide/Show the code
# We put solution C1 and C2 in order to compare them to the projection
# of found cocktail in the plane.
ga_cocktails <- c(list(c(47, 1393), c(888, 659)), ga_top$cocktails)
# Compute the distance between cocktails
divergence <- get_dissimilarity_from_cocktail_list(
ga_cocktails, ATC_Tree_UpperBound_2014, normalization = TRUE
)
divergence <- as.matrix(do.call(rbind, divergence))
umap_config = umap::umap.defaults
umap_config$n_neighbors <- 5
# Precise UMAP that the input is a dissimilarity matrix
umap_config$input <- "dist"
umap_config$min_dist <- .001
umap_config$random_state <- 123
umap_results <- umap::umap(divergence, config = umap_config)
dbscan_results <- dbscan::dbscan(umap_results$layout, eps = 1)
ga_plot <- tibble(cocktails = ga_cocktails) |>
mutate(
UMAP1 = umap_results$layout[, 1],
UMAP2 = umap_results$layout[, 2],
cluster = factor(dbscan_results$cluster),
solution = ifelse(
row_number() %in% 1:2, 1, 0 # First two points are the true signals
)
)
ggplot(ga_plot, aes(UMAP1, UMAP2, colour = cluster)) +
geom_point(alpha = 0.8, size = 2,aes(shape=as.factor(solution))) +
labs(colour = "Cluster", shape = "Solution") +
theme_minimal()The two triangles (true solutions) embed in clusters 1 and 2 alongside related GA cocktails, demonstrating that the dissimilarity-based clustering groups variants of the same underlying signal in a reasonable way.
5.5.6 E.6 Post-Treatment with Firth’s Penalized Logistic Regression
For the C_1 cocktails, we apply the post-treatment of Section 2.3 to test whether the signal reflects a true synergistic interaction or merely the additive effect of its components. A function in emcAdr has been added in order to test a cocktail.
Hide/Show the code
firth_reg <- run_firth_regression(
cocktail = solution_C1,
upper_bound = ATC_Tree_UpperBound_2014$upperBound,
patient_data = toy_data,
adr_column = "patientADR"
)
summary(firth_reg)logistf::logistf(formula = Y ~ ., data = df_model)
Model fitted by Penalized ML
Coefficients:
coef se(coef) lower 0.95 upper 0.95 Chisq p
(Intercept) -5.188918 0.1985810 -5.6048123 -4.823127 Inf 0.000000000
X47.TRUE -0.557285 1.4303230 -5.4020854 1.417979 0.1824861 0.669245382
X1393.TRUE 1.692411 0.5180643 0.5357799 2.615262 7.3118976 0.006849955
X47.1393.TRUE 2.468896 1.5322560 0.1735472 7.390559 4.6094408 0.031796401
method
(Intercept) 2
X47.TRUE 2
X1393.TRUE 2
X47.1393.TRUE 2
Method: 1-Wald, 2-Profile penalized log-likelihood, 3-None
Likelihood ratio test=80.73281 on 3 df, p=0, n=5000
Wald test = 787.0257 on 3 df, p = 0
The p-value of 0.032 on the interaction term confirms that this synergistic effect is statistically distinguishable, even with only 16 AE cases in the C1_full group. It confirms that the genetic algorithm’s discovery reflects a true drug–drug interaction rather than the marginal contribution of either family alone.
Genetic algorithm output also contains cocktail 1393 which contributes significantly to the AE on his own as depicted in the summary.
6 Code and Results
See the complete results of application of methodology on the FAERS dataset on this link.
Code to reproduce the experiments conducted in this article can be found here :
- GitHub link of the data refinement code : JulesBa-Git/FAERS-xml-parser.
- GitHub link to the latest release of the R package of the complete methodology : JulesBa-Git/emcAdr.
- The code of the methodology is also available on the CRAN.
- The code to obtain figures is available on the
figures/scriptsfolder of this repository.
Session information
Hide/Show the code
sessionInfo()R version 4.5.1 (2025-06-13)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] tidyr_1.3.2 purrr_1.2.2 stringr_1.6.0 umap_0.2.10.0 ggplot2_4.0.3
[6] dbscan_1.2.4 dplyr_1.2.1 emcAdr_1.3
loaded via a namespace (and not attached):
[1] gtable_0.3.6 shape_1.4.6.1 xfun_0.57
[4] formula.tools_1.7.1 lattice_0.22-9 vctrs_0.7.3
[7] tools_4.5.1 Rdpack_2.6.6 generics_0.1.4
[10] tibble_3.3.1 pan_1.9 pkgconfig_2.0.3
[13] jomo_2.7-6 Matrix_1.7-5 RColorBrewer_1.1-3
[16] S7_0.2.2 lifecycle_1.0.5 compiler_4.5.1
[19] farver_2.1.2 codetools_0.2-20 htmltools_0.5.9
[22] yaml_2.3.12 glmnet_4.1-10 mice_3.19.0
[25] nloptr_2.2.1 pillar_1.11.1 MASS_7.3-65
[28] openssl_2.4.0 reformulas_0.4.4 iterators_1.0.14
[31] rpart_4.1.24 boot_1.3-31 foreach_1.5.2
[34] mitml_0.4-5 nlme_3.1-169 RSpectra_0.16-2
[37] tidyselect_1.2.1 digest_0.6.39 stringi_1.8.7
[40] labeling_0.4.3 splines_4.5.1 operator.tools_1.6.3.1
[43] fastmap_1.2.0 grid_4.5.1 cli_3.6.6
[46] magrittr_2.0.5 survival_3.8-3 utf8_1.2.6
[49] broom_1.0.12 withr_3.0.2 scales_1.4.0
[52] backports_1.5.1 logistf_1.26.1 rmarkdown_2.31
[55] nnet_7.3-20 lme4_2.0-1 reticulate_1.46.0
[58] askpass_1.2.1 png_0.1-9 evaluate_1.0.5
[61] knitr_1.51 rbibutils_2.4.1 mgcv_1.9-4
[64] rlang_1.2.0 Rcpp_1.1.1-1.1 glue_1.8.1
[67] renv_1.2.2 minqa_1.2.8 jsonlite_2.0.0
[70] R6_2.6.1
