# SurvReLU: Inherently Interpretable Survival Analysis via Deep ReLU Networks

## ๐ Abstract

The paper proposes a deep rectified linear unit (ReLU) network called SurvReLU that bridges the gap between deep survival models and traditional tree-based survival models. SurvReLU can achieve interpretability like a tree-based model while maintaining the representational power of a neural network. The key contributions are:

- Establishing an explicit connection between ReLU networks and tree-based survival models by showing that ReLU networks can partition the input space into locally homogeneous regions like a tree.
- Introducing a statistically-driven method to dynamically optimize the topology of the ReLU network for survival analysis, enabling automatic pruning of the resulting tree structure.
- Demonstrating that SurvReLU can be optimized end-to-end with flexible loss functions, including both continuous-time and discrete-time survival losses.

Experiments on both simulated and real-world datasets show that SurvReLU achieves competitive performance compared to previous deep and tree-based survival models, while providing better interpretability.

## ๐ Q&A

### [01] Introduction

**1. What are the key limitations of traditional survival analysis models like Kaplan-Meier estimator and Cox proportional hazards model?**

- The Kaplan-Meier estimator ignores time-dependent covariates, while the Cox proportional hazards model has a linear constraint between the risk function and covariates that is easily violated in real scenarios.

**2. How have deep neural networks been introduced to address the limitations of traditional survival models?**

- Deep neural networks have been used to replace the linear model in the Cox proportional hazards framework, as in DeepSurv, to better approximate arbitrary functions.
- Assumption-free deep survival models have also been explored, such as directly learning the distribution of survival times using a deep neural network.

**3. What are the key limitations of deep survival models compared to tree-based survival models?**

- Deep survival models are challenging to interpret due to their "black-box" nature, while tree-based survival models offer better interpretability.
- However, tree-based survival models are typically shown to be inferior to deep survival models in terms of performance, which may be due to their inability to guarantee convergence to global optima because of their greedy expansion and reliance on predefined splitting rules.

### [02] Method

**1. How does the proposed SurvReLU network establish a connection between ReLU networks and tree-based survival models?**

- The ReLU network can partition the input space into locally homogeneous and disjoint polyhedrons, similar to how tree-based models partition the input space.
- The activation patterns and composite function in the ReLU network serve as the nodes and leaves in survival trees, respectively.
- The ReLU network can incorporate the input covariates at each layer to resemble a non-axis-aligned tree-splitting rule.

**2. How does the proposed method dynamically optimize the topology of the SurvReLU network?**

- The method employs the log-rank test to localize which partitions should be pruned, merging nodes whose p-value from the log-rank test is less than 0.05.
- The timing of the topology optimization is determined by tracking the changes in the rank of the activation patterns matrix, performing the optimization when the topology is unchanged.

**3. What loss functions can be used to optimize the SurvReLU network?**

- The end-to-end parameterization of SurvReLU enables a flexible choice of loss functions, including both continuous-time (e.g., DeepSurv) and discrete-time (e.g., DeepHit) survival losses.

### [03] Experiments and Results

**1. How did the proposed SurvReLU perform on the simulated datasets compared to other methods?**

- On the simulated datasets with linear and Gaussian risk functions, SurvReLU achieved similar or better performance than other deep survival models and tree-based survival models in terms of the time-dependent concordance index (C-index).
- SurvReLU was able to effectively approximate the true risk functions, while the other tree-based survival models showed inferior performance.

**2. How did the proposed SurvReLU perform on the real-world SUPPORT and METABRIC datasets?**

- On the real-world datasets, SurvReLU (Cont.) outperformed other continuous-time deep survival models, and SurvReLU (Disc.) achieved even better performance than the discrete-time DeepHit model.
- SurvReLU resulted in a compact tree structure that is inherently as interpretable as tree-based survival models, a feature not provided by the other deep survival models.

**3. What were the key findings from the ablation studies on SurvReLU?**

- The performance of SurvReLU typically saturates after reaching a certain number of layers, similar to many deep survival models.
- There is a trade-off between the performance of SurvReLU and the sparsity of the weight matrices, as a more sparse weight matrix pushes SurvReLU to result in a tree structure with an axis-aligned partitioning rule, which is better for interpretability.