Support vector methods for survival analysis: a comparison between ranking and regression approaches

https://doi.org/10.1016/j.artmed.2011.06.006Get rights and content

Abstract

Objective

To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data.

Methods

The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data.

Results

We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model’s discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints.

Conclusions

This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods including only regression or both regression and ranking constraints on clinical data. On high dimensional data, the former model performs better. However, this approach does not have a theoretical link with standard statistical models for survival data. This link can be made by means of transformation models when ranking constraints are included.

Introduction

Survival studies arise in different areas. Although they are most well known in medical and in particular in cancer studies, they also occur in economics (e.g. prediction of bankruptcy of factories), in mechanics (e.g. failure of airplanes, breakdown of engines, etc.), electronics (e.g. lifetime of electrical components), social sciences (e.g. estimating the time from marriage to divorce) and many other topics. Depending on the question at study one is interested in risk groups (which group of patients/components is more likely to experience the event?) or time predictions (before which time should the engine be replaced to decrease the risk of failure?).

The survival literature describes different models to answer these questions. Many common methods including the proportional hazard model (cox model) and log-odds model are transformation models (tm) [1], [2], [3], [4], [5], [6]. This type of models assemble a prognostic index based on the covariates and link this index to the observed event times by means of a monotonic transformation function in a second step. tms for survival analysis mainly focus on the first step. The standard cox model [7] for example avoids the second step by assuming that the hazard (the instantaneous risk to observe the event now, knowing that the event did not occur before) is proportional to an unspecified baseline hazard. Other models assume a fixed transformation function h. The accelerated failure time model is one example which assumes that the transformation function h(y), with y the outcome under study, equals the logarithmic function, and the proportional odds model takes h(y) = logit(y) [6].

Survival models based on support vector machines (svm) [8] are able to incorporate non-linearities in an automatic way and using non-additive kernels, interactions are automatically incorporated. These methods use an approach which is different from the standard statistical approach. svm-based models do not assume a true underlying function for which the parameters need to be estimated. Instead the empirical risk of misranking two instances with regard to their failure time, is minimized [9]. The survival problem was therefore reformulated as a ranking problem. To reduce the computational load, a simplified version comparing each observation only with its closest neighbor instead of with all other observations, was proposed in [10]. A more theoretical framework was provided in [8]. We will refer to the survival model proposed in the latter work as model 1. In this work, we ask ourselves whether the inclusion of regression constraints can improve the performance. Therefore, the performance of model 1 is compared with that of model 2, including ranking and regression constraints. The proposed model is compared with survival methods only including ranking constraints (see [9], [10], [11]) and only including regression constraints (see [12], [13]). Table 1 gives an overview of the different models handled in this work, their constraints, the number of tuning parameters in case of a linear kernel and how the ranking constraints are defined.

This paper is organized as follows. Section 2 gives an overview of transformation models in survival analysis. Section 3 starts with a summary of existing svm-based survival methods, followed by the introduction of a new model, proposed by the authors. Section 4 compares the different svm-based survival models on 8 different datasets. In addition to the methods mentioned before, the experiments include the performance of the cox model for comparison.

The following notations are used throughout the text. D denotes the set of observations {xi,yi,δi}i=1n, where xi is a d-dimensional covariate vector, yi is the corresponding survival time and δi denotes whether an event was observed (δi = 1) or the observation was right censored (δi=0). For notational convenience, it is assumed that the observations in D are sorted such that for two observations {(xi, yi, δi), (xj, yj, δj)} with j < i, it applies that yj < yi.

Section snippets

Transformation models

A tm models a possibly unknown transformation of the outcome instead of the outcome itself as a function of the covariates. Initially, tms were introduced in regression problems where the normality assumption on the distribution of the errors and the constant variance were not satisfied. A standard regression model for example tries to model the outcome y as a linear combination of the covariates:y=wTx+ϵ,where w is a coefficient vector and ϵ is the error variable. In cases where y is not

Kernel-based survival models

This section starts with a brief discussion of existing survival models based on svms. In a second subsection, a new method is proposed. Since the outcome of this type of survival models can, in general, not be interpreted as a failure time, we will denote the outcome of the model as the prognostic index u(x) instead of the prediction of the model. For the cox model this corresponds to u(x)=wTx.

Experiments

This section compares the performances of the discussed methods on 5 clinical data sets and 3 high dimensional data sets. A description of the data and the different performance measures is given first. Next the results on real data and on artificial data are discussed.

Conclusions

This work compared different methods for survival analysis based on support vector machines. Three different approaches were discussed: (i) the ranking approach, (ii) the regression approach and (iii) the combined approach. On a theoretical basis, the first and third methods are preferred since they can be linked with well known statistical models for survival analysis. However, the experiments revealed that the ranking approach performs significantly less than both other approaches.

Acknowledgments

This research is supported by Research Council KUL: GOA AMBioRICS, GOA MANET, CoE EF/05/006, IDO 05/010, IOF KP06/11, IOF SCORES4CHEM, several PhD, postdoc and fellow grants; Flemish Government: FWO: PhD and postdoc grants, IBBT, G.0407.02, G.0360.05, G.0519.06, G.0321.06, G.0341.07 and projects G.0452.04, G.0499.04, G.0211.05, G.0226.06, G.0302.07; IWT: PhD Grants, McKnow-E, Eureka-Flite; Belgian Federal Science Policy Office: IUAP P6/04; EU: FP6-2002 LIFESCIHEALTH 503094, IST 2004-27214,

References (29)

  • J.D. Kalbfleisch

    Likelihood methods and nonparametric tests

    Journal of the American Statistical Association

    (1978)
  • D.M. Dabrowska et al.

    Partial likelihood in transformation models with censored data

    Scandinavian Journal of Statistics

    (1988)
  • K. Doksum et al.

    On a correspondence between models in binary regression and survival analysis

    International Statistical Review

    (1990)
  • S.C. Cheng et al.

    Analysis of transformation models with censored data

    Biometrika

    (1995)
  • S.C. Cheng et al.

    Predicting survival probabilities with semiparametric transformation models

    Journal of the American Statistical Association

    (1997)
  • J.D. Kalbfleisch et al.

    The statistical analysis of failure time data

    (2002)
  • D.R. Cox

    Regression models and life-tables (with discussion)

    Journal of the Royal Statistical Society, Series B

    (1972)
  • V. Van Belle et al.

    Learning transformation models for ranking and survival analysis

    Journal of Machine Learning Research

    (2011)
  • V. Van Belle et al.

    Support vector machines for survival analysis

  • V. Van Belle et al.

    Survival SVM: a practical scalable algorithm

  • L. Evers et al.

    Sparse kernel methods for high-dimensional survival data

    Bioinformatics

    (2008)
  • P.K. Shivaswamy et al.

    A support vector approach to censored targets

  • F.M. Khan et al.

    Support vector regression for censored data (SVRc): a novel tool for survival analysis

  • J.F. Lawless

    Survival and event history analysis

    Wiley reference series in biostatistics. Chapter: Parametric models in survival analysis

    (2006)
  • Cited by (132)

    View all citing articles on Scopus
    View full text