Science

Method of flexible co-variates

in the proportional hazards models

D.Vovoras, C.P.Tsokos

University of South Florida, Mathematics&Statistics Department

Tampa, FL 33620-5700, USA

The problem of survival data analysis has been a very active research field for many decades now. Many of them are centered around the important Cox regression model-its extensions and alternative forms. The present study is employing the means of generalized additive regression modeling to extend the scope of the classical Cox-PH model in order to provide flexible models for studying the effect of the prognostic factors on the hazard function. Smoothing splines are used to predict the behavior and form non-linear additive proportional hazards models, in order to test the effectiveness of alternative treatments in a breast cancer clinical trial.

For introduction we note.

Each year governments and organizations around the world fund thousands of clinical trials which follow the history of disease and evaluate alternative treatments. Accurate analysis of the provided information is critical, not only because the nature of the care for individuals is directly affected by the findings, but also in terms of time and money. While the determination of a treatment's efficacy is an important goal for a clinical trial, the identification of prognostic factors is an equally important component of the analysis. This article describes such methods to identify and characterize the effect of potential prognostic factors on disease endpoints, as well as to define differing risk groups. We note that valid comparisons between treatments are possible only after correctly accounting for factors that may affect the course of the disease.

Survival analysis or failure time data analysis is interested in the time from a defined time origin to the occurrence of some given event. In biomedicine the typical example is the time from randomization to a given treatment, until the studied event occurs leading to the observation of survival times for the patients involved. It is usually the objective to compare different treatment effects on the survival time while incorporating available information for each patient. This leads us to a statistical regression analysis problem. In many occasions the survival times are incompletely observed, the most common example being right censoring, that is, it is only known for the survival time šthat it is larger than an observed right censoring value.

The intention of this paper is to expose the reader to some of the important issues dealt with in these kinds of studies. šIf the focus is non-parametric, the cumulative hazard function or the survival function can be estimated using the Nelson-Aalen and the Kaplan-Meier estimators, respectively. The proportional hazards model is a popular semi-parametric tool for analyzing censored failure time data- we are using the term failure time as a generic term to refer to the time up to the endpoint of interest. It is semi-parametric in the sense that it does not make any distributional assumptions about the failure times, but on the other hand does specify the form in which covariates, or prognostic factors, affect the hazard rate of failure. The model easily accommodates right censored data, usually the case for clinical trials where the patients enter at random times but whose follow-up ends at a fixed time point.

The Cox or proportional hazards regression model is on many occasions used to simultaneously model prognostic factors and treatment effects for the failure times involved. We briefly outline the model in the next section and give a way to incorporate tools that can investigate whether a prognostic factor is important, and whether it has linear or nonlinear relationship to the failure time. The identified model automatically provides estimates of treatment effects related nonlinearly to the prognostic factors. Although the focus here is on censored survival data, the same techniques have been used to other problems, for example regression modeling for time dependent distributional parameters. While some prognostic factors are linearly related to survival (usually categorical), the influence of others (clinical laboratory values or clinical characteristics) may well be more accurately described by a non-linear relationship.

The method used, namely an additive proportional hazards model, relaxes the linearity assumptions and allows smooth non-linear functions of the covariates to be included in the log hazard ratio. The advantage is that the transformations involved are not chosen a priori by the analyst, but rather are estimated flexibly from the data at hand. Another attractive feature is that the need to categorize a continuous covariate in order to discover the nature of its effect is alleviated. Though there are several methods to accommodate the non-linear terms, the methods used here are splines, piecewise polynomials satisfying continuity constraints at the knots joining the pieces. There are general information with a general review of splines (De Boor, Eubank, Wahba). Hastie et al, as well as Hastie and Tibshirani, and Gray discuss the effects of number of knots with respect to sensitivity of the proposed models, as well as other alternatives, namely regression splines.

As mentioned we will aim mainly at studying nonlinear effects of the covariates involved, as well as important interactions between prognostic factors- the interactive contribution may turn out to be more significant than the individual, again with emphasis on examining the shape of the effect. To look for possible interactions between continuous and categorical covariates, separate spline functions are fit for the continuous covariate within the levels of the categorical covariate. A test for the hypothesis that the shape of the function is the same within the levels was proposed by Gray, the test though cannot be regarded as reliable.

The subject of examining the proportional hazards model assumptions has generated an extensive literature. Cox, in his original paper suggested fitting time varying covariates, and Zucker and Karr proposed using non-parametric penalized likelihood estimation to flexible functions of time for the covariates.

For discussion we note.

In this work we described different approaches for modelling. The modeling procedures described are useful for a variety of reasons. Firstly, they are very helpful in identifying a proper model for the data at hand, since they give a valuable insight into the behavior of the hazard function for various prognostic factors, thus preventing model misspecification which can lead to incorrect conclusions about treatment efficacy. Secondly, they provide information about the relationship of the covariates and disease risk which goes beyond the standard techniques. For example, we found that it is because of the U shaped function that the best fitting line for the hemoglobin concentration relationship would have zero slope, which is why it was not identified as a significant predictor in the linear analysis. We should stress that linearity remains a special case and thus linear relationships can be easily confirmed with flexible modeling, as it was the case for the effect of tumor size. Lastly, these methods can be used beyond modeling of prognostic factors for patient care or planning for future trials, into an investigative tool in a variety of settings.

Finally, a different approach to flexible survival analysis, the regression tree approach, Segal, focuses on identifying subgroups of individuals with respective survival. The two approaches are somewhat complementary in that the additive model seeks for smooth main effects, while the regression tree structured technique is designed to detect interactions between variables. The authors have investigated decision tree issues for relapse times in the same trial, a comparison between these two methods would be an interesting topic for further study.

 

Acknowledgement. The authors wish to thank N.A. Ibrahim for supplying the data.




[Contents]

homeKazanUniversitywhat's newsearchlevel upfeedback

© 1995-2008 Kazan State University