Pancreatic cancer is a worldwide common disease with very low survival rate. Recent-year studies have shown the survival rate of the disease has been improved during the last few decades. However, the change of the survival rate of patients after pancreatoduodenectomy (PD, major treatment for pancreatic cancer) has not yet been explored. This project used hospital episodes statistics (HES) data and investigated the postoperative and long-term survival of England patients after PD be- tween the period 2001 and 2014 on 6 variables: gender, age, ethnicity, Charlson score, IMD (index for multiple deprivation) and centre volume. We found that age and Charlson score are the most significant factors that affect patients’ survival after PD. IMD and centre volume are also significant to patients’ survival. We analysed the importance of centralisation and found the possible optimal annual volume for a centre should be around 30. In addition, we demonstrated the predictive level of three kinds of models, Cox proportional hazard regression model (linear model), Cox model with lasso penalty (glmnet) and random forest model (non-linear model). We found that linear model works better than non-linear model on this dataset.
- Report and supplementaty material are in folder report.
- All codes can be found in folder src
All content in this repository is openly licensed with a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0, which means you cannot use the material for commercial purposes. If you remix, transform, or build upon the material, you may not distribute the modified material. But you are free to copy and redistribute the materials so long as you credit the source.
If you were to use content from this repo in your own work, please attribute me with a sentence like:
The material is (partially) derived from Zhangdaihong Liu's project "Survival Following Pancreatoduodenectomy in England Perspectives from the HES Database".