- Research Note
- Open access
- Published:
Factors associated with time to relapse in pulmonary tuberculosis patients using penalized Cox models
BMC Research Notes volume 17, Article number: 333 (2024)
Abstract
Background
Pulmonary tuberculosis (TB) is a contagious bacterial infection caused by Mycobacterium tuberculosis that primarily affects the lungs. Despite advances in treatment, TB remains a major public health challenge, particularly in developing countries. This retrospective cohort study aimed to identify factors that influence the time to relapse in patients with pulmonary tuberculosis.
Methods
All smear-positive pulmonary tuberculosis patients (1548 patients) whose data were recorded in the tuberculosis registration system of Golestan University of Medical Sciences and Health Services from March 2014 to March 2019, and followed up until February 2021. To analyze the data, we applied LASSO, MCP, and SCAD penalized Cox models.
Results
Based on the AIC and BIC criteria, the MCP penalized model was better than the LASSO and SCAD penalized models. The variables of age at diagnosis (> 35 vs. \(\le\) 35, HR = 2.77), marital status (married vs. single, HR = 8.49), chronic renal failure (Yes vs. No, HR = 5.36), and pre-treatment smear test results (+ 1 vs. 1-9Basil, HR = 0.65 and + 2 vs. 1-9Basil, HR = 0.96) influenced time to relapse according to the MCP penalized Cox regression model.
Conclusions
Health care systems should focus on identification of factors that influence time to relapse and developing interventions to reduce relapse rates.
Introduction
Tuberculosis (TB) is one of the oldest contagious infectious diseases in humans caused by Mycobacterium tuberculosis [1, 2]. TB is a preventable and usually curable disease. However, in 2022, TB was the second leading cause of death from an infectious agent, after COVID-19, and caused nearly twice as many deaths as HIV/AIDS. Approximately 10 million people worldwide are infected with TB annually [1]. Tuberculosis usually attacks the lungs. Therefore, the most common type of tuberculosis is pulmonary tuberculosis [3]. Patients who have undergone complete treatment and achieved recovery remain susceptible to tuberculosis relapse. Individuals experiencing a relapse exhibit reduced treatment success rates and higher mortality compared to those initially infected [4].
Recognizing the risks of TB relapse for patients, identifying the factors influencing the relapse of TB can help to provide targeted care and more monitoring after TB treatment. Several studies have investigated the factors affecting patient survival and TB relapse. Factors such as drug resistance [5], smoking [6], HIV infection [6, 7], chronic lung disease [8], drug use [9] and positive sputum smear [4] have been associated with an increased risk of TB relapse. These factors may increase the risk of relapse due to the potential persistence of mycobacteria in patients after the treatment period. The aim of this study is to identify the factors that influence the time to TB relapse in treated patients using penalized methods.
Materials and methods
Data
In this retrospective cohort study, we used a dataset containing information on 1548 patients with smear-positive pulmonary tuberculosis. All smear-positive pulmonary tuberculosis patients (bacteriologically confirmed) whose data were recorded in the tuberculosis registration system of Golestan University of Medical Sciences and Health Services from March 2014 to March 2019 and followed up until February 2021. Information on the following factors was available in the data set: age at diagnosis (year), duration of treatment (days), delayed diagnosis (days), delayed treatment (days), gender (female/male), marital status (single/married), level of education (illiterate/high school/ university education), residence (urban/rural), BMI (kg/m), currently incarcerated (yes/no), prison history (yes/no), injection drug addiction (yes/no), history of TB contact (yes/no), HIV infection (yes/no), diabetes (yes/no), chronic renal failure (yes/no), smear test results before treatment (1-9Basil / + 1/ + 2/ + 3), smear test results after 2 months (1-9Basil/ + 1/ + 2/ + 3/negative), chest X-ray result (non-suggestive/less suggestive/more suggestive). The outcome of interest in this study was time to relapse.
Ethical considerations
The study followed the ethical guidelines for the participation of human subjects as proposed in the Declaration of Helsinki, and formal ethical approval was obtained from Hamadan University of Medical Sciences. During data collection, participants were informed of the study objectives and other ethical issues related to the study. Written informed consent was obtained from all participants. The study was approved by the ethics committee of Hamadan University of Medical Sciences (ethics code: IR.UMSHA.REC.1400.411).
Statistical analysis
Penalized Cox regression was used to select important factors associated with time to relapse. We used the least absolute shrinkage and selection operator (LASSO) group, minimax concave penalty (MCP) group, and smoothly clipped absolute deviation (SCAD) group penalties in the Cox regression model. Methods were compared by Akaike Information Criterion (AIC) and the results of the best model based on the minimum AIC were considered. Analysis was performed using R software version 4.1.1 (URL http://www.R-project.org) and the grpreg package. Missing values were imputed using the predictive mean matching (PMM) method from the mice package.
Penalized regression models
Penalized regression methods are preferred by researchers for their simultaneous variable selection and coefficient estimation. These methods add a penalty function to the likelihood function that apply different penalties to the regression coefficients. This facilitates regression shrinkage and selection, with the coefficient estimation achieved by maximizing the log-likelihood function under a constraint [10, 11]. Various penalty methods have been proposed for variable selection including LASSO, MCP and SCAD. These methods improve model interpretability and reduce data overfitting by eliminating predictor variables that are unrelated to the response variable [12,13,14].
Cox regression is a commonly used method for analyzing survival data. However, traditional Cox regression can lead to overfitting, particularly when dealing with many covariates. The decision to use penalized Cox regression instead of the traditional model stems from the need to address potential multicollinearity while also performing variable selection. Penalized methods allow the coefficients of less significant variables to be reduce to zero, resulting in a more parsimonious model that improves predictive accuracy. By using penalized Cox regression, we achieve a more robust and interpretable model [12, 14].
LASSO penalty
The LASSO penalty, introduced by Tibshirani, is defined by a penalty function that depends on a tuning parameter, λ. This parameter is crucial in variable selection and affects the regression coefficients [14]. The LASSO penalty is defined as follows:
where \(\ge 0\).
SCAD penalty
The SCAD penalty has been proposed by Fan and Li [12]. The SCAD penalty takes the following form:
where \(\lambda>0\) and \(\alpha>2\) are tuning parameters.
MCP penalty
The MCP penalty was proposed by Zhang [15]. It is expressed as follows:
where \(\lambda>0\) and \(\alpha>1\) are tuning parameters.
In penalized regression, a penalty function is added to the likelihood function to penalize the coefficients.
where \(l(\beta )\) is the log-likelihood function, P(.) is the penalty function, \(\beta ={({\beta }_{1},\dots ,{\beta }_{p})}^{T}\) is the vector of regression coefficients, and \(k\) is the number of explanatory variables [13].
Results
Of the 1548 patients with smear-positive pulmonary tuberculosis in this study, 40 (2.6%) of the patients relapsed and more than 95% of the patients did not relapse. The mean follow-up time of patients was 133.98 months. The median (IQR) follow-up time was 136.87 (96.16, 175.64) months. The mean age of patients at diagnosis was 49.63 ± 20.26 years with a range of 9 to 92 years. Also, the mean duration of treatment was 210.11 ± 45.50 days with a minimum and maximum of 180 and 557 days, respectively. The demographic, clinical and laboratory characteristics of all study participants are listed in Table 1. According to the results of Table 1, the majority of patients were male (51.2%), married (84.8%), with high school education (48.8%), living in rural areas (59.3%), with normal BMI (48.7%), with no history of imprisonment (85.9%), without injection drug addiction (98.7%), without HIV infection (99.7%), without diabetes (89.8%), and without chronic renal failure (98.9%). 40.7% participants had a history of contact with pulmonary TB patients. Among the patients, 74.2% had chest X-ray findings highly suggestive of TB, while 20.6% had findings less suggestive of TB. Furthermore, the smear test results before treatment showed that 37.7% were + 1, 20.8% were + 2, and 34.5% were + 3.
Figure 1, shows the Kaplan–Meier estimated survival distribution function 1054 patients with 95% confidence intervals indicating the time to relapse of pulmonary tuberculosis with number at risk over time. As seen, the survival function figure also shows that the risk of relapse is high early in the patient's recovery. However, this risk decreases over time after 130 months.
The results of the LASSO, MCP, and SCAD penalized Cox models are shown in Table 2. The MCP penalized model had the lowest AIC value (570.416) compared to the LASSO (578.682) and SCAD (571.778) models. BIC results indicated that the MCP and SCAD models were similarly effective, both outperforming the LASSO model. However, the Likelihood Ratio Test showed no statistically significant differences between the models: (LASSO vs. MCP: LRT = 3.71; p = 0.29), (LASSO vs. SCAD: LRT = 2.40; p = 0.50), and (MCP vs. SCAD: LRT = 1.31; p = 0.99).
To compare the performance of the MCP, LASSO and SCAD models, bootstrap sampling with 10,000 iterations was used to estimate the mean AIC value for each model. The ANOVA results indicated statistically significant differences among the models (F = 78.33; p < 0.001). Tukey's HSD test further revealed that both the MCP and SCAD models significantly differed from the LASSO model (MCP vs. LASSO: mean difference = − 8.42; p < 0.001 and SCAD vs. LASSO: mean difference = − 7.14; p < 0.001), while there was no significant difference between the MCP and SCAD models (MCP vs. SCAD: mean difference = 1.47; p = 0.97).
For both the MCP and SCAD penalty, age at diagnosis, marital status, and pre-treatment smear test results were important factors influencing the TB relapse. Additionally, chronic renal failure was identified in the MCP penalty, while BMI was identified in the SCAD penalty. Based on our analysis, the MCP method demonstrated the best performance according to both AIC and BIC criteria. Therefore, we have chosen to explain the variables based on the MCP penalty.
According to the results of the MCP penalized model, the hazard of relapse for patients over 35 years of age was 2.77 times that of patients under 35 years of age \(\left(\beta =1.02 ,\text{exp}\left(\beta \right)=2.77\right).\) Married patients exhibited a hazard of relapse nearly nine times that of single patients\((\beta =2.14,\text{exp}\left(\beta \right)=8.49\)). Regarding chronic renal status, the hazard of relapse among patients with chronic renal failure was 5.36 times that of patients without chronic renal failure\((\beta =1.68,\text{exp}\left(\beta \right)=5.36)\). Conversely, patients with a pre-treatment smear result of + 1 had a 35% lower hazard of relapse compared with those with a result of 1–9 Basil\((\beta =-0.43,\mathit{exp}\left(\beta \right)=0.65)\). Additionally, patients with a pre-treatment smear result of + 2 had a 4% lower hazard of relapse compared to those with a result of 1–9 Basil\((\beta =-0.04,\mathit{exp}\left(\beta \right)=0.96)\).
As shown in Fig. 2, the time to TB relapse was shorter in patients over 35 years of age compared to those under 35, and in married patients compared to single patients. Furthermore, patients with chronic renal failure experienced a shorter time to TB relapse than those without renal failure.
Discussion
In this study, we addressed risk factors for time to relapse in patients with pulmonary tuberculosis using penalized models. Specifically, we employed the LASSO, MCP, and SCAD penalized Cox models. Based on both the AIC and BIC criteria, the MCP penalty emerged as the best approach for identifying factors that influence patients' time to relapse.
Our findings indicate that age is a significant risk factor for time to relapse. Consistent with previous studies [4, 16], Based on our findings, the risk of TB relapse for patients over 35 years of age was approximately 3 times that of patients under 35 years of age, In fact, patients over 35 years of age experienced a shorter time to relapse compared to younger patients. However, in Cudahy's study, the risk of TB relapse was significantly higher in younger people. This discrepancy may be due to differences in study populations and clinical factors. Specifically, the average age of participants in Cudahy's study was approximately 35 years, while in our study, the average age was nearly 50 years. These age differences could influence the observed risk factors and highlight the need for further investigation into the factors contributing to these differing outcomes [4].
This study also showed that marital status was a risk factor for time to relapse, and the risk of TB relapse for married patients was approximately nine times that of single patients. Consistent with our findings, marital status was recognized as an important factor for TB relapse in the study by Hill et al. [17] and Dizaji et al. [16]. In Hill’s study, the expected time to relapse for divorced or widowed patients is shorter than that for married patients and the risk of TB relapse for divorced or widowed patients was nearly 3 times that of married patients. Additionally, the risk of TB relapse for single patients compared to married patients was not statistically significant [17].
The study also revealed that the bacilli density in the initial smear taken before treatment was a significant factor for time to relapse. The association between smear grade and TB relapse may be attributed to a higher mycobacterial load and the presence of cavitary disease, leading to an increased risk of relapse after completing treatment. The relationship between smear grade and both relapse and unsuccessful TB treatment has been reported in other studies [4, 18, 19].
This study also found that patients with chronic renal failure also had a lower expected time to TB relapse. Several studies have reported an increased risk of TB infection in patients with renal diseases. Patients with kidney disease have an increased susceptibility to TB infection, in part due to suppression of the immune system [20,21,22]. Some studies have identified variables such as a history of TB contact [7, 16, 23], HIV coinfection [7, 16, 18, 23], diabetes [23] and educational level [7] as factors influencing TB relapse, which were not specifically identified in this study. In addition, gender, chest x-ray results and a history of imprisonment were not found to have a significant effect on relapse, either in this study or in other studies with similar findings [4, 18].
Conclusion
This study has identified several risk factors that influence the time to occurrence of TB relapse, including age, marital status, initial smear bacilli density, and chronic renal failure. Understanding these factors can help healthcare providers design more effective prevention and intervention programs to reduce the risk of TB relapse and enhance patient care. Furthermore, this information can assist health authorities in developing improved public health initiatives aimed at controlling TB.
Limitations
The data utilized in this study were obtained from the Tuberculosis Registration System and Health Services. However, certain variables, such as smoking and alcohol consumption, were not recorded. Variables with missing values were excluded from the analysis, and those with less than 5% missing values were imputed.
Availability of data and materials
The data is available upon the request from the corresponding author.
Abbreviations
- AIC:
-
Akaike Information Criterion
- BIC:
-
Bayesian information criterion
- LASSO:
-
Least absolute shrinkage and selection operator
- LRT:
-
Likelihood Ratio
- MCP:
-
Minimax concave penalty
- SCAD:
-
Smoothly clipped absolute deviation
- SD:
-
Standard deviation
- TB:
-
Pulmonary tuberculosis
References
World Health Organization. Tuberculosis. Available from: https://www.who.int/news-room/fact-sheets/detail/tuberculosis. Accessed 2024
Weiangkham D, Umnuaypornlert A, Saokaew S, Prommongkol S, Ponmark J. Effect of alcohol consumption on relapse outcomes among tuberculosis patients: a systematic review and meta-analysis. Front Public Health. 2022;10:962809.
Torshizi F, Honarvar M, Rahimarbabi E, Sheikhy M, Hajiebrahimi M, Behnampour N. Incidence and treatment outcomes of pulmonary tuberculosis in Islamic Republic of Iran. East Mediterr Health J. 2023;29(6):417.
Cudahy PGT, Wilson D, Cohen T. Risk factors for recurrent tuberculosis after successful treatment in a high burden setting: a cohort study. BMC Infect Dis. 2020;20:1–8.
Sun Y, Harley D, Vally H, Sleigh A. Impact of multidrug resistance on tuberculosis recurrence and long-term outcome in China. PLoS ONE. 2017;12(1): e0168865.
Yen Y, Yen M, Lin Y, Lin Y, Shih H, Li L, et al. Smoking increases risk of recurrence after successful anti-tuberculosis treatment: a population-based study. Int J Tuberc Lung Dis. 2014;18(4):492–8.
Diriba K, Awulachew E. Associated risk factor of tuberculosis infection among adult patients in Gedeo Zone, Southern Ethiopia. SAGE Open Med. 2022;10:20503121221086724.
Gadoev J, Asadov D, Harries AD, Parpieva N, Tayler-Smith K, Isaakidis P, et al. Recurrent tuberculosis and associated factors: a five-year countrywide study in Uzbekistan. PLoS ONE. 2017;12(5): e0176473.
Dooley KE, Lahlou O, Ghali I, Knudsen J, Elmessaoudi MD, Cherkaoui I, et al. Risk factors for tuberculosis treatment failure, default, or relapse and outcomes of retreatment in Morocco. BMC Public Health. 2011;11:1–7.
Arayeshgari M, Tapak L, Roshanaei G, Poorolajal J, Ghaleiha A. Application of group smoothly clipped absolute deviation method in identifying correlates of psychiatric distress among college students. BMC Psychiatry. 2020;20(1):1–11.
Fan J, Lv J. A selective overview of variable selection in high dimensional feature space. Stat Sin. 2010;20(1):101.
Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001;96(456):1348–60.
Ogutu JO, Piepho H-P. Regularized group regression methods for genomic prediction: bridge, MCP, SCAD, group bridge, group lasso, sparse group lasso, group MCP and group SCAD. BMC Proc. 2014. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1753-6561-8-S5-S7.
Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Stat Methodol. 1996;58(1):267–88.
Zhang C-H. Nearly unbiased variable selection under minimax concave penalty. 2010.
Dizaji MK, Kazemnejad A, Tabarsi P, Zayeri F. Using competing risks model and competing events in outcome of pulmonary tuberculosis patients. Int J Mycobacteriol. 2016;5(Suppl 1):S237.
Hill PC, Jackson-Sillah D, Donkor SA, Otu J, Adegbola RA, Lienhardt C. Risk factors for pulmonary tuberculosis: a clinic-based case control study in The Gambia. BMC Public Health. 2006;6:1–7.
Charoensakulchai S, Lertpheantum C, Aksornpusitpong C, Trakulsuk P, Sakboonyarat B, Rangsin R, et al. Six-year trend and risk factors of unsuccessful pulmonary tuberculosis treatment outcomes in Thai Community Hospital. BMC Res Notes. 2021;14:1–8.
Nazar E, Baghishani H, Doosti H, Ghavami V, Aryan E, Nasehi M, et al. Bayesian Spatial Survival Analysis of Duration to Cure among New Smear-Positive Pulmonary Tuberculosis (PTB) Patients in Iran, during 2011–2018. Int J Environ Res Public Health. 2021;18(1):54.
Ruzangi J, Iwagami M, Smeeth L, Mangtani P, Nitsch D. The association between chronic kidney disease and tuberculosis; a comparative cohort study in England. BMC Nephrol. 2020;21:1–9.
Romanowski K, Clark EG, Levin A, Cook VJ, Johnston JC. Tuberculosis and chronic kidney disease: an emerging global syndemic. Kidney Int. 2016;90(1):34–40.
Milburn H, Ashman N, Davies P, Doffman S, Drobniewski F, Khoo S, et al. Guidelines for the prevention and management of Mycobacterium tuberculosis infection and disease in adult patients with chronic kidney disease. Thorax. 2010;65(6):559–70.
Kirenga BJ, Ssengooba W, Muwonge C, Nakiyingi L, Kyaligonza S, Kasozi S, et al. Tuberculosis risk factors among tuberculosis patients in Kampala, Uganda: implications for tuberculosis control. BMC Public Health. 2015;15:1–7.
Acknowledgements
We would like to appreciate the Vice-chancellor of Education of the Hamadan University of Medical Science for technical support for their approval and support of this work.
Funding
This study was supported and approved by Hamadan University of Medical Sciences (Grant NO: 140006094622). The funding body had no role in the design of the study and collection as well as in writing the manuscript.
Author information
Authors and Affiliations
Contributions
LT, ZM and GR conceived the research topic, explored that idea, performed the statistical analysis and drafted the manuscript. GR and NB participated in the interpretations and drafting of the manuscript. All authors contributed to the final version of the manuscript; revised the article critically for important intellectual content and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This study was submitted to and approved by the Ethical Committee of Hamadan University of Medical Science (IR.UMSHA.REC. 1400.411). Informed written consent was obtained from all participants.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Mehrbakhsh, Z., Roshanaei, G., Behnampour, N. et al. Factors associated with time to relapse in pulmonary tuberculosis patients using penalized Cox models. BMC Res Notes 17, 333 (2024). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13104-024-06986-3
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13104-024-06986-3