glut в feature analysis

This commit is contained in:
2024-11-30 22:17:37 +03:00
parent f35f9e01b1
commit c7891d7243

View File

@@ -199,6 +199,8 @@
In \cite{kras} feature importance analysis was employed to identify genes associated with resistance to KRAS G12C inhibitor treatment in cancer cells. The authors used seven different feature ranking algorithms: LASSO, LightGBM, MCFS, mRMR, RF-based, CATBoost, and XGBoost. These algorithms generated feature lists based on different principles, enabling a comprehensive evaluation of gene significance. To refine the feature selection, the authors applied Incremental Feature Selection (IFS), testing the performance of classifiers like Decision Tree (DT), k-Nearest Neighbors (KNN), Random Forest (RF), and Support Vector Machine (SVM) on the ranked features. By doing feature analysis they were able to highlight several key genes, such as H2AFZ, CKS1B, and TUBA1B, which were consistently ranked highly across multiple algorithms and are linked to tumor progression and drug resistance.
In study \cite{glut}, univariate Cox regression analysis was employed to identify immuno-senescence-related genes with prognostic significance in pancreatic cancer. The Cox proportional hazards model was used to assess the relationship between the expression of individual genes and overall survival. The hazard ratio (HR) for each gene was estimated, with a p-value of less than 0.01 indicating statistical significance. Genes with a p-value below this threshold were selected as meaningful features for subsequent analysis, as they were considered to have a potential impact on the prognosis of pancreatic cancer patients. This approach allows for the identification of genes that might serve as independent prognostic biomarkers.
The authors of \cite{cervical} used feature importance analysis based on the Random Forest (RF) model to identify key SNPs related to NACT sensitivity in LACC patients. The importance of each feature was calculated by assessing its impact on impurity reduction at each node in the RF model, with a larger decrease in impurity indicating greater feature importance. The mean decrease in impurity (MDI) was calculated using the total decrease in impurity averaged over all decision trees. The impurity \(g\) of a split was computed as:
\[