PD biomarkers
- 2 Devlogs
- 3 Total hours
Transparent "Glass-Box" XAI pipeline using Boruta + SHAP for discovering robust serum exosome miRNA biomarkers in Parkinson's Disease (GSE269781 dataset).
Transparent "Glass-Box" XAI pipeline using Boruta + SHAP for discovering robust serum exosome miRNA biomarkers in Parkinson's Disease (GSE269781 dataset).
Yesterday XGboost wasnt really working so today I switched to logistic regression and it worked perfectly. My ROC AUC and AP are now pretty solid and I can trust the biomarkers that my model extract.
Tomorrow, I’m going to be try to use Random Forest standard instead of XGboost and maybe see if that makes it better. Also, I’m going to focus on documentation readmes all that.
As you can see in the PCA Space plot the controls and the patients are split up pretty well and looks clean.
I’m also looking into using TargetScan with Kegg Pathway Analysis to Biologically validate my top 14 biomarkers.
Hi! This is my first parkinson’s project devlog.
I started this project around 3 months ago, and I knew nothing about machine learning then. In that time, I learnt everything I needed for my project. So far, I have made a decent pipeline to identify biomarkers.
I’m using an NCBI GEO Superseries for training, and I was planning on using a Portuguese cohort for extra real-world data tests.
Today, when I was comparing whether to use XGBoost or Random Forest, it gave accuracies of 93-95%, but after more tests, I realized that my AUC rn is around 70%.
Tomorrow, I’m going to try switching from XGBoost + Boruta feature selection to Logistic Regression to see if my AUC can be brought up to at least 80.