top of page

Projects

Life Expectancy Prediction Using Machine Learning

Built predictive models to estimate life expectancy based on socioeconomic and health indicators across countries. Achieved 91% accuracy using regression models and explored key factors influencing public health.

Tools:
Python · Pandas · Seaborn · scikit-learn · Jupyter Notebook

Life Expectancy HD Picture.jpg

Clinical Concept Normalization with BioBERT & SapBERT

Built a semantic normalization pipeline to map noisy clinical phrases (e.g., misspellings, shorthand) to standardized SNOMED CT concepts. Compared the performance of BioBERT and SapBERT using cosine similarity and curated ground truth sets, guiding improvements in terminology alignment for clinical NLP applications. Key methods used are Cosine Similarity · Concept Embedding · SNOMED CT Normalization

Tools:

Python · Hugging Face Transformers · scikit-learn · Google Colab · pandas · matplotlib

NLP picture.jpg

Heart Disease Risk Modeling with Logistic Regression

Developed an interpretable logistic regression model using clinical variables. Applied statistical tests and ROC analysis to evaluate model performance and guide data-driven decisions in cardiovascular care.

Tools:
R · RMarkdown · Power BI · Statistical Testing (Chi-square, Mann-Whitney, Spearman)

Obesity Heart Disease Deaths Stats.jpg

Breast Cancer Outcomes & Clinical DBMS Design

Designed a normalized 3NF database for analyzing protein expression and treatment outcomes in breast cancer patients. Wrote optimized SQL queries and mapped clinical concepts for deeper insight.

Tools:
SQL · Python · Jupyter Notebook · MySQL · Clinical Terminologies

Breast Cancer HD Picture.webp
bottom of page