Projects

Here is a collection of my important projects.

PowerBI Reports

This covers all my interactive report projects done with powerBI, you'll need a powerBI account to interact with these.

View Reports

🍄 Mushroom Classification: Predicting Edible or Poisonous Mushrooms!

In this project, I participated in a Kaggle competition where the goal was to predict whether a mushroom is edible (e) or poisonous (p) based on its physical characteristics. I explored several advanced machine learning models, including XGBoost, LightGBM, and Random Forest, with XGBoost achieving a remarkable 98% accuracy! 🚀

See Repo

📈 Data Analysis Made Simple with Data-Cent

Data-Cent is an open source Streamlit and Python-powered platform designed to help analysts and students perform seamless data exploration. Featuring interactive Plotly dashboards and preset Matplotlib plots for quick EDA, users can analyse preloaded datasets or upload their own data for customised insights. Whether you're a beginner or an experienced analyst, Data-Cent simplifies the data discovery process.

Careful, you'll need to wake up this giant. 😉

Try App See Repo

🐕 Dog Breed Classification with Deep Learning

This project involves building a deep learning model to classify dog breeds using image data. Leveraging transfer learning with TensorFlow and a pre-trained MobileNetV2 model, the system can accurately identify over 120 dog breeds from a dataset of 10,000+ images. The model is designed to handle both preloaded data and custom user inputs, making it versatile for real-world applications. The project demonstrates strong performance in image classification, achieving high accuracy on both validation and unseen data.

Careful, you'll need to wake up this giant. 😉

Try App See Repo

Research and Articles

Here is a collection of my Articles and studies focused on data science and AI.

My Thoughts on PyForest — Easy or Just Bad Practice?: PyForest simplifies data science coding by automatically importing popular libraries like pandas and seaborn, making exploratory analysis faster and cleaner. Its main benefits include convenience, time-saving, and reduced clutter in code. However, it has drawbacks such as hiding dependencies, risking namespace conflicts, and reducing educational value. For quick analysis, PyForest is useful, but for production code and learning, explicit imports are better for clarity and maintainability. Use it in the early stages, but switch to explicit imports for long-term projects..
Hyperparameters Tuning — RandomizedSearchCV vs GridSearchCV, which is better?: This article compares GridSearchCV and RandomizedSearchCV for tuning a Random Forest Classifier using a synthetic dataset. GridSearchCV performs an exhaustive search, ensuring thoroughness but at high computational cost. RandomizedSearchCV samples fewer combinations, offering faster results with similar performance. Both methods produced comparable results, but RandomizedSearchCV is more efficient for larger parameter spaces, making it the preferred choice for practical use.
Multifactorial Analysis of wildfire occurrences and causes: This research explores the use of machine learning and spatial analysis to improve wildfire prediction and management in England. By integrating environmental factors, human behaviors, and spatial dynamics, the study analysed data from multiple sources to create a multivariate dataset. Using advanced models like Random Forest, XGB, LGBM, and FNN, the research aimed to better understand wildfire risk and support evidence-based decision-making for more effective wildfire management.
Analysis on Big Data and A.I for Future Crime Predictions: This report examines the increasing use of Big Data and AI in crime prediction and prevention, drawing parallels to the film Minority Report. Similar to the movie's concept of "pre-crime," advanced algorithms analyze vast datasets to predict criminal activity and help law enforcement allocate resources more effectively, potentially lowering crime rates and enhancing public safety. However, like in the film, this approach raises concerns about reliability, ethics, legal implications, and public privacy, which are critical to address for responsible implementation.