Project

HDB Resale Price Predictor

A Singapore housing estimator built by narrowing a research model into practical user inputs.

Python Streamlit pandas scikit-learn Regression modeling Feature engineering Model artifacts
HDB Resale Price Predictor
HDB Resale Price Predictor supporting evidence

Business context

Housing-price models are easy to over-engineer for accuracy and hard to turn into a usable product. This project focused on Singapore HDB resale pricing, with the main challenge being how to keep predictive quality high while limiting inputs to factors a real user can provide.

Outcome

  • Production model reports roughly 0.9261 R^2 with RMSE around 39,180 SGD.
  • Refactored a research notebook with 150+ engineered features into a deployable Streamlit application.
  • Built a consistent training and inference pipeline around a reduced, user-facing feature set.
  • Framed the product around Singapore-specific housing decisions rather than generic regression output.

Key decisions

  • Explicitly prioritized usability over maximum predictive accuracy.
  • Reduced the feature space to variables normal users can actually enter.
  • Kept preprocessing shared between training and inference so the deployed app matched the modeling pipeline.
  • Chose a simpler, more explainable production path over a larger academic feature set.

System design

Historical HDB resale records are cleaned, transformed, and passed through a modular preprocessing and regression pipeline. The trained artifacts are then reused inside the Streamlit app so feature handling and predictions remain consistent between model development and live inference.

Stack

  • Python, pandas, scikit-learn, and joblib artifacts
  • Streamlit for the user-facing estimator
  • Regression modeling, feature engineering, and Singapore housing domain logic