XGBoost learns better with monotonicity constraints for Cartoq
Cartoq, leading automobile site in India (43mn annual uniques), produces automated content [example of automated content here] related to best used car deals in the market. Content automation was achieved by predicting the best deals among all used car listings, with an XGBoost model trained on a database of 100,000+ used car listings.
Recently, we imposed some monotonic constraints on the model (increase in age of car should always reduce its price, for instance), and there was a significant jump in performance. The gap between Train and Test R-square reduced by 500 basis points, with mono-based Test R-square of 0.8219, and Test RMSE almost reduced by 3200 points (till now, the highest reduction was about 500 points).
Monotonicity evidently gave the trees a better direction: Average Gain scores among features got more “realistically” distributed, Coverage among features was also more balanced.
The XGBoostmodel predicts “True Price” for each used car, by estimating a predicted depreciation and thus is able to identify the most attractive deals for buyers based on the difference between list price and predicted price. [More on this here].
Image Source: Shutterstock