-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switching to float for histogram building #10913
Comments
Hi, XGBoost GPU used to have this option. I proposed and merged the removal of this option (a long time ago). The model wasn't converging when f32 was used for accumulation when the number of samples was large. This can be obvious for typical regression since one would see metrics like
As a result, it's not about variation. It's about converging or not, and whether a user can quickly tell he or she is running into trouble. I don't have the experiment's results now. I think I was using the HIGGS dataset as a demonstration. The performance gain is negligible for small datasets, whereas for large datasets, it might not converge. The parameter's practical usefulness was small. |
Feel free to close if there are no further questions. |
@trivialfis thanks, that was very helpful |
Hi, I'd like to get some clarifications about the decision to stick to double for
GHistRow
that is used in histogram building. @RAMitchell mentioned in this comment that switching to float will result in significant accuracy degradation. I'm just wondering how is performance measured in this case. More generally, if say we have ways to improve the runtime of model building that would lead to changes in model metrics, from the development team's perspective, what level of variation is acceptable?The text was updated successfully, but these errors were encountered: