Inconsistent use of mean & sum when calculating KL divergence? #36

profPlum · 2024-04-15T18:46:36Z

There is a mean taken inside BaseVariationalLayer_.kl_div(). But later a sum is used inside get_kl_loss() & when reducing the KL loss of a layer's bias & weights (e.g. inside Conv2dReparameterization.kl_loss()).

I'm wondering if there is mathematical justification for this? Why take the mean of the individual weight KL divergences only to later sum across layers?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent use of mean & sum when calculating KL divergence? #36

Inconsistent use of mean & sum when calculating KL divergence? #36

profPlum commented Apr 15, 2024 •

edited

Loading

Inconsistent use of mean & sum when calculating KL divergence? #36

Inconsistent use of mean & sum when calculating KL divergence? #36

Comments

profPlum commented Apr 15, 2024 • edited Loading

profPlum commented Apr 15, 2024 •

edited

Loading