-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting higher derivatives of the log of the gamma function #196
Comments
It seems @johnmyleswhite has been busy with other stuff, there was no response so I tried to implement directly in ForwardDiff's My first attempt was @ambiguous @inline function Base.polygamma{N}(m::Real, x::Dual{N})
return Dual(polygamma(m, x), polygamma(m+1, x) * partials(x))
end Running the MWE from above, I get julia> harr(x::Vector) = 1/2 * log(x[1]*trigamma(x[1])-1) - log(x[2])
harr (generic function with 1 method)
julia> ForwardDiff.hessian(harr, [1.5, 60.])
ERROR: MethodError: polygamma(::Int64, ::ForwardDiff.Dual{2,Float64}) is ambiguous. Candidates:
polygamma{N}(m::Real, x::ForwardDiff.Dual{N,T<:Real}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/dual.jl:424
polygamma(m::Integer, z::Number) at special/gamma.jl:452
in harr(::Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}) at ./REPL[16]:1
in vector_mode_gradient at /home/beaujean/.julia/v0.5/ForwardDiff/src/gradient.jl:60 [inlined]
in gradient(::#harr, ::Array{ForwardDiff.Dual{2,Float64},1}, ::ForwardDiff.GradientConfig{2,ForwardDiff.Dual{2,Float64},Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/gradient.jl:7
in vector_mode_jacobian(::ForwardDiff.##37#38{#harr,ForwardDiff.HessianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1},ForwardDiff.Dual{2,Float64},Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}}}, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1}}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/jacobian.jl:75
in jacobian(::ForwardDiff.##37#38{#harr,ForwardDiff.HessianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1},ForwardDiff.Dual{2,Float64},Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}}}, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1}}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/jacobian.jl:7
in hessian(::#harr, ::Array{Float64,1}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/hessian.jl:6 Removing @ambiguous @inline function Base.polygamma{N}(m::Integer, x::Dual{N})
return Dual(polygamma(m, x), polygamma(m+1, x) * partials(x))
end
julia> ForwardDiff.hessian(harr, [1.5, 60.])
ERROR: StackOverflowError:
in polygamma(::Int64, ::ForwardDiff.Dual{2,Float64}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/dual.jl:0
in polygamma(::Int64, ::ForwardDiff.Dual{2,Float64}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/dual.jl:424 (repeats 79999 times) |
update: The julia> ForwardDiff.derivative(x->besselk(2, x), 1.1)
ERROR: StackOverflowError:
in besselk(::Int64, ::ForwardDiff.Dual{1,Float64}) at ./special/bessel.jl:455 (repeats 80000 times)
|
Hi @fredRos, sorry for such a late reply. Thanks for tackling this. Just narrowing it down to this function: function Base.polygamma(m::Integer, x::Dual)
return Dual(polygamma(m, x), polygamma(m+1, x) * partials(x))
end This definition is infinitely recursive, because you're defining function Base.polygamma(m::Integer, x::Dual)
return Dual(polygamma(m, value(x)), polygamma(m+1, value(x)) * partials(x))
end It'll likely be easier for us to add the definition via Calculus. I have commit access there, so I can help out in resolving JuliaMath/Calculus.jl#104 (apologies for not doing so earlier, that issue must've slipped through the cracks on my end).
Nice catch! It looks like ForwardDiff's auto-definition code for binary I'll play around with fixing this up. Thanks again for looking into it. |
function Base.polygamma(m::Integer, x::Dual)
return Dual(polygamma(m, x), polygamma(m+1, x) * partials(x))
end
Oh my goodness, how could I miss that? I guess not enough experience
with dual numbers. Yes, this quickly fixes my concrete problem.
I'll play around with fixing this up. Thanks again for looking into it.
Great, these fixes will be useful for the general public. Seems like
nobody has differentiated Bessel functions and the like. I will file an
issue with SpecialFunctions because they forgot to mention `polygamma`
in their docs even though it is provided. Let me know if I can help further.
|
An update: I've add polygamma to a new rules system I've implemented for DiffBase: JuliaAttic/DiffBase.jl#5. Follow-up work will be for me to switch ForwardDiff over to using DiffBase's rules instead of reaching into Calculus' internals to generate derivative definitions. |
I can replicate the original problem with no issues julia> harr(x::Vector) = 1/2 * log(x[1]*trigamma(x[1])-1) - log(x[2])
harr (generic function with 1 method)
julia> ForwardDiff.hessian(harr, [1.5, 60.])
2×2 Array{Float64,2}:
0.272972 0.0
0.0 0.000277778 However, I am unable to take gradients of the log probability density function of a Gamma wrt the parameters julia> using Distributions
julia> ForwardDiff.gradient(x -> logpdf(Gamma(x[1], x[2]), 1.0), [0.1, 1.0])
ERROR: MethodError: Cannot `convert` an object of type ForwardDiff.Dual{ForwardDiff.Tag{##7#8,0xb046287d533b082d},Float64,2} to an object of type Float64
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] gammalogpdf at C:\Users\jcarey\.julia\v0.6\StatsFuns\src\rmath.jl:59 [inlined]
[2] logpdf at C:\Users\jcarey\.julia\v0.6\Distributions\src\univariates.jl:590 [inlined]
[3] (::##7#8)(::Array{ForwardDiff.Dual{ForwardDiff.Tag{##7#8,0xb046287d533b082d},Float64,2},1}) at .\REPL[16]:1
[4] vector_mode_gradient(::##7#8, ::Array{Float64,1}, ::ForwardDiff.GradientConfig{ForwardDiff.Tag{##7#8,0xb046287d533b082d},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{##7#8,0xb046287d533b082d},Float64,2},1}}) at C:\Users\jcarey\.julia\v0.6\ForwardDiff\src\gradient.jl:94
[5] gradient at C:\Users\jcarey\.julia\v0.6\ForwardDiff\src\gradient.jl:19 [inlined]
[6] gradient(::##7#8, ::Array{Float64,1}) at C:\Users\jcarey\.julia\v0.6\ForwardDiff\src\gradient.jl:18 I don't have this problem with other distributions I have tried such as julia> ForwardDiff.gradient(x -> logpdf(Normal(x[1], x[2]), 1.0), [0.1, 1.0])
2-element Array{Float64,1}:
0.9
-0.19 |
The highest derivative of
lgamma
is currently limited to the first derivative oftrigamma(x) = polygamma(1,x)
, probably because it has a special name. But of course the derivative ofpolygamma(n, x)
ispolygamma(n+1,x)
and they are all available. I ran into this problem when computing the Hessian of a function involvingtrigamma
. Here is a minimal exampleI checked the source code of
Calculus
and will post an issue there to add this functionality. I'm wondering why the generic support forpolygamma
hasn't been added. Perhaps it can't pick up the first argument topolygamma
?The text was updated successfully, but these errors were encountered: