Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting higher derivatives of the log of the gamma function #196

Closed
fredRos opened this issue Feb 16, 2017 · 6 comments · Fixed by #223
Closed

Supporting higher derivatives of the log of the gamma function #196

fredRos opened this issue Feb 16, 2017 · 6 comments · Fixed by #223

Comments

@fredRos
Copy link

fredRos commented Feb 16, 2017

The highest derivative of lgamma is currently limited to the first derivative of trigamma(x) = polygamma(1,x), probably because it has a special name. But of course the derivative of polygamma(n, x) is polygamma(n+1,x) and they are all available. I ran into this problem when computing the Hessian of a function involving trigamma. Here is a minimal example

julia> harr(x::Vector) = 1/2 * log(x[1]*trigamma(x[1])-1) - log(x[2])
harr (generic function with 1 method)

julia> ForwardDiff.hessian(harr, [1.5, 60.])
ERROR: MethodError: Cannot `convert` an object of type ForwardDiff.Dual{2,Float64} to an object of type Float64
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
 in harr(::Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}) at ./REPL[15]:1
 in vector_mode_gradient at /home/beaujean/.julia/v0.5/ForwardDiff/src/gradient.jl:60 [inlined]
 in gradient(::#harr, ::Array{ForwardDiff.Dual{2,Float64},1}, ::ForwardDiff.GradientConfig{2,ForwardDiff.Dual{2,Float64},Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/gradient.jl:7
 in vector_mode_jacobian(::ForwardDiff.##33#34{#harr,ForwardDiff.HessianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1},ForwardDiff.Dual{2,Float64},Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}}}, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1}}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/jacobian.jl:75
 in jacobian(::ForwardDiff.##33#34{#harr,ForwardDiff.HessianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1},ForwardDiff.Dual{2,Float64},Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}}}, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1}}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/jacobian.jl:7
 in hessian(::#harr, ::Array{Float64,1}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/hessian.jl:6

I checked the source code of Calculus and will post an issue there to add this functionality. I'm wondering why the generic support for polygamma hasn't been added. Perhaps it can't pick up the first argument to polygamma?

@fredRos
Copy link
Author

fredRos commented Apr 24, 2017

It seems @johnmyleswhite has been busy with other stuff, there was no response so I tried to implement directly in ForwardDiff's dual.jl following the atan example but couldn't get it to work. Any help is appreciated.

My first attempt was

@ambiguous @inline function Base.polygamma{N}(m::Real, x::Dual{N})
    return Dual(polygamma(m, x), polygamma(m+1, x) * partials(x))
end

Running the MWE from above, I get

julia> harr(x::Vector) = 1/2 * log(x[1]*trigamma(x[1])-1) - log(x[2])
harr (generic function with 1 method)

julia> ForwardDiff.hessian(harr, [1.5, 60.])
ERROR: MethodError: polygamma(::Int64, ::ForwardDiff.Dual{2,Float64}) is ambiguous. Candidates:
  polygamma{N}(m::Real, x::ForwardDiff.Dual{N,T<:Real}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/dual.jl:424
  polygamma(m::Integer, z::Number) at special/gamma.jl:452
 in harr(::Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}) at ./REPL[16]:1
 in vector_mode_gradient at /home/beaujean/.julia/v0.5/ForwardDiff/src/gradient.jl:60 [inlined]
 in gradient(::#harr, ::Array{ForwardDiff.Dual{2,Float64},1}, ::ForwardDiff.GradientConfig{2,ForwardDiff.Dual{2,Float64},Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/gradient.jl:7
 in vector_mode_jacobian(::ForwardDiff.##37#38{#harr,ForwardDiff.HessianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1},ForwardDiff.Dual{2,Float64},Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}}}, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1}}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/jacobian.jl:75
 in jacobian(::ForwardDiff.##37#38{#harr,ForwardDiff.HessianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1},ForwardDiff.Dual{2,Float64},Array{ForwardDiff.Dual{2,ForwardDiff.Dual{2,Float64}},1}}}, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{2,Float64,Array{ForwardDiff.Dual{2,Float64},1}}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/jacobian.jl:7
 in hessian(::#harr, ::Array{Float64,1}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/hessian.jl:6

Removing @ambiguous didn't change anything. Using an Integer instead of a Real leads to a StackOverflowError

@ambiguous @inline function Base.polygamma{N}(m::Integer, x::Dual{N})
    return Dual(polygamma(m, x), polygamma(m+1, x) * partials(x))
end

julia> ForwardDiff.hessian(harr, [1.5, 60.])
ERROR: StackOverflowError:
 in polygamma(::Int64, ::ForwardDiff.Dual{2,Float64}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/dual.jl:0
 in polygamma(::Int64, ::ForwardDiff.Dual{2,Float64}) at /home/beaujean/.julia/v0.5/ForwardDiff/src/dual.jl:424 (repeats 79999 times)

@fredRos
Copy link
Author

fredRos commented Apr 25, 2017

update: The StackOverflowError also occurs for other special functions with two arguments declared in ForwardDiff.SPECIAL_FUNCS

julia> ForwardDiff.derivative(x->besselk(2, x), 1.1)
ERROR: StackOverflowError:
 in besselk(::Int64, ::ForwardDiff.Dual{1,Float64}) at ./special/bessel.jl:455 (repeats 80000 times)

polygamma is missing in SPECIAL_FUNCS. Why actually?

@jrevels
Copy link
Member

jrevels commented Apr 25, 2017

Hi @fredRos, sorry for such a late reply. Thanks for tackling this.

Just narrowing it down to this function:

 function Base.polygamma(m::Integer, x::Dual)
    return Dual(polygamma(m, x), polygamma(m+1, x) * partials(x))
end

This definition is infinitely recursive, because you're defining polygamma(::Integer, ::Dual) to call polygamma(::Integer, ::Dual) with no base case. The definition should probably be something closer to:

 function Base.polygamma(m::Integer, x::Dual)
    return Dual(polygamma(m, value(x)), polygamma(m+1, value(x)) * partials(x))
end

It'll likely be easier for us to add the definition via Calculus. I have commit access there, so I can help out in resolving JuliaMath/Calculus.jl#104 (apologies for not doing so earlier, that issue must've slipped through the cracks on my end).

The StackOverflowError also occurs for other special functions with two arguments declared in ForwardDiff.SPECIAL_FUNCS

Nice catch! It looks like ForwardDiff's auto-definition code for binary SpecialFunctions primitives is incorrect, because it assumes that they're unary (like the other auto-defined primitives). I'm not sure how our tests didn't catch this, there might be a bug in Calculus...

I'll play around with fixing this up. Thanks again for looking into it.

@fredRos
Copy link
Author

fredRos commented Apr 25, 2017 via email

@jrevels
Copy link
Member

jrevels commented Apr 28, 2017

An update: I've add polygamma to a new rules system I've implemented for DiffBase: JuliaAttic/DiffBase.jl#5.

Follow-up work will be for me to switch ForwardDiff over to using DiffBase's rules instead of reaching into Calculus' internals to generate derivative definitions.

@jacobcvt12
Copy link

I can replicate the original problem with no issues

julia> harr(x::Vector) = 1/2 * log(x[1]*trigamma(x[1])-1) - log(x[2])
harr (generic function with 1 method)

julia> ForwardDiff.hessian(harr, [1.5, 60.])
2×2 Array{Float64,2}:
 0.272972  0.0
 0.0       0.000277778

However, I am unable to take gradients of the log probability density function of a Gamma wrt the parameters

julia> using Distributions
julia> ForwardDiff.gradient(x -> logpdf(Gamma(x[1], x[2]), 1.0), [0.1, 1.0])
ERROR: MethodError: Cannot `convert` an object of type ForwardDiff.Dual{ForwardDiff.Tag{##7#8,0xb046287d533b082d},Float64,2} to an object of type Float64
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
Stacktrace:
 [1] gammalogpdf at C:\Users\jcarey\.julia\v0.6\StatsFuns\src\rmath.jl:59 [inlined]
 [2] logpdf at C:\Users\jcarey\.julia\v0.6\Distributions\src\univariates.jl:590 [inlined]
 [3] (::##7#8)(::Array{ForwardDiff.Dual{ForwardDiff.Tag{##7#8,0xb046287d533b082d},Float64,2},1}) at .\REPL[16]:1
 [4] vector_mode_gradient(::##7#8, ::Array{Float64,1}, ::ForwardDiff.GradientConfig{ForwardDiff.Tag{##7#8,0xb046287d533b082d},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{##7#8,0xb046287d533b082d},Float64,2},1}}) at C:\Users\jcarey\.julia\v0.6\ForwardDiff\src\gradient.jl:94
 [5] gradient at C:\Users\jcarey\.julia\v0.6\ForwardDiff\src\gradient.jl:19 [inlined]
 [6] gradient(::##7#8, ::Array{Float64,1}) at C:\Users\jcarey\.julia\v0.6\ForwardDiff\src\gradient.jl:18

I don't have this problem with other distributions I have tried such as Normal, InverseGaussian, and Cauchy.

julia> ForwardDiff.gradient(x -> logpdf(Normal(x[1], x[2]), 1.0), [0.1, 1.0])
2-element Array{Float64,1}:
  0.9
 -0.19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants