Estimation and sampling of copulas in Julia with Copulas.jl

Announce

I am proud to annonce the publication and the registration of my new Julia package, Copulas.jl.

As it’s name suggests, Copulas.jl is a package that implements methods and tools to work with an arround copulas in the Julia programming language. Quoting the github page :

This package brings most standard [copula](https://en.wikipedia.org/wiki/Copula_(probability_theory)) features into native Julia: random number generation, pdf and cdf, fitting, copula-based multivariate distributions through Sklar's theorem, etc., while fully complying with the [`Distributions.jl`](https://github.com/JuliaStats/Distributions.jl) API (after all, copulas are distributions functions) in order to provide interoperability with other packages based on this API such as, e.g., [`Turing.jl`](https://github.com/TuringLang/Turing.jl).

Usually, people that use and work with copulas turn to R, because of the amazing R package copula. While it is still well maintained and regularly updated, the R package copula is a mixture of obscure, heavily optimized C code and more standard R code, which makes it a complicated code base for readability, extensibility, reliability and maintenance.

This is an attempt to provide a very light, fast, reliable and maintainable copula implementation in native Julia (which means, in particular, floating point type agnostic, i.e. compatibility with BigFloat, DoubleFloats, MultiFloats and other kind of numbers). The two most important exported types are:

  • Copula: an abstract mother type for all the copulas in the package.
  • SklarDist: allows construction of a multivariate distribution by specifying the copula and the marginals through Sklar’s theorem.

What is already implemented

The API contains random number generation, cdf and pdf evaluation, and the fit function from Distributions.jl. A typical use case might look like this:

using Copulas, Distributions, Random
X₁ = Gamma(2,3)
X₂ = Pareto()
X₃ = LogNormal(0,1)
C = ClaytonCopula(3,0.7) # A 3-variate Frank Copula with θ = 0.7
D = SklarDist(C,(X₁,X₂,X₃)) # The final distribution

# This generates a (3,1000)-sized dataset from the multivariate distribution D
simu = rand(D,1000)

# While the following estimates the parameters of the model from a dataset: 
D̂ = fit(SklarDist{FrankCopula,Tuple{Gamma,Normal,LogNormal}}, simu)
# Increase the number of observations to get a beter fit (or not?)  

Available copula families are:

  • GaussianCopula,
  • TCopula,
  • ArchimedeanCopula (for any generator),
  • ClaytonCopula,FrankCopula, AMHCopula, JoeCopula, GumbelCopula as example of the ArchimedeanCopula abstract type, see below,
  • WCopula and MCopula, which are Fréchet-Hoeffding bounds,
  • EmpiricalCopula to follow closely a given dataset.

The next ones to be implemented will probably be:

  • Nested archimedeans (general, with the possibility to nest any family with any family, assuming it is possible, with parameter checks.)
  • Bernstein copula and more general Beta copula as smoothing of the Empirical copula.
  • CheckerboardCopula (and more generally PatchworkCopula)

Adding a new ArchimedeanCopula is very easy. The Clayton implementation is as short as:

struct ClaytonCopula{d,T} <: ArchimedeanCopula{d}
    θ::T
end
ClaytonCopula(d,θ)            = ClaytonCopula{d,typeof(θ)}(θ)     # Constructor
ϕ(C::ClaytonCopula, t)        = (1+sign(C.θ)*t)^(-1/C.θ)          # Generator
ϕ⁻¹(C::ClaytonCopula,t)       = sign(C.θ)*(t^(-C.θ)-1)            # Inverse Generator
τ(C::ClaytonCopula)           = C.θ/(C.θ+2)                       # θ -> τ
τ⁻¹(::Type{ClaytonCopula},τ)  = 2τ/(1-τ)                          # τ -> θ
radial_dist(C::ClaytonCopula) = Distributions.Gamma(1/C.θ,1)      # Radial distribution

The Archimedean API is modular:

  • To sample an archimedean, only radial_dist and ϕ are needed.
  • To evaluate the cdf, only ϕ and ϕ⁻¹ are needed
  • Currently, to fit the copula τ⁻¹ is needed as we use the inverse tau moment method. But we plan on also implementing inverse rho and MLE (density needed).
  • Note that the generator ϕ follows the convention ϕ(0)=1, while others (e.g., https://en.wikipedia.org/wiki/Copula_(probability_theory)#Archimedean_copulas) use ϕ⁻¹ as the generator.
  • The density and thus log-likelyhood of all archimedean copulas is upcoming soon.

Next step

The main thing to do on this package is to add test and documentations to the code that is already done. Suggestions and contributions are, of course, welcomed !

Oskar Laverny
Oskar Laverny
Maître de Conférence

What would be the dependence structure between quality of code and quantity of coffee ?

comments powered by Disqus

Related