Dual Numbers, Automatic Differentiation, Haskell

Duals are an extension to a number system such that x^2 = 0 has a non-zero solution.
We could consider them as a quotient:
– take a ring R
– quotient R[X] by the ideal < X^2 >

Since we can keep things general in the theory, I’m going to take
duals over a general type a. Maybe I should look into the mathematical
prelude so that I will be able to specify a to be a ring, but that’s
for another day.

> data Dual a = D a a deriving Show

Some Notation

I shall also slip in and out of the notation a + b\mathrm{j} to represent the Dual (D a b).

We could maybe use some utility functions to give us the ‘real’ and ‘dual’ parts

> real, dual :: Dual a -> a
> real (D a _) = a
> dual (D _ b) = b

In some ways, we can consider these as an analogue of the complex numbers,
but instead with j^2 = 0. Anyhow, equality comes naturally as follows.
(We could get this from the polynomial definition using the coset definition of
equality)

> instance Eq a => Eq (Dual a) where
> (D a b) == (D c d) = (a == c) && (b == d)

Now, we want to treat this as a number.
Since things commute, addition and multiplication come naturally.

> instance Num a => Num (Dual a) where
> (D a b) + (D c d) = D (a+c) (b+d)
> (D a b) * (D c d) = D (a*c) (b*c + a*d)

Again, negation is fairly obvious

> negate (D a b) = D (-a) (-b)

Now Haskell wants us to give a definition for signum – i.e. give
some kind of subset of ‘positive’ duals.
Not sure that this really makes much sense. I’ll leave it undefined for now.

> signum (D a b) = undefined

We also need to give an ‘abs’. Taking inspiration from complex numbers,
if we were to define abs (D a b) to be (D a b)*(D a -b) we end up with

> abs (D a b) = D (abs a) 0

This is all kind of crazy! The complex numbers modulus maps from \mathbb{C} \rightarrow \mathbb{R} whereas Haskell requires us to give a function from Duals to Duals. I can’t seem to see anywhere what properties such an abs function should have.

Haskell’s implementation of complex numbers does seem to give something similar to my implementation.
\mathrm{abs } z \mapsto |z| + 0i

For now then, I’ll just use the definition given here: http://en.wikipedia.org/wiki/Automatic_differentiation#Automatic_differentiation_using_dual_numbers

Finally we need to give a way of converting from integrals to duals – just
take the ‘dual’ part to be 0

> fromInteger n = D (fromIntegral n) 0

Now since Dual numbers are PRETTY MAGIC
– f(D a 1) = D f(a) f’(a)
we can automagically differentiate functions of type Num a => a -> a

> deriv :: Num a => (Dual a -> Dual b) -> a -> b
> deriv f x = dual $ f (D x 1)

Now we could probably do with some notion of division!

but how?!

(D a b) / (D c d) = (D a b) * (D c -d) / c² ?!?!

let’s try and see if crazy happens.

> instance Fractional a => Fractional (Dual a) where
> (D a b) / (D c d) = D (real x / (c*c)) (dual x / (c*c))
> where
> x = (D a b) * (D c (-d))
> fromRational r = D (fromRational r) 0

Everything seems fine!

Now for maybe the more interesting bit, I’d like to implement Duals for
standard functions – sin, cos, … etc
In haskell, this means implementing the Floating class.

> instance Floating a => Floating (Dual a) where

Taking pi to be the Dual with ‘real’ part of pi seems sensible,
though maybe we should ensure that this is still twice the first
‘positive’ root of cos, though quite what positive means.
Since it seems good to have that cos(\pi/2) = 0,
let us take pi to be \pi + 2\pi\mathrm{j}

> pi = D pi (2*pi)

Next thing to define is exp. Taking the power series definition,
and noting that

(a+b\mathrm{j})^n = a^n + n b a^{n-1}\mathrm{j}
(which follows from binomially expanding (a+b\mathrm{j})^n and then noting that all other terms contain a j^2.

it follows trivially that

> exp (D a b) = D (exp a) (b*(exp a))

Ok, so maybe something better is happening here.
Why not try a general power series. If we have:
f(x) = \sum_0^\infty a_n x^n
Then we can try sticking in the dual x + y\mathrm{j}
f(x + y\mathrm{j}) = \sum_0^\infty a_n x^n nyx^{n-1}\mathrm{j}
= \sum_0^\infty a_n x^n + \left (y\sum_1^\infty a_n n x^{n-1} \right ) \mathrm{j}

where we see the dual part is the termwise derivative! Woooop!
so, taking what we already ‘know’ of the derivatives of these functions
(though quite how this is applicable when our underlying Num type isn’t sane)
we get

f(a + b\mathrm{j}) = f(a) + bf'(a)

though, there is a problem in that, since haskell doesn’t ‘know’
these functions are defined in terms of power series, we can’t use
our existing deriv function to produce f' :(
Lets get cracking then…

> log (D a b) = D (log a) (b*(1/a))
> sin (D a b) = D (sin a) (b*(cos a))
> cos (D a b) = D (cos a) (b*(- sin a))
> sinh (D a b) = D (sinh a) (b*(cosh a))
> cosh (D a b) = D (cosh a) (b*(sinh a))

–leave the inverse trigometric and hyperbolics for now, I’m lazy :D

We can even do crazy things now by taking duals over complex numbers, though
what this ‘means’ I’m not yet sure.

I shall have to think on this.

On the other hand, we are now able to differentiate fairly complicated functions.

> g x = (sin (x * cos (log (1 / (x**2 – 2*x + 3)))))**3
> g’ = deriv g