Automatic differentiation

Automatic (forward-mode) differentiation (AD) is performed here by first defining a function as a composition of elementary functions (as shown in the introduction).

There is NumPy and SymPy support. For example, running each AD step separately …

import numpy as np
from njet import derive
from njet.functions import exp

d2 = derive(lambda x, y: exp(-x**2 + y), order=3)
lin = np.linspace(3, 4, 4)

for e in lin:
    print( d2(2.1, e) )

{(0, 0): 0.24414328315343706, (1, 0): -1.0254017892444356, (0, 1): 0.24414328315343706, (2, 0): 3.818400948519755, (1, 1): -1.0254017892444356, (0, 2): 0.24414328315343706, (2, 1): 3.8184009485197556, (0, 3): 0.24414328315343706, (3, 0): -11.935676826805235, (1, 2): -1.0254017892444356}
{(0, 0): 0.34072939947024816, (1, 0): -1.4310634777750424, (0, 1): 0.34072939947024816, (2, 0): 5.329007807714682, (1, 1): -1.4310634777750424, (0, 2): 0.34072939947024816, (2, 1): 5.329007807714681, (0, 3): 0.34072939947024816, (3, 0): -16.657578881301497, (1, 2): -1.4310634777750422}
{(0, 0): 0.47552618349279985, (1, 0): -1.9972099706697595, (0, 1): 0.47552618349279985, (2, 0): 7.4372295098273895, (1, 1): -1.9972099706697595, (0, 2): 0.47552618349279985, (2, 1): 7.4372295098273895, (0, 3): 0.47552618349279985, (3, 0): -23.247524058596003, (1, 2): -1.9972099706697595}
{(0, 0): 0.6636502501363193, (1, 0): -2.7873310505725413, (0, 1): 0.6636502501363193, (2, 0): 10.379489912132033, (1, 1): -2.7873310505725413, (0, 2): 0.6636502501363193, (2, 1): 10.379489912132033, (0, 3): 0.6636502501363193, (3, 0): -32.44453342866438, (1, 2): -2.787331050572541}

… can be speed up by directly passing the NumPy arrays into the code:

d2(2.1, lin)

{(0, 0): array([0.24414328, 0.3407294 , 0.47552618, 0.66365025]),
 (1, 0): array([-1.02540179, -1.43106348, -1.99720997, -2.78733105]),
 (0, 1): array([0.24414328, 0.3407294 , 0.47552618, 0.66365025]),
 (2, 0): array([ 3.81840095,  5.32900781,  7.43722951, 10.37948991]),
 (1, 1): array([-1.02540179, -1.43106348, -1.99720997, -2.78733105]),
 (0, 2): array([0.24414328, 0.3407294 , 0.47552618, 0.66365025]),
 (2, 1): array([ 3.81840095,  5.32900781,  7.43722951, 10.37948991]),
 (0, 3): array([0.24414328, 0.3407294 , 0.47552618, 0.66365025]),
 (3, 0): array([-11.93567683, -16.65757888, -23.24752406, -32.44453343]),
 (1, 2): array([-1.02540179, -1.43106348, -1.99720997, -2.78733105])}

It is also possible to pass SymPy symbols:

from sympy import Symbol
d2(Symbol('x'), Symbol('y'))

{(0, 0): 1.0*exp(-x**2 + y),
 (1, 0): -2.0*x*exp(-x**2 + y),
 (0, 1): 1.0*exp(-x**2 + y),
 (2, 0): 4.0*x**2*exp(-x**2 + y) - 2.0*exp(-x**2 + y),
 (1, 1): -2.0*x*exp(-x**2 + y),
 (0, 2): 1.0*exp(-x**2 + y),
 (2, 1): 4.0*x**2*exp(-x**2 + y) - 2.0*exp(-x**2 + y),
 (0, 3): 1.0*exp(-x**2 + y),
 (3, 0): -8.0*x**3*exp(-x**2 + y) + 12.0*x*exp(-x**2 + y),
 (1, 2): -2.0*x*exp(-x**2 + y)}

Nested expressions

Expressions containing higher-order derivatives can straightforwardly be derived. For example, in the case of one variable:

  def prime(f, k=0):
      # Return \partial f / \partial x_k
      df = derive(f, order=1)
      return lambda x: df.grad(x)[(k,)]

  from njet.functions import sin
  xmpl = lambda x: sin(x**2)

  from sympy import Symbol
  f3 = prime(prime(prime(xmpl)))(Symbol('x'))
  f3.expand()
> -8*x**3*cos(x**2) - 12*x*sin(x**2)

Here a more sophisticated example for two variables:

f = lambda x, y: sin(1/x + y)
df = derive(f, order=3)

dxxf = lambda x, y: df(x, y)[(1, 1)]
dxxyf = lambda x, y: df(x, y)[(2, 1)]

g = lambda x, y: f(x, y)/(1 + dxxf(x, y)) + dxxyf(x, y)**-3

We obtain the derivatives of the function g, containing the function f itself and some of its higher-order derivatives, up to fourth order:

dg = derive(g, order=4)
dg(0.2, 1.1)

{(0, 0): 0.05125472033924497,
 (1, 0): -1.2893245550004897,
 (0, 1): 0.07784343804658536,
 (0, 2): 1.0912311590829322,
 (2, 0): 608.7138384279668,
 (1, 1): -25.28585653666342,
 (2, 1): 13059.768934688353,
 (0, 3): 22.863930873679625,
 (3, 0): -321809.8708462019,
 (1, 2): -540.6030970198217,
 (3, 1): -9115018.77817841,
 (0, 4): 638.7760280114579,
 (1, 3): -15256.040672728306,
 (4, 0): 227762162.9608126,
 (2, 2): 370126.0266704479}

Of course, this also synergizes with either NumPy arrays or SymPy symbols. E.g.:

dg(np.linspace(0.2, 0.64, 5), 1.1)

{(0, 0): array([0.05125472, 0.10722629, 0.37526799, 0.10042023, 0.39647601]),
 (1, 0): array([-1.28932456,  0.8244532 , 10.59476301,  1.63373827, 12.93969049]),
 (0, 1): array([ 0.07784344, -0.00505105, -1.19829647, -0.41567626, -3.08965192]),
 (0, 2): array([ 1.09123116e+00,  1.70120125e-02,  1.48647193e+01, -1.90986854e+00, 6.21825808e+01]),
 (2, 0): array([608.71383843,   6.65953138, 770.48242475, -21.18301586, 946.75436686]),
 (1, 1): array([ -25.28585654,   -0.24971327, -108.54451995,    5.59886998, -245.80653651]),
 (2, 1): array([ 1.30597689e+04, -8.68288739e+00, -1.32996630e+04, -1.96661374e+02, -2.38423630e+04]),
 (0, 3): array([ 2.28639309e+01, -3.49863419e-02, -2.75171895e+02, -1.34122433e+01, -1.70211185e+03]),
 (3, 0): array([-3.21809871e+05,  1.13907036e+02,  8.86275855e+04,  8.81153961e+02, 8.62690730e+04]),
 (1, 2): array([-5.40603097e+02,  6.12535213e-01,  1.93861032e+03,  4.78129009e+01, 6.43635888e+03]),
 (3, 1): array([-9.11501878e+06, -3.41856494e+02, -2.10459443e+06,  7.83829609e+03, -2.60956636e+06]),
 (0, 4): array([ 6.38776028e+02,  1.56152245e-01,  6.79315246e+03, -1.18250949e+02, 5.53778566e+04]),
 (1, 3): array([-1.52560407e+04, -2.16266355e+00, -4.69698546e+04,  4.43941387e+02, -2.04118615e+05]),
 (4, 0): array([ 2.27762163e+08,  3.56314244e+03,  1.36083587e+07, -3.62297999e+04, 9.01586775e+06]),
 (2, 2): array([ 3.70126027e+05,  2.87900980e+01,  3.17958465e+05, -1.80550952e+03, 7.37728082e+05])}

Complex differentiation

Wirtinger calculus can (currently) be realized by modifying a function so that every variable has their complex conjugate counterpart. So what will not work is

  f = lambda z: z.conjugate()
  df = derive(f)
  df(Symbol('z'))
> {(0,): conjugate(z), (1,): 1.0}

The correct way to deal with this anti-holomorphic function would be:

  f = lambda z, zc: zc
  df = derive(f)
  df(Symbol('z'), Symbol('z').conjugate())
> {(0, 0): conjugate(z), (0, 1): 1.0}

In the following the AD routines are explained in more detail.

njet.ad.build_tensor(D, k: int, **kwargs)

Convert the components of the k-th derivative into the entries of a (self.n_args)**k tensor. See njet.jet.build_tensor for details.

Parameters
  • D (derive or cderive object) –

  • k (int) – The number of derivatives to be considered.

Returns

Return type

dict or list

class njet.ad.derive(func, order: int = 1, n_args: int = 0)

Class to handle the derivatives of a (jet-)function (i.e. a function consisting of a composition of elementary functions).

Parameters
  • func (callable(s)) – The function(s) to be derived. Must be expressed in terms of functions (e.g. polynomials) supported by njet.functions. Note that the first function in the given chain will be executed first.

  • order (int, optional) – The order up to which the function should be derived.

  • n_args (int, optional) – The number of arguments on which func depends on; passed to getNargs routine.

  • n_out (int, optional) – Define the number of output parameters of the function(s). This is required for multi-dimensional output.

  • truncate (int, optional) – If given, truncate the jets after each iteration through the given functions.

eval(*z, **kwargs)

Pass a jet of order self.order, having polynomials in its higher-order entries, through the given function.

Parameters
  • z (subscriptable) – List of values at which the function and its derivatives should be evaluated.

  • **kwargs – Optional keyword arguments passed to the underlying function.

Returns

Jet containing the value of the function in its zero-th entry and the jet-polynomials in the higher-order entries.

Return type

jet

njet.ad.getNargs(func)

Determine the number of arguments of a function. Raise an error if they can not be determined.

Parameters

func (callable) – The function to be examined.

Returns

The number of arguments of the given function.

Return type

int

njet.ad.grad(D, *z, **kwargs)

Returns the gradient of the current function. See njet.jet.grad for details.

njet.ad.hess(D, *z, **kwargs)

Returns the Hessian of the function. See njet.jet.hess for details.

njet.ad.identity(*z, order: int)

Create a set of jets representing the components of the identity operator.

Parameters
  • *z – The point at which the identitity is evaluated.

  • order (int, optional) – The order of the desired jets.

Returns

A list of jets, representing the components of the identity.

Return type

list

class njet.ad.jet(*values, **kwargs)

Class to store and operate with higher-order derivatives of a given function.

njet.ad.taylor_coefficients(evaluation, output_format=0, **kwargs)

Return the Taylor coefficients of a jet evaluation.

Parameters
  • output_format (int, optional) – If 0 and the output would be a list of length 1, return the entry of this list instead. If != 0, disable this behavior.

  • **kwargs – Parameters passed to njet.jet.taylor_coefficients routine.

Returns

One or more dictionaries, representing the Taylor-coefficients of the given evaluation (see njet.poly.jetpoly.taylor_coefficients for details).

Return type

dict