arsenal.maths package

Submodules

arsenal.maths.checkgrad module

arsenal.maths.checkgrad.fd_Hessian(f, x, eps=1e-05)[source]
arsenal.maths.checkgrad.fd_Jv(f, x, shape=None)[source]

Finite-difference approximation to a Jacobian-vector product, J[f](x) v, where f: Rⁿ ↦ Rᵐ.

Example: If f is a gradient, this function approximates Hessian-vector products.

Solve a linear system for second-order gradient methods without materializing the Hessian or Fisher matrix. For example, the natural gradient direction in an exponential family:

g = Euclidean_gradient(w) Fvp = fd_Jv(dlogZ, w) # Fisher-vector products ng = implicit_cg(Fvp, g)[0]
arsenal.maths.checkgrad.fdcheck(func, w, g, keys=None, eps=1e-05, quiet=0, verbose=1, progressbar=1, throw=True)[source]

Finite-difference check.

Returns arsenal.math.compare instance.

  • func: zero argument function, which references w in caller’s scope.
  • w: parameters.
  • g: gradient estimate to compare against
  • keys: dimensions to check
  • eps: perturbation size
arsenal.maths.checkgrad.finite_difference(f, eps=1e-05)[source]
arsenal.maths.checkgrad.prox_numerical(f, x, s, jac=None)[source]

Numerically estimate the proximal operator Prox_{s*f}(x).

arsenal.maths.checkgrad.quick_fdcheck(func, w, g, n_checks=20, eps=1e-05, quiet=True, verbose=False, progressbar=False, throw=True)[source]

Check gradient along random directions (a faster alternative to axis-aligned directions).

Tim Vieira (2017) “How to test gradient implementations” https://timvieira.github.io/blog/post/2017/04/21/how-to-test-gradient-implementations/

arsenal.maths.cholesky module

Utilities for working with the Cholesky factorization of a positive-definite matrix.

class arsenal.maths.cholesky.Cholesky(M)[source]

Bases: object

Interface to Cholesky factorization of a matrix, which supports updates (growing and rank-one updates) as well as efficient utility methods (e.g., solve, det).

Remarks:

  • We store the factorization (self.L) as upper triangular to be consistent with scipy.linalg.cholesky. Note that numpy.linalg.cholesky is lower triangular for some reason.
det()[source]
downdate_rank_one(z)[source]

Rank-one down-date: compute Chol(M - x xᵀ) given L = Chol(M).

solve(b)[source]
update_grow(B, D)[source]

Growth update: compute Chol([[ A, B], [B.T, D]]) given L = Chol(A)

Remarks:
  • We don’t need to know A to peform the update.
update_rank_one(x, x_copy=True)[source]

Rank-one update: compute Chol(M + x xᵀ) given L = Chol(M).

Remarks:
  • Update is performed in-place (so self.L is mutated).
  • We don’t need to know M to peform the update.
arsenal.maths.cholesky.test_grow()[source]
arsenal.maths.cholesky.test_rank_one()[source]
arsenal.maths.cholesky.test_util()[source]

arsenal.maths.combinatorics module

arsenal.maths.compare module

arsenal.maths.compare.align(X, Y, distance=<function <lambda>>, maximize=False)[source]

Find a cheap alignment of X and Y.

arsenal.maths.compare.check_equal(want, have, verbose=0, **kwargs)[source]
class arsenal.maths.compare.compare(want, have, name=None, data=None, P_LARGER=0.9, regression=True, ax=None, alphabet=None, want_label=None, have_label=None, verbose=1)[source]

Bases: object

format_message()[source]
live_plot(*args, **kw)[source]
message()[source]
plot(regression=True, seaborn=False, ax=None, title=None, name=None, **scatter_kw)[source]
regression()[source]

least squares linear regression

regression_line()[source]
show(*args, **kw)[source]
show_errors()[source]

show largest relative errors

arsenal.maths.compare.pp_plot(a, b, pts=100, show_line=True)[source]

P-P plot for samples a and b.

arsenal.maths.featureselection module

Find most informative features ranked by information gain (i.e. the ID3 hueristic for decision trees).

Input: a tab-delimited file where each line starts with a label followed by features. Output: a sorted list of the most informative features.

class arsenal.maths.featureselection.Alphabet(random_int=None)[source]

Bases: object

Bijective mapping from strings to integers.

>>> a = Alphabet()
>>> [a[x] for x in 'abcd']
[0, 1, 2, 3]
>>> list(map(a.lookup, range(4)))
['a', 'b', 'c', 'd']
>>> a.stop_growth()
>>> a['e']
>>> a.freeze()
>>> a.add('z')
Traceback (most recent call last):
  ...
ValueError: Alphabet is frozen. Key "z" not found.
>>> print(a.plaintext())
a
b
c
d
add(k)
add_many(x)[source]
enum()[source]
freeze()[source]
classmethod from_iterable(s)[source]

Assumes keys are strings.

imap(seq, emit_none=False)[source]

Apply alphabet to sequence while filtering. By default, None is not emitted, so the Note that the output sequence may have fewer items.

items()[source]
keys()[source]
lookup(i)[source]
lookup_many(x)[source]
map(seq, *args, **kwargs)[source]
plaintext()[source]

assumes keys are strings

stop_growth()[source]
arsenal.maths.featureselection.integerize(data)[source]

Integerize dataset returns a triple (label alphabet, feature alphabet, integerized dataset)

arsenal.maths.featureselection.kl_divergence(p, q)[source]

Compute KL divergence of two vectors, K(p || q). NOTE: If any value in q is 0.0 then the KL-divergence is infinite.

arsenal.maths.featureselection.kl_filter(data, verbose=True, progress=False, out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, feature_label_cuttoff=0, feature_count_cuttoff=0, do_label_count=False)[source]

data = (label, [features …])

KL is a synonym for Information Gain

KL( p(label) || p(label|feature) )

arsenal.maths.featureselection.lidstone(p, delta)[source]

Lidstone smoothing is a generalization of Laplace smoothing.

arsenal.maths.featureselection.normalize(p)[source]
arsenal.maths.featureselection.read_tab_file(f)[source]

Load really simple tab-separated file format for labeled data. Label is the first column, features are the remaining columns.

arsenal.maths.inv module

class arsenal.maths.inv.Inv(A)[source]

Bases: object

rank_one_update(u, v)[source]
value
class arsenal.maths.inv.InvAndDet(A)[source]

Bases: arsenal.maths.inv.Inv

rank_one_update(u, v)[source]
arsenal.maths.inv.test()[source]

arsenal.maths.optimize module

class arsenal.maths.optimize.NProj(C, type)[source]

Bases: object

func(x, **kw)[source]
grad_like(x, **kw)[source]

Gradient of prox_func wrt x.

op(x, **kw)[source]
solve(x, y0=None)[source]
class arsenal.maths.optimize.NProx(f, s)[source]

Bases: object

Numerically estimate the proximal operator Prox_{s*f}(x) and related concepts such as the Moreau envelope.

This implementation is not efficient and should only be used for testing purposes.

https://web.stanford.edu/~boyd/papers/pdf/prox_algs.pdf

func(x, **kw)[source]

Proximally smoothed f with smoothing parameter s > 0.

grad_like(x, **kw)[source]

Gradient of prox_func wrt x.

objective(x, y)[source]
op(x, **kw)[source]
solve(x, y0=None)[source]
arsenal.maths.optimize.is_subgradient(f, g, x)[source]

Numerically check whether or not g is a subgradient of f at x.

arsenal.maths.pareto module

Pareto frontier

class arsenal.maths.pareto.Pareto(df, xcol, ycol, ax=None)[source]

Bases: object

lookup_x(x)[source]
Find Pareto point constrained by x.
argmax p.y

p: p.x <= x

lookup_y(y)[source]
Find Pareto point constrained by y.
argmax p.x

p: p.x >= y

scatter(**plot_kwargs)[source]
show_frontier(**plot_kwargs)[source]
arsenal.maths.pareto.pareto_frontier(X, Y, maxX=True, maxY=True, indices=False)[source]

Determine Pareto frontier, returns list of sorted points.

Args:

X, Y: data.

maxX, maxY: (bool) whether to maximize or minimize along respective
coordinate.
arsenal.maths.pareto.pareto_ix(X, Y, *a, **kw)[source]

Determine Pareto frontier, returns list of indices

arsenal.maths.pareto.show_frontier(X, Y, maxX=False, maxY=True, dots=False, XMIN=None, XMAX=None, YMIN=None, YMAX=None, ax=None, label=None, interpolation='pessimistic', **style)[source]

Plot Pareto frontier.

Args:

X, Y: data.

maxX, maxY: (bool) whether to maximize or minimize along respective
coordinate.
dots: (bool) highlight points on the frontier (will use same color as
style).

ax: use an existing axis if non-null.

style: keyword arguments, which will be passed to lines connecting the
points on the Pareto frontier.

XMAX: max value along x-axis YMIN: min value along y-axis

arsenal.maths.pareto.test()[source]

arsenal.maths.rvs module

class arsenal.maths.rvs.Empirical(x)[source]

Bases: object

Empirical CDF of data a, returns function which makes values to their cumulative probabilities.

>>> g = cdf([5, 10, 15])

Evaluate the CDF at a few points

>>> g([5,9,13,15,100])
array([0.33333333, 0.33333333, 0.66666667, 1.        , 1.        ])

Check that ties are handled correctly

>>> g = cdf([5, 5, 15])

The value p(x <= 5) = 2/3

>>> g([0, 5, 15])
array([0.        , 0.66666667, 1.        ])

The auantile function should be the inverse of the cdf.

>>> g = cdf([-1, 5, 5, 15])
>>> g.quantile(np.linspace(0, 1, 10))
array([-1, -1, -1,  5,  5,  5,  5,  5,  5, 15])
cdf(z)

Call self as a function.

conditional_mean(a, b)[source]

E[T | a <= T < b]

plot(confidence=0.95)[source]
ppf(q)
quantile(q)[source]
sf(z)[source]
class arsenal.maths.rvs.Max(X, Y)[source]

Bases: object

cdf(t)[source]
pdf(t)[source]
ppf(u)[source]
rvs(size)[source]
class arsenal.maths.rvs.Mixture(w, ds)[source]

Bases: object

Mixture of several densities

cdf(x)[source]
mean()[source]
pdf(x)[source]
ppf(q)[source]
rvs(size=1)[source]
class arsenal.maths.rvs.TruncatedDistribution(d, a, b)[source]

Bases: object

cdf(x)[source]
mean()[source]
pdf(x)[source]
ppf(u)[source]
rvs(size=None)[source]
sf(x)[source]
arsenal.maths.rvs.anneal(p, *, invT=None, T=None)[source]

p ** (1/T)

arsenal.maths.rvs.cdf

alias of arsenal.maths.rvs.Empirical

arsenal.maths.rvs.compare_samples_to_distr(D, samples, a, b, bins)[source]
arsenal.maths.rvs.is_distribution(p)[source]
arsenal.maths.rvs.log_sample(w)[source]

Sample from unnormalized log-distribution.

arsenal.maths.rvs.random_dist(*size)[source]

Generate a random conditional distribution which sums to one over the last dimension of the input dimensions.

arsenal.maths.rvs.random_psd(n)[source]
arsenal.maths.rvs.sample(w, size=None, u=None)[source]

Uses the inverse CDF method to return samples drawn from an (unnormalized) discrete distribution.

arsenal.maths.rvs.show_distr(D, a, b, resolution=1000)[source]
arsenal.maths.rvs.spherical(size)[source]

Generate random vector from spherical Gaussian.

arsenal.maths.rvs.test_cdf()[source]
arsenal.maths.rvs.test_mixture()[source]
arsenal.maths.rvs.test_truncated_distribution()[source]

arsenal.maths.stepsize module

Adaptive stepsize algorithms

class arsenal.maths.stepsize.adagrad(x, damping=0.0001)[source]

Bases: object

Adagrad

class arsenal.maths.stepsize.adam(x)[source]

Bases: object

Adam as described in http://arxiv.org/pdf/1412.6980.pdf.

arsenal.maths.stepsize.ewma(x, y, alpha)[source]

Exponentially weighted moving average (x is updated in place).

arsenal.maths.stepsize.norm_clip(x, C)[source]

Rescale x (in place) so that ||x|| <= C.

class arsenal.maths.stepsize.sgd(x)[source]

Bases: object

Simple sgd

class arsenal.maths.stepsize.sgd_momentum(x)[source]

Bases: object

arsenal.maths.util module

arsenal.maths.util.argmax_random_tie(x)[source]

argmax with randomized tie breaking.

arsenal.maths.util.argmin_random_tie(x)[source]

argmin with randomized tie breaking.

arsenal.maths.util.assert_equal(a, b, name='', verbose=False, throw=True, tol=0.001, color=1)[source]

isfinite: asserts that both a and b must be finite.

>>> assert_equal(0, 1, throw=0, color=0)
0 1 err=[1.] fail
>>> assert_equal(0, 0, verbose=1, color=0)
0 0 err=[0] ok
arsenal.maths.util.assert_isdistr(p)[source]
arsenal.maths.util.bernstein(samples, delta, R, check=True)[source]

Plug-n-chug empirical Bernstein bound, computes “error bars” which hold with probability 1-delta for the mean of independent samples from a given range R=b-a (known a priori).

Returns epsilon such that the following bound hold,

p( mean(samples) - true_mean <= eps ) >= 1-delta

We assume that sample are independent (not necessarily identically distributed).

Bound is based on sample variance V and a priori knowledge that RVs in are in the range [a,b] (although we only really require a known range b-a)

The bound holds with probability >=(1-delta).

The sample mean has symmetric deviations so we get a two-sided bound by passing in 2*delta, i.e.,

p( |mean(samples) - true_mean| <= eps ) >= 1-2*delta

This is analogous to p-values, which make assumption of normally distributed random variables. This means that the bounds can be ‘tighter’, but the assumptions are usually not valid.

arsenal.maths.util.blocks(M, k)[source]

Partition the matrix M into blocks

M = [[A, B],
[C, D]]

Output: A: k x k, D: (n-k) x (n-k) if M: n x n.

TODO: support arbitrary partitions when k is a list

arsenal.maths.util.cosine(a, b)[source]
arsenal.maths.util.cross_entropy(p, q)[source]

Cross Entropy of two vectors,

CE(p,q) = - sum_i p[i] log q[i]

Relationship to KL-divergence:

CE(p,q) = entropy(p) + KL(p||q)
arsenal.maths.util.cumavg(x)[source]

Cumulative average.

>>> cumavg([1,2,3,4,5])
array([1. , 1.5, 2. , 2.5, 3. ])
arsenal.maths.util.d_softmax(out, x, adj)[source]

out = softmax(x), adj are the adjoints we are chaining together.

A(x) = logsumexp(x)

∇ A(x) = softmax(x) ∇² A(x) = d_softmax(x) % this method.

arsenal.maths.util.entropy(p)[source]

Entropy of a discrete random variable with distribution p

arsenal.maths.util.f1(A, B)[source]

Compute the F1 measure on two bit vectors.

>>> f1([1], [1])
1.0
>>> f1([1], [0])
0.0
>>> f1([1,0,1], [0,1,1])
0.5
arsenal.maths.util.fit_curve(xs, ys)[source]
arsenal.maths.util.hardmax(x)[source]
arsenal.maths.util.inf_norm(a, b)[source]
arsenal.maths.util.kl_divergence(p, q)[source]

Compute KL divergence of two vectors, K(p || q). NOTE: If any value in q is 0.0 then the KL-divergence is infinite.

arsenal.maths.util.lidstone(p, delta)[source]

Lidstone smoothing is a generalization of Laplace smoothing.

arsenal.maths.util.linf(a, b)[source]
arsenal.maths.util.log1mexp(x)[source]

Numerically stable implementation of log(1-exp(x))

Note: function is finite for x < 0.

Source: http://cran.r-project.org/web/packages/Rmpfr/vignettes/log1mexp-note.pdf

arsenal.maths.util.logsubexp(x, y)[source]

Numerically stable computation of subtraction in log-space z = log(exp(x) - exp(y))

arsenal.maths.util.logsumexp(arr, axis=None)[source]

Computes the sum of arr assuming arr is in the log domain.

Returns log(sum(exp(arr))) while minimizing the possibility of over/underflow.

Examples:

>>> a = arange(10)
>>> log(sum(exp(a)))
9.45862974442671
>>> logsumexp(a)
9.45862974442671
>>> x = [[0, 0, 1000.0], [1000.0, 0, 0]]
>>> logsumexp(x, axis=1)
array([1000., 1000.])
>>> logsumexp(x)
1000.6931471805599
>>> logsumexp(x, axis=0)
array([1.00000000e+03, 6.93147181e-01, 1.00000000e+03])
arsenal.maths.util.mean_confidence_interval(a, confidence=0.95)[source]

returns (mean, lower, upper)

arsenal.maths.util.mutual_information(joint)[source]

Mutual Information

MI(x,y) = KL( p(x,y) || p(x) p(y) )

We can compute this easily from the joint distribution

joint = p(X=x,Y=y)
because
p(X=x) = sum_y p(X=x, Y=y) p(Y=y) = sum_x p(X=x, Y=y)
relationships:
MI(x,y) is the expected PMI(x,y) wrt p(x,y) MI(x,y) = KL(p(x,y) || p(x) p(y))
properties:
MI(X,Y) = MI(Y,X) is symmetric
arsenal.maths.util.norm(x, p=2)[source]
arsenal.maths.util.normalize(x, p=1)[source]
arsenal.maths.util.normalize_interval(data)[source]

Shift and rescale data so that it lies in the range [0,1].

arsenal.maths.util.normalize_zscore(data)[source]

Shift and rescale data to be zero-mean and unit-variance along axis 0.

arsenal.maths.util.onehot(i, n)[source]

Create a one-hot vector: a vector of length n with a 1 at position i and zeros elsewhere.

arsenal.maths.util.project_onto_simplex(a, radius=1.0)[source]

Project point a to the probability simplex. Returns (the projected point x, the zero threshold, and the residual value).

arsenal.maths.util.project_pmin_simplex(p, pmin)[source]
arsenal.maths.util.quadratic_formula(a, b, c)[source]
arsenal.maths.util.relative_difference(a, b)[source]

Element-wise relative difference of two arrays

Definition:
{ 0 if if x=y=0 { abs(x-y) / max(|x|, |y|) otherwise

Choices for denominator include:

max(|x|, |y|) # problem with x=y=0

avg(x, y) # problem with x = -y; gives geometric mean

avg(|x|, |y|) # problem when x=y=0

Further reading:
http://en.wikipedia.org/wiki/Relative_change_and_difference
arsenal.maths.util.restore_random_state(seed=None)[source]
arsenal.maths.util.set_printoptions(*args, **kw)[source]

Context manager for numpy’s print options.

arsenal.maths.util.simpless(w, B)[source]
Reparameterization transform similar to softmax, but for the constraints set
{p | p >= 0, sum(p) <= B}.
arsenal.maths.util.softmax(x, axis=None)[source]
>>> x = [1, -10, 100, .5]
>>> softmax(x)
array([1.01122149e-43, 1.68891188e-48, 1.00000000e+00, 6.13336839e-44])
>>> exp(x) / exp(x).sum()
array([1.01122149e-43, 1.68891188e-48, 1.00000000e+00, 6.13336839e-44])
>>> x = [[0, 0, 1000], [1000, 0, 0]]

Normalize by row: >>> softmax(x, axis=0) array([[0. , 0.5, 1. ],

[1. , 0.5, 0. ]])

Normalize by column: >>> softmax(x, axis=1) array([[0., 0., 1.],

[1., 0., 0.]])

Normalize by cell: >>> softmax(x, axis=None) array([[0. , 0. , 0.5],

[0.5, 0. , 0. ]])
arsenal.maths.util.split_ix(N, p, randomize=1)[source]

Sample a random partition of N integers with proportions p.

class arsenal.maths.util.subspace(A)[source]

Bases: object

Linear subspace

basic_checks()[source]
project(q)[source]
visualize(ax, num=100)[source]
arsenal.maths.util.wide_dataframe()[source]
arsenal.maths.util.wls(A, b, w)[source]

weighted least-squares estimation.

arsenal.maths.util.zero_retrieval(want, have)[source]

How good are we at retrieving zero values? Measured with F1 score.

Module contents