mizani - Man Page

Name

mizani — Mizani Documentation

Mizani is python library that provides the pieces necessary to create scales for a graphics system. It is based on the R Scales package.

Contents

bounds - Limiting data values for a palette

Continuous variables have values anywhere in the range minus infinite to plus infinite. However, when creating a visual representation of these values what usually matters is the relative difference between the values. This is where rescaling comes into play.

The values are mapped onto a range that a scale can deal with. For graphical representation that range tends to be [0, 1] or [0, n], where n is some number that makes the plotted object overflow the plotting area.

Although a scale may be able handle the [0, n] range, it may be desirable to have a lower bound greater than zero. For example, if data values get mapped to zero on a scale whose graphical representation is the size/area/radius/length some data will be invisible. The solution is to restrict the lower bound e.g. [0.1, 1]. Similarly you can restrict the upper bound -- using these functions.

mizani.bounds.censor(x, range=(0, 1), only_finite=True)

Convert any values outside of range to a NULL type object.

Parameters
x

numpy:array_like Values to manipulate

range

python:tuple (min, max) giving desired output range

only_finite

bool If True (the default), will only modify finite values.

Returns
x

numpy:array_like Censored array

Notes

All values in x should be of the same type. only_finite parameter is not considered for Datetime and Timedelta types.

The NULL type object depends on the type of values in x.

  • float - float('nan')
  • int - float('nan')
  • datetime.datetime : np.datetime64(NaT)
  • datetime.timedelta : np.timedelta64(NaT)

Examples

>>> a = [1, 2, np.inf, 3, 4, -np.inf, 5]
>>> censor(a, (0, 10))
[1, 2, inf, 3, 4, -inf, 5]
>>> censor(a, (0, 10), False)
[1, 2, nan, 3, 4, nan, 5]
>>> censor(a, (2, 4))
[nan, 2, inf, 3, 4, -inf, nan]
mizani.bounds.expand_range(range, mul=0, add=0, zero_width=1)

Expand a range with a multiplicative or additive constant

Parameters
range

python:tuple Range of data. Size 2.

mul

python:int | python:float Multiplicative constant

add

python:int | python:float | timedelta Additive constant

zero_width

python:int | python:float | timedelta Distance to use if range has zero width

Returns
out

python:tuple Expanded range

Notes

If expanding datetime or timedelta types, add and zero_width must be suitable timedeltas i.e. You should not mix types between Numpy, Pandas and the datetime module.

Examples

>>> expand_range((3, 8))
(3, 8)
>>> expand_range((0, 10), mul=0.1)
(-1.0, 11.0)
>>> expand_range((0, 10), add=2)
(-2, 12)
>>> expand_range((0, 10), mul=.1, add=2)
(-3.0, 13.0)
>>> expand_range((0, 1))
(0, 1)

When the range has zero width

>>> expand_range((5, 5))
(4.5, 5.5)
mizani.bounds.rescale(x, to=(0, 1), _from=None)

Rescale numeric vector to have specified minimum and maximum.

Parameters
x

numpy:array_like | numeric 1D vector of values to manipulate.

to

python:tuple output range (numeric vector of length two)

_from

python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x

Returns
out

numpy:array_like Rescaled values

Examples

>>> x = [0, 2, 4, 6, 8, 10]
>>> rescale(x)
array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
>>> rescale(x, to=(0, 2))
array([0. , 0.4, 0.8, 1.2, 1.6, 2. ])
>>> rescale(x, to=(0, 2), _from=(0, 20))
array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
mizani.bounds.rescale_max(x, to=(0, 1), _from=None)

Rescale numeric vector to have specified maximum.

Parameters
x

numpy:array_like | numeric 1D vector of values to manipulate.

to

python:tuple output range (numeric vector of length two)

_from

python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x. Only the 2nd (max) element is essential to the output.

Returns
out

numpy:array_like Rescaled values

Examples

>>> x = [0, 2, 4, 6, 8, 10]
>>> rescale_max(x, (0, 3))
array([0. , 0.6, 1.2, 1.8, 2.4, 3. ])

Only the 2nd (max) element of the parameters to and _from are essential to the output.

>>> rescale_max(x, (1, 3))
array([0. , 0.6, 1.2, 1.8, 2.4, 3. ])
>>> rescale_max(x, (0, 20))
array([ 0.,  4.,  8., 12., 16., 20.])

If max(x) < _from[1] then values will be scaled beyond the requested maximum (to[1]).

>>> rescale_max(x, to=(1, 3), _from=(-1, 6))
array([0., 1., 2., 3., 4., 5.])

If the values are the same, they taken on the requested maximum. This includes an array of all zeros.

>>> rescale_max([5, 5, 5])
array([1., 1., 1.])
>>> rescale_max([0, 0, 0])
array([1, 1, 1])
mizani.bounds.rescale_mid(x, to=(0, 1), _from=None, mid=0)

Rescale numeric vector to have specified minimum, midpoint, and maximum.

Parameters
x

numpy:array_like | numeric 1D vector of values to manipulate.

to

python:tuple output range (numeric vector of length two)

_from

python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x

mid

numeric mid-point of input range

Returns
out

numpy:array_like Rescaled values

Examples

>>> rescale_mid([1, 2, 3], mid=1)
array([0.5 , 0.75, 1.  ])
>>> rescale_mid([1, 2, 3], mid=2)
array([0. , 0.5, 1. ])
mizani.bounds.squish_infinite(x, range=(0, 1))

Truncate infinite values to a range.

Parameters
x

numpy:array_like Values that should have infinities squished.

range

python:tuple The range onto which to squish the infinites. Must be of size 2.

Returns
out

numpy:array_like Values with infinites squished.

Examples

>>> squish_infinite([0, .5, .25, np.inf, .44])
[0.0, 0.5, 0.25, 1.0, 0.44]
>>> squish_infinite([0, -np.inf, .5, .25, np.inf], (-10, 9))
[0.0, -10.0, 0.5, 0.25, 9.0]
mizani.bounds.zero_range(x, tol=2.220446049250313e-14)

Determine if range of vector is close to zero.

Parameters
x

numpy:array_like | numeric Value(s) to check. If it is an array_like, it should be of length 2.

tol

python:float Tolerance. Default tolerance is the machine epsilon times 10^2.

Returns
out

bool Whether x has zero range.

Examples

>>> zero_range([1, 1])
True
>>> zero_range([1, 2])
False
>>> zero_range([1, 2], tol=2)
True
mizani.bounds.expand_range_distinct(range, expand=(0, 0, 0, 0), zero_width=1)

Expand a range with a multiplicative or additive constants

Similar to expand_range() but both sides of the range expanded using different constants

Parameters
range

python:tuple Range of data. Size 2

expand

python:tuple Length 2 or 4. If length is 2, then the same constants are used for both sides. If length is 4 then the first two are are the Multiplicative (mul) and Additive (add) constants for the lower limit, and the second two are the constants for the upper limit.

zero_width

python:int | python:float | timedelta Distance to use if range has zero width

Returns
out

python:tuple Expanded range

Examples

>>> expand_range_distinct((3, 8))
(3, 8)
>>> expand_range_distinct((0, 10), (0.1, 0))
(-1.0, 11.0)
>>> expand_range_distinct((0, 10), (0.1, 0, 0.1, 0))
(-1.0, 11.0)
>>> expand_range_distinct((0, 10), (0.1, 0, 0, 0))
(-1.0, 10)
>>> expand_range_distinct((0, 10), (0, 2))
(-2, 12)
>>> expand_range_distinct((0, 10), (0, 2, 0, 2))
(-2, 12)
>>> expand_range_distinct((0, 10), (0, 0, 0, 2))
(0, 12)
>>> expand_range_distinct((0, 10), (.1, 2))
(-3.0, 13.0)
>>> expand_range_distinct((0, 10), (.1, 2, .1, 2))
(-3.0, 13.0)
>>> expand_range_distinct((0, 10), (0, 0, .1, 2))
(0, 13.0)
mizani.bounds.squish(x, range=(0, 1), only_finite=True)

Squish values into range.

Parameters
x

numpy:array_like Values that should have out of range values squished.

range

python:tuple The range onto which to squish the values.

only_finite: boolean

When true, only squishes finite values.

Returns
out

numpy:array_like Values with out of range values squished.

Examples

>>> squish([-1.5, 0.2, 0.5, 0.8, 1.0, 1.2])
[0.0, 0.2, 0.5, 0.8, 1.0, 1.0]
>>> squish([-np.inf, -1.5, 0.2, 0.5, 0.8, 1.0, np.inf], only_finite=False)
[0.0, 0.0, 0.2, 0.5, 0.8, 1.0, 1.0]

breaks - Partitioning a scale for readability

All scales have a means by which the values that are mapped onto the scale are interpreted. Numeric digital scales put out numbers for direct interpretation, but most scales cannot do this. What they offer is named markers/ticks that aid in assessing the values e.g. the common odometer will have ticks and values to help gauge the speed of the vehicle.

The named markers are what we call breaks. Properly calculated breaks make interpretation straight forward. These functions provide ways to calculate good(hopefully) breaks.

class mizani.breaks.mpl_breaks(*args, **kwargs)

Compute breaks using MPL's default locator

See MaxNLocator for the parameter descriptions

Examples

>>> x = range(10)
>>> limits = (0, 9)
>>> mpl_breaks()(limits)
array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
>>> mpl_breaks(nbins=2)(limits)
array([  0.,   5.,  10.])
__call__(limits)

Compute breaks

Parameters
limits

python:tuple Minimum and maximum values

Returns
out

numpy:array_like Sequence of breaks points

class mizani.breaks.log_breaks(n=5, base=10)

Integer breaks on log transformed scales

Parameters
n

python:int Desired number of breaks

base

python:int Base of logarithm

Examples

>>> x = np.logspace(3, 6)
>>> limits = min(x), max(x)
>>> log_breaks()(limits)
array([     1000,    10000,   100000,  1000000])
>>> log_breaks(2)(limits)
array([  1000, 100000])
>>> log_breaks()([0.1, 1])
array([0.1, 0.3, 1. , 3. ])
__call__(limits)

Compute breaks

Parameters
limits

python:tuple Minimum and maximum values

Returns
out

numpy:array_like Sequence of breaks points

class mizani.breaks.minor_breaks(n=1)

Compute minor breaks

Parameters
n

python:int Number of minor breaks between the major breaks.

Examples

>>> major = [1, 2, 3, 4]
>>> limits = [0, 5]
>>> minor_breaks()(major, limits)
array([0.5, 1.5, 2.5, 3.5, 4.5])
>>> minor_breaks()([1, 2], (1, 2))
array([1.5])

More than 1 minor break.

>>> minor_breaks(3)([1, 2], (1, 2))
array([1.25, 1.5 , 1.75])
>>> minor_breaks()([1, 2], (1, 2), 3)
array([1.25, 1.5 , 1.75])
__call__(major, limits=None, n=None)

Minor breaks

Parameters
major

numpy:array_like Major breaks

limits

numpy:array_like | python:None Limits of the scale. If array_like, must be of size 2. If None, then the minimum and maximum of the major breaks are used.

n

python:int Number of minor breaks between the major breaks. If None, then self.n is used.

Returns
out

numpy:array_like Minor beraks

class mizani.breaks.trans_minor_breaks(trans, n=1)

Compute minor breaks for transformed scales

The minor breaks are computed in data space. This together with major breaks computed in transform space reveals the non linearity of of a scale. See the log transforms created with log_trans() like log10_trans.

Parameters
trans

trans or type Trans object or trans class.

n

python:int Number of minor breaks between the major breaks.

Examples

>>> from mizani.transforms import sqrt_trans
>>> major = [1, 2, 3, 4]
>>> limits = [0, 5]
>>> sqrt_trans().minor_breaks(major, limits)
array([0.5, 1.5, 2.5, 3.5, 4.5])
>>> class sqrt_trans2(sqrt_trans):
...     def __init__(self):
...         self.minor_breaks = trans_minor_breaks(sqrt_trans2)
>>> sqrt_trans2().minor_breaks(major, limits)
array([1.58113883, 2.54950976, 3.53553391])

More than 1 minor break

>>> major = [1, 10]
>>> limits = [1, 10]
>>> sqrt_trans().minor_breaks(major, limits, 4)
array([2.8, 4.6, 6.4, 8.2])
__call__(major, limits=None, n=None)

Minor breaks for transformed scales

Parameters
major

numpy:array_like Major breaks

limits

numpy:array_like | python:None Limits of the scale. If array_like, must be of size 2. If None, then the minimum and maximum of the major breaks are used.

n

python:int Number of minor breaks between the major breaks. If None, then self.n is used.

Returns
out

numpy:array_like Minor breaks

class mizani.breaks.date_breaks(width=None)

Regularly spaced dates

Parameters
width

python:str | python:None An interval specification. Must be one of [second, minute, hour, day, week, month, year] If None, the interval automatic.

Examples

>>> from datetime import datetime
>>> x = [datetime(year, 1, 1) for year in [2010, 2026, 2015]]

Default breaks will be regularly spaced but the spacing is automatically determined

>>> limits = min(x), max(x)
>>> breaks = date_breaks()
>>> [d.year for d in breaks(limits)]
[2010, 2012, 2014, 2016, 2018, 2020, 2022, 2024, 2026]

Breaks at 4 year intervals

>>> breaks = date_breaks('4 year')
>>> [d.year for d in breaks(limits)]
[2008, 2012, 2016, 2020, 2024, 2028]
__call__(limits)

Compute breaks

Parameters
limits

python:tuple Minimum and maximum datetime.datetime values.

Returns
out

numpy:array_like Sequence of break points.

class mizani.breaks.timedelta_breaks(n=5, Q=(1, 2, 5, 10))

Timedelta breaks

Returns
out

python:callable() f(limits) A function that takes a sequence of two datetime.timedelta values and returns a sequence of break points.

Examples

>>> from datetime import timedelta
>>> breaks = timedelta_breaks()
>>> x = [timedelta(days=i*365) for i in range(25)]
>>> limits = min(x), max(x)
>>> major = breaks(limits)
>>> [val.total_seconds()/(365*24*60*60)for val in major]
[0.0, 5.0, 10.0, 15.0, 20.0, 25.0]
__call__(limits)

Compute breaks

Parameters
limits

python:tuple Minimum and maximum datetime.timedelta values.

Returns
out

numpy:array_like Sequence of break points.

class mizani.breaks.extended_breaks(n=5, Q=[1, 5, 2, 2.5, 4, 3], only_inside=False, w=[0.25, 0.2, 0.5, 0.05])

An extension of Wilkinson's tick position algorithm

Parameters
n

python:int Desired number of ticks

Q

python:list List of nice numbers

only_inside

bool If True, then all the ticks will be within the given range.

w

python:list Weights applied to the four optimization components (simplicity, coverage, density, and legibility). They should add up to 1.

References

  • Talbot, J., Lin, S., Hanrahan, P. (2010) An Extension of Wilkinson's Algorithm for Positioning Tick Labels on Axes, InfoVis 2010.

Additional Credit to Justin Talbot on whose code this implementation is almost entirely based.

Examples

>>> limits = (0, 9)
>>> extended_breaks()(limits)
array([  0. ,   2.5,   5. ,   7.5,  10. ])
>>> extended_breaks(n=6)(limits)
array([  0.,   2.,   4.,   6.,   8.,  10.])
__call__(limits)

Calculate the breaks

Parameters
limits

array Minimum and maximum values.

Returns
out

numpy:array_like Sequence of break points.

formatters - Labelling breaks

Scales have guides and these are what help users make sense of the data mapped onto the scale. Common examples of guides include the x-axis, the y-axis, the keyed legend and a colorbar legend. The guides have demarcations(breaks), some of which must be labelled.

The *_format functions below create functions that convert data values as understood by a specific scale and return string representations of those values. Manipulating the string representation of a value helps improve readability of the guide.

class mizani.formatters.comma_format(digits=0)

Format number with commas separating thousands

Parameters
digits

python:int Number of digits after the decimal point.

Examples

>>> comma_format()([1000, 2, 33000, 400])
['1,000', '2', '33,000', '400']
__call__(x)

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.formatters.custom_format(fmt='{}', style='new')

Custom format

Parameters
fmt

python:str, optional Format string. Default is the generic new style format braces, {}.

style

'new' | 'old' Whether to use new style or old style formatting. New style uses the str.format() while old style uses %. The format string must be written accordingly.

Examples

>>> formatter = custom_format('{:.2f} USD')
>>> formatter([3.987, 2, 42.42])
['3.99 USD', '2.00 USD', '42.42 USD']
__call__(x)

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.formatters.currency_format(prefix='$', suffix='', digits=2, big_mark='')

Currency formatter

Parameters
prefix

python:str What to put before the value.

suffix

python:str What to put after the value.

digits

python:int Number of significant digits

big_mark

python:str The thousands separator. This is usually a comma or a dot.

Examples

>>> x = [1.232, 99.2334, 4.6, 9, 4500]
>>> currency_format()(x)
['$1.23', '$99.23', '$4.60', '$9.00', '$4500.00']
>>> currency_format('C$', digits=0, big_mark=',')(x)
['C$1', 'C$99', 'C$5', 'C$9', 'C$4,500']
__call__(x)

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

mizani.formatters.dollar_format

alias of currency_format

class mizani.formatters.percent_format(use_comma=False)

Percent formatter

Multiply by one hundred and display percent sign

Parameters
use_comma

bool If True, use a comma to separate the thousands. Default is False.

Examples

>>> formatter = percent_format()
>>> formatter([.45, 9.515, .01])
['45%', '952%', '1%']
>>> formatter([.654, .8963, .1])
['65.4%', '89.6%', '10.0%']
__call__(x)

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.formatters.scientific_format(digits=3)

Scientific formatter

Parameters
digits

python:int Significant digits.

Notes

Be careful when using many digits (15+ on a 64 bit computer). Consider of the machine epsilon.

Examples

>>> x = [.12, .23, .34, 45]
>>> scientific_format()(x)
['1.2e-01', '2.3e-01', '3.4e-01', '4.5e+01']
__call__(x)

Call self as a function.

class mizani.formatters.date_format(fmt='%Y-%m-%d', tz=None)

Datetime formatter

Parameters
fmt

python:str Format string. See strftime.

tz

datetime.tzinfo, optional Time zone information. If none is specified, the time zone will be that of the first date. If the first date has no time information then a time zone is chosen by other means.

Examples

>>> from datetime import datetime
>>> x = [datetime(x, 1, 1) for x in [2010, 2014, 2018, 2022]]
>>> date_format()(x)
['2010-01-01', '2014-01-01', '2018-01-01', '2022-01-01']
>>> date_format('%Y')(x)
['2010', '2014', '2018', '2022']

Can format time

>>> x = [datetime(2017, 12, 1, 16, 5, 7)]
>>> date_format("%Y-%m-%d %H:%M:%S")(x)
['2017-12-01 16:05:07']

Time zones are respected

>>> UTC = ZoneInfo('UTC')
>>> UG = ZoneInfo('Africa/Kampala')
>>> x = [datetime(2010, 1, 1, i) for i in [8, 15]]
>>> x_tz = [datetime(2010, 1, 1, i, tzinfo=UG) for i in [8, 15]]
>>> date_format('%Y-%m-%d %H:%M')(x)
['2010-01-01 08:00', '2010-01-01 15:00']
>>> date_format('%Y-%m-%d %H:%M')(x_tz)
['2010-01-01 08:00', '2010-01-01 15:00']

Format with a specific time zone

>>> date_format('%Y-%m-%d %H:%M', tz=UTC)(x_tz)
['2010-01-01 05:00', '2010-01-01 12:00']
>>> date_format('%Y-%m-%d %H:%M', tz='EST')(x_tz)
['2010-01-01 00:00', '2010-01-01 07:00']
__call__(x)

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.formatters.mpl_format

Format using MPL formatter for scalars

Examples

>>> mpl_format()([.654, .8963, .1])
['0.6540', '0.8963', '0.1000']
__call__(x)

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.formatters.log_format(base=10, exponent_limits=(-4, 4), mathtex=False)

Log Formatter

Parameters
base

python:int Base of the logarithm. Default is 10.

exponent_limits

python:tuple limits (int, int) where if the any of the powers of the numbers falls outside, then the labels will be in exponent form. This only applies for base 10.

mathtex

bool If True, return the labels in mathtex format as understood by Matplotlib.

Examples

>>> log_format()([0.001, 0.1, 100])
['0.001', '0.1', '100']
>>> log_format()([0.0001, 0.1, 10000])
['1e-4', '1e-1', '1e4']
>>> log_format(mathtex=True)([0.0001, 0.1, 10000])
['$10^{-4}$', '$10^{-1}$', '$10^{4}$']
__call__(x)

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.formatters.timedelta_format(units=None, add_units=True, usetex=False)

Timedelta formatter

Parameters
units

python:str, optional The units in which the breaks will be computed. If None, they are decided automatically. Otherwise, the value should be one of:

'ns'    # nanoseconds
'us'    # microseconds
'ms'    # milliseconds
's'     # secondss
'm'     # minute
'h'     # hour
'd'     # day
'w'     # week
'M'     # month
'y'     # year
add_units

bool Whether to append the units identifier string to the values.

usetext

bool If True, they microseconds identifier string is rendered with greek letter mu. Default is False.

Examples

>>> from datetime import timedelta
>>> x = [timedelta(days=31*i) for i in range(5)]
>>> timedelta_format()(x)
['0', '1 month', '2 months', '3 months', '4 months']
>>> timedelta_format(units='d')(x)
['0', '31 days', '62 days', '93 days', '124 days']
>>> timedelta_format(units='d', add_units=False)(x)
['0', '31', '62', '93', '124']
__call__(x)

Call self as a function.

class mizani.formatters.pvalue_format(accuracy=0.001, add_p=False)

p-values Formatter

Parameters
accuracy

python:float Number to round to

add_p

bool Whether to prepend "p=" or "p<" to the output

Examples

>>> x = [.90, .15, .015, .009, 0.0005]
>>> pvalue_format()(x)
['0.9', '0.15', '0.015', '0.009', '<0.001']
>>> pvalue_format(0.1)(x)
['0.9', '0.1', '<0.1', '<0.1', '<0.1']
>>> pvalue_format(0.1, True)(x)
['p=0.9', 'p=0.1', 'p<0.1', 'p<0.1', 'p<0.1']
__call__(x)

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.formatters.ordinal_format(prefix='', suffix='', big_mark='')

Ordinal Formatter

Parameters
prefix

python:str What to put before the value.

suffix

python:str What to put after the value.

big_mark

python:str The thousands separator. This is usually a comma or a dot.

Examples

>>> ordinal_format()(range(8))
['0th', '1st', '2nd', '3rd', '4th', '5th', '6th', '7th']
>>> ordinal_format(suffix=' Number')(range(11, 15))
['11th Number', '12th Number', '13th Number', '14th Number']
__call__(x)

Call self as a function.

class mizani.formatters.number_bytes_format(symbol='auto', units='binary', fmt='{:.0f} ')

Bytes Formatter

Parameters
symbol

python:str Valid symbols are "B", "kB", "MB", "GB", "TB", "PB", "EB", "ZB", and "YB" for SI units, and the "iB" variants for binary units. Default is "auto" where the symbol to be used is determined separately for each value of 1x.

units

"binary" | "si" Which unit base to use, 1024 for "binary" or 1000 for "si".

fmt

python:str, optional Format sting. Default is {:.0f}.

Examples

>>> x = [1000, 1000000, 4e5]
>>> number_bytes_format()(x)
['1000 B', '977 KiB', '391 KiB']
>>> number_bytes_format(units='si')(x)
['1 kB', '1 MB', '400 kB']
__call__(x)

Call self as a function.

palettes - Mapping values onto the domain of a scale

Palettes are the link between data values and the values along the dimension of a scale. Before a collection of values can be represented on a scale, they are transformed by a palette. This transformation is knowing as mapping. Values are mapped onto a scale by a palette.

Scales tend to have restrictions on the magnitude of quantities that they can intelligibly represent. For example, the size of a point should be significantly smaller than the plot panel onto which it is plotted or else it would be hard to compare two or more points. Therefore palettes must be created that enforce such restrictions. This is the reason for the *_pal functions that create and return the actual palette functions.

mizani.palettes.hls_palette(n_colors=6, h=0.01, l=0.6, s=0.65)

Get a set of evenly spaced colors in HLS hue space.

h, l, and s should be between 0 and 1

Parameters
n_colors

python:int number of colors in the palette

h

python:float first hue

l

python:float lightness

s

python:float saturation

Returns
palette

python:list List of colors as RGB hex strings.

SEE ALSO:

husl_palette

Make a palette using evenly spaced circular hues in the HUSL system.

Examples

>>> len(hls_palette(2))
2
>>> len(hls_palette(9))
9
mizani.palettes.husl_palette(n_colors=6, h=0.01, s=0.9, l=0.65)

Get a set of evenly spaced colors in HUSL hue space.

h, s, and l should be between 0 and 1

Parameters
n_colors

python:int number of colors in the palette

h

python:float first hue

s

python:float saturation

l

python:float lightness

Returns
palette

python:list List of colors as RGB hex strings.

SEE ALSO:

hls_palette

Make a palette using evenly spaced circular hues in the HSL system.

Examples

>>> len(husl_palette(3))
3
>>> len(husl_palette(11))
11
mizani.palettes.rescale_pal(range=(0.1, 1))

Rescale the input to the specific output range.

Useful for alpha, size, and continuous position.

Parameters
range

python:tuple Range of the scale

Returns
out

function Palette function that takes a sequence of values in the range [0, 1] and returns values in the specified range.

Examples

>>> palette = rescale_pal()
>>> palette([0, .2, .4, .6, .8, 1])
array([0.1 , 0.28, 0.46, 0.64, 0.82, 1.  ])

The returned palette expects inputs in the [0, 1] range. Any value outside those limits is clipped to range[0] or range[1].

>>> palette([-2, -1, 0.2, .4, .8, 2, 3])
array([0.1 , 0.1 , 0.28, 0.46, 0.82, 1.  , 1.  ])
mizani.palettes.area_pal(range=(1, 6))

Point area palette (continuous).

Parameters
range

python:tuple Numeric vector of length two, giving range of possible sizes. Should be greater than 0.

Returns
out

function Palette function that takes a sequence of values in the range [0, 1] and returns values in the specified range.

Examples

>>> x = np.arange(0, .6, .1)**2
>>> palette = area_pal()
>>> palette(x)
array([1. , 1.5, 2. , 2.5, 3. , 3.5])

The results are equidistant because the input x is in area space, i.e it is squared.

mizani.palettes.abs_area(max)

Point area palette (continuous), with area proportional to value.

Parameters
max

python:float A number representing the maximum size

Returns
out

function Palette function that takes a sequence of values in the range [0, 1] and returns values in the range [0, max].

Examples

>>> x = np.arange(0, .8, .1)**2
>>> palette = abs_area(5)
>>> palette(x)
array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5])

Compared to area_pal(), abs_area() will handle values in the range [-1, 0] without returning np.nan. And values whose absolute value is greater than 1 will be clipped to the maximum.

mizani.palettes.grey_pal(start=0.2, end=0.8)

Utility for creating continuous grey scale palette

Parameters
start

python:float grey value at low end of palette

end

python:float grey value at high end of palette

Returns
out

function Continuous color palette that takes a single int parameter n and returns n equally spaced colors.

Examples

>>> palette = grey_pal()
>>> palette(5)
['#333333', '#737373', '#989898', '#b5b5b5', '#cccccc']
mizani.palettes.hue_pal(h=0.01, l=0.6, s=0.65, color_space='hls')

Utility for making hue palettes for color schemes.

Parameters
h

python:float first hue. In the [0, 1] range

l

python:float lightness. In the [0, 1] range

s

python:float saturation. In the [0, 1] range

color_space

'hls' | 'husl' Color space to use for the palette

Returns
out

function A discrete color palette that takes a single int parameter n and returns n equally spaced colors. Though the palette is continuous, since it is varies the hue it is good for categorical data. However if n is large enough the colors show continuity.

Examples

>>> hue_pal()(5)
['#db5f57', '#b9db57', '#57db94', '#5784db', '#c957db']
>>> hue_pal(color_space='husl')(5)
['#e0697e', '#9b9054', '#569d79', '#5b98ab', '#b675d7']
mizani.palettes.brewer_pal(type='seq', palette=1, direction=1)

Utility for making a brewer palette

Parameters
type

'sequential' | 'qualitative' | 'diverging' Type of palette. Sequential, Qualitative or Diverging. The following abbreviations may be used, seq, qual or div.

palette

python:int | python:str Which palette to choose from. If is an integer, it must be in the range [0, m], where m depends on the number sequential, qualitative or diverging palettes. If it is a string, then it is the name of the palette.

direction

python:int The order of colours in the scale. If -1 the order of colors is reversed. The default is 1.

Returns
out

function A color palette that takes a single int parameter n and returns n colors. The maximum value of n varies depending on the parameters.

Examples

>>> brewer_pal()(5)
['#EFF3FF', '#BDD7E7', '#6BAED6', '#3182BD', '#08519C']
>>> brewer_pal('qual')(5)
['#7FC97F', '#BEAED4', '#FDC086', '#FFFF99', '#386CB0']
>>> brewer_pal('qual', 2)(5)
['#1B9E77', '#D95F02', '#7570B3', '#E7298A', '#66A61E']
>>> brewer_pal('seq', 'PuBuGn')(5)
['#F6EFF7', '#BDC9E1', '#67A9CF', '#1C9099', '#016C59']

The available color names for each palette type can be obtained using the following code:

import palettable.colorbrewer as brewer

print([k for k in brewer.COLOR_MAPS['Sequential'].keys()])
print([k for k in brewer.COLOR_MAPS['Qualitative'].keys()])
print([k for k in brewer.COLOR_MAPS['Diverging'].keys()])
mizani.palettes.gradient_n_pal(colors, values=None, name='gradientn')

Create a n color gradient palette

Parameters
colors

python:list list of colors

values

python:list, optional list of points in the range [0, 1] at which to place each color. Must be the same size as colors. Default to evenly space the colors

name

python:str Name to call the resultant MPL colormap

Returns
out

function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].

Examples

>>> palette = gradient_n_pal(['red', 'blue'])
>>> palette([0, .25, .5, .75, 1])
['#ff0000', '#bf0040', '#7f0080', '#3f00c0', '#0000ff']
>>> palette([-np.inf, 0, np.nan, 1, np.inf])
[nan, '#ff0000', nan, '#0000ff', nan]
mizani.palettes.cmap_pal(name, lut=None)

Create a continuous palette using an MPL colormap

Parameters
name

python:str Name of colormap

lut

python:None | python:int This is the number of entries desired in the lookup table. Default is None, leave it up Matplotlib.

Returns
out

function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].

Examples

>>> palette = cmap_pal('viridis')
>>> palette([.1, .2, .3, .4, .5])
['#482475', '#414487', '#355f8d', '#2a788e', '#21918c']
mizani.palettes.cmap_d_pal(name, lut=None)

Create a discrete palette using an MPL Listed colormap

Parameters
name

python:str Name of colormap

lut

python:None | python:int This is the number of entries desired in the lookup table. Default is None, leave it up Matplotlib.

Returns
out

function A discrete color palette that takes a single int parameter n and returns n colors. The maximum value of n varies depending on the parameters.

Examples

>>> palette = cmap_d_pal('viridis')
>>> palette(5)
['#440154', '#3b528b', '#21918c', '#5cc863', '#fde725']
mizani.palettes.desaturate_pal(color, prop, reverse=False)

Create a palette that desaturate a color by some proportion

Parameters
color

matplotlib color hex, rgb-tuple, or html color name

prop

python:float saturation channel of color will be multiplied by this value

reverse

bool Whether to reverse the palette.

Returns
out

function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].

Examples

>>> palette = desaturate_pal('red', .1)
>>> palette([0, .25, .5, .75, 1])
['#ff0000', '#e21d1d', '#c53a3a', '#a95656', '#8c7373']
mizani.palettes.manual_pal(values)

Create a palette from a list of values

Parameters
values

python:sequence Values that will be returned by the palette function.

Returns
out

function A function palette that takes a single int parameter n and returns n values.

Examples

>>> palette = manual_pal(['a', 'b', 'c', 'd', 'e'])
>>> palette(3)
['a', 'b', 'c']
mizani.palettes.xkcd_palette(colors)

Make a palette with color names from the xkcd color survey.

See xkcd for the full list of colors: http://xkcd.com/color/rgb/

Parameters
colors

python:list of strings List of keys in the mizani.external.xkcd_rgb dictionary.

Returns
palette

python:list List of colors as RGB hex strings.

Examples

>>> palette = xkcd_palette(['red', 'green', 'blue'])
>>> palette
['#e50000', '#15b01a', '#0343df']
>>> from mizani.external import xkcd_rgb
>>> list(sorted(xkcd_rgb.keys()))[:5]
['acid green', 'adobe', 'algae', 'algae green', 'almost black']
mizani.palettes.crayon_palette(colors)

Make a palette with color names from Crayola crayons.

The colors come from http://en.wikipedia.org/wiki/List_of_Crayola_crayon_colors

Parameters
colors

python:list of strings List of keys in the mizani.external.crayloax_rgb dictionary.

Returns
palette

python:list List of colors as RGB hex strings.

Examples

>>> palette = crayon_palette(['almond', 'silver', 'yellow'])
>>> palette
['#eed9c4', '#c9c0bb', '#fbe870']
>>> from mizani.external import crayon_rgb
>>> list(sorted(crayon_rgb.keys()))[:5]
['almond', 'antique brass', 'apricot', 'aquamarine', 'asparagus']
mizani.palettes.cubehelix_pal(start=0, rot=0.4, gamma=1.0, hue=0.8, light=0.85, dark=0.15, reverse=False)

Utility for creating continuous palette from the cubehelix system.

This produces a colormap with linearly-decreasing (or increasing) brightness. That means that information will be preserved if printed to black and white or viewed by someone who is colorblind.

Parameters
start

python:float (0 <= start <= 3) The hue at the start of the helix.

rot

python:float Rotations around the hue wheel over the range of the palette.

gamma

python:float (0 <= gamma) Gamma factor to emphasize darker (gamma < 1) or lighter (gamma > 1) colors.

hue

python:float (0 <= hue <= 1) Saturation of the colors.

dark

python:float (0 <= dark <= 1) Intensity of the darkest color in the palette.

light

python:float (0 <= light <= 1) Intensity of the lightest color in the palette.

reverse

bool If True, the palette will go from dark to light.

Returns
out

function Continuous color palette that takes a single int parameter n and returns n equally spaced colors.

References

Green, D. A. (2011). "A colour scheme for the display of astronomical intensity images". Bulletin of the Astromical Society of India, Vol. 39, p. 289-295.

Examples

>>> palette = cubehelix_pal()
>>> palette(5)
['#edd1cb', '#d499a7', '#aa688f', '#6e4071', '#2d1e3e']

transforms - Transforming variables, scales and coordinates

"The Grammar of Graphics (2005)" by Wilkinson, Anand and Grossman describes three types of transformations.

  • Variable transformations - Used to make statistical operations on variables appropriate and meaningful. They are also used to new variables.
  • Scale transformations - Used to make statistical objects displayed on dimensions appropriate and meaningful.
  • Coordinate transformations - Used to manipulate the geometry of graphics to help perceive relationships and find meaningful structures for representing variations.

Variable and scale transformations are similar in-that they lead to plotted objects that are indistinguishable. Typically, variable transformation is done outside the graphics system and so the system cannot provide transformation specific guides & decorations for the plot. The trans is aimed at being useful for scale and coordinate transformations.

class mizani.transforms.asn_trans(**kwargs)

Arc-sin square-root Transformation

static transform(x)

Transform of x

static inverse(x)

Inverse of x

class mizani.transforms.atanh_trans(**kwargs)

Arc-tangent Transformation

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'arctanh'>

inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'tanh'>

mizani.transforms.boxcox_trans(p, offset=0, **kwargs)

Boxcox Transformation

The Box-Cox transformation is a flexible transformation, often used to transform data towards normality.

The Box-Cox power transformation (type 1) requires strictly positive values and takes the following form for y \gt 0:

y^{(\lambda)} = \frac{y^\lambda - 1}{\lambda}

When y = 0, the natural log transform is used.

Parameters
p

python:float Transformation exponent \lambda.

offset

python:int Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 0. modulus_trans() sets the default to 1.

kwargs

python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.

SEE ALSO:

modulus_trans()

References

mizani.transforms.modulus_trans(p, offset=1, **kwargs)

Modulus Transformation

The modulus transformation generalises Box-Cox to work with both positive and negative values.

When y \neq 0

y^{(\lambda)} = sign(y) * \frac{(|y| + 1)^\lambda - 1}{\lambda}

and when y = 0

y^{(\lambda)} =  sign(y) * \ln{(|y| + 1)}

Parameters
p

python:float Transformation exponent \lambda.

offset

python:int Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 1. boxcox_trans() sets the default to 0.

kwargs

python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.

SEE ALSO:

boxcox_trans()

References

class mizani.transforms.datetime_trans(tz=None, **kwargs)

Datetime Transformation

Parameters
tz

python:str | ZoneInfo Timezone information

Examples

>>> # from zoneinfo import ZoneInfo
>>> # from backports.zoneinfo import ZoneInfo  # for python < 3.9
>>> UTC = ZoneInfo("UTC")
>>> EST = ZoneInfo("EST")
>>> t = datetime_trans(EST)
>>> x = datetime.datetime(2022, 1, 20, tzinfo=UTC)
>>> x2 = t.inverse(t.transform(x))
>>> x == x2
True
>>> x.tzinfo == x2.tzinfo
False
>>> x.tzinfo.key
'UTC'
>>> x2.tzinfo.key
'EST'
dataspace_is_numerical = False

Whether the untransformed data is numerical

domain = (datetime.datetime(1, 1, 1, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC')), datetime.datetime(9999, 12, 31, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC')))

Limits of the transformed data

breaks_ = <mizani.breaks.date_breaks object>

Callable to calculate breaks

format = <mizani.formatters.date_format object>

Function to format breaks

transform(x)

Transform from date to a numerical format

inverse(x)

Transform to date from numerical format

property tzinfo

Alias of tz

mizani.transforms.exp_trans(base=None, **kwargs)

Create a exponential transform class for base

This is inverse of the log transform.

Parameters
base

python:float Base of the logarithm

kwargs

python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.

Returns
out

type Exponential transform class

class mizani.transforms.identity_trans(**kwargs)

Identity Transformation

class mizani.transforms.log10_trans(**kwargs)

Log 10 Transformation

breaks_ = <mizani.breaks.log_breaks object>

Callable to calculate breaks

domain = (2.2250738585072014e-308, inf)

Limits of the transformed data

format = <mizani.formatters.log_format object>

Function to format breaks

static inverse(x)

Inverse of x

minor_breaks = <mizani.breaks.trans_minor_breaks object>

Callable to calculate minor_breaks

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log10'>

class mizani.transforms.log1p_trans(**kwargs)

Log plus one Transformation

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log1p'>

inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'expm1'>

class mizani.transforms.log2_trans(**kwargs)

Log 2 Transformation

breaks_ = <mizani.breaks.log_breaks object>

Callable to calculate breaks

domain = (2.2250738585072014e-308, inf)

Limits of the transformed data

format = <mizani.formatters.log_format object>

Function to format breaks

static inverse(x)

Inverse of x

minor_breaks = <mizani.breaks.trans_minor_breaks object>

Callable to calculate minor_breaks

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log2'>

mizani.transforms.log_trans(base=None, **kwargs)

Create a log transform class for base

Parameters
base

python:float Base for the logarithm. If None, then the natural log is used.

kwargs

python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.

Returns
out

type Log transform class

class mizani.transforms.logit_trans(**kwargs)

Logit Transformation

domain = (0, 1)

Limits of the transformed data

static inverse(x)

Inverse of x

static transform(x)

Transform of x

mizani.transforms.probability_trans(distribution, *args, **kwargs)

Probability Transformation

Parameters
distribution

python:str Name of the distribution. Valid distributions are listed at scipy.stats. Any of the continuous or discrete distributions.

args

python:tuple Arguments passed to the distribution functions.

kwargs

python:dict Keyword arguments passed to the distribution functions.

Notes

Make sure that the distribution is a good enough approximation for the data. When this is not the case, computations may run into errors. Absence of any errors does not imply that the distribution fits the data.

mizani.transforms.probit_trans

alias of norm_trans

class mizani.transforms.reverse_trans(**kwargs)

Reverse Transformation

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>

inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>

class mizani.transforms.sqrt_trans(**kwargs)

Square-root Transformation

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'sqrt'>

inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'square'>

domain = (0, inf)

Limits of the transformed data

class mizani.transforms.timedelta_trans(**kwargs)

Timedelta Transformation

dataspace_is_numerical = False

Whether the untransformed data is numerical

domain = (datetime.timedelta(days=-999999999), datetime.timedelta(days=999999999, seconds=86399, microseconds=999999))

Limits of the transformed data

breaks_ = <mizani.breaks.timedelta_breaks object>

Callable to calculate breaks

format = <mizani.formatters.timedelta_format object>

Function to format breaks

static transform(x)

Transform from Timeddelta to numerical format

static inverse(x)

Transform to Timedelta from numerical format

class mizani.transforms.pd_timedelta_trans(**kwargs)

Pandas timedelta Transformation

dataspace_is_numerical = False

Whether the untransformed data is numerical

domain = (Timedelta('-106752 days +00:12:43.145224193'), Timedelta('106751 days 23:47:16.854775807'))

Limits of the transformed data

breaks_ = <mizani.breaks.timedelta_breaks object>

Callable to calculate breaks

format = <mizani.formatters.timedelta_format object>

Function to format breaks

static transform(x)

Transform from Timeddelta to numerical format

static inverse(x)

Transform to Timedelta from numerical format

mizani.transforms.pseudo_log_trans(sigma=1, base=None, **kwargs)

Pseudo-log transformation

A transformation mapping numbers to a signed logarithmic scale with a smooth transition to linear scale around 0.

Parameters
sigma

python:float Scaling factor for the linear part.

base

python:int Approximate logarithm used. If None, then the natural log is used.

kwargs

python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.

class mizani.transforms.reciprocal_trans(**kwargs)

Reciprocal Transformation

static transform(x)

Transform of x

static inverse(x)

Inverse of x

class mizani.transforms.trans(**kwargs)

Base class for all transforms

This class is used to transform data and also tell the x and y axes how to create and label the tick locations.

The key methods to override are trans.transform() and trans.inverse(). Alternately, you can quickly create a transform class using the trans_new() function.

Parameters
kwargs

python:dict Attributes of the class to set/override

Examples

By default trans returns one minor break between every pair of major break

>>> major = [0, 1, 2]
>>> t = trans()
>>> t.minor_breaks(major)
array([0.5, 1.5])

Create a trans that returns 4 minor breaks

>>> t = trans(minor_breaks=minor_breaks(4))
>>> t.minor_breaks(major)
array([0.2, 0.4, 0.6, 0.8, 1.2, 1.4, 1.6, 1.8])
aesthetic = None

Aesthetic that the transform works on

dataspace_is_numerical = True

Whether the untransformed data is numerical

domain = (-inf, inf)

Limits of the transformed data

format = <mizani.formatters.mpl_format object>

Function to format breaks

breaks_ = None

Callable to calculate breaks

minor_breaks = None

Callable to calculate minor_breaks

static transform(x)

Transform of x

static inverse(x)

Inverse of x

breaks(limits)

Calculate breaks in data space and return them in transformed space.

Expects limits to be in transform space, this is the same space as that where the domain is specified.

This method wraps around breaks_() to ensure that the calculated breaks are within the domain the transform. This is helpful in cases where an aesthetic requests breaks with limits expanded for some padding, yet the expansion goes beyond the domain of the transform. e.g for a probability transform the breaks will be in the domain [0, 1] despite any outward limits.

Parameters
limits

python:tuple The scale limits. Size 2.

Returns
out

numpy:array_like Major breaks

mizani.transforms.trans_new(name, transform, inverse, breaks=None, minor_breaks=None, _format=None, domain=(-inf, inf), doc='', **kwargs)

Create a transformation class object

Parameters
name

python:str Name of the transformation

transform

python:callable() f(x) A function (preferably a ufunc) that computes the transformation.

inverse

python:callable() f(x) A function (preferably a ufunc) that computes the inverse of the transformation.

breaks

python:callable() f(limits) Function to compute the breaks for this transform. If None, then a default good enough for a linear domain is used.

minor_breaks

python:callable() f(major, limits) Function to compute the minor breaks for this transform. If None, then a default good enough for a linear domain is used.

_format

python:callable() f(breaks) Function to format the generated breaks.

domain

numpy:array_like Domain over which the transformation is valid. It should be of length 2.

doc

python:str Docstring for the class.

**kwargs

python:dict Attributes of the transform, e.g if base is passed in kwargs, then t.base would be a valied attribute.

Returns
out

trans Transform class

mizani.transforms.gettrans(t)

Return a trans object

Parameters
t

python:str | python:callable() | type | trans name of transformation function

Returns
out

trans.UNINDENT

scale - Implementing a scale

According to On the theory of scales of measurement by S.S. Stevens, scales can be classified in four ways -- nominal, ordinal, interval and ratio. Using current(2016) terminology, nominal data is made up of unordered categories, ordinal data is made up of ordered categories and the two can be classified as discrete. On the other hand both interval and ratio data are continuous.

The scale classes below show how the rest of the Mizani package can be used to implement the two categories of scales. The key tasks are training and mapping and these correspond to the train and map methods.

To train a scale on data means, to make the scale learn the limits of the data. This is elaborate (or worthy of a dedicated method) for two reasons:

  • Practical -- data may be split up across more than one object, yet all will be represented by a single scale.
  • Conceptual -- training is a key action that may need to be inserted into multiple locations of the data processing pipeline before a graphic can be created.

To map data onto a scale means, to associate data values with values(potential readings) on a scale. This is perhaps the most important concept unpinning a scale.

The apply methods are simple examples of how to put it all together.

class mizani.scale.scale_continuous

Continuous scale

classmethod apply(x, palette, na_value=None, trans=None)

Scale data continuously

Parameters
x

numpy:array_like Continuous values to scale

palette

python:callable() f(x) Palette to use

na_value

object Value to use for missing values.

trans

trans How to transform the data before scaling. If None, no transformation is done.

Returns
out

numpy:array_like Scaled values

classmethod train(new_data, old=None)

Train a continuous scale

Parameters
new_data

numpy:array_like New values

old

numpy:array_like Old range. Most likely a tuple of length 2.

Returns
out

python:tuple Limits(range) of the scale

classmethod map(x, palette, limits, na_value=None, oob=<function censor>)

Map values to a continuous palette

Parameters
x

numpy:array_like Continuous values to scale

palette

python:callable() f(x) palette to use

na_value

object Value to use for missing values.

oob

python:callable() f(x) Function to deal with values that are beyond the limits

Returns
out

numpy:array_like Values mapped onto a palette

class mizani.scale.scale_discrete

Discrete scale

classmethod apply(x, palette, na_value=None)

Scale data discretely

Parameters
x

numpy:array_like Discrete values to scale

palette

python:callable() f(x) Palette to use

na_value

object Value to use for missing values.

Returns
out

numpy:array_like Scaled values

classmethod train(new_data, old=None, drop=False, na_rm=False)

Train a continuous scale

Parameters
new_data

numpy:array_like New values

old

numpy:array_like Old range. List of values known to the scale.

drop

bool Whether to drop(not include) unused categories

na_rm

bool If True, remove missing values. Missing values are either NaN or None.

Returns
out

python:list Values covered by the scale

classmethod map(x, palette, limits, na_value=None)

Map values to a discrete palette

Parameters
palette

python:callable() f(x) palette to use

x

numpy:array_like Continuous values to scale

na_value

object Value to use for missing values.

Returns
out

numpy:array_like Values mapped onto a palette

Installation

mizani can be can be installed in a couple of ways depending on purpose.

Official release installation

For a normal user, it is recommended to install the official release.

$ pip install mizani

Development installation

To do any development you have to clone the mizani source repository and install the package in development mode. These commands do all of that:

$ git clone https://github.com/has2k1/mizani.git
$ cd mizani
$ pip install -e .

If you only want to use the latest development sources and do not care about having a cloned repository, e.g. if a bug you care about has been fixed but an official release has not come out yet, then use this command:

$ pip install git+https://github.com/has2k1/mizani.git

Changelog

v0.8.1

2022-09-28

Bug Fixes

  • Fixed regression bug in log_format for where formatting for bases 2, 8 and 16 would fail if the values were float-integers.

Enhancements

  • log_format now uses exponent notation for bases other than base 10.

v0.8.0

2022-09-26

API Changes

  • The lut parameter of cmap_pal and cmap_d_pal has been deprecated and will removed in a future version.
  • datetime_trans gained parameter tz that controls the timezone of the transformation.
  • log_format gained boolean parameter mathtex for TeX values as understood matplotlib instead of values in scientific notation.

Bug Fixes

  • Fixed bug in zero_range where uint64 values would cause a RuntimeError.

v0.7.4

2022-04-02 .SS API Changes

  • comma_format is now imported automatically when using *.
  • Fixed issue with scale_discrete so that if you train on data with Nan and specify and old range that also has NaN, the result range does not include two NaN values.

v0.7.3

(2020-10-29) .SS Bug Fixes

  • Fixed log_breaks for narrow range if base=2 (GH76).

v0.7.2

(2020-10-29) .SS Bug Fixes

  • Fixed bug in rescale_max() to properly handle values whose maximum is zero (GH16).

v0.7.1

(2020-06-05) .SS Bug Fixes

  • Fixed regression in mizani.scales.scale_discrete.train() when trainning on values with some categoricals that have common elements.

v0.7.0

(2020-06-04) .SS Bug Fixes

  • Fixed issue with mizani.formatters.log_breaks where non-linear breaks could not be generated if the limits where greater than the largest integer sys.maxsize.
  • Fixed mizani.palettes.gradient_n_pal() to return nan for nan values.
  • Fixed mizani.scales.scale_discrete.train() when training categoricals to maintain the order. (plotnine #381)

v0.6.0

(2019-08-15) .SS New

  • Added pvalue_format
  • Added ordinal_format
  • Added number_bytes_format
  • Added pseudo_log_trans()
  • Added reciprocal_trans
  • Added modulus_trans()

Enhancements

  • mizani.breaks.date_breaks now supports intervals in the

    order of seconds.

  • mizani.palettes.brewer_pal now supports a direction argument to control the order of the returned colors.

API Changes

  • boxcox_trans() now only accepts positive values. For both positive and negative values, modulus_trans() has been added.

v0.5.4

(2019-03-26) .SS Enhancements

  • mizani.formatters.log_format now does a better job of approximating labels for numbers like 3.000000000000001e-05.

API Changes

  • exponent_threshold parameter of mizani.formatters.log_format has been deprecated.

v0.5.3

(2018-12-24) .SS API Changes

  • Log transforms now default to base - 2 minor breaks. So base 10 has 8 minor breaks and 9 partitions, base 8 has 6 minor breaks and 7 partitions, ..., base 2 has 0 minor breaks and a single partition.

v0.5.2

(2018-10-17) .SS Bug Fixes

  • Fixed issue where some functions that took pandas series would return output where the index did not match that of the input.

v0.5.1

(2018-10-15) .SS Bug Fixes

  • Fixed issue with log_breaks, so that it does not fail needlessly when the limits in the (0, 1) range.

Enhancements

  • Changed log_format to return better formatted breaks.

v0.5.0

(2018-11-10) .SS API Changes

  • Support for python 2 has been removed.
  • call() and

    meth:~mizani.breaks.trans_minor_breaks.call now accept optional parameter n which is the number of minor breaks between any two major breaks.

  • The parameter nan_value has be renamed to na_value.
  • The parameter nan_rm has be renamed to na_rm.

Enhancements

  • Better support for handling missing values when training discrete scales.
  • Changed the algorithm for log_breaks, it can now return breaks that do not fall on the integer powers of the base.

v0.4.6

(2018-03-20) .INDENT 0.0

  • Added squish

v0.4.5

(2018-03-09) .INDENT 0.0

  • Added identity_pal
  • Added cmap_d_pal

v0.4.4

(2017-12-13) .INDENT 0.0

  • Fixed date_format to respect the timezones of the dates (GH8).

v0.4.3

(2017-12-01) .INDENT 0.0

  • Changed date_breaks to have more variety in the spacing between the breaks.
  • Fixed date_format to respect time part of the date (GH7).

v0.4.2

(2017-11-06) .INDENT 0.0

  • Fixed (regression) break calculation for the non ordinal transforms.

v0.4.1

(2017-11-04) .INDENT 0.0

  • trans objects can now be instantiated with parameter to override attributes of the instance. And the default methods for computing breaks and minor breaks on the transform instance are not class attributes, so they can be modified without global repercussions.

v0.4.0

(2017-10-24) .SS API Changes

  • Breaks and formatter generating functions have been converted to classes, with a __call__ method. How they are used has not changed, but this makes them move flexible.
  • ExtendedWilkson class has been removed. extended_breaks() now contains the implementation of the break calculating algorithm.

v0.3.4

(2017-09-12) .INDENT 0.0

  • Fixed issue where some formatters methods failed if passed empty breaks argument.
  • Fixed issue with log_breaks() where if the limits were with in the same order of magnitude the calculated breaks were always the ends of the order of magnitude.

    Now log_breaks()((35, 50)) returns [35,  40,  45,  50] as breaks instead of [1, 100].

v0.3.3

(2017-08-30) .INDENT 0.0

  • Fixed SettingWithCopyWarnings in squish_infinite().
  • Added log_format().

API Changes

  • Added log_trans now uses log_format() as the formatting method.

v0.3.2

(2017-07-14) .INDENT 0.0

  • Added expand_range_distinct()

v0.3.1

(2017-06-22) .INDENT 0.0

  • Fixed bug where using log_breaks() with Numpy 1.13.0 led to a ValueError.

v0.3.0

(2017-04-24) .INDENT 0.0

  • Added xkcd_palette(), a palette that selects from 954 named colors.
  • Added crayon_palette(), a palette that selects from 163 named colors.
  • Added cubehelix_pal(), a function that creates a continuous palette from the cubehelix system.
  • Fixed bug where a color palette would raise an exception when passed a single scalar value instead of a list-like.
  • extended_breaks() and mpl_breaks() now return a single break if the limits are equal. Previous, one run into an Overflow and the other returned a sequence filled with n of the same limit.

API Changes

  • mpl_breaks() now returns a function that (strictly) expects a tuple with the minimum and maximum values.

v0.2.0

(2017-01-27) .INDENT 0.0

  • Fixed bug in censor() where a sequence of values with an irregular index would lead to an exception.
  • Fixed boundary issues due internal loss of precision in ported function seq().
  • Added mizani.breaks.extended_breaks() which computes breaks using a modified version of Wilkinson's tick algorithm.
  • Changed the default function mizani.transforms.trans.breaks_() used by mizani.transforms.trans to compute breaks from mizani.breaks.mpl_breaks() to mizani.breaks.extended_breaks().
  • mizani.breaks.timedelta_breaks() now uses mizani.breaks.extended_breaks() internally instead of mizani.breaks.mpl_breaks().
  • Added manual palette function mizani.palettes.manual_pal().
  • Requires pandas version 0.19.0 or higher.

v0.1.0

(2016-06-30)

First public release

Author

Hassan Kibirige

Info

Oct 21, 2022 0.8.1 Mizani