mizani - Man Page

Name

mizani — Mizani Documentation

Mizani is python library that provides the pieces necessary to create scales for a graphics system. It is based on the R Scales package.

Contents

bounds - Limiting data values for a palette

Continuous variables have values anywhere in the range minus infinite to plus infinite. However, when creating a visual representation of these values what usually matters is the relative difference between the values. This is where rescaling comes into play.

The values are mapped onto a range that a scale can deal with. For graphical representation that range tends to be [0, 1] or [0, n], where n is some number that makes the plotted object overflow the plotting area.

Although a scale may be able handle the [0, n] range, it may be desirable to have a lower bound greater than zero. For example, if data values get mapped to zero on a scale whose graphical representation is the size/area/radius/length some data will be invisible. The solution is to restrict the lower bound e.g. [0.1, 1]. Similarly you can restrict the upper bound -- using these functions.

mizani.bounds.censor(x: TFloatVector, range: TupleFloat2 = (0, 1), only_finite: bool = True) -> TFloatVector

Convert any values outside of range to a NULL type object.

Parameters
x

numpy:array_like Values to manipulate

range

python:tuple (min, max) giving desired output range

only_finite

bool If True (the default), will only modify finite values.

Returns
x

numpy:array_like Censored array

Notes

All values in x should be of the same type. only_finite parameter is not considered for Datetime and Timedelta types.

The NULL type object depends on the type of values in x.

  • float - float('nan')
  • int - float('nan')
  • datetime.datetime : np.datetime64(NaT)
  • datetime.timedelta : np.timedelta64(NaT)

Examples

>>> a = np.array([1, 2, np.inf, 3, 4, -np.inf, 5])
>>> censor(a, (0, 10))
array([  1.,   2.,  inf,   3.,   4., -inf,   5.])
>>> censor(a, (0, 10), False)
array([ 1.,  2., nan,  3.,  4., nan,  5.])
>>> censor(a, (2, 4))
array([ nan,   2.,  inf,   3.,   4., -inf,  nan])
mizani.bounds.expand_range(range: TupleFloat2, mul: float = 0, add: float = 0, zero_width: float = 1) -> TupleFloat2

Expand a range with a multiplicative or additive constant

Parameters
range

python:tuple Range of data. Size 2.

mul

python:int | python:float Multiplicative constant

add

python:int | python:float | timedelta Additive constant

zero_width

python:int | python:float | timedelta Distance to use if range has zero width

Returns
out

python:tuple Expanded range

Notes

If expanding datetime or timedelta types, add and zero_width must be suitable timedeltas i.e. You should not mix types between Numpy, Pandas and the datetime module.

Examples

>>> expand_range((3, 8))
(3, 8)
>>> expand_range((0, 10), mul=0.1)
(-1.0, 11.0)
>>> expand_range((0, 10), add=2)
(-2, 12)
>>> expand_range((0, 10), mul=.1, add=2)
(-3.0, 13.0)
>>> expand_range((0, 1))
(0, 1)

When the range has zero width

>>> expand_range((5, 5))
(4.5, 5.5)
mizani.bounds.rescale(x: FloatArrayLike, to: TupleFloat2 = (0, 1), _from: TupleFloat2 | None = None) -> NDArrayFloat

Rescale numeric vector to have specified minimum and maximum.

Parameters
x

numpy:array_like | numeric 1D vector of values to manipulate.

to

python:tuple output range (numeric vector of length two)

_from

python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x

Returns
out

numpy:array_like Rescaled values

Examples

>>> x = [0, 2, 4, 6, 8, 10]
>>> rescale(x)
array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
>>> rescale(x, to=(0, 2))
array([0. , 0.4, 0.8, 1.2, 1.6, 2. ])
>>> rescale(x, to=(0, 2), _from=(0, 20))
array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
mizani.bounds.rescale_max(x: FloatArrayLike, to: TupleFloat2 = (0, 1), _from: TupleFloat2 | None = None) -> NDArrayFloat

Rescale numeric vector to have specified maximum.

Parameters
x

numpy:array_like 1D vector of values to manipulate.

to

python:tuple output range (numeric vector of length two)

_from

python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x. Only the 2nd (max) element is essential to the output.

Returns
out

numpy:array_like Rescaled values

Examples

>>> x = np.array([0, 2, 4, 6, 8, 10])
>>> rescale_max(x, (0, 3))
array([0. , 0.6, 1.2, 1.8, 2.4, 3. ])

Only the 2nd (max) element of the parameters to and _from are essential to the output.

>>> rescale_max(x, (1, 3))
array([0. , 0.6, 1.2, 1.8, 2.4, 3. ])
>>> rescale_max(x, (0, 20))
array([ 0.,  4.,  8., 12., 16., 20.])

If max(x) < _from[1] then values will be scaled beyond the requested maximum (to[1]).

>>> rescale_max(x, to=(1, 3), _from=(-1, 6))
array([0., 1., 2., 3., 4., 5.])

If the values are the same, they taken on the requested maximum. This includes an array of all zeros.

>>> rescale_max(np.array([5, 5, 5]))
array([1., 1., 1.])
>>> rescale_max(np.array([0, 0, 0]))
array([1, 1, 1])
mizani.bounds.rescale_mid(x: FloatArrayLike, to: TupleFloat2 = (0, 1), _from: TupleFloat2 | None = None, mid: float = 0) -> NDArrayFloat

Rescale numeric vector to have specified minimum, midpoint, and maximum.

Parameters
x

numpy:array_like 1D vector of values to manipulate.

to

python:tuple output range (numeric vector of length two)

_from

python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x

mid

numeric mid-point of input range

Returns
out

numpy:array_like Rescaled values

Examples

>>> rescale_mid([1, 2, 3], mid=1)
array([0.5 , 0.75, 1.  ])
>>> rescale_mid([1, 2, 3], mid=2)
array([0. , 0.5, 1. ])

rescale_mid does have the same signature as rescale and rescale_max. In cases where we need a compatible function with the same signature, we use a closure around the extra mid argument.

>>> def rescale_mid_compat(mid):
...     def _rescale(x, to=(0, 1), _from=None):
...         return rescale_mid(x, to, _from, mid=mid)
...     return _rescale
>>> rescale_mid2 = rescale_mid_compat(mid=2)
>>> rescale_mid2([1, 2, 3])
array([0. , 0.5, 1. ])
mizani.bounds.squish_infinite(x: FloatArrayLike, range: TupleFloat2 = (0, 1)) -> NDArrayFloat

Truncate infinite values to a range.

Parameters
x

numpy:array_like Values that should have infinities squished.

range

python:tuple The range onto which to squish the infinites. Must be of size 2.

Returns
out

numpy:array_like Values with infinites squished.

Examples

>>> arr1 = np.array([0, .5, .25, np.inf, .44])
>>> arr2 = np.array([0, -np.inf, .5, .25, np.inf])
>>> squish_infinite(arr1)
array([0.  , 0.5 , 0.25, 1.  , 0.44])
>>> squish_infinite(arr2, (-10, 9))
array([  0.  , -10.  ,   0.5 ,   0.25,   9.  ])
mizani.bounds.zero_range(x: tuple[Any, Any], tol: float = 2.220446049250313e-14) -> bool

Determine if range of vector is close to zero.

Parameters
x

numpy:array_like Value(s) to check. If it is an array_like, it should be of length 2.

tol

python:float Tolerance. Default tolerance is the machine epsilon times 10^2.

Returns
out

bool Whether x has zero range.

Examples

>>> zero_range([1, 1])
True
>>> zero_range([1, 2])
False
>>> zero_range([1, 2], tol=2)
True
mizani.bounds.expand_range_distinct(range: TupleFloat2, expand: TupleFloat2 | TupleFloat4 = (0, 0, 0, 0), zero_width: float = 1) -> TupleFloat2

Expand a range with a multiplicative or additive constants

Similar to expand_range() but both sides of the range expanded using different constants

Parameters
range

python:tuple Range of data. Size 2

expand

python:tuple Length 2 or 4. If length is 2, then the same constants are used for both sides. If length is 4 then the first two are are the Multiplicative (mul) and Additive (add) constants for the lower limit, and the second two are the constants for the upper limit.

zero_width

python:int | python:float | timedelta Distance to use if range has zero width

Returns
out

python:tuple Expanded range

Examples

>>> expand_range_distinct((3, 8))
(3, 8)
>>> expand_range_distinct((0, 10), (0.1, 0))
(-1.0, 11.0)
>>> expand_range_distinct((0, 10), (0.1, 0, 0.1, 0))
(-1.0, 11.0)
>>> expand_range_distinct((0, 10), (0.1, 0, 0, 0))
(-1.0, 10)
>>> expand_range_distinct((0, 10), (0, 2))
(-2, 12)
>>> expand_range_distinct((0, 10), (0, 2, 0, 2))
(-2, 12)
>>> expand_range_distinct((0, 10), (0, 0, 0, 2))
(0, 12)
>>> expand_range_distinct((0, 10), (.1, 2))
(-3.0, 13.0)
>>> expand_range_distinct((0, 10), (.1, 2, .1, 2))
(-3.0, 13.0)
>>> expand_range_distinct((0, 10), (0, 0, .1, 2))
(0, 13.0)
mizani.bounds.squish(x: FloatArrayLike, range: TupleFloat2 = (0, 1), only_finite: bool = True) -> NDArrayFloat

Squish values into range.

Parameters
x

numpy:array_like Values that should have out of range values squished.

range

python:tuple The range onto which to squish the values.

only_finite: boolean

When true, only squishes finite values.

Returns
out

numpy:array_like Values with out of range values squished.

Examples

>>> squish([-1.5, 0.2, 0.8, 1.0, 1.2])
array([0. , 0.2, 0.8, 1. , 1. ])
>>> squish([-np.inf, -1.5, 0.2, 0.8, 1.0, np.inf], only_finite=False)
array([0. , 0. , 0.2, 0.8, 1. , 1. ])

breaks - Partitioning a scale for readability

All scales have a means by which the values that are mapped onto the scale are interpreted. Numeric digital scales put out numbers for direct interpretation, but most scales cannot do this. What they offer is named markers/ticks that aid in assessing the values e.g. the common odometer will have ticks and values to help gauge the speed of the vehicle.

The named markers are what we call breaks. Properly calculated breaks make interpretation straight forward. These functions provide ways to calculate good(hopefully) breaks.

class mizani.breaks.breaks_log(n: int = 5, base: float = 10)

Integer breaks on log transformed scales

Parameters
n

python:int Desired number of breaks

base

python:int Base of logarithm

Examples

>>> x = np.logspace(3, 6)
>>> limits = min(x), max(x)
>>> breaks_log()(limits)
array([     1000,    10000,   100000,  1000000])
>>> breaks_log(2)(limits)
array([  1000, 100000])
>>> breaks_log()([0.1, 1])
array([0.1, 0.3, 1. , 3. ])
__call__(limits: TupleFloat2) -> NDArrayFloat

Compute breaks

Parameters
limits

python:tuple Minimum and maximum values

Returns
out

numpy:array_like Sequence of breaks points

class mizani.breaks.breaks_symlog

Breaks for the Symmetric Logarithm Transform

Parameters
n

python:int Desired number of breaks

base

python:int Base of logarithm

Examples

>>> limits = (-100, 100)
>>> breaks_symlog()(limits)
array([-100,  -10,    0,   10,  100])
__call__(limits: TupleFloat2) -> NDArrayFloat

Call self as a function.

class mizani.breaks.minor_breaks(n: int = 1)

Compute minor breaks

This is the naive method. It does not take into account the transformation.

Parameters
n

python:int Number of minor breaks between the major breaks.

Examples

>>> major = [1, 2, 3, 4]
>>> limits = [0, 5]
>>> minor_breaks()(major, limits)
array([0.5, 1.5, 2.5, 3.5, 4.5])
>>> minor_breaks()([1, 2], (1, 2))
array([1.5])

More than 1 minor break.

>>> minor_breaks(3)([1, 2], (1, 2))
array([1.25, 1.5 , 1.75])
>>> minor_breaks()([1, 2], (1, 2), 3)
array([1.25, 1.5 , 1.75])
__call__(major: FloatArrayLike, limits: TupleFloat2 | None = None, n: int | None = None) -> NDArrayFloat

Minor breaks

Parameters
major

numpy:array_like Major breaks

limits

numpy:array_like | python:None Limits of the scale. If array_like, must be of size 2. If None, then the minimum and maximum of the major breaks are used.

n

python:int Number of minor breaks between the major breaks. If None, then self.n is used.

Returns
out

numpy:array_like Minor beraks

class mizani.breaks.minor_breaks_trans(trans: Trans, n: int = 1)

Compute minor breaks for transformed scales

The minor breaks are computed in data space. This together with major breaks computed in transform space reveals the non linearity of of a scale. See the log transforms created with log_trans() like log10_trans.

Parameters
trans

trans or type Trans object or trans class.

n

python:int Number of minor breaks between the major breaks.

Examples

>>> from mizani.transforms import sqrt_trans
>>> major = [1, 2, 3, 4]
>>> limits = [0, 5]
>>> t1 = sqrt_trans()
>>> t1.minor_breaks(major, limits)
array([1.58113883, 2.54950976, 3.53553391])

# Changing the regular minor_breaks method

>>> t2 = sqrt_trans()
>>> t2.minor_breaks = minor_breaks()
>>> t2.minor_breaks(major, limits)
array([0.5, 1.5, 2.5, 3.5, 4.5])

More than 1 minor break

>>> major = [1, 10]
>>> limits = [1, 10]
>>> t2.minor_breaks(major, limits, 4)
array([2.8, 4.6, 6.4, 8.2])
__call__(major: FloatArrayLike, limits: TupleFloat2 | None = None, n: int | None = None) -> NDArrayFloat

Minor breaks for transformed scales

Parameters
major

numpy:array_like Major breaks

limits

numpy:array_like | python:None Limits of the scale. If array_like, must be of size 2. If None, then the minimum and maximum of the major breaks are used.

n

python:int Number of minor breaks between the major breaks. If None, then self.n is used.

Returns
out

numpy:array_like Minor breaks

class mizani.breaks.breaks_date(n: int = 5, width: str | None = None)

Regularly spaced dates

Parameters
n

Desired number of breaks.

width

python:str | python:None An interval specification. Must be one of [second, minute, hour, day, week, month, year] If None, the interval automatic.

Examples

>>> from datetime import datetime
>>> limits = (datetime(2010, 1, 1), datetime(2026, 1, 1))

Default breaks will be regularly spaced but the spacing is automatically determined

>>> breaks = breaks_date(9)
>>> [d.year for d in breaks(limits)]
[2010, 2012, 2014, 2016, 2018, 2020, 2022, 2024, 2026]

Breaks at 4 year intervals

>>> breaks = breaks_date('4 year')
>>> [d.year for d in breaks(limits)]
[2010, 2014, 2018, 2022, 2026]
__call__(limits: TupleT2[datetime]) -> Sequence[datetime]

Compute breaks

Parameters
limits

python:tuple Minimum and maximum datetime.datetime values.

Returns
out

numpy:array_like Sequence of break points.

class mizani.breaks.breaks_timedelta(n: int = 5, Q: Sequence[float] = (1, 2, 5, 10))

Timedelta breaks

Returns
out

python:callable() f(limits) A function that takes a sequence of two datetime.timedelta values and returns a sequence of break points.

Examples

>>> from datetime import timedelta
>>> breaks = breaks_timedelta()
>>> x = [timedelta(days=i*365) for i in range(25)]
>>> limits = min(x), max(x)
>>> major = breaks(limits)
>>> [val.total_seconds()/(365*24*60*60)for val in major]
[0.0, 5.0, 10.0, 15.0, 20.0, 25.0]
__call__(limits: tuple[Timedelta, Timedelta]) -> NDArrayTimedelta

Compute breaks

Parameters
limits

python:tuple Minimum and maximum datetime.timedelta values.

Returns
out

numpy:array_like Sequence of break points.

class mizani.breaks.breaks_extended(n: int = 5, Q: Sequence[float] = (1, 5, 2, 2.5, 4, 3), only_inside: bool = False, w: Sequence[float] = (0.25, 0.2, 0.5, 0.05))

An extension of Wilkinson's tick position algorithm

Parameters
n

python:int Desired number of breaks

Q

python:list List of nice numbers

only_inside

bool If True, then all the breaks will be within the given range.

w

python:list Weights applied to the four optimization components (simplicity, coverage, density, and legibility). They should add up to 1.

References

  • Talbot, J., Lin, S., Hanrahan, P. (2010) An Extension of Wilkinson's Algorithm for Positioning Tick Labels on Axes, InfoVis 2010.

Additional Credit to Justin Talbot on whose code this implementation is almost entirely based.

Examples

>>> limits = (0, 9)
>>> breaks_extended()(limits)
array([  0. ,   2.5,   5. ,   7.5,  10. ])
>>> breaks_extended(n=6)(limits)
array([  0.,   2.,   4.,   6.,   8.,  10.])
__call__(limits: TupleFloat2) -> NDArrayFloat

Calculate the breaks

Parameters
limits

array Minimum and maximum values.

Returns
out

numpy:array_like Sequence of break points.

labels - Labelling breaks

Scales have guides and these are what help users make sense of the data mapped onto the scale. Common examples of guides include the x-axis, the y-axis, the keyed legend and a colorbar legend. The guides have demarcations(breaks), some of which must be labelled.

The label_* functions below create functions that convert data values as understood by a specific scale and return string representations of those values. Manipulating the string representation of a value helps improve readability of the guide.

class mizani.labels.label_comma(accuracy: float | None = None, precision: int = 0, scale: float = 1, prefix: str = '', suffix: str = '', big_mark: str = ',', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)

Labels of numbers with commas as separators

Parameters
precision

python:int Number of digits after the decimal point.

Examples

>>> label_comma()([1000, 2, 33000, 400])
['1,000', '2', '33,000', '400']
class mizani.labels.label_custom(fmt: str = '{}', style: Literal['old', 'new'] = 'new')

Creating a custom labelling function

Parameters
fmt

python:str, optional Format string. Default is the generic new style format braces, {}.

style

'new' | 'old' Whether to use new style or old style formatting. New style uses the str.format() while old style uses %. The format string must be written accordingly.

Examples

>>> label = label_custom('{:.2f} USD')
>>> label([3.987, 2, 42.42])
['3.99 USD', '2.00 USD', '42.42 USD']
__call__(x: FloatArrayLike) -> Sequence[str]

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.labels.label_currency(accuracy: float | None = None, precision: int | None = None, scale: float = 1, prefix: str = '$', suffix: str = '', big_mark: str = '', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)

Labelling currencies

Parameters
prefix

python:str What to put before the value.

Examples

>>> x = [1.232, 99.2334, 4.6, 9, 4500]
>>> label_currency()(x)
['$1.23', '$99.23', '$4.60', '$9.00', '$4500.00']
>>> label_currency(prefix='C$', precision=0, big_mark=',')(x)
['C$1', 'C$99', 'C$5', 'C$9', 'C$4,500']
mizani.labels.label_dollar

alias of label_currency

class mizani.labels.label_percent(accuracy: float | None = None, precision: int | None = None, scale: float = 100, prefix: str = '', suffix: str = '%', big_mark: str = '', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)

Labelling percentages

Multiply by one hundred and display percent sign

Examples

>>> label = label_percent()
>>> label([.45, 9.515, .01])
['45%', '952%', '1%']
>>> label([.654, .8963, .1])
['65%', '90%', '10%']
class mizani.labels.label_scientific(digits: int = 3)

Scientific number labels

Parameters
digits

python:int Significant digits.

Notes

Be careful when using many digits (15+ on a 64 bit computer). Consider of the machine epsilon.

Examples

>>> x = [.12, .23, .34, 45]
>>> label_scientific()(x)
['1.2e-01', '2.3e-01', '3.4e-01', '4.5e+01']
__call__(x: FloatArrayLike) -> Sequence[str]

Call self as a function.

class mizani.labels.label_date(fmt: str = '%Y-%m-%d', tz: tzinfo | None = None)

Datetime labels

Parameters
fmt

python:str Format string. See strftime.

tz

datetime.tzinfo, optional Time zone information. If none is specified, the time zone will be that of the first date. If the first date has no time information then a time zone is chosen by other means.

Examples

>>> from datetime import datetime
>>> x = [datetime(x, 1, 1) for x in [2010, 2014, 2018, 2022]]
>>> label_date()(x)
['2010-01-01', '2014-01-01', '2018-01-01', '2022-01-01']
>>> label_date('%Y')(x)
['2010', '2014', '2018', '2022']

Can format time

>>> x = [datetime(2017, 12, 1, 16, 5, 7)]
>>> label_date("%Y-%m-%d %H:%M:%S")(x)
['2017-12-01 16:05:07']

Time zones are respected

>>> UTC = ZoneInfo('UTC')
>>> UG = ZoneInfo('Africa/Kampala')
>>> x = [datetime(2010, 1, 1, i) for i in [8, 15]]
>>> x_tz = [datetime(2010, 1, 1, i, tzinfo=UG) for i in [8, 15]]
>>> label_date('%Y-%m-%d %H:%M')(x)
['2010-01-01 08:00', '2010-01-01 15:00']
>>> label_date('%Y-%m-%d %H:%M')(x_tz)
['2010-01-01 08:00', '2010-01-01 15:00']

Format with a specific time zone

>>> label_date('%Y-%m-%d %H:%M', tz=UTC)(x_tz)
['2010-01-01 05:00', '2010-01-01 12:00']
>>> label_date('%Y-%m-%d %H:%M', tz='EST')(x_tz)
['2010-01-01 00:00', '2010-01-01 07:00']
__call__(x: Sequence[datetime]) -> Sequence[str]

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.labels.label_number(accuracy: float | None = None, precision: int | None = None, scale: float = 1, prefix: str = '', suffix: str = '', big_mark: str = '', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)

Labelling numbers

Parameters
precision

python:int Number of digits after the decimal point.

suffix

python:str What to put after the value.

big_mark

python:str The thousands separator. This is usually a comma or a dot.

decimal_mark

python:str What to use to separate the decimals digits.

Examples

>>> label_number()([.654, .8963, .1])
['0.65', '0.90', '0.10']
>>> label_number(accuracy=0.0001)([.654, .8963, .1])
['0.6540', '0.8963', '0.1000']
>>> label_number(precision=4)([.654, .8963, .1])
['0.6540', '0.8963', '0.1000']
>>> label_number(prefix="$")([5, 24, -42])
['$5', '$24', '-$42']
>>> label_number(suffix="s")([5, 24, -42])
['5s', '24s', '-42s']
>>> label_number(big_mark="_")([1e3, 1e4, 1e5, 1e6])
['1_000', '10_000', '100_000', '1_000_000']
>>> label_number(width=3)([1, 10, 100, 1000])
['  1', ' 10', '100', '1000']
>>> label_number(align="^", width=5)([1, 10, 100, 1000])
['  1  ', ' 10  ', ' 100 ', '1000 ']
>>> label_number(style_positive=" ")([5, 24, -42])
[' 5', ' 24', '-42']
>>> label_number(style_positive="+")([5, 24, -42])
['+5', '+24', '-42']
>>> label_number(prefix="$", style_negative="braces")([5, 24, -42])
['$5', '$24', '($42)']
__call__(x: FloatArrayLike) -> Sequence[str]

Call self as a function.

class mizani.labels.label_log(base: float = 10, exponent_limits: TupleInt2 = (-4, 4), mathtex: bool = False)

Log number labels

Parameters
base

python:int Base of the logarithm. Default is 10.

exponent_limits

python:tuple limits (int, int) where if the any of the powers of the numbers falls outside, then the labels will be in exponent form. This only applies for base 10.

mathtex

bool If True, return the labels in mathtex format as understood by Matplotlib.

Examples

>>> label_log()([0.001, 0.1, 100])
['0.001', '0.1', '100']
>>> label_log()([0.0001, 0.1, 10000])
['1e-4', '1e-1', '1e4']
>>> label_log(mathtex=True)([0.0001, 0.1, 10000])
['$10^{-4}$', '$10^{-1}$', '$10^{4}$']
__call__(x: FloatArrayLike) -> Sequence[str]

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.labels.label_timedelta(units: DurationUnit | None = None, show_units: bool = True, zero_has_units: bool = True, usetex: bool = False, space: bool = True, use_plurals: bool = True)

Timedelta labels

Parameters
units

python:str, optional The units in which the breaks will be computed. If None, they are decided automatically. Otherwise, the value should be one of:

'ns'    # nanoseconds
'us'    # microseconds
'ms'    # milliseconds
's'     # seconds
'min'   # minute
'h'     # hour
'day'     # day
'week'  # week
'month' # month
'year'  # year
show_units

bool Whether to append the units symbol to the values.

zero_has_units

bool If True a value of zero

usetex

bool If True, they microseconds identifier string is rendered with greek letter mu. Default is False.

space

bool If True add a space between the value and the units

use_plurals

bool If True, for the when the value is not 1 and the units are one of week, month and year, the plural form of the unit is used e.g. 2 weeks.

Examples

>>> from datetime import timedelta
>>> x = [timedelta(days=31*i) for i in range(5)]
>>> label_timedelta()(x)
['0 months', '1 month', '2 months', '3 months', '4 months']
>>> label_timedelta(use_plurals=False)(x)
['0 month', '1 month', '2 month', '3 month', '4 month']
>>> label_timedelta(units='day')(x)
['0 days', '31 days', '62 days', '93 days', '124 days']
>>> label_timedelta(units='day', zero_has_units=False)(x)
['0', '31 days', '62 days', '93 days', '124 days']
>>> label_timedelta(units='day', show_units=False)(x)
['0', '31', '62', '93', '124']
__call__(x: NDArrayTimedelta) -> Sequence[str]

Call self as a function.

class mizani.labels.label_pvalue(accuracy: float = 0.001, add_p: float = False)

p-values labelling

Parameters
accuracy

python:float Number to round to

add_p

bool Whether to prepend "p=" or "p<" to the output

Examples

>>> x = [.90, .15, .015, .009, 0.0005]
>>> label_pvalue()(x)
['0.9', '0.15', '0.015', '0.009', '<0.001']
>>> label_pvalue(0.1)(x)
['0.9', '0.1', '<0.1', '<0.1', '<0.1']
>>> label_pvalue(0.1, True)(x)
['p=0.9', 'p=0.1', 'p<0.1', 'p<0.1', 'p<0.1']
__call__(x: FloatArrayLike) -> Sequence[str]

Format a sequence of inputs

Parameters
x

array Input

Returns
out

python:list List of strings.

class mizani.labels.label_ordinal(prefix: str = '', suffix: str = '', big_mark: str = '')

Ordinal number labelling

Parameters
prefix

python:str What to put before the value.

suffix

python:str What to put after the value.

big_mark

python:str The thousands separator. This is usually a comma or a dot.

Examples

>>> label_ordinal()(range(8))
['0th', '1st', '2nd', '3rd', '4th', '5th', '6th', '7th']
>>> label_ordinal(suffix=' Number')(range(11, 15))
['11th Number', '12th Number', '13th Number', '14th Number']
__call__(x: FloatArrayLike) -> Sequence[str]

Call self as a function.

class mizani.labels.label_bytes(symbol: Literal['auto'] | BytesSymbol = 'auto', units: Literal['binary', 'si'] = 'binary', fmt: str = '{:.0f} ')

Labelling byte numbers

Parameters
symbol

python:str Valid symbols are "B", "kB", "MB", "GB", "TB", "PB", "EB", "ZB", and "YB" for SI units, and the "iB" variants for binary units. Default is "auto" where the symbol to be used is determined separately for each value of 1x.

units

"binary" | "si" Which unit base to use, 1024 for "binary" or 1000 for "si".

fmt

python:str, optional Format sting. Default is {:.0f}.

Examples

>>> x = [1000, 1000000, 4e5]
>>> label_bytes()(x)
['1000 B', '977 KiB', '391 KiB']
>>> label_bytes(units='si')(x)
['1 kB', '1 MB', '400 kB']
__call__(x: FloatArrayLike) -> Sequence[str]

Call self as a function.

palettes - Mapping values onto the domain of a scale

Palettes are the link between data values and the values along the dimension of a scale. Before a collection of values can be represented on a scale, they are transformed by a palette. This transformation is knowing as mapping. Values are mapped onto a scale by a palette.

Scales tend to have restrictions on the magnitude of quantities that they can intelligibly represent. For example, the size of a point should be significantly smaller than the plot panel onto which it is plotted or else it would be hard to compare two or more points. Therefore palettes must be created that enforce such restrictions. This is the reason for the *_pal functions that create and return the actual palette functions.

mizani.palettes.hls_palette(n_colors: int = 6, h: float = 0.01, l: float = 0.6, s: float = 0.65) -> Sequence[TupleFloat3]

Get a set of evenly spaced colors in HLS hue space.

h, l, and s should be between 0 and 1

Parameters
n_colors

python:int number of colors in the palette

h

python:float first hue

l

python:float lightness

s

python:float saturation

Returns
palette

python:list List of colors as RGB hex strings.

SEE ALSO:

hsluv_palette

Make a palette using evenly spaced circular hues in the HSLuv system.

Examples

>>> len(hls_palette(2))
2
>>> len(hls_palette(9))
9
mizani.palettes.hsluv_palette(n_colors: int = 6, h: float = 0.01, s: float = 0.9, l: float = 0.65) -> Sequence[TupleFloat3]

Get a set of evenly spaced colors in HSLuv hue space.

h, s, and l should be between 0 and 1

Parameters
n_colors

python:int number of colors in the palette

h

python:float first hue

s

python:float saturation

l

python:float lightness

Returns
palette

python:list List of colors as RGB hex strings.

SEE ALSO:

hls_palette

Make a palette using evenly spaced circular hues in the HSL system.

Examples

>>> len(hsluv_palette(3))
3
>>> len(hsluv_palette(11))
11
class mizani.palettes.rescale_pal(range: TupleFloat2 = (0.1, 1))

Rescale the input to the specific output range.

Useful for alpha, size, and continuous position.

Parameters
range

python:tuple Range of the scale

Returns
out

function Palette function that takes a sequence of values in the range [0, 1] and returns values in the specified range.

Examples

>>> palette = rescale_pal()
>>> palette([0, .2, .4, .6, .8, 1])
array([0.1 , 0.28, 0.46, 0.64, 0.82, 1.  ])

The returned palette expects inputs in the [0, 1] range. Any value outside those limits is clipped to range[0] or range[1].

>>> palette([-2, -1, 0.2, .4, .8, 2, 3])
array([0.1 , 0.1 , 0.28, 0.46, 0.82, 1.  , 1.  ])
class mizani.palettes.area_pal(range: TupleFloat2 = (1, 6))

Point area palette (continuous).

Parameters
range

python:tuple Numeric vector of length two, giving range of possible sizes. Should be greater than 0.

Returns
out

function Palette function that takes a sequence of values in the range [0, 1] and returns values in the specified range.

Examples

>>> x = np.arange(0, .6, .1)**2
>>> palette = area_pal()
>>> palette(x)
array([1. , 1.5, 2. , 2.5, 3. , 3.5])

The results are equidistant because the input x is in area space, i.e it is squared.

class mizani.palettes.abs_area(max: float)

Point area palette (continuous), with area proportional to value.

Parameters
max

python:float A number representing the maximum size

Returns
out

function Palette function that takes a sequence of values in the range [0, 1] and returns values in the range [0, max].

Examples

>>> x = np.arange(0, .8, .1)**2
>>> palette = abs_area(5)
>>> palette(x)
array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5])

Compared to area_pal(), abs_area() will handle values in the range [-1, 0] without returning np.nan. And values whose absolute value is greater than 1 will be clipped to the maximum.

class mizani.palettes.grey_pal(start: float = 0.2, end: float = 0.8)

Utility for creating continuous grey scale palette

Parameters
start

python:float grey value at low end of palette

end

python:float grey value at high end of palette

Returns
out

function Continuous color palette that takes a single int parameter n and returns n equally spaced colors.

Examples

>>> palette = grey_pal()
>>> palette(5)
['#333333', '#737373', '#989898', '#b4b4b4', '#cccccc']
class mizani.palettes.hue_pal(h: float = 0.01, l: float = 0.6, s: float = 0.65, color_space: Literal['hls', 'hsluv'] = 'hls')

Utility for making hue palettes for color schemes.

Parameters
h

python:float first hue. In the [0, 1] range

l

python:float lightness. In the [0, 1] range

s

python:float saturation. In the [0, 1] range

color_space

'hls' | 'hsluv' Color space to use for the palette. hls for https://en.wikipedia.org/wiki/HSL_and_HSV or hsluv for https://www.hsluv.org/.

Returns
out

function A discrete color palette that takes a single int parameter n and returns n equally spaced colors. Though the palette is continuous, since it is varies the hue it is good for categorical data. However if n is large enough the colors show continuity.

Examples

>>> hue_pal()(5)
['#db5f57', '#b9db57', '#57db94', '#5784db', '#c957db']
>>> hue_pal(color_space='hsluv')(5)
['#e0697e', '#9b9054', '#569d79', '#5b98ab', '#b675d7']
class mizani.palettes.brewer_pal(type: ColorScheme | ColorSchemeShort = 'seq', palette: int | str = 1, direction: Literal[1, -1] = 1)

Utility for making a brewer palette

Parameters
type

'sequential' | 'qualitative' | 'diverging' Type of palette. Sequential, Qualitative or Diverging. The following abbreviations may be used, seq, qual or div.

palette

python:int | python:str Which palette to choose from. If is an integer, it must be in the range [0, m], where m depends on the number sequential, qualitative or diverging palettes. If it is a string, then it is the name of the palette.

direction

python:int The order of colours in the scale. If -1 the order of colors is reversed. The default is 1.

Returns
out

function A color palette that takes a single int parameter n and returns n colors. The maximum value of n varies depending on the parameters.

Examples

>>> brewer_pal()(5)
['#EFF3FF', '#BDD7E7', '#6BAED6', '#3182BD', '#08519C']
>>> brewer_pal('qual')(5)
['#7FC97F', '#BEAED4', '#FDC086', '#FFFF99', '#386CB0']
>>> brewer_pal('qual', 2)(5)
['#1B9E77', '#D95F02', '#7570B3', '#E7298A', '#66A61E']
>>> brewer_pal('seq', 'PuBuGn')(5)
['#F6EFF7', '#BDC9E1', '#67A9CF', '#1C9099', '#016C59']

The available color names for each palette type can be obtained using the following code:

from mizani._colors.brewer import get_palette_names

print(get_palette_names("sequential"))
print(get_palette_names("qualitative"))
print(get_palette_names("diverging"))
class mizani.palettes.gradient_n_pal(colors: Sequence[str], values: Sequence[float] | None = None)

Create a n color gradient palette

Parameters
colors

python:list list of colors

values

python:list, optional list of points in the range [0, 1] at which to place each color. Must be the same size as colors. Default to evenly space the colors

Returns
out

function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].

Examples

>>> palette = gradient_n_pal(['red', 'blue'])
>>> palette([0, .25, .5, .75, 1])
['#ff0000', '#bf0040', '#7f0080', '#4000bf', '#0000ff']
>>> palette([-np.inf, 0, np.nan, 1, np.inf])
[None, '#ff0000', None, '#0000ff', None]
class mizani.palettes.cmap_pal(name: str)

Create a continuous palette using a colormap

Parameters
name

python:str Name of colormap

Returns
out

function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].

Examples

>>> palette = cmap_pal('viridis')
>>> palette([.1, .2, .3, .4, .5])
['#482475', '#414487', '#355f8d', '#2a788e', '#21918c']
class mizani.palettes.cmap_d_pal(name: str)

Create a discrete palette from a colormap

Parameters
name

python:str Name of colormap

Returns
out

function A discrete color palette that takes a single int parameter n and returns n colors. The maximum value of n varies depending on the parameters.

Examples

>>> palette = cmap_d_pal('viridis')
>>> palette(5)
['#440154', '#3b528b', '#21918c', '#5ec962', '#fde725']
class mizani.palettes.desaturate_pal(color: str, prop: float, reverse: bool = False)

Create a palette that desaturate a color by some proportion

Parameters
color

color html color name, hex, rgb-tuple

prop

python:float saturation channel of color will be multiplied by this value

reverse

bool Whether to reverse the palette.

Returns
out

function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].

Examples

>>> palette = desaturate_pal('red', .1)
>>> palette([0, .25, .5, .75, 1])
['#ff0000', '#e21d1d', '#c53a3a', '#a95656', '#8c7373']
class mizani.palettes.manual_pal(values: Sequence[Any])

Create a palette from a list of values

Parameters
values

python:sequence Values that will be returned by the palette function.

Returns
out

function A function palette that takes a single int parameter n and returns n values.

Examples

>>> palette = manual_pal(['a', 'b', 'c', 'd', 'e'])
>>> palette(3)
['a', 'b', 'c']
mizani.palettes.xkcd_palette(colors: Sequence[str]) -> Sequence[RGBHexColor]

Make a palette with color names from the xkcd color survey.

See xkcd for the full list of colors: http://xkcd.com/color/rgb/

Parameters
colors

python:list of strings List of keys in the mizani.colors.xkcd_rgb dictionary.

Returns
palette

python:list List of colors as RGB hex strings.

Examples

>>> palette = xkcd_palette(['red', 'green', 'blue'])
>>> palette
['#E50000', '#15B01A', '#0343DF']
>>> from mizani._colors.named_colors import XKCD
>>> list(sorted(XKCD.keys()))[:4]
['xkcd:acid green', 'xkcd:adobe', 'xkcd:algae', 'xkcd:algae green']
mizani.palettes.crayon_palette(colors: Sequence[str]) -> Sequence[RGBHexColor]

Make a palette with color names from Crayola crayons.

The colors come from http://en.wikipedia.org/wiki/List_of_Crayola_crayon_colors

Parameters
colors

python:list of strings List of keys in the mizani.colors.crayloax_rgb dictionary.

Returns
palette

python:list List of colors as RGB hex strings.

Examples

>>> palette = crayon_palette(['almond', 'silver', 'yellow'])
>>> palette
['#EED9C4', '#C9C0BB', '#FBE870']
>>> from mizani._colors.named_colors import CRAYON
>>> list(sorted(CRAYON.keys()))[:3]
['crayon:almond', 'crayon:antique brass', 'crayon:apricot']
class mizani.palettes.cubehelix_pal(start: int = 0, rotation: float = 0.4, gamma: float = 1.0, hue: float = 0.8, light: float = 0.85, dark: float = 0.15, reverse: bool = False)

Utility for creating discrete palette from the cubehelix system.

This produces a colormap with linearly-decreasing (or increasing) brightness. That means that information will be preserved if printed to black and white or viewed by someone who is colorblind.

Parameters
start

python:float (0 <= start <= 3) The hue at the start of the helix.

rot

python:float Rotations around the hue wheel over the range of the palette.

gamma

python:float (0 <= gamma) Gamma factor to emphasize darker (gamma < 1) or lighter (gamma > 1) colors.

hue

python:float (0 <= hue <= 1) Saturation of the colors.

dark

python:float (0 <= dark <= 1) Intensity of the darkest color in the palette.

light

python:float (0 <= light <= 1) Intensity of the lightest color in the palette.

reverse

bool If True, the palette will go from dark to light.

Returns
out

function Continuous color palette that takes a single int parameter n and returns n equally spaced colors.

References

Green, D. A. (2011). "A colour scheme for the display of astronomical intensity images". Bulletin of the Astromical Society of India, Vol. 39, p. 289-295.

Examples

>>> palette = cubehelix_pal()
>>> palette(5)
['#edd1cb', '#d499a7', '#aa678f', '#6e4071', '#2d1e3e']
mizani.palettes.identity_pal() -> Callable[[T], T]

Create palette that maps values onto themselves

Returns
out

function Palette function that takes a value or sequence of values and returns the same values.

Examples

>>> palette = identity_pal()
>>> palette(9)
9
>>> palette([2, 4, 6])
[2, 4, 6]
class mizani.palettes.none_pal

Discrete palette that returns only None values

transforms - Transforming variables, scales and coordinates

"The Grammar of Graphics (2005)" by Wilkinson, Anand and Grossman describes three types of transformations.

  • Variable transformations - Used to make statistical operations on variables appropriate and meaningful. They are also used to new variables.
  • Scale transformations - Used to make statistical objects displayed on dimensions appropriate and meaningful.
  • Coordinate transformations - Used to manipulate the geometry of graphics to help perceive relationships and find meaningful structures for representing variations.

Variable and scale transformations are similar in-that they lead to plotted objects that are indistinguishable. Typically, variable transformation is done outside the graphics system and so the system cannot provide transformation specific guides & decorations for the plot. The trans is aimed at being useful for scale and coordinate transformations.

class mizani.transforms.asn_trans(**kwargs: Any)

Arc-sin square-root Transformation

static transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

static inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

class mizani.transforms.atanh_trans(**kwargs: Any)

Arc-tangent Transformation

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'arctanh'>

inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'tanh'>

mizani.transforms.boxcox_trans(p, offset=0, **kwargs)

Boxcox Transformation

The Box-Cox transformation is a flexible transformation, often used to transform data towards normality.

The Box-Cox power transformation (type 1) requires strictly positive values and takes the following form for y \gt 0:

y^{(\lambda)} = \frac{y^\lambda - 1}{\lambda}

When y = 0, the natural log transform is used.

Parameters
p

python:float Transformation exponent \lambda.

offset

python:int Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 0. modulus_trans() sets the default to 1.

kwargs

python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.

SEE ALSO:

modulus_trans()

References

mizani.transforms.modulus_trans(p, offset=1, **kwargs)

Modulus Transformation

The modulus transformation generalises Box-Cox to work with both positive and negative values.

When y \neq 0

y^{(\lambda)} = sign(y) * \frac{(|y| + 1)^\lambda - 1}{\lambda}

and when y = 0

y^{(\lambda)} =  sign(y) * \ln{(|y| + 1)}

Parameters
p

python:float Transformation exponent \lambda.

offset

python:int Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 1. boxcox_trans() sets the default to 0.

kwargs

python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.

SEE ALSO:

boxcox_trans()

References

class mizani.transforms.datetime_trans(tz=None, **kwargs)

Datetime Transformation

Parameters
tz

python:str | ZoneInfo Timezone information

Examples

>>> from zoneinfo import ZoneInfo
>>> UTC = ZoneInfo("UTC")
>>> EST = ZoneInfo("EST")
>>> t = datetime_trans(EST)
>>> x = [datetime(2022, 1, 20, tzinfo=UTC)]
>>> x2 = t.inverse(t.transform(x))
>>> list(x) == list(x2)
True
>>> x[0].tzinfo == x2[0].tzinfo
False
>>> x[0].tzinfo.key
'UTC'
>>> x2[0].tzinfo.key
'EST'
breaks_: BreaksFunction = <mizani.breaks.breaks_date object>

Callable to calculate breaks

format: FormatFunction = label_date(fmt='%Y-%m-%d', tz=None)

Function to format breaks

transform(x: DatetimeArrayLike) -> NDArrayFloat

Transform from date to a numerical format

inverse(x: FloatArrayLike) -> NDArrayDatetime

Transform to date from numerical format

property tzinfo

Alias of tz

mizani.transforms.exp_trans(base: float | None = None, **kwargs: Any)

Create a exponential transform class for base

This is inverse of the log transform.

Parameters
base

python:float Base of the logarithm

kwargs

python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.

Returns
out

type Exponential transform class

class mizani.transforms.identity_trans(**kwargs: Any)

Identity Transformation

Examples

The default trans returns one minor break between every pair of major break

>>> major = [0, 1, 2]
>>> t = identity_trans()
>>> t.minor_breaks(major)
array([0.5, 1.5])

Create a trans that returns 4 minor breaks

>>> t = identity_trans(minor_breaks=minor_breaks(4))
>>> t.minor_breaks(major)
array([0.2, 0.4, 0.6, 0.8, 1.2, 1.4, 1.6, 1.8])
transform_is_linear: bool = True

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

static transform(param: T) -> T

Return whatever is passed in

static inverse(param: T) -> T

Return whatever is passed in

class mizani.transforms.log10_trans(**kwargs: Any)

Log 10 Transformation

breaks_: BreaksFunction = <mizani.breaks.breaks_log object>

Callable to calculate breaks

format: FormatFunction = label_log(base=10, exponent_limits=(-4, 4), mathtex=False)

Function to format breaks

static inverse(x)

Inverse of x

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log10'>

class mizani.transforms.log1p_trans(**kwargs: Any)

Log plus one Transformation

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log1p'>

inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'expm1'>

class mizani.transforms.log2_trans(**kwargs: Any)

Log 2 Transformation

breaks_: BreaksFunction = <mizani.breaks.breaks_log object>

Callable to calculate breaks

format: FormatFunction = label_log(base=2, exponent_limits=(-4, 4), mathtex=False)

Function to format breaks

static inverse(x)

Inverse of x

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log2'>

mizani.transforms.log_trans(base: float | None = None, **kwargs: Any) -> trans

Create a log transform class for base

Parameters
base

python:float Base for the logarithm. If None, then the natural log is used.

kwargs

python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.

Returns
out

type Log transform class

class mizani.transforms.logit_trans(**kwargs: Any)

Logit Transformation

static inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

static transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

mizani.transforms.probability_trans(distribution: str, *args, **kwargs) -> trans

Probability Transformation

Parameters
distribution

python:str Name of the distribution. Valid distributions are listed at scipy.stats. Any of the continuous or discrete distributions.

args

python:tuple Arguments passed to the distribution functions.

kwargs

python:dict Keyword arguments passed to the distribution functions.

Notes

Make sure that the distribution is a good enough approximation for the data. When this is not the case, computations may run into errors. Absence of any errors does not imply that the distribution fits the data.

mizani.transforms.probit_trans

alias of norm_trans

class mizani.transforms.reverse_trans(**kwargs: Any)

Reverse Transformation

transform_is_linear: bool = True

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>

inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>

class mizani.transforms.sqrt_trans(**kwargs: Any)

Square-root Transformation

transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'sqrt'>

inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'square'>

class mizani.transforms.symlog_trans(**kwargs: Any)

Symmetric Log Transformation

They symmetric logarithmic transformation is defined as

f(x) = log(x+1) for x >= 0
       -log(-x+1) for x < 0

It can be useful for data that has a wide range of both positive and negative values (including zero).

breaks_: BreaksFunction = <mizani.breaks.breaks_symlog object>

Callable to calculate breaks

static transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

static inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

class mizani.transforms.timedelta_trans(**kwargs: Any)

Timedelta Transformation

breaks_: BreaksFunction = <mizani.breaks.breaks_timedelta object>

Callable to calculate breaks

format: FormatFunction = label_timedelta(units=None, show_units=True, zero_has_units=True, usetex=False, space=True, use_plurals=True)

Function to format breaks

static transform(x: NDArrayTimedelta | Sequence[timedelta]) -> NDArrayFloat

Transform from Timeddelta to numerical format

static inverse(x: FloatArrayLike) -> NDArrayTimedelta

Transform to Timedelta from numerical format

class mizani.transforms.pd_timedelta_trans(**kwargs: Any)

Pandas timedelta Transformation

breaks_: BreaksFunction = <mizani.breaks.breaks_timedelta object>

Callable to calculate breaks

format: FormatFunction = label_timedelta(units=None, show_units=True, zero_has_units=True, usetex=False, space=True, use_plurals=True)

Function to format breaks

static transform(x: TimedeltaSeries) -> NDArrayFloat

Transform from Timeddelta to numerical format

static inverse(x: FloatArrayLike) -> NDArrayTimedelta

Transform to Timedelta from numerical format

class mizani.transforms.pseudo_log_trans(sigma=1, base=None, **kwargs)

Pseudo-log transformation

A transformation mapping numbers to a signed logarithmic scale with a smooth transition to linear scale around 0.

Parameters
sigma

python:float Scaling factor for the linear part.

base

python:int Approximate logarithm used. If None, then the natural log is used.

kwargs

python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.

transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

minor_breaks(major: FloatArrayLike, limits: TupleFloat2 | None = None, n: int | None = None) -> NDArrayFloat

Calculate minor_breaks

class mizani.transforms.reciprocal_trans(**kwargs: Any)

Reciprocal Transformation

static transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

static inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

class mizani.transforms.trans(**kwargs: Any)

Base class for all transforms

This class is used to transform data and also tell the x and y axes how to create and label the tick locations.

The key methods to override are trans.transform() and trans.inverse(). Alternately, you can quickly create a transform class using the trans_new() function.

Parameters
kwargs

python:dict Attributes of the class to set/override

transform_is_linear: bool = False

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

breaks_: BreaksFunction = <mizani.breaks.breaks_extended object>

Callable to calculate breaks

format: FormatFunction = label_number(accuracy=None, precision=None, scale=1, prefix='', suffix='', big_mark='', decimal_mark='.', fill='', style_negative='-', style_positive='', align='>', width=None)

Function to format breaks

property domain_is_numerical: bool

Return True if transformation acts on numerical data. e.g. int, float, and imag are numerical but datetime is not.

minor_breaks(major: FloatArrayLike, limits: TupleFloat2 | None = None, n: int | None = None) -> NDArrayFloat

Calculate minor_breaks

abstract static transform(x: TFloatArrayLike) -> TFloatArrayLike

Transform of x

abstract static inverse(x: TFloatArrayLike) -> TFloatArrayLike

Inverse of x

breaks(limits: DomainType) -> NDArrayFloat

Calculate breaks in data space and return them in transformed space.

Expects limits to be in transform space, this is the same space as that where the domain is specified.

This method wraps around breaks_() to ensure that the calculated breaks are within the domain the transform. This is helpful in cases where an aesthetic requests breaks with limits expanded for some padding, yet the expansion goes beyond the domain of the transform. e.g for a probability transform the breaks will be in the domain [0, 1] despite any outward limits.

Parameters
limits

python:tuple The scale limits. Size 2.

Returns
out

numpy:array_like Major breaks

mizani.transforms.trans_new(name: str, transform: TransformFunction, inverse: InverseFunction, breaks: BreaksFunction | None = None, minor_breaks: MinorBreaksFunction | None = None, _format: FormatFunction | None = None, domain=(-inf, inf), doc: str = '', **kwargs) -> trans

Create a transformation class object

Parameters
name

python:str Name of the transformation

transform

python:callable() f(x) A function (preferably a ufunc) that computes the transformation.

inverse

python:callable() f(x) A function (preferably a ufunc) that computes the inverse of the transformation.

breaks

python:callable() f(limits) Function to compute the breaks for this transform. If None, then a default good enough for a linear domain is used.

minor_breaks

python:callable() f(major, limits) Function to compute the minor breaks for this transform. If None, then a default good enough for a linear domain is used.

_format

python:callable() f(breaks) Function to format the generated breaks.

domain

numpy:array_like Domain over which the transformation is valid. It should be of length 2.

doc

python:str Docstring for the class.

**kwargs

python:dict Attributes of the transform, e.g if base is passed in kwargs, then t.base would be a valied attribute.

Returns
out

trans Transform class

mizani.transforms.gettrans(t: str | Callable[[], Type[trans]] | Type[trans] | trans | None = None)

Return a trans object

Parameters
t

python:str | python:callable() | type | trans Name of transformation function. If None, returns an identity transform.

Returns
out

trans.UNINDENT

scale - Implementing a scale

According to On the theory of scales of measurement by S.S. Stevens, scales can be classified in four ways -- nominal, ordinal, interval and ratio. Using current(2016) terminology, nominal data is made up of unordered categories, ordinal data is made up of ordered categories and the two can be classified as discrete. On the other hand both interval and ratio data are continuous.

The scale classes below show how the rest of the Mizani package can be used to implement the two categories of scales. The key tasks are training and mapping and these correspond to the train and map methods.

To train a scale on data means, to make the scale learn the limits of the data. This is elaborate (or worthy of a dedicated method) for two reasons:

  • Practical -- data may be split up across more than one object, yet all will be represented by a single scale.
  • Conceptual -- training is a key action that may need to be inserted into multiple locations of the data processing pipeline before a graphic can be created.

To map data onto a scale means, to associate data values with values(potential readings) on a scale. This is perhaps the most important concept unpinning a scale.

The apply methods are simple examples of how to put it all together.

class mizani.scale.scale_continuous

Continuous scale

classmethod apply(x: FloatArrayLike, palette: ContinuousPalette, na_value: Any = None, trans: Trans | None = None) -> NDArrayFloat

Scale data continuously

Parameters
x

numpy:array_like Continuous values to scale

palette

python:callable() f(x) Palette to use

na_value

object Value to use for missing values.

trans

trans How to transform the data before scaling. If None, no transformation is done.

Returns
out

numpy:array_like Scaled values

classmethod train(new_data: FloatArrayLike, old: TupleFloat2 | None = None) -> TupleFloat2

Train a continuous scale

Parameters
new_data

numpy:array_like New values

old

numpy:array_like Old range

Returns
out

python:tuple Limits(range) of the scale

classmethod map(x: FloatArrayLike, palette: ContinuousPalette, limits: TupleFloat2, na_value: Any = None, oob: Callable[[TVector], TVector] = <function censor>) -> NDArrayFloat

Map values to a continuous palette

Parameters
x

numpy:array_like Continuous values to scale

palette

python:callable() f(x) palette to use

na_value

object Value to use for missing values.

oob

python:callable() f(x) Function to deal with values that are beyond the limits

Returns
out

numpy:array_like Values mapped onto a palette

class mizani.scale.scale_discrete

Discrete scale

classmethod apply(x: AnyArrayLike, palette: DiscretePalette, na_value: Any = None)

Scale data discretely

Parameters
x

numpy:array_like Discrete values to scale

palette

python:callable() f(x) Palette to use

na_value

object Value to use for missing values.

Returns
out

numpy:array_like Scaled values

classmethod train(new_data: AnyArrayLike, old: Sequence[Any] | None = None, drop: bool = False, na_rm: bool = False) -> Sequence[Any]

Train a continuous scale

Parameters
new_data

numpy:array_like New values

old

numpy:array_like Old range. List of values known to the scale.

drop

bool Whether to drop(not include) unused categories

na_rm

bool If True, remove missing values. Missing values are either NaN or None.

Returns
out

python:list Values covered by the scale

classmethod map(x: AnyArrayLike, palette: DiscretePalette, limits: Sequence[Any], na_value: Any = None) -> AnyArrayLike

Map values to a discrete palette

Parameters
palette

python:callable() f(x) palette to use

x

numpy:array_like Continuous values to scale

na_value

object Value to use for missing values.

Returns
out

numpy:array_like Values mapped onto a palette

Installation

mizani can be can be installed in a couple of ways depending on purpose.

Official release installation

For a normal user, it is recommended to install the official release.

$ pip install mizani

Development installation

To do any development you have to clone the mizani source repository and install the package in development mode. These commands do all of that:

$ git clone https://github.com/has2k1/mizani.git
$ cd mizani
$ pip install -e .

If you only want to use the latest development sources and do not care about having a cloned repository, e.g. if a bug you care about has been fixed but an official release has not come out yet, then use this command:

$ pip install git+https://github.com/has2k1/mizani.git

Changelog

v0.12.2

2024-09-04

Bug Fixes

  • Fixed squish and squish_infinite to work for non writeable pandas series. This is broken for numpy 2.1.0.

v0.12.1

2024-08-19

Enhancements

  • Renamed "husl" color palette type to "hsluv". "husl" is the old name but we still work although not part of the API.

v0.12.0

2024-07-30 .SS API Changes

  • mizani now requires python 3.9 and above.

Bug Fixes

  • Fixed bug where a date with a timezone could lose the timezone. #45.

v0.11.4

2024-05-24 .SS Bug Fixes

  • Fixed squish and squish_infinite so that they do not reuse numpy arrays. The users object is not modified.

    This also prevents exceptions where the numpy array backs a pandas object and it is protected by copy-on-write.

v0.11.3

2024-05-09 .SS Bug Fixes

  • Fixed bug when calculating monthly breaks where when the limits are narrow and do not align with the start and end of the month, there were no dates returned. (#42)

v0.11.2

2024-04-26 .SS Bug Fixes

  • Added the ability to create reversed colormap for cmap_pal and cmap_d_pal using the matplotlib convention of name_r.

v0.11.1

2024-03-27 .SS Bug Fixes

  • Fix mizani.palettes.brewer_pal to return exact colors in the when the requested colors are less than or equal to those in the palette.
  • Add all matplotlib colormap and make them avalaible from cmap_pal and cmap_d_pal (#39).

New

  • Added breaks_symlog to calculate breaks for the symmetric logarithm transformation.

Changes

  • The default big_mark for label_number has been changed from a comma to nothing.

v0.11.0

2024-02-12 .SS Enhancements

  • Removed FutureWarnings when using pandas 2.1.0

New

  • Added breaks_symlog to calculate breaks for the symmetric logarithm transformation.

Changes

  • The default big_mark for label_number has been changed from a comma to nothing.

v0.10.0

2023-07-28 .SS API Changes

  • mpl_format has been removed, number_format takes its place.
  • mpl_breaks has been removed, extended_breaks has always been the default and it is sufficient.
  • matplotlib has been removed as a dependency of mizani.
  • mizani now requires python 3.9 and above.
  • The units parameter for of timedelta_format now accepts the values "min", "day", "week", "month", instead of "m", "d", "w", "M".
  • The naming convention for break formatting methods has changed from *_format to label_*. Specifically these methods have been renamed.

    • comma_format is now label_comma
    • custom_format is now label_custom
    • currency_format is now label_currency
    • label_dollar is now label_dollar
    • percent_format is now label_percent
    • scientific_format is now label_scientific
    • date_format is now label_date
    • number_format is now label_number
    • log_format is now label_log
    • timedelta_format is now label_timedelta
    • pvalue_format is now label_pvalue
    • ordinal_format is now label_ordinal
    • number_bytes_format is now label_bytes
  • The naming convention for break calculating methods has changed from *_breaks to breaks_*. Specifically these methods have been renamed.

    • log_breaks is now breaks_log
    • trans_minor_breaks is now minor_breaks_trans
    • date_breaks is now breaks_date
    • timedelta_breaks is now breaks_timedelta
    • extended_breaks is now breaks_extended
  • dataspace_is_numerical has changed to domain_is_numerical and it is now determined dynamically.
  • The default minor_breaks for all transforms that are not linear are now calculated in dataspace. But only if the dataspace is numerical.

New

  • symlog_trans for symmetric log transformation

v0.9.2

2023-05-25 .SS Bug Fixes

  • Fixed regression in but in date_format where it cannot deal with UTC timezone from timezone #30.

v0.9.1

2023-05-19 .SS Bug Fixes

  • Fixed but in date_format to handle datetime sequences within the same timezone but a mixed daylight saving state. (plotnine #687)

v0.9.0

2023-04-15 .SS API Changes

  • palettable dropped as a dependency.

Bug Fixes

  • Fixed bug in datetime_trans where a pandas series with an index that did not start at 0 could not be transformed.
  • Install tzdata on pyiodide/emscripten. #27

v0.8.1

2022-09-28 .SS Bug Fixes

  • Fixed regression bug in log_format for where formatting for bases 2, 8 and 16 would fail if the values were float-integers.

Enhancements

  • log_format now uses exponent notation for bases other than base 10.

v0.8.0

2022-09-26 .SS API Changes

  • The lut parameter of cmap_pal and cmap_d_pal has been deprecated and will removed in a future version.
  • datetime_trans gained parameter tz that controls the timezone of the transformation.
  • log_format gained boolean parameter mathtex for TeX values as understood matplotlib instead of values in scientific notation.

Bug Fixes

  • Fixed bug in zero_range where uint64 values would cause a RuntimeError.

v0.7.4

2022-04-02 .SS API Changes

  • comma_format is now imported automatically when using *.
  • Fixed issue with scale_discrete so that if you train on data with Nan and specify and old range that also has NaN, the result range does not include two NaN values.

v0.7.3

(2020-10-29) .SS Bug Fixes

  • Fixed log_breaks for narrow range if base=2 (#76).

v0.7.2

(2020-10-29) .SS Bug Fixes

  • Fixed bug in rescale_max() to properly handle values whose maximum is zero (#16).

v0.7.1

(2020-06-05) .SS Bug Fixes

  • Fixed regression in mizani.scales.scale_discrete.train() when trainning on values with some categoricals that have common elements.

v0.7.0

(2020-06-04) .SS Bug Fixes

  • Fixed issue with mizani.formatters.log_breaks where non-linear breaks could not be generated if the limits where greater than the largest integer sys.maxsize.
  • Fixed mizani.palettes.gradient_n_pal() to return nan for nan values.
  • Fixed mizani.scales.scale_discrete.train() when training categoricals to maintain the order. (plotnine #381)

v0.6.0

(2019-08-15) .SS New

  • Added pvalue_format
  • Added ordinal_format
  • Added number_bytes_format
  • Added pseudo_log_trans()
  • Added reciprocal_trans
  • Added modulus_trans()

Enhancements

  • mizani.breaks.date_breaks now supports intervals in the

    order of seconds.

  • mizani.palettes.brewer_pal now supports a direction argument to control the order of the returned colors.

API Changes

  • boxcox_trans() now only accepts positive values. For both positive and negative values, modulus_trans() has been added.

v0.5.4

(2019-03-26) .SS Enhancements

  • mizani.formatters.log_format now does a better job of approximating labels for numbers like 3.000000000000001e-05.

API Changes

  • exponent_threshold parameter of mizani.formatters.log_format has been deprecated.

v0.5.3

(2018-12-24) .SS API Changes

  • Log transforms now default to base - 2 minor breaks. So base 10 has 8 minor breaks and 9 partitions, base 8 has 6 minor breaks and 7 partitions, ..., base 2 has 0 minor breaks and a single partition.

v0.5.2

(2018-10-17) .SS Bug Fixes

  • Fixed issue where some functions that took pandas series would return output where the index did not match that of the input.

v0.5.1

(2018-10-15) .SS Bug Fixes

  • Fixed issue with log_breaks, so that it does not fail needlessly when the limits in the (0, 1) range.

Enhancements

  • Changed log_format to return better formatted breaks.

v0.5.0

(2018-11-10) .SS API Changes

  • Support for python 2 has been removed.
  • call() and

    meth:~mizani.breaks.trans_minor_breaks.call now accept optional parameter n which is the number of minor breaks between any two major breaks.

  • The parameter nan_value has be renamed to na_value.
  • The parameter nan_rm has be renamed to na_rm.

Enhancements

  • Better support for handling missing values when training discrete scales.
  • Changed the algorithm for log_breaks, it can now return breaks that do not fall on the integer powers of the base.

v0.4.6

(2018-03-20) .INDENT 0.0

  • Added squish

v0.4.5

(2018-03-09) .INDENT 0.0

  • Added identity_pal
  • Added cmap_d_pal

v0.4.4

(2017-12-13) .INDENT 0.0

  • Fixed date_format to respect the timezones of the dates (#8).

v0.4.3

(2017-12-01) .INDENT 0.0

  • Changed date_breaks to have more variety in the spacing between the breaks.
  • Fixed date_format to respect time part of the date (#7).

v0.4.2

(2017-11-06) .INDENT 0.0

  • Fixed (regression) break calculation for the non ordinal transforms.

v0.4.1

(2017-11-04) .INDENT 0.0

  • trans objects can now be instantiated with parameter to override attributes of the instance. And the default methods for computing breaks and minor breaks on the transform instance are not class attributes, so they can be modified without global repercussions.

v0.4.0

(2017-10-24) .SS API Changes

  • Breaks and formatter generating functions have been converted to classes, with a __call__ method. How they are used has not changed, but this makes them move flexible.
  • ExtendedWilkson class has been removed. extended_breaks() now contains the implementation of the break calculating algorithm.

v0.3.4

(2017-09-12) .INDENT 0.0

  • Fixed issue where some formatters methods failed if passed empty breaks argument.
  • Fixed issue with log_breaks() where if the limits were with in the same order of magnitude the calculated breaks were always the ends of the order of magnitude.

    Now log_breaks()((35, 50)) returns [35,  40,  45,  50] as breaks instead of [1, 100].

v0.3.3

(2017-08-30) .INDENT 0.0

  • Fixed SettingWithCopyWarnings in squish_infinite().
  • Added log_format().

API Changes

  • Added log_trans now uses log_format() as the formatting method.

v0.3.2

(2017-07-14) .INDENT 0.0

  • Added expand_range_distinct()

v0.3.1

(2017-06-22) .INDENT 0.0

  • Fixed bug where using log_breaks() with Numpy 1.13.0 led to a ValueError.

v0.3.0

(2017-04-24) .INDENT 0.0

  • Added xkcd_palette(), a palette that selects from 954 named colors.
  • Added crayon_palette(), a palette that selects from 163 named colors.
  • Added cubehelix_pal(), a function that creates a continuous palette from the cubehelix system.
  • Fixed bug where a color palette would raise an exception when passed a single scalar value instead of a list-like.
  • extended_breaks() and mpl_breaks() now return a single break if the limits are equal. Previous, one run into an Overflow and the other returned a sequence filled with n of the same limit.

API Changes

  • mpl_breaks() now returns a function that (strictly) expects a tuple with the minimum and maximum values.

v0.2.0

(2017-01-27) .INDENT 0.0

  • Fixed bug in censor() where a sequence of values with an irregular index would lead to an exception.
  • Fixed boundary issues due internal loss of precision in ported function seq().
  • Added mizani.breaks.extended_breaks() which computes breaks using a modified version of Wilkinson's tick algorithm.
  • Changed the default function mizani.transforms.trans.breaks_() used by mizani.transforms.trans to compute breaks from mizani.breaks.mpl_breaks() to mizani.breaks.extended_breaks().
  • mizani.breaks.timedelta_breaks() now uses mizani.breaks.extended_breaks() internally instead of mizani.breaks.mpl_breaks().
  • Added manual palette function mizani.palettes.manual_pal().
  • Requires pandas version 0.19.0 or higher.

v0.1.0

(2016-06-30)

First public release

Author

Hassan Kibirige

Info

Sep 05, 2024 0.12.2 Mizani