mizani - Man Page
Name
mizani — Mizani Documentation
Mizani is python library that provides the pieces necessary to create scales for a graphics system. It is based on the R Scales package.
Contents
bounds - Limiting data values for a palette
Continuous variables have values anywhere in the range minus infinite to plus infinite. However, when creating a visual representation of these values what usually matters is the relative difference between the values. This is where rescaling comes into play.
The values are mapped onto a range that a scale can deal with. For graphical representation that range tends to be [0, 1] or [0, n], where n is some number that makes the plotted object overflow the plotting area.
Although a scale may be able handle the [0, n] range, it may be desirable to have a lower bound greater than zero. For example, if data values get mapped to zero on a scale whose graphical representation is the size/area/radius/length some data will be invisible. The solution is to restrict the lower bound e.g. [0.1, 1]. Similarly you can restrict the upper bound -- using these functions.
- mizani.bounds.censor(x, range=(0, 1), only_finite=True)
Convert any values outside of range to a NULL type object.
- Parameters
- x
numpy:array_like Values to manipulate
- range
python:tuple (min, max) giving desired output range
- only_finite
bool If True (the default), will only modify finite values.
- Returns
- x
numpy:array_like Censored array
Notes
All values in x should be of the same type. only_finite parameter is not considered for Datetime and Timedelta types.
The NULL type object depends on the type of values in x.
- float - float('nan')
- int - float('nan')
- datetime.datetime : np.datetime64(NaT)
- datetime.timedelta : np.timedelta64(NaT)
Examples
>>> a = [1, 2, np.inf, 3, 4, -np.inf, 5] >>> censor(a, (0, 10)) [1, 2, inf, 3, 4, -inf, 5] >>> censor(a, (0, 10), False) [1, 2, nan, 3, 4, nan, 5] >>> censor(a, (2, 4)) [nan, 2, inf, 3, 4, -inf, nan]
- mizani.bounds.expand_range(range, mul=0, add=0, zero_width=1)
Expand a range with a multiplicative or additive constant
- Parameters
- range
python:tuple Range of data. Size 2.
- mul
python:int | python:float Multiplicative constant
- add
python:int | python:float | timedelta Additive constant
- zero_width
python:int | python:float | timedelta Distance to use if range has zero width
- Returns
- out
python:tuple Expanded range
Notes
If expanding datetime or timedelta types, add and zero_width must be suitable timedeltas i.e. You should not mix types between Numpy, Pandas and the datetime module.
Examples
>>> expand_range((3, 8)) (3, 8) >>> expand_range((0, 10), mul=0.1) (-1.0, 11.0) >>> expand_range((0, 10), add=2) (-2, 12) >>> expand_range((0, 10), mul=.1, add=2) (-3.0, 13.0) >>> expand_range((0, 1)) (0, 1)
When the range has zero width
>>> expand_range((5, 5)) (4.5, 5.5)
- mizani.bounds.rescale(x, to=(0, 1), _from=None)
Rescale numeric vector to have specified minimum and maximum.
- Parameters
- x
numpy:array_like | numeric 1D vector of values to manipulate.
- to
python:tuple output range (numeric vector of length two)
- _from
python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x
- Returns
- out
numpy:array_like Rescaled values
Examples
>>> x = [0, 2, 4, 6, 8, 10] >>> rescale(x) array([0. , 0.2, 0.4, 0.6, 0.8, 1. ]) >>> rescale(x, to=(0, 2)) array([0. , 0.4, 0.8, 1.2, 1.6, 2. ]) >>> rescale(x, to=(0, 2), _from=(0, 20)) array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
- mizani.bounds.rescale_max(x, to=(0, 1), _from=None)
Rescale numeric vector to have specified maximum.
- Parameters
- x
numpy:array_like | numeric 1D vector of values to manipulate.
- to
python:tuple output range (numeric vector of length two)
- _from
python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x. Only the 2nd (max) element is essential to the output.
- Returns
- out
numpy:array_like Rescaled values
Examples
>>> x = [0, 2, 4, 6, 8, 10] >>> rescale_max(x, (0, 3)) array([0. , 0.6, 1.2, 1.8, 2.4, 3. ])
Only the 2nd (max) element of the parameters to and _from are essential to the output.
>>> rescale_max(x, (1, 3)) array([0. , 0.6, 1.2, 1.8, 2.4, 3. ]) >>> rescale_max(x, (0, 20)) array([ 0., 4., 8., 12., 16., 20.])
If max(x) < _from[1] then values will be scaled beyond the requested maximum (to[1]).
>>> rescale_max(x, to=(1, 3), _from=(-1, 6)) array([0., 1., 2., 3., 4., 5.])
If the values are the same, they taken on the requested maximum. This includes an array of all zeros.
>>> rescale_max([5, 5, 5]) array([1., 1., 1.]) >>> rescale_max([0, 0, 0]) array([1, 1, 1])
- mizani.bounds.rescale_mid(x, to=(0, 1), _from=None, mid=0)
Rescale numeric vector to have specified minimum, midpoint, and maximum.
- Parameters
- x
numpy:array_like | numeric 1D vector of values to manipulate.
- to
python:tuple output range (numeric vector of length two)
- _from
python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x
- mid
numeric mid-point of input range
- Returns
- out
numpy:array_like Rescaled values
Examples
>>> rescale_mid([1, 2, 3], mid=1) array([0.5 , 0.75, 1. ]) >>> rescale_mid([1, 2, 3], mid=2) array([0. , 0.5, 1. ])
- mizani.bounds.squish_infinite(x, range=(0, 1))
Truncate infinite values to a range.
- Parameters
- x
numpy:array_like Values that should have infinities squished.
- range
python:tuple The range onto which to squish the infinites. Must be of size 2.
- Returns
- out
numpy:array_like Values with infinites squished.
Examples
>>> squish_infinite([0, .5, .25, np.inf, .44]) [0.0, 0.5, 0.25, 1.0, 0.44] >>> squish_infinite([0, -np.inf, .5, .25, np.inf], (-10, 9)) [0.0, -10.0, 0.5, 0.25, 9.0]
- mizani.bounds.zero_range(x, tol=2.220446049250313e-14)
Determine if range of vector is close to zero.
- Parameters
- x
numpy:array_like | numeric Value(s) to check. If it is an array_like, it should be of length 2.
- tol
python:float Tolerance. Default tolerance is the machine epsilon times 10^2.
- Returns
- out
bool Whether x has zero range.
Examples
>>> zero_range([1, 1]) True >>> zero_range([1, 2]) False >>> zero_range([1, 2], tol=2) True
- mizani.bounds.expand_range_distinct(range, expand=(0, 0, 0, 0), zero_width=1)
Expand a range with a multiplicative or additive constants
Similar to expand_range() but both sides of the range expanded using different constants
- Parameters
- range
python:tuple Range of data. Size 2
- expand
python:tuple Length 2 or 4. If length is 2, then the same constants are used for both sides. If length is 4 then the first two are are the Multiplicative (mul) and Additive (add) constants for the lower limit, and the second two are the constants for the upper limit.
- zero_width
python:int | python:float | timedelta Distance to use if range has zero width
- Returns
- out
python:tuple Expanded range
Examples
>>> expand_range_distinct((3, 8)) (3, 8) >>> expand_range_distinct((0, 10), (0.1, 0)) (-1.0, 11.0) >>> expand_range_distinct((0, 10), (0.1, 0, 0.1, 0)) (-1.0, 11.0) >>> expand_range_distinct((0, 10), (0.1, 0, 0, 0)) (-1.0, 10) >>> expand_range_distinct((0, 10), (0, 2)) (-2, 12) >>> expand_range_distinct((0, 10), (0, 2, 0, 2)) (-2, 12) >>> expand_range_distinct((0, 10), (0, 0, 0, 2)) (0, 12) >>> expand_range_distinct((0, 10), (.1, 2)) (-3.0, 13.0) >>> expand_range_distinct((0, 10), (.1, 2, .1, 2)) (-3.0, 13.0) >>> expand_range_distinct((0, 10), (0, 0, .1, 2)) (0, 13.0)
- mizani.bounds.squish(x, range=(0, 1), only_finite=True)
Squish values into range.
- Parameters
- x
numpy:array_like Values that should have out of range values squished.
- range
python:tuple The range onto which to squish the values.
- only_finite: boolean
When true, only squishes finite values.
- Returns
- out
numpy:array_like Values with out of range values squished.
Examples
>>> squish([-1.5, 0.2, 0.5, 0.8, 1.0, 1.2]) [0.0, 0.2, 0.5, 0.8, 1.0, 1.0]
>>> squish([-np.inf, -1.5, 0.2, 0.5, 0.8, 1.0, np.inf], only_finite=False) [0.0, 0.0, 0.2, 0.5, 0.8, 1.0, 1.0]
breaks - Partitioning a scale for readability
All scales have a means by which the values that are mapped onto the scale are interpreted. Numeric digital scales put out numbers for direct interpretation, but most scales cannot do this. What they offer is named markers/ticks that aid in assessing the values e.g. the common odometer will have ticks and values to help gauge the speed of the vehicle.
The named markers are what we call breaks. Properly calculated breaks make interpretation straight forward. These functions provide ways to calculate good(hopefully) breaks.
- class mizani.breaks.mpl_breaks(*args, **kwargs)
Compute breaks using MPL's default locator
See MaxNLocator for the parameter descriptions
Examples
>>> x = range(10) >>> limits = (0, 9) >>> mpl_breaks()(limits) array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]) >>> mpl_breaks(nbins=2)(limits) array([ 0., 5., 10.])
- __call__(limits)
Compute breaks
- Parameters
- limits
python:tuple Minimum and maximum values
- Returns
- out
numpy:array_like Sequence of breaks points
- class mizani.breaks.log_breaks(n=5, base=10)
Integer breaks on log transformed scales
- Parameters
- n
python:int Desired number of breaks
- base
python:int Base of logarithm
Examples
>>> x = np.logspace(3, 6) >>> limits = min(x), max(x) >>> log_breaks()(limits) array([ 1000, 10000, 100000, 1000000]) >>> log_breaks(2)(limits) array([ 1000, 100000]) >>> log_breaks()([0.1, 1]) array([0.1, 0.3, 1. , 3. ])
- __call__(limits)
Compute breaks
- Parameters
- limits
python:tuple Minimum and maximum values
- Returns
- out
numpy:array_like Sequence of breaks points
- class mizani.breaks.minor_breaks(n=1)
Compute minor breaks
- Parameters
- n
python:int Number of minor breaks between the major breaks.
Examples
>>> major = [1, 2, 3, 4] >>> limits = [0, 5] >>> minor_breaks()(major, limits) array([0.5, 1.5, 2.5, 3.5, 4.5]) >>> minor_breaks()([1, 2], (1, 2)) array([1.5])
More than 1 minor break.
>>> minor_breaks(3)([1, 2], (1, 2)) array([1.25, 1.5 , 1.75]) >>> minor_breaks()([1, 2], (1, 2), 3) array([1.25, 1.5 , 1.75])
- __call__(major, limits=None, n=None)
Minor breaks
- Parameters
- major
numpy:array_like Major breaks
- limits
numpy:array_like | python:None Limits of the scale. If array_like, must be of size 2. If None, then the minimum and maximum of the major breaks are used.
- n
python:int Number of minor breaks between the major breaks. If None, then self.n is used.
- Returns
- out
numpy:array_like Minor beraks
- class mizani.breaks.trans_minor_breaks(trans, n=1)
Compute minor breaks for transformed scales
The minor breaks are computed in data space. This together with major breaks computed in transform space reveals the non linearity of of a scale. See the log transforms created with log_trans() like log10_trans.
- Parameters
- trans
trans or type Trans object or trans class.
- n
python:int Number of minor breaks between the major breaks.
Examples
>>> from mizani.transforms import sqrt_trans >>> major = [1, 2, 3, 4] >>> limits = [0, 5] >>> sqrt_trans().minor_breaks(major, limits) array([0.5, 1.5, 2.5, 3.5, 4.5]) >>> class sqrt_trans2(sqrt_trans): ... def __init__(self): ... self.minor_breaks = trans_minor_breaks(sqrt_trans2) >>> sqrt_trans2().minor_breaks(major, limits) array([1.58113883, 2.54950976, 3.53553391])
More than 1 minor break
>>> major = [1, 10] >>> limits = [1, 10] >>> sqrt_trans().minor_breaks(major, limits, 4) array([2.8, 4.6, 6.4, 8.2])
- __call__(major, limits=None, n=None)
Minor breaks for transformed scales
- Parameters
- major
numpy:array_like Major breaks
- limits
numpy:array_like | python:None Limits of the scale. If array_like, must be of size 2. If None, then the minimum and maximum of the major breaks are used.
- n
python:int Number of minor breaks between the major breaks. If None, then self.n is used.
- Returns
- out
numpy:array_like Minor breaks
- class mizani.breaks.date_breaks(width=None)
Regularly spaced dates
- Parameters
- width
python:str | python:None An interval specification. Must be one of [second, minute, hour, day, week, month, year] If None, the interval automatic.
Examples
>>> from datetime import datetime >>> x = [datetime(year, 1, 1) for year in [2010, 2026, 2015]]
Default breaks will be regularly spaced but the spacing is automatically determined
>>> limits = min(x), max(x) >>> breaks = date_breaks() >>> [d.year for d in breaks(limits)] [2010, 2012, 2014, 2016, 2018, 2020, 2022, 2024, 2026]
Breaks at 4 year intervals
>>> breaks = date_breaks('4 year') >>> [d.year for d in breaks(limits)] [2008, 2012, 2016, 2020, 2024, 2028]
- __call__(limits)
Compute breaks
- Parameters
- limits
python:tuple Minimum and maximum datetime.datetime values.
- Returns
- out
numpy:array_like Sequence of break points.
- class mizani.breaks.timedelta_breaks(n=5, Q=(1, 2, 5, 10))
Timedelta breaks
- Returns
- out
python:callable() f(limits) A function that takes a sequence of two datetime.timedelta values and returns a sequence of break points.
Examples
>>> from datetime import timedelta >>> breaks = timedelta_breaks() >>> x = [timedelta(days=i*365) for i in range(25)] >>> limits = min(x), max(x) >>> major = breaks(limits) >>> [val.total_seconds()/(365*24*60*60)for val in major] [0.0, 5.0, 10.0, 15.0, 20.0, 25.0]
- __call__(limits)
Compute breaks
- Parameters
- limits
python:tuple Minimum and maximum datetime.timedelta values.
- Returns
- out
numpy:array_like Sequence of break points.
- class mizani.breaks.extended_breaks(n=5, Q=[1, 5, 2, 2.5, 4, 3], only_inside=False, w=[0.25, 0.2, 0.5, 0.05])
An extension of Wilkinson's tick position algorithm
- Parameters
- n
python:int Desired number of ticks
- Q
python:list List of nice numbers
- only_inside
bool If True, then all the ticks will be within the given range.
- w
python:list Weights applied to the four optimization components (simplicity, coverage, density, and legibility). They should add up to 1.
References
- Talbot, J., Lin, S., Hanrahan, P. (2010) An Extension of Wilkinson's Algorithm for Positioning Tick Labels on Axes, InfoVis 2010.
Additional Credit to Justin Talbot on whose code this implementation is almost entirely based.
Examples
>>> limits = (0, 9) >>> extended_breaks()(limits) array([ 0. , 2.5, 5. , 7.5, 10. ]) >>> extended_breaks(n=6)(limits) array([ 0., 2., 4., 6., 8., 10.])
- __call__(limits)
Calculate the breaks
- Parameters
- limits
array Minimum and maximum values.
- Returns
- out
numpy:array_like Sequence of break points.
formatters - Labelling breaks
Scales have guides and these are what help users make sense of the data mapped onto the scale. Common examples of guides include the x-axis, the y-axis, the keyed legend and a colorbar legend. The guides have demarcations(breaks), some of which must be labelled.
The *_format functions below create functions that convert data values as understood by a specific scale and return string representations of those values. Manipulating the string representation of a value helps improve readability of the guide.
- class mizani.formatters.comma_format(digits=0)
Format number with commas separating thousands
- Parameters
- digits
python:int Number of digits after the decimal point.
Examples
>>> comma_format()([1000, 2, 33000, 400]) ['1,000', '2', '33,000', '400']
- __call__(x)
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.formatters.custom_format(fmt='{}', style='new')
Custom format
- Parameters
- fmt
python:str, optional Format string. Default is the generic new style format braces, {}.
- style
'new' | 'old' Whether to use new style or old style formatting. New style uses the str.format() while old style uses %. The format string must be written accordingly.
Examples
>>> formatter = custom_format('{:.2f} USD') >>> formatter([3.987, 2, 42.42]) ['3.99 USD', '2.00 USD', '42.42 USD']
- __call__(x)
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.formatters.currency_format(prefix='$', suffix='', digits=2, big_mark='')
Currency formatter
- Parameters
- prefix
python:str What to put before the value.
- suffix
python:str What to put after the value.
- digits
python:int Number of significant digits
- big_mark
python:str The thousands separator. This is usually a comma or a dot.
Examples
>>> x = [1.232, 99.2334, 4.6, 9, 4500] >>> currency_format()(x) ['$1.23', '$99.23', '$4.60', '$9.00', '$4500.00'] >>> currency_format('C$', digits=0, big_mark=',')(x) ['C$1', 'C$99', 'C$5', 'C$9', 'C$4,500']
- __call__(x)
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- mizani.formatters.dollar_format
alias of currency_format
- class mizani.formatters.percent_format(use_comma=False)
Percent formatter
Multiply by one hundred and display percent sign
- Parameters
- use_comma
bool If True, use a comma to separate the thousands. Default is False.
Examples
>>> formatter = percent_format() >>> formatter([.45, 9.515, .01]) ['45%', '952%', '1%'] >>> formatter([.654, .8963, .1]) ['65.4%', '89.6%', '10.0%']
- __call__(x)
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.formatters.scientific_format(digits=3)
Scientific formatter
- Parameters
- digits
python:int Significant digits.
Notes
Be careful when using many digits (15+ on a 64 bit computer). Consider of the machine epsilon.
Examples
>>> x = [.12, .23, .34, 45] >>> scientific_format()(x) ['1.2e-01', '2.3e-01', '3.4e-01', '4.5e+01']
- __call__(x)
Call self as a function.
- class mizani.formatters.date_format(fmt='%Y-%m-%d', tz=None)
Datetime formatter
- Parameters
- fmt
python:str Format string. See strftime.
- tz
datetime.tzinfo, optional Time zone information. If none is specified, the time zone will be that of the first date. If the first date has no time information then a time zone is chosen by other means.
Examples
>>> from datetime import datetime >>> x = [datetime(x, 1, 1) for x in [2010, 2014, 2018, 2022]] >>> date_format()(x) ['2010-01-01', '2014-01-01', '2018-01-01', '2022-01-01'] >>> date_format('%Y')(x) ['2010', '2014', '2018', '2022']
Can format time
>>> x = [datetime(2017, 12, 1, 16, 5, 7)] >>> date_format("%Y-%m-%d %H:%M:%S")(x) ['2017-12-01 16:05:07']
Time zones are respected
>>> UTC = ZoneInfo('UTC') >>> UG = ZoneInfo('Africa/Kampala') >>> x = [datetime(2010, 1, 1, i) for i in [8, 15]] >>> x_tz = [datetime(2010, 1, 1, i, tzinfo=UG) for i in [8, 15]] >>> date_format('%Y-%m-%d %H:%M')(x) ['2010-01-01 08:00', '2010-01-01 15:00'] >>> date_format('%Y-%m-%d %H:%M')(x_tz) ['2010-01-01 08:00', '2010-01-01 15:00']
Format with a specific time zone
>>> date_format('%Y-%m-%d %H:%M', tz=UTC)(x_tz) ['2010-01-01 05:00', '2010-01-01 12:00'] >>> date_format('%Y-%m-%d %H:%M', tz='EST')(x_tz) ['2010-01-01 00:00', '2010-01-01 07:00']
- __call__(x)
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.formatters.mpl_format
Format using MPL formatter for scalars
Examples
>>> mpl_format()([.654, .8963, .1]) ['0.6540', '0.8963', '0.1000']
- __call__(x)
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.formatters.log_format(base=10, exponent_limits=(-4, 4), mathtex=False)
Log Formatter
- Parameters
- base
python:int Base of the logarithm. Default is 10.
- exponent_limits
python:tuple limits (int, int) where if the any of the powers of the numbers falls outside, then the labels will be in exponent form. This only applies for base 10.
- mathtex
bool If True, return the labels in mathtex format as understood by Matplotlib.
Examples
>>> log_format()([0.001, 0.1, 100]) ['0.001', '0.1', '100']
>>> log_format()([0.0001, 0.1, 10000]) ['1e-4', '1e-1', '1e4']
>>> log_format(mathtex=True)([0.0001, 0.1, 10000]) ['$10^{-4}$', '$10^{-1}$', '$10^{4}$']
- __call__(x)
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.formatters.timedelta_format(units=None, add_units=True, usetex=False)
Timedelta formatter
- Parameters
- units
python:str, optional The units in which the breaks will be computed. If None, they are decided automatically. Otherwise, the value should be one of:
'ns' # nanoseconds 'us' # microseconds 'ms' # milliseconds 's' # secondss 'm' # minute 'h' # hour 'd' # day 'w' # week 'M' # month 'y' # year
- add_units
bool Whether to append the units identifier string to the values.
- usetext
bool If True, they microseconds identifier string is rendered with greek letter mu. Default is False.
Examples
>>> from datetime import timedelta >>> x = [timedelta(days=31*i) for i in range(5)] >>> timedelta_format()(x) ['0', '1 month', '2 months', '3 months', '4 months'] >>> timedelta_format(units='d')(x) ['0', '31 days', '62 days', '93 days', '124 days'] >>> timedelta_format(units='d', add_units=False)(x) ['0', '31', '62', '93', '124']
- __call__(x)
Call self as a function.
- class mizani.formatters.pvalue_format(accuracy=0.001, add_p=False)
p-values Formatter
- Parameters
- accuracy
python:float Number to round to
- add_p
bool Whether to prepend "p=" or "p<" to the output
Examples
>>> x = [.90, .15, .015, .009, 0.0005] >>> pvalue_format()(x) ['0.9', '0.15', '0.015', '0.009', '<0.001'] >>> pvalue_format(0.1)(x) ['0.9', '0.1', '<0.1', '<0.1', '<0.1'] >>> pvalue_format(0.1, True)(x) ['p=0.9', 'p=0.1', 'p<0.1', 'p<0.1', 'p<0.1']
- __call__(x)
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.formatters.ordinal_format(prefix='', suffix='', big_mark='')
Ordinal Formatter
- Parameters
- prefix
python:str What to put before the value.
- suffix
python:str What to put after the value.
- big_mark
python:str The thousands separator. This is usually a comma or a dot.
Examples
>>> ordinal_format()(range(8)) ['0th', '1st', '2nd', '3rd', '4th', '5th', '6th', '7th'] >>> ordinal_format(suffix=' Number')(range(11, 15)) ['11th Number', '12th Number', '13th Number', '14th Number']
- __call__(x)
Call self as a function.
- class mizani.formatters.number_bytes_format(symbol='auto', units='binary', fmt='{:.0f} ')
Bytes Formatter
- Parameters
- symbol
python:str Valid symbols are "B", "kB", "MB", "GB", "TB", "PB", "EB", "ZB", and "YB" for SI units, and the "iB" variants for binary units. Default is "auto" where the symbol to be used is determined separately for each value of 1x.
- units
"binary" | "si" Which unit base to use, 1024 for "binary" or 1000 for "si".
- fmt
python:str, optional Format sting. Default is {:.0f}.
Examples
>>> x = [1000, 1000000, 4e5] >>> number_bytes_format()(x) ['1000 B', '977 KiB', '391 KiB'] >>> number_bytes_format(units='si')(x) ['1 kB', '1 MB', '400 kB']
- __call__(x)
Call self as a function.
palettes - Mapping values onto the domain of a scale
Palettes are the link between data values and the values along the dimension of a scale. Before a collection of values can be represented on a scale, they are transformed by a palette. This transformation is knowing as mapping. Values are mapped onto a scale by a palette.
Scales tend to have restrictions on the magnitude of quantities that they can intelligibly represent. For example, the size of a point should be significantly smaller than the plot panel onto which it is plotted or else it would be hard to compare two or more points. Therefore palettes must be created that enforce such restrictions. This is the reason for the *_pal functions that create and return the actual palette functions.
- mizani.palettes.hls_palette(n_colors=6, h=0.01, l=0.6, s=0.65)
Get a set of evenly spaced colors in HLS hue space.
h, l, and s should be between 0 and 1
- Parameters
- n_colors
python:int number of colors in the palette
- h
python:float first hue
- l
python:float lightness
- s
python:float saturation
- Returns
- palette
python:list List of colors as RGB hex strings.
SEE ALSO:
- husl_palette
Make a palette using evenly spaced circular hues in the HUSL system.
Examples
>>> len(hls_palette(2)) 2 >>> len(hls_palette(9)) 9
- mizani.palettes.husl_palette(n_colors=6, h=0.01, s=0.9, l=0.65)
Get a set of evenly spaced colors in HUSL hue space.
h, s, and l should be between 0 and 1
- Parameters
- n_colors
python:int number of colors in the palette
- h
python:float first hue
- s
python:float saturation
- l
python:float lightness
- Returns
- palette
python:list List of colors as RGB hex strings.
SEE ALSO:
- hls_palette
Make a palette using evenly spaced circular hues in the HSL system.
Examples
>>> len(husl_palette(3)) 3 >>> len(husl_palette(11)) 11
- mizani.palettes.rescale_pal(range=(0.1, 1))
Rescale the input to the specific output range.
Useful for alpha, size, and continuous position.
- Parameters
- range
python:tuple Range of the scale
- Returns
- out
function Palette function that takes a sequence of values in the range [0, 1] and returns values in the specified range.
Examples
>>> palette = rescale_pal() >>> palette([0, .2, .4, .6, .8, 1]) array([0.1 , 0.28, 0.46, 0.64, 0.82, 1. ])
The returned palette expects inputs in the [0, 1] range. Any value outside those limits is clipped to range[0] or range[1].
>>> palette([-2, -1, 0.2, .4, .8, 2, 3]) array([0.1 , 0.1 , 0.28, 0.46, 0.82, 1. , 1. ])
- mizani.palettes.area_pal(range=(1, 6))
Point area palette (continuous).
- Parameters
- range
python:tuple Numeric vector of length two, giving range of possible sizes. Should be greater than 0.
- Returns
- out
function Palette function that takes a sequence of values in the range [0, 1] and returns values in the specified range.
Examples
>>> x = np.arange(0, .6, .1)**2 >>> palette = area_pal() >>> palette(x) array([1. , 1.5, 2. , 2.5, 3. , 3.5])
The results are equidistant because the input x is in area space, i.e it is squared.
- mizani.palettes.abs_area(max)
Point area palette (continuous), with area proportional to value.
- Parameters
- max
python:float A number representing the maximum size
- Returns
- out
function Palette function that takes a sequence of values in the range [0, 1] and returns values in the range [0, max].
Examples
>>> x = np.arange(0, .8, .1)**2 >>> palette = abs_area(5) >>> palette(x) array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5])
Compared to area_pal(), abs_area() will handle values in the range [-1, 0] without returning np.nan. And values whose absolute value is greater than 1 will be clipped to the maximum.
- mizani.palettes.grey_pal(start=0.2, end=0.8)
Utility for creating continuous grey scale palette
- Parameters
- start
python:float grey value at low end of palette
- end
python:float grey value at high end of palette
- Returns
- out
function Continuous color palette that takes a single int parameter n and returns n equally spaced colors.
Examples
>>> palette = grey_pal() >>> palette(5) ['#333333', '#737373', '#989898', '#b5b5b5', '#cccccc']
- mizani.palettes.hue_pal(h=0.01, l=0.6, s=0.65, color_space='hls')
Utility for making hue palettes for color schemes.
- Parameters
- h
python:float first hue. In the [0, 1] range
- l
python:float lightness. In the [0, 1] range
- s
python:float saturation. In the [0, 1] range
- color_space
'hls' | 'husl' Color space to use for the palette
- Returns
- out
function A discrete color palette that takes a single int parameter n and returns n equally spaced colors. Though the palette is continuous, since it is varies the hue it is good for categorical data. However if n is large enough the colors show continuity.
Examples
>>> hue_pal()(5) ['#db5f57', '#b9db57', '#57db94', '#5784db', '#c957db'] >>> hue_pal(color_space='husl')(5) ['#e0697e', '#9b9054', '#569d79', '#5b98ab', '#b675d7']
- mizani.palettes.brewer_pal(type: ColorScheme | ColorSchemeShort = 'seq', palette: int = 1, direction: Literal[1, -1] = 1)
Utility for making a brewer palette
- Parameters
- type
'sequential' | 'qualitative' | 'diverging' Type of palette. Sequential, Qualitative or Diverging. The following abbreviations may be used, seq, qual or div.
- palette
python:int | python:str Which palette to choose from. If is an integer, it must be in the range [0, m], where m depends on the number sequential, qualitative or diverging palettes. If it is a string, then it is the name of the palette.
- direction
python:int The order of colours in the scale. If -1 the order of colors is reversed. The default is 1.
- Returns
- out
function A color palette that takes a single int parameter n and returns n colors. The maximum value of n varies depending on the parameters.
Examples
>>> brewer_pal()(5) ['#EFF3FF', '#BDD7E7', '#6BAED6', '#3182BD', '#08519C'] >>> brewer_pal('qual')(5) ['#7FC97F', '#BEAED4', '#FDC086', '#FFFF99', '#386CB0'] >>> brewer_pal('qual', 2)(5) ['#1B9E77', '#D95F02', '#7570B3', '#E7298A', '#66A61E'] >>> brewer_pal('seq', 'PuBuGn')(5) ['#F6EFF7', '#BDC9E1', '#67A9CF', '#1C9099', '#016C59']
The available color names for each palette type can be obtained using the following code:
from mizani.colors.brewer import get_palette_names print(get_palette_names("sequential")) print(get_palette_names("qualitative")) print(get_palette_names("diverging"))
- mizani.palettes.gradient_n_pal(colors, values=None, name='gradientn')
Create a n color gradient palette
- Parameters
- colors
python:list list of colors
- values
python:list, optional list of points in the range [0, 1] at which to place each color. Must be the same size as colors. Default to evenly space the colors
- name
python:str Name to call the resultant MPL colormap
- Returns
- out
function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].
Examples
>>> palette = gradient_n_pal(['red', 'blue']) >>> palette([0, .25, .5, .75, 1]) ['#ff0000', '#bf0040', '#7f0080', '#3f00c0', '#0000ff'] >>> palette([-np.inf, 0, np.nan, 1, np.inf]) [nan, '#ff0000', nan, '#0000ff', nan]
- mizani.palettes.cmap_pal(name, lut=None)
Create a continuous palette using an MPL colormap
- Parameters
- name
python:str Name of colormap
- lut
python:None | python:int This is the number of entries desired in the lookup table. Default is None, leave it up Matplotlib.
- Returns
- out
function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].
Examples
>>> palette = cmap_pal('viridis') >>> palette([.1, .2, .3, .4, .5]) ['#482475', '#414487', '#355f8d', '#2a788e', '#21918c']
- mizani.palettes.cmap_d_pal(name, lut=None)
Create a discrete palette using an MPL Listed colormap
- Parameters
- name
python:str Name of colormap
- lut
python:None | python:int This is the number of entries desired in the lookup table. Default is None, leave it up Matplotlib.
- Returns
- out
function A discrete color palette that takes a single int parameter n and returns n colors. The maximum value of n varies depending on the parameters.
Examples
>>> palette = cmap_d_pal('viridis') >>> palette(5) ['#440154', '#3b528b', '#21918c', '#5cc863', '#fde725']
- mizani.palettes.desaturate_pal(color, prop, reverse=False)
Create a palette that desaturate a color by some proportion
- Parameters
- color
matplotlib color hex, rgb-tuple, or html color name
- prop
python:float saturation channel of color will be multiplied by this value
- reverse
bool Whether to reverse the palette.
- Returns
- out
function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].
Examples
>>> palette = desaturate_pal('red', .1) >>> palette([0, .25, .5, .75, 1]) ['#ff0000', '#e21d1d', '#c53a3a', '#a95656', '#8c7373']
- mizani.palettes.manual_pal(values)
Create a palette from a list of values
- Parameters
- values
python:sequence Values that will be returned by the palette function.
- Returns
- out
function A function palette that takes a single int parameter n and returns n values.
Examples
>>> palette = manual_pal(['a', 'b', 'c', 'd', 'e']) >>> palette(3) ['a', 'b', 'c']
- mizani.palettes.xkcd_palette(colors)
Make a palette with color names from the xkcd color survey.
See xkcd for the full list of colors: http://xkcd.com/color/rgb/
- Parameters
- colors
python:list of strings List of keys in the mizani.external.xkcd_rgb dictionary.
- Returns
- palette
python:list List of colors as RGB hex strings.
Examples
>>> palette = xkcd_palette(['red', 'green', 'blue']) >>> palette ['#e50000', '#15b01a', '#0343df']
>>> from mizani.external import xkcd_rgb >>> list(sorted(xkcd_rgb.keys()))[:5] ['acid green', 'adobe', 'algae', 'algae green', 'almost black']
- mizani.palettes.crayon_palette(colors)
Make a palette with color names from Crayola crayons.
The colors come from http://en.wikipedia.org/wiki/List_of_Crayola_crayon_colors
- Parameters
- colors
python:list of strings List of keys in the mizani.external.crayloax_rgb dictionary.
- Returns
- palette
python:list List of colors as RGB hex strings.
Examples
>>> palette = crayon_palette(['almond', 'silver', 'yellow']) >>> palette ['#eed9c4', '#c9c0bb', '#fbe870']
>>> from mizani.external import crayon_rgb >>> list(sorted(crayon_rgb.keys()))[:5] ['almond', 'antique brass', 'apricot', 'aquamarine', 'asparagus']
- mizani.palettes.cubehelix_pal(start=0, rot=0.4, gamma=1.0, hue=0.8, light=0.85, dark=0.15, reverse=False)
Utility for creating continuous palette from the cubehelix system.
This produces a colormap with linearly-decreasing (or increasing) brightness. That means that information will be preserved if printed to black and white or viewed by someone who is colorblind.
- Parameters
- start
python:float (0 <= start <= 3) The hue at the start of the helix.
- rot
python:float Rotations around the hue wheel over the range of the palette.
- gamma
python:float (0 <= gamma) Gamma factor to emphasize darker (gamma < 1) or lighter (gamma > 1) colors.
- hue
python:float (0 <= hue <= 1) Saturation of the colors.
- dark
python:float (0 <= dark <= 1) Intensity of the darkest color in the palette.
- light
python:float (0 <= light <= 1) Intensity of the lightest color in the palette.
- reverse
bool If True, the palette will go from dark to light.
- Returns
- out
function Continuous color palette that takes a single int parameter n and returns n equally spaced colors.
References
Green, D. A. (2011). "A colour scheme for the display of astronomical intensity images". Bulletin of the Astromical Society of India, Vol. 39, p. 289-295.
Examples
>>> palette = cubehelix_pal() >>> palette(5) ['#edd1cb', '#d499a7', '#aa688f', '#6e4071', '#2d1e3e']
transforms - Transforming variables, scales and coordinates
"The Grammar of Graphics (2005)" by Wilkinson, Anand and Grossman describes three types of transformations.
- Variable transformations - Used to make statistical operations on variables appropriate and meaningful. They are also used to new variables.
- Scale transformations - Used to make statistical objects displayed on dimensions appropriate and meaningful.
- Coordinate transformations - Used to manipulate the geometry of graphics to help perceive relationships and find meaningful structures for representing variations.
Variable and scale transformations are similar in-that they lead to plotted objects that are indistinguishable. Typically, variable transformation is done outside the graphics system and so the system cannot provide transformation specific guides & decorations for the plot. The trans is aimed at being useful for scale and coordinate transformations.
- class mizani.transforms.asn_trans(**kwargs)
Arc-sin square-root Transformation
- static transform(x)
Transform of x
- static inverse(x)
Inverse of x
- class mizani.transforms.atanh_trans(**kwargs)
Arc-tangent Transformation
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'arctanh'>
inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'tanh'>
- mizani.transforms.boxcox_trans(p, offset=0, **kwargs)
Boxcox Transformation
The Box-Cox transformation is a flexible transformation, often used to transform data towards normality.
The Box-Cox power transformation (type 1) requires strictly positive values and takes the following form for y \gt 0:
y^{(\lambda)} = \frac{y^\lambda - 1}{\lambda}
When y = 0, the natural log transform is used.
- Parameters
- p
python:float Transformation exponent \lambda.
- offset
python:int Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 0. modulus_trans() sets the default to 1.
- kwargs
python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.
- SEE ALSO:
modulus_trans()
References
- Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252. https://www.jstor.org/stable/2984418
- John, J. A., & Draper, N. R. (1980). An alternative family of transformations. Applied Statistics, 190-197. http://www.jstor.org/stable/2986305
- mizani.transforms.modulus_trans(p, offset=1, **kwargs)
Modulus Transformation
The modulus transformation generalises Box-Cox to work with both positive and negative values.
When y \neq 0
y^{(\lambda)} = sign(y) * \frac{(|y| + 1)^\lambda - 1}{\lambda}
and when y = 0
y^{(\lambda)} = sign(y) * \ln{(|y| + 1)}
- Parameters
- p
python:float Transformation exponent \lambda.
- offset
python:int Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 1. boxcox_trans() sets the default to 0.
- kwargs
python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.
- SEE ALSO:
boxcox_trans()
References
- Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252. https://www.jstor.org/stable/2984418
- John, J. A., & Draper, N. R. (1980). An alternative family of transformations. Applied Statistics, 190-197. http://www.jstor.org/stable/2986305
- class mizani.transforms.datetime_trans(tz=None, **kwargs)
Datetime Transformation
- Parameters
- tz
python:str | ZoneInfo Timezone information
Examples
>>> # from zoneinfo import ZoneInfo >>> # from backports.zoneinfo import ZoneInfo # for python < 3.9 >>> UTC = ZoneInfo("UTC") >>> EST = ZoneInfo("EST") >>> t = datetime_trans(EST) >>> x = datetime.datetime(2022, 1, 20, tzinfo=UTC) >>> x2 = t.inverse(t.transform(x)) >>> x == x2 True >>> x.tzinfo == x2.tzinfo False >>> x.tzinfo.key 'UTC' >>> x2.tzinfo.key 'EST'
- dataspace_is_numerical = False
Whether the untransformed data is numerical
- domain = (datetime.datetime(1, 1, 1, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC')), datetime.datetime(9999, 12, 31, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC')))
Limits of the transformed data
- breaks_ = <mizani.breaks.date_breaks object>
Callable to calculate breaks
- format = <mizani.formatters.date_format object>
Function to format breaks
- transform(x)
Transform from date to a numerical format
- inverse(x)
Transform to date from numerical format
- property tzinfo
Alias of tz
- mizani.transforms.exp_trans(base=None, **kwargs)
Create a exponential transform class for base
This is inverse of the log transform.
- Parameters
- base
python:float Base of the logarithm
- kwargs
python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.
- Returns
- out
type Exponential transform class
- class mizani.transforms.identity_trans(**kwargs)
Identity Transformation
- class mizani.transforms.log10_trans(**kwargs)
Log 10 Transformation
- breaks_ = <mizani.breaks.log_breaks object>
Callable to calculate breaks
- domain = (2.2250738585072014e-308, inf)
Limits of the transformed data
- format = <mizani.formatters.log_format object>
Function to format breaks
- static inverse(x)
Inverse of x
- minor_breaks = <mizani.breaks.trans_minor_breaks object>
Callable to calculate minor_breaks
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log10'>
- class mizani.transforms.log1p_trans(**kwargs)
Log plus one Transformation
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log1p'>
inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'expm1'>
- class mizani.transforms.log2_trans(**kwargs)
Log 2 Transformation
- breaks_ = <mizani.breaks.log_breaks object>
Callable to calculate breaks
- domain = (2.2250738585072014e-308, inf)
Limits of the transformed data
- format = <mizani.formatters.log_format object>
Function to format breaks
- static inverse(x)
Inverse of x
- minor_breaks = <mizani.breaks.trans_minor_breaks object>
Callable to calculate minor_breaks
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log2'>
- mizani.transforms.log_trans(base=None, **kwargs)
Create a log transform class for base
- Parameters
- base
python:float Base for the logarithm. If None, then the natural log is used.
- kwargs
python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.
- Returns
- out
type Log transform class
- class mizani.transforms.logit_trans(**kwargs)
Logit Transformation
- domain = (0, 1)
Limits of the transformed data
- static inverse(x)
Inverse of x
- static transform(x)
Transform of x
- mizani.transforms.probability_trans(distribution, *args, **kwargs)
Probability Transformation
- Parameters
- distribution
python:str Name of the distribution. Valid distributions are listed at scipy.stats. Any of the continuous or discrete distributions.
- args
python:tuple Arguments passed to the distribution functions.
- kwargs
python:dict Keyword arguments passed to the distribution functions.
Notes
Make sure that the distribution is a good enough approximation for the data. When this is not the case, computations may run into errors. Absence of any errors does not imply that the distribution fits the data.
- mizani.transforms.probit_trans
alias of norm_trans
- class mizani.transforms.reverse_trans(**kwargs)
Reverse Transformation
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>
inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>
- class mizani.transforms.sqrt_trans(**kwargs)
Square-root Transformation
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'sqrt'>
inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'square'>
- domain = (0, inf)
Limits of the transformed data
- class mizani.transforms.timedelta_trans(**kwargs)
Timedelta Transformation
- dataspace_is_numerical = False
Whether the untransformed data is numerical
- domain = (datetime.timedelta(days=-999999999), datetime.timedelta(days=999999999, seconds=86399, microseconds=999999))
Limits of the transformed data
- breaks_ = <mizani.breaks.timedelta_breaks object>
Callable to calculate breaks
- format = <mizani.formatters.timedelta_format object>
Function to format breaks
- static transform(x)
Transform from Timeddelta to numerical format
- static inverse(x)
Transform to Timedelta from numerical format
- class mizani.transforms.pd_timedelta_trans(**kwargs)
Pandas timedelta Transformation
- dataspace_is_numerical = False
Whether the untransformed data is numerical
- domain = (Timedelta('-106752 days +00:12:43.145224193'), Timedelta('106751 days 23:47:16.854775807'))
Limits of the transformed data
- breaks_ = <mizani.breaks.timedelta_breaks object>
Callable to calculate breaks
- format = <mizani.formatters.timedelta_format object>
Function to format breaks
- static transform(x)
Transform from Timeddelta to numerical format
- static inverse(x)
Transform to Timedelta from numerical format
- mizani.transforms.pseudo_log_trans(sigma=1, base=None, **kwargs)
Pseudo-log transformation
A transformation mapping numbers to a signed logarithmic scale with a smooth transition to linear scale around 0.
- Parameters
- sigma
python:float Scaling factor for the linear part.
- base
python:int Approximate logarithm used. If None, then the natural log is used.
- kwargs
python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.
- class mizani.transforms.reciprocal_trans(**kwargs)
Reciprocal Transformation
- static transform(x)
Transform of x
- static inverse(x)
Inverse of x
- class mizani.transforms.trans(**kwargs)
Base class for all transforms
This class is used to transform data and also tell the x and y axes how to create and label the tick locations.
The key methods to override are trans.transform() and trans.inverse(). Alternately, you can quickly create a transform class using the trans_new() function.
- Parameters
- kwargs
python:dict Attributes of the class to set/override
Examples
By default trans returns one minor break between every pair of major break
>>> major = [0, 1, 2] >>> t = trans() >>> t.minor_breaks(major) array([0.5, 1.5])
Create a trans that returns 4 minor breaks
>>> t = trans(minor_breaks=minor_breaks(4)) >>> t.minor_breaks(major) array([0.2, 0.4, 0.6, 0.8, 1.2, 1.4, 1.6, 1.8])
- aesthetic = None
Aesthetic that the transform works on
- dataspace_is_numerical = True
Whether the untransformed data is numerical
- domain = (-inf, inf)
Limits of the transformed data
- format = <mizani.formatters.mpl_format object>
Function to format breaks
- breaks_ = None
Callable to calculate breaks
- minor_breaks = None
Callable to calculate minor_breaks
- static transform(x)
Transform of x
- static inverse(x)
Inverse of x
- breaks(limits)
Calculate breaks in data space and return them in transformed space.
Expects limits to be in transform space, this is the same space as that where the domain is specified.
This method wraps around breaks_() to ensure that the calculated breaks are within the domain the transform. This is helpful in cases where an aesthetic requests breaks with limits expanded for some padding, yet the expansion goes beyond the domain of the transform. e.g for a probability transform the breaks will be in the domain [0, 1] despite any outward limits.
- Parameters
- limits
python:tuple The scale limits. Size 2.
- Returns
- out
numpy:array_like Major breaks
- mizani.transforms.trans_new(name, transform, inverse, breaks=None, minor_breaks=None, _format=None, domain=(-inf, inf), doc='', **kwargs)
Create a transformation class object
- Parameters
- name
python:str Name of the transformation
- transform
python:callable() f(x) A function (preferably a ufunc) that computes the transformation.
- inverse
python:callable() f(x) A function (preferably a ufunc) that computes the inverse of the transformation.
- breaks
python:callable() f(limits) Function to compute the breaks for this transform. If None, then a default good enough for a linear domain is used.
- minor_breaks
python:callable() f(major, limits) Function to compute the minor breaks for this transform. If None, then a default good enough for a linear domain is used.
- _format
python:callable() f(breaks) Function to format the generated breaks.
- domain
numpy:array_like Domain over which the transformation is valid. It should be of length 2.
- doc
python:str Docstring for the class.
- **kwargs
python:dict Attributes of the transform, e.g if base is passed in kwargs, then t.base would be a valied attribute.
- Returns
- out
trans Transform class
- mizani.transforms.gettrans(t)
Return a trans object
- Parameters
- t
python:str | python:callable() | type | trans name of transformation function
- Returns
- out
trans.UNINDENT
scale - Implementing a scale
According to On the theory of scales of measurement by S.S. Stevens, scales can be classified in four ways -- nominal, ordinal, interval and ratio. Using current(2016) terminology, nominal data is made up of unordered categories, ordinal data is made up of ordered categories and the two can be classified as discrete. On the other hand both interval and ratio data are continuous.
The scale classes below show how the rest of the Mizani package can be used to implement the two categories of scales. The key tasks are training and mapping and these correspond to the train and map methods.
To train a scale on data means, to make the scale learn the limits of the data. This is elaborate (or worthy of a dedicated method) for two reasons:
- Practical -- data may be split up across more than one object, yet all will be represented by a single scale.
- Conceptual -- training is a key action that may need to be inserted into multiple locations of the data processing pipeline before a graphic can be created.
To map data onto a scale means, to associate data values with values(potential readings) on a scale. This is perhaps the most important concept unpinning a scale.
The apply methods are simple examples of how to put it all together.
- class mizani.scale.scale_continuous
Continuous scale
- classmethod apply(x, palette, na_value=None, trans=None)
Scale data continuously
- Parameters
- x
numpy:array_like Continuous values to scale
- palette
python:callable() f(x) Palette to use
- na_value
object Value to use for missing values.
- trans
trans How to transform the data before scaling. If None, no transformation is done.
- Returns
- out
numpy:array_like Scaled values
- classmethod train(new_data, old=None)
Train a continuous scale
- Parameters
- new_data
numpy:array_like New values
- old
numpy:array_like Old range. Most likely a tuple of length 2.
- Returns
- out
python:tuple Limits(range) of the scale
- classmethod map(x, palette, limits, na_value=None, oob=<function censor>)
Map values to a continuous palette
- Parameters
- x
numpy:array_like Continuous values to scale
- palette
python:callable() f(x) palette to use
- na_value
object Value to use for missing values.
- oob
python:callable() f(x) Function to deal with values that are beyond the limits
- Returns
- out
numpy:array_like Values mapped onto a palette
- class mizani.scale.scale_discrete
Discrete scale
- classmethod apply(x, palette, na_value=None)
Scale data discretely
- Parameters
- x
numpy:array_like Discrete values to scale
- palette
python:callable() f(x) Palette to use
- na_value
object Value to use for missing values.
- Returns
- out
numpy:array_like Scaled values
- classmethod train(new_data, old=None, drop=False, na_rm=False)
Train a continuous scale
- Parameters
- new_data
numpy:array_like New values
- old
numpy:array_like Old range. List of values known to the scale.
- drop
bool Whether to drop(not include) unused categories
- na_rm
bool If True, remove missing values. Missing values are either NaN or None.
- Returns
- out
python:list Values covered by the scale
- classmethod map(x, palette, limits, na_value=None)
Map values to a discrete palette
- Parameters
- palette
python:callable() f(x) palette to use
- x
numpy:array_like Continuous values to scale
- na_value
object Value to use for missing values.
- Returns
- out
numpy:array_like Values mapped onto a palette
Installation
mizani can be can be installed in a couple of ways depending on purpose.
Official release installation
For a normal user, it is recommended to install the official release.
$ pip install mizani
Development installation
To do any development you have to clone the mizani source repository and install the package in development mode. These commands do all of that:
$ git clone https://github.com/has2k1/mizani.git $ cd mizani $ pip install -e .
If you only want to use the latest development sources and do not care about having a cloned repository, e.g. if a bug you care about has been fixed but an official release has not come out yet, then use this command:
$ pip install git+https://github.com/has2k1/mizani.git
Changelog
v0.9.3
2023-09-01
Enhancements
- Removed FutureWarnings when using pandas 2.1.0
v0.9.2
2023-05-25
Bug Fixes
- Fixed regression in but in date_format where it cannot deal with UTC timezone from timezone #30.
v0.9.1
2023-05-19 .SS Bug Fixes
- Fixed but in date_format to handle datetime sequences within the same timezone but a mixed daylight saving state. (plotnine #687)
v0.9.0
2023-04-15 .SS API Changes
- palettable dropped as a dependency.
Bug Fixes
- Fixed bug in datetime_trans where a pandas series with an index that did not start at 0 could not be transformed.
- Install tzdata on pyiodide/emscripten. #27
v0.8.1
2022-09-28 .SS Bug Fixes
- Fixed regression bug in log_format for where formatting for bases 2, 8 and 16 would fail if the values were float-integers.
Enhancements
- log_format now uses exponent notation for bases other than base 10.
v0.8.0
2022-09-26 .SS API Changes
- The lut parameter of cmap_pal and cmap_d_pal has been deprecated and will removed in a future version.
- datetime_trans gained parameter tz that controls the timezone of the transformation.
- log_format gained boolean parameter mathtex for TeX values as understood matplotlib instead of values in scientific notation.
Bug Fixes
- Fixed bug in zero_range where uint64 values would cause a RuntimeError.
v0.7.4
2022-04-02 .SS API Changes
- comma_format is now imported automatically when using *.
- Fixed issue with scale_discrete so that if you train on data with Nan and specify and old range that also has NaN, the result range does not include two NaN values.
v0.7.3
(2020-10-29) .SS Bug Fixes
- Fixed log_breaks for narrow range if base=2 (#76).
v0.7.2
(2020-10-29) .SS Bug Fixes
- Fixed bug in rescale_max() to properly handle values whose maximum is zero (#16).
v0.7.1
(2020-06-05) .SS Bug Fixes
- Fixed regression in mizani.scales.scale_discrete.train() when trainning on values with some categoricals that have common elements.
v0.7.0
(2020-06-04) .SS Bug Fixes
- Fixed issue with mizani.formatters.log_breaks where non-linear breaks could not be generated if the limits where greater than the largest integer sys.maxsize.
- Fixed mizani.palettes.gradient_n_pal() to return nan for nan values.
- Fixed mizani.scales.scale_discrete.train() when training categoricals to maintain the order. (plotnine #381)
v0.6.0
(2019-08-15) .SS New
- Added pvalue_format
- Added ordinal_format
- Added number_bytes_format
- Added pseudo_log_trans()
- Added reciprocal_trans
- Added modulus_trans()
Enhancements
- mizani.breaks.date_breaks now supports intervals in the
order of seconds.
- mizani.palettes.brewer_pal now supports a direction argument to control the order of the returned colors.
API Changes
- boxcox_trans() now only accepts positive values. For both positive and negative values, modulus_trans() has been added.
v0.5.4
(2019-03-26) .SS Enhancements
- mizani.formatters.log_format now does a better job of approximating labels for numbers like 3.000000000000001e-05.
API Changes
- exponent_threshold parameter of mizani.formatters.log_format has been deprecated.
v0.5.3
(2018-12-24) .SS API Changes
- Log transforms now default to base - 2 minor breaks. So base 10 has 8 minor breaks and 9 partitions, base 8 has 6 minor breaks and 7 partitions, ..., base 2 has 0 minor breaks and a single partition.
v0.5.2
(2018-10-17) .SS Bug Fixes
- Fixed issue where some functions that took pandas series would return output where the index did not match that of the input.
v0.5.1
(2018-10-15) .SS Bug Fixes
- Fixed issue with log_breaks, so that it does not fail needlessly when the limits in the (0, 1) range.
Enhancements
- Changed log_format to return better formatted breaks.
v0.5.0
(2018-11-10) .SS API Changes
- Support for python 2 has been removed.
- call() and
meth:~mizani.breaks.trans_minor_breaks.call now accept optional parameter n which is the number of minor breaks between any two major breaks.
- The parameter nan_value has be renamed to na_value.
- The parameter nan_rm has be renamed to na_rm.
Enhancements
- Better support for handling missing values when training discrete scales.
- Changed the algorithm for log_breaks, it can now return breaks that do not fall on the integer powers of the base.
v0.4.6
(2018-03-20) .INDENT 0.0
- Added squish
v0.4.5
(2018-03-09) .INDENT 0.0
- Added identity_pal
- Added cmap_d_pal
v0.4.4
(2017-12-13) .INDENT 0.0
- Fixed date_format to respect the timezones of the dates (#8).
v0.4.3
(2017-12-01) .INDENT 0.0
- Changed date_breaks to have more variety in the spacing between the breaks.
- Fixed date_format to respect time part of the date (#7).
v0.4.2
(2017-11-06) .INDENT 0.0
- Fixed (regression) break calculation for the non ordinal transforms.
v0.4.1
(2017-11-04) .INDENT 0.0
- trans objects can now be instantiated with parameter to override attributes of the instance. And the default methods for computing breaks and minor breaks on the transform instance are not class attributes, so they can be modified without global repercussions.
v0.4.0
(2017-10-24) .SS API Changes
- Breaks and formatter generating functions have been converted to classes, with a __call__ method. How they are used has not changed, but this makes them move flexible.
- ExtendedWilkson class has been removed. extended_breaks() now contains the implementation of the break calculating algorithm.
v0.3.4
(2017-09-12) .INDENT 0.0
- Fixed issue where some formatters methods failed if passed empty breaks argument.
Fixed issue with log_breaks() where if the limits were with in the same order of magnitude the calculated breaks were always the ends of the order of magnitude.
Now log_breaks()((35, 50)) returns [35, 40, 45, 50] as breaks instead of [1, 100].
v0.3.3
(2017-08-30) .INDENT 0.0
- Fixed SettingWithCopyWarnings in squish_infinite().
- Added log_format().
API Changes
- Added log_trans now uses log_format() as the formatting method.
v0.3.2
(2017-07-14) .INDENT 0.0
- Added expand_range_distinct()
v0.3.1
(2017-06-22) .INDENT 0.0
- Fixed bug where using log_breaks() with Numpy 1.13.0 led to a ValueError.
v0.3.0
(2017-04-24) .INDENT 0.0
- Added xkcd_palette(), a palette that selects from 954 named colors.
- Added crayon_palette(), a palette that selects from 163 named colors.
- Added cubehelix_pal(), a function that creates a continuous palette from the cubehelix system.
- Fixed bug where a color palette would raise an exception when passed a single scalar value instead of a list-like.
- extended_breaks() and mpl_breaks() now return a single break if the limits are equal. Previous, one run into an Overflow and the other returned a sequence filled with n of the same limit.
API Changes
- mpl_breaks() now returns a function that (strictly) expects a tuple with the minimum and maximum values.
v0.2.0
(2017-01-27) .INDENT 0.0
- Fixed bug in censor() where a sequence of values with an irregular index would lead to an exception.
- Fixed boundary issues due internal loss of precision in ported function seq().
- Added mizani.breaks.extended_breaks() which computes breaks using a modified version of Wilkinson's tick algorithm.
- Changed the default function mizani.transforms.trans.breaks_() used by mizani.transforms.trans to compute breaks from mizani.breaks.mpl_breaks() to mizani.breaks.extended_breaks().
- mizani.breaks.timedelta_breaks() now uses mizani.breaks.extended_breaks() internally instead of mizani.breaks.mpl_breaks().
- Added manual palette function mizani.palettes.manual_pal().
- Requires pandas version 0.19.0 or higher.
v0.1.0
(2016-06-30)
First public release
Author
Hassan Kibirige
Copyright
2023, Hassan Kibirige