pybabel - Man Page

Name

babel — Babel Documentation

Babel is an integrated collection of utilities that assist in internationalizing and localizing Python applications, with an emphasis on web-based applications.

User Documentation

The user documentation explains some core concept of the library and gives some information about how it can be used.

Introduction

The functionality Babel provides for internationalization (I18n) and localization (L10N) can be separated into two different aspects:

  • tools to build and work with gettext message catalogs, and
  • a Python interface to the CLDR (Common Locale Data Repository), providing access to various locale display names, localized number and date formatting, etc.

Message Catalogs

While the Python standard library includes a gettext module that enables applications to use message catalogs, it requires developers to build these catalogs using GNU tools such as xgettext, msgmerge, and msgfmt.  And while xgettext does have support for extracting messages from Python files, it does not know how to deal with other kinds of files commonly found in Python web-applications, such as templates, nor does it provide an easy extensibility mechanism to add such support.

Babel addresses this by providing a framework where various extraction methods can be plugged in to a larger message extraction framework, and also removes the dependency on the GNU gettext tools for common tasks, as these aren’t necessarily available on all platforms. See Working with Message Catalogs for details on this aspect of Babel.

Locale Data

Furthermore, while the Python standard library does include support for basic localization with respect to the formatting of numbers and dates (the locale module, among others), this support is based on the assumption that there will be only one specific locale used per process (at least simultaneously.) Also, it doesn’t provide access to other kinds of locale data, such as the localized names of countries, languages, or time-zones, which are frequently needed in web-based applications.

For these requirements, Babel includes data extracted from the Common Locale Data Repository (CLDR), and provides a number of convenient methods for accessing and using this data. See Locale Data, Date and Time, and Number Formatting for more information on this aspect of Babel.

Installation

Babel is distributed as a standard Python package fully set up with all the dependencies it needs.  It primarily depends on the excellent pytz library for timezone handling.  To install it you can use pip.

virtualenv

Virtualenv is probably what you want to use during development, and if you have shell access to your production machines, you’ll probably want to use it there, too.  Use pip to install it:

$ sudo pip install virtualenv

If you’re on Windows, run it in a command-prompt window with administrator privileges, and leave out sudo.

Once you have virtualenv installed, just fire up a shell and create your own environment.  I usually create a project folder and a venv folder within:

$ mkdir myproject
$ cd myproject
$ virtualenv venv
New python executable in venv/bin/python
Installing distribute............done.

Now, whenever you want to work on a project, you only have to activate the corresponding environment.  On OS X and Linux, do the following:

$ . venv/bin/activate

If you are a Windows user, the following command is for you:

$ venv\scripts\activate

Either way, you should now be using your virtualenv (notice how the prompt of your shell has changed to show the active environment).

Now you can just enter the following command to get Babel installed in your virtualenv:

$ pip install Babel

A few seconds later and you are good to go.

System-Wide Installation

This is possible as well, though I do not recommend it.  Just run pip with root privileges:

$ sudo pip install Babel

(On Windows systems, run it in a command-prompt window with administrator privileges, and leave out sudo.)

Living on the Edge

If you want to work with the latest version of Babel, you will need to use a git checkout.

Get the git checkout in a new virtualenv and run in development mode:

$ git clone https://github.com/python-babel/babel
Initialized empty Git repository in ~/dev/babel/.git/
$ cd babel
$ virtualenv venv
New python executable in venv/bin/python
Installing distribute............done.
$ . venv/bin/activate
$ pip install pytz
$ python setup.py import_cldr
$ pip install --editable .
...
Finished processing dependencies for Babel

Make sure to not forget about the pip install pytz and import_cldr steps because otherwise you will be missing the locale data. The custom setup command will download the most appropriate CLDR release from the official website and convert it for Babel but will not work without pytz.

This will pull also in the dependencies and activate the git head as the current version inside the virtualenv.  Then all you have to do is run git pull origin to update to the latest version.  If the CLDR data changes you will have to re-run python setup.py import_cldr.

Locale Data

While message catalogs allow you to localize any messages in your application, there are a number of strings that are used in many applications for which translations are readily available.

Imagine for example you have a list of countries that users can choose from, and you’d like to display the names of those countries in the language the user prefers. Instead of translating all those country names yourself in your application, you can make use of the translations provided by the locale data included with Babel, which is based on the Common Locale Data Repository (CLDR) developed and maintained by the Unicode Consortium.

The Locale Class

You normally access such locale data through the Locale class provided by Babel:

>>> from babel import Locale
>>> locale = Locale('en', 'US')
>>> locale.territories['US']
u'United States'
>>> locale = Locale('es', 'MX')
>>> locale.territories['US']
u'Estados Unidos'

In addition to country/territory names, the locale data also provides access to names of languages, scripts, variants, time zones, and more. Some of the data is closely related to number and date formatting.

Most of the corresponding Locale properties return dictionaries, where the key is a code such as the ISO country and language codes. Consult the API documentation for references to the relevant specifications.

Likely Subtags

When dealing with locales you can run into the situation where a locale tag is not fully descriptive.  For instance people commonly refer to zh_TW but that identifier does not resolve to a locale that the CLDR covers.  Babel’s locale identifier parser in that case will attempt to resolve the most likely subtag to end up with the intended locale:

>>> from babel import Locale
>>> Locale.parse('zh_TW')
Locale('zh', territory='TW', script='Hant')

This can also be used to find the most appropriate locale for a territory. In that case the territory code needs to be prefixed with und (unknown language identifier):

>>> Locale.parse('und_AZ')
Locale('az', territory='AZ', script='Latn')
>>> Locale.parse('und_DE')
Locale('de', territory='DE')

Babel currently cannot deal with fuzzy locales (a locale not fully backed by data files) so we only accept locales that are fully backed by CLDR data.  This will change in the future, but for the time being this restriction is in place.

Locale Display Names

Locales itself can be used to describe the locale itself or other locales. This mainly means that given a locale object you can ask it for its canonical display name, the name of the language and other things.  Since the locales cross-reference each other you can ask for locale names in any language supported by the CLDR:

>>> l = Locale.parse('de_DE')
>>> l.get_display_name('en_US')
u'German (Germany)'
>>> l.get_display_name('fr_FR')
u'allemand (Allemagne)'

Display names include all the information to uniquely identify a locale (language, territory, script and variant) which is often not what you want.  You can also ask for the information in parts:

>>> l.get_language_name('de_DE')
u'Deutsch'
>>> l.get_language_name('it_IT')
u'tedesco'
>>> l.get_territory_name('it_IT')
u'Germania'
>>> l.get_territory_name('pt_PT')
u'Alemanha'

Calendar Display Names

The Locale class provides access to many locale display names related to calendar display, such as the names of weekdays or months.

These display names are of course used for date formatting, but can also be used, for example, to show a list of months to the user in their preferred language:

>>> locale = Locale('es')
>>> month_names = locale.months['format']['wide'].items()
>>> for idx, name in sorted(month_names):
...     print name
enero
febrero
marzo
abril
mayo
junio
julio
agosto
septiembre
octubre
noviembre
diciembre

Date and Time

When working with date and time information in Python, you commonly use the classes date, datetime and/or time from the datetime package. Babel provides functions for locale-specific formatting of those objects in its dates module:

>>> from datetime import date, datetime, time
>>> from babel.dates import format_date, format_datetime, format_time

>>> d = date(2007, 4, 1)
>>> format_date(d, locale='en')
u'Apr 1, 2007'
>>> format_date(d, locale='de_DE')
u'01.04.2007'

As this example demonstrates, Babel will automatically choose a date format that is appropriate for the requested locale.

The format_*() functions also accept an optional format argument, which allows you to choose between one of four format variations:

  • short,
  • medium (the default),
  • long, and
  • full.

For example:

>>> format_date(d, format='short', locale='en')
u'4/1/07'
>>> format_date(d, format='long', locale='en')
u'April 1, 2007'
>>> format_date(d, format='full', locale='en')
u'Sunday, April 1, 2007'

Core Time Concepts

Working with dates and time can be a complicated thing.  Babel attempts to simplify working with them by making some decisions for you.  Python’s datetime module has different ways to deal with times and dates: naive and timezone-aware datetime objects.

Babel generally recommends you to store all your time in naive datetime objects and treat them as UTC at all times.  This simplifies dealing with time a lot because otherwise you can get into the hairy situation where you are dealing with datetime objects of different timezones.  That is tricky because there are situations where time can be ambiguous.  This is usually the case when dealing with dates around timezone transitions.  The most common case of timezone transition is changes between daylight saving time and standard time.

As such we recommend to always use UTC internally and only reformat to local time when returning dates to users.  At that point the timezone the user has selected can usually be established and Babel can automatically rebase the time for you.

To get the current time use the utcnow() method of the datetime object.  It will return a naive datetime object in UTC.

For more information about timezones see Time-zone Support.

Pattern Syntax

While Babel makes it simple to use the appropriate date/time format for a given locale, you can also force it to use custom patterns. Note that Babel uses different patterns for specifying number and date formats compared to the Python equivalents (such as time.strftime()), which have mostly been inherited from C and POSIX. The patterns used in Babel are based on the Locale Data Markup Language specification (LDML), which defines them as follows:

A date/time pattern is a string of characters, where specific strings of characters are replaced with date and time data from a calendar when formatting or used to generate data for a calendar when parsing. […]

Characters may be used multiple times. For example, if y is used for the year, yy might produce “99”, whereas yyyy produces “1999”. For most numerical fields, the number of characters specifies the field width. For example, if h is the hour, h might produce “5”, but hh produces “05”. For some characters, the count specifies whether an abbreviated or full form should be used […]

Two single quotes represent a literal single quote, either inside or outside single quotes. Text within single quotes is not interpreted in any way (except for two adjacent single quotes).

For example:

>>> d = date(2007, 4, 1)
>>> format_date(d, "EEE, MMM d, ''yy", locale='en')
u"Sun, Apr 1, '07"
>>> format_date(d, "EEEE, d.M.yyyy", locale='de')
u'Sonntag, 1.4.2007'

>>> t = time(15, 30)
>>> format_time(t, "hh 'o''clock' a", locale='en')
u"03 o'clock PM"
>>> format_time(t, 'H:mm a', locale='de')
u'15:30 nachm.'

>>> dt = datetime(2007, 4, 1, 15, 30)
>>> format_datetime(dt, "yyyyy.MMMM.dd GGG hh:mm a", locale='en')
u'02007.April.01 AD 03:30 PM'

The syntax for custom datetime format patterns is described in detail in the the Locale Data Markup Language specification. The following table is just a relatively brief overview.

Date Fields

FieldSymbolDescription
EraGReplaced with the era string for the current date. One to three letters for the abbreviated form, four lettersfor the long form, five for the narrow form
YearyReplaced by the year. Normally the length specifies the padding, but for two letters it also specifies the maximum length.
YSame as y but uses the ISO year-week calendar. ISO year-week increments after completing the last week of the year. Therefore it may change a few days before or after y. Recommend use with the w Symbol.
u??
QuarterQUse one or two for the numerical quarter, three for the abbreviation, or four for the full name.
qUse one or two for the numerical quarter, three for the abbreviation, or four for the full name.
MonthMUse one or two for the numerical month, three for the abbreviation, or four for the full name, or five for the narrow name.
LUse one or two for the numerical month, three for the abbreviation, or four for the full name, or 5 for the narrow name.
WeekwWeek of year according to the ISO year-week calendar. This may have 52 or 53 weeks depending on the year. Recommend use with the Y symbol.
WWeek of month.
DaydDay of month.
DDay of year.
FDay of week in month.
g??
Week dayEDay of week. Use one through three letters for the short day, or four for the full name, or five for the narrow name.
eLocal day of week. Same as E except adds a numeric value that will depend on the local starting day of the week, using one or two letters.
c??

Time Fields

FieldSymbolDescription
PeriodaAM or PM
HourhHour [1-12].
HHour [0-23].
KHour [0-11].
kHour [1-24].
MinutemUse one or two for zero places padding.
SecondsUse one or two for zero places padding.
SFractional second, rounds to the count of letters.
AMilliseconds in day.
TimezonezUse one to three letters for the short timezone or four for the full name.
ZUse one to three letters for RFC 822, four letters for GMT format.
vUse one letter for short wall (generic) time, four for long wall time.
VSame as z, except that timezone abbreviations should be used regardless of whether they are in common use by the locale.

Time Delta Formatting

In addition to providing functions for formatting localized dates and times, the babel.dates module also provides a function to format the difference between two times, called a ‘’time delta’’. These are usually represented as datetime.timedelta objects in Python, and it’s also what you get when you subtract one datetime object from an other.

The format_timedelta function takes a timedelta object and returns a human-readable representation. This happens at the cost of precision, as it chooses only the most significant unit (such as year, week, or hour) of the difference, and displays that:

>>> from datetime import timedelta
>>> from babel.dates import format_timedelta
>>> delta = timedelta(days=6)
>>> format_timedelta(delta, locale='en_US')
u'1 week'

The resulting strings are based from the CLDR data, and are properly pluralized depending on the plural rules of the locale and the calculated number of units.

The function provides parameters for you to influence how this most significant unit is chosen: with threshold you set the value after which the presentation switches to the next larger unit, and with granularity you can limit the smallest unit to display:

>>> delta = timedelta(days=6)
>>> format_timedelta(delta, threshold=1.2, locale='en_US')
u'6 days'
>>> format_timedelta(delta, granularity='month', locale='en_US')
u'1 month'

Time-zone Support

Many of the verbose time formats include the time-zone, but time-zone information is not by default available for the Python datetime and time objects. The standard library includes only the abstract tzinfo class, which you need appropriate implementations for to actually use in your application. Babel includes a tzinfo implementation for UTC (Universal Time).

Babel uses pytz for real timezone support which includes the definitions of practically all of the time-zones used on the world, as well as important functions for reliably converting from UTC to local time, and vice versa.  The module is generally wrapped for you so you can directly interface with it from within Babel:

>>> from datetime import time
>>> from babel.dates import get_timezone, UTC
>>> dt = datetime(2007, 4, 1, 15, 30, tzinfo=UTC)
>>> eastern = get_timezone('US/Eastern')
>>> format_datetime(dt, 'H:mm Z', tzinfo=eastern, locale='en_US')
u'11:30 -0400'

The recommended approach to deal with different time-zones in a Python application is to always use UTC internally, and only convert from/to the users time-zone when accepting user input and displaying date/time data, respectively. You can use Babel together with pytz to apply a time-zone to any datetime or time object for display, leaving the original information unchanged:

>>> british = get_timezone('Europe/London')
>>> format_datetime(dt, 'H:mm zzzz', tzinfo=british, locale='en_US')
u'16:30 British Summer Time'

Here, the given UTC time is adjusted to the “Europe/London” time-zone, and daylight savings time is taken into account. Daylight savings time is also applied to format_time, but because the actual date is unknown in that case, the current day is assumed to determine whether DST or standard time should be used.

For many timezones it’s also possible to ask for the next timezone transition.  This for instance is useful to answer the question “when do I have to move the clock forward next”:

>>> t = get_next_timezone_transition('Europe/Vienna', datetime(2011, 3, 2))
>>> t
<TimezoneTransition CET -> CEST (2011-03-27 01:00:00)>
>>> t.from_offset
3600.0
>>> t.to_offset
7200.0
>>> t.from_tz
'CET'
>>> t.to_tz
'CEST'

Lastly Babel also provides support for working with the local timezone of your operating system.  It’s provided through the LOCALTZ constant:

>>> from babel.dates import LOCALTZ, get_timezone_name
>>> LOCALTZ
<DstTzInfo 'Europe/Vienna' CET+1:00:00 STD>
>>> get_timezone_name(LOCALTZ)
u'Central European Time'

Localized Time-zone Names

While the Locale class provides access to various locale display names related to time-zones, the process of building a localized name of a time-zone is actually quite complicated. Babel implements it in separately usable functions in the babel.dates module, most importantly the get_timezone_name function:

>>> from babel import Locale
>>> from babel.dates import get_timezone_name, get_timezone

>>> tz = get_timezone('Europe/Berlin')
>>> get_timezone_name(tz, locale=Locale.parse('pt_PT'))
u'Hora da Europa Central'

You can pass the function either a datetime.tzinfo object, or a datetime.date or datetime.datetime object. If you pass an actual date, the function will be able to take daylight savings time into account. If you pass just the time-zone, Babel does not know whether daylight savings time is in effect, so it uses a generic representation, which is useful for example to display a list of time-zones to the user.

>>> from datetime import datetime

>>> dt = tz.localize(datetime(2007, 8, 15))
>>> get_timezone_name(dt, locale=Locale.parse('de_DE'))
u'Mitteleurop\xe4ische Sommerzeit'
>>> get_timezone_name(tz, locale=Locale.parse('de_DE'))
u'Mitteleurop\xe4ische Zeit'

Number Formatting

Support for locale-specific formatting and parsing of numbers is provided by the babel.numbers module:

>>> from babel.numbers import format_number, format_decimal, format_percent

Examples:

# Numbers with decimal places
>>> format_decimal(1.2345, locale='en_US')
u'1.234'
>>> format_decimal(1.2345, locale='sv_SE')
u'1,234'
# Integers with thousand grouping
>>> format_decimal(12345, locale='de_DE')
u'12.345'
>>> format_decimal(12345678, locale='de_DE')
u'12.345.678'

Pattern Syntax

While Babel makes it simple to use the appropriate number format for a given locale, you can also force it to use custom patterns. As with date/time formatting patterns, the patterns Babel supports for number formatting are based on the Locale Data Markup Language specification (LDML).

Examples:

>>> format_decimal(-1.2345, format='#,##0.##;-#', locale='en')
u'-1.23'
>>> format_decimal(-1.2345, format='#,##0.##;(#)', locale='en')
u'(1.23)'

The syntax for custom number format patterns is described in detail in the the specification. The following table is just a relatively brief overview.

SymbolDescription
0Digit
1-9‘1’ through ‘9’ indicate rounding.
@Significant digit
#Digit, zero shows as absent
.Decimal separator or monetary decimal separator
-Minus sign
,Grouping separator
ESeparates mantissa and exponent in scientific notation
+Prefix positive exponents with localized plus sign
;Separates positive and negative subpatterns
%Multiply by 100 and show as percentage
Multiply by 1000 and show as per mille
¤Currency sign, replaced by currency symbol. If doubled, replaced by international currency symbol. If tripled, uses the long form of the decimal symbol.
'Used to quote special characters in a prefix or suffix
*Pad escape, precedes pad character

Rounding Modes

Since Babel makes full use of Python’s Decimal type to perform number rounding before formatting, users have the chance to control the rounding mode and other configurable parameters through the active Context instance.

By default, Python rounding mode is ROUND_HALF_EVEN which complies with UTS #35 section 3.3.  Yet, the caller has the opportunity to tweak the current context before formatting a number or currency:

>>> from babel.numbers import decimal, format_decimal
>>> with decimal.localcontext(decimal.Context(rounding=decimal.ROUND_DOWN)):
>>>    txt = format_decimal(123.99, format='#', locale='en_US')
>>> txt
u'123'

It is also possible to use decimal.setcontext or directly modifying the instance returned by decimal.getcontext.  However, using a context manager is always more convenient due to the automatic restoration and the ability to nest them.

Whatever mechanism is chosen, always make use of the decimal module imported from babel.numbers.  For efficiency reasons, Babel uses the fastest decimal implementation available, such as cdecimal.  These various implementation offer an identical API, but their types and instances do not interoperate with each other.

For example, the previous example can be slightly modified to generate unexpected results on Python 2.7, with the cdecimal module installed:

>>> from decimal import localcontext, Context, ROUND_DOWN
>>> from babel.numbers import format_decimal
>>> with localcontext(Context(rounding=ROUND_DOWN)):
>>>    txt = format_decimal(123.99, format='#', locale='en_US')
>>> txt
u'124'

Changing other parameters such as the precision may also alter the results of the number formatting functions.  Remember to test your code to make sure it behaves as desired.

Parsing Numbers

Babel can also parse numeric data in a locale-sensitive manner:

>>> from babel.numbers import parse_decimal, parse_number

Examples:

>>> parse_decimal('1,099.98', locale='en_US')
1099.98
>>> parse_decimal('1.099,98', locale='de')
1099.98
>>> parse_decimal('2,109,998', locale='de')
Traceback (most recent call last):
  ...
NumberFormatError: '2,109,998' is not a valid decimal number

Note: as of version 2.8.0, the parse_number function has limited functionality. It can remove group symbols of certain locales from numeric strings, but may behave unexpectedly until its logic handles more encoding issues and other special cases.

Examples:

>>> parse_number('1,099', locale='en_US')
1099
>>> parse_number('1.099.024', locale='de')
1099024
>>> parse_number('123' + u'\xa0' + '4567', locale='ru')
1234567
>>> parse_number('123 4567', locale='ru')
  ...
NumberFormatError: '123 4567' is not a valid number

Working with Message Catalogs

Introduction

The gettext translation system enables you to mark any strings used in your application as subject to localization, by wrapping them in functions such as gettext(str) and ngettext(singular, plural, num). For brevity, the gettext function is often aliased to _(str), so you can write:

print(_("Hello"))

instead of just:

print("Hello")

to make the string “Hello” localizable.

Message catalogs are collections of translations for such localizable messages used in an application. They are commonly stored in PO (Portable Object) and MO (Machine Object) files, the formats of which are defined by the GNU gettext tools and the GNU translation project.

The general procedure for building message catalogs looks something like this:

  • use a tool (such as xgettext) to extract localizable strings from the code base and write them to a POT (PO Template) file.
  • make a copy of the POT file for a specific locale (for example, “en_US”) and start translating the messages
  • use a tool such as msgfmt to compile the locale PO file into a binary MO file
  • later, when code changes make it necessary to update the translations, you regenerate the POT file and merge the changes into the various locale-specific PO files, for example using msgmerge

Python provides the gettext module as part of the standard library, which enables applications to work with appropriately generated MO files.

As gettext provides a solid and well supported foundation for translating application messages, Babel does not reinvent the wheel, but rather reuses this infrastructure, and makes it easier to build message catalogs for Python applications.

Message Extraction

Babel provides functionality similar to that of the xgettext program, except that only extraction from Python source files is built-in, while support for other file formats can be added using a simple extension mechanism.

Unlike xgettext, which is usually invoked once for every file, the routines for message extraction in Babel operate on directories. While the per-file approach of xgettext works nicely with projects using a Makefile, Python projects rarely use make, and thus a different mechanism is needed for extracting messages from the heterogeneous collection of source files that many Python projects are composed of.

When message extraction is based on directories instead of individual files, there needs to be a way to configure which files should be treated in which manner. For example, while many projects may contain .html files, some of those files may be static HTML files that don’t contain localizable message, while others may be Jinja2 templates, and still others may contain Genshi markup templates. Some projects may even mix HTML files for different templates languages (for whatever reason). Therefore the way in which messages are extracted from source files can not only depend on the file extension, but needs to be controllable in a precise manner.

Babel accepts a configuration file to specify this mapping of files to extraction methods, which is described below.

Front-Ends

Babel provides two different front-ends to access its functionality for working with message catalogs:

  • A Command-Line Interface, and
  • Distutils/Setuptools Integration

Which one you choose depends on the nature of your project. For most modern Python projects, the distutils/setuptools integration is probably more convenient.

Extraction Method Mapping and Configuration

The mapping of extraction methods to files in Babel is done via a configuration file. This file maps extended glob patterns to the names of the extraction methods, and can also set various options for each pattern (which options are available depends on the specific extraction method).

For example, the following configuration adds extraction of messages from both Genshi markup templates and text templates:

# Extraction from Python source files

[python: **.py]

# Extraction from Genshi HTML and text templates

[genshi: **/templates/**.html]
ignore_tags = script,style
include_attrs = alt title summary

[genshi: **/templates/**.txt]
template_class = genshi.template:TextTemplate
encoding = ISO-8819-15

# Extraction from JavaScript files

[javascript: **.js]
extract_messages = $._, jQuery._

The configuration file syntax is based on the format commonly found in .INI files on Windows systems, and as supported by the ConfigParser module in the Python standard library. Section names (the strings enclosed in square brackets) specify both the name of the extraction method, and the extended glob pattern to specify the files that this extraction method should be used for, separated by a colon. The options in the sections are passed to the extraction method. Which options are available is specific to the extraction method used.

The extended glob patterns used in this configuration are similar to the glob patterns provided by most shells. A single asterisk (*) is a wildcard for any number of characters (except for the pathname component separator “/”), while a question mark (?) only matches a single character. In addition, two subsequent asterisk characters (**) can be used to make the wildcard match any directory level, so the pattern **.txt matches any file with the extension .txt in any directory.

Lines that start with a # or ; character are ignored and can be used for comments. Empty lines are ignored, too.

NOTE:

if you’re performing message extraction using the command Babel provides for integration into setup.py scripts, you can also provide this configuration in a different way, namely as a keyword argument to the setup() function. See Distutils/Setuptools Integration for more information.

Default Extraction Methods

Babel comes with a few builtin extractors: python (which extracts messages from Python source files), javascript, and ignore (which extracts nothing).

The python extractor is by default mapped to the glob pattern **.py, meaning it’ll be applied to all files with the .py extension in any directory. If you specify your own mapping configuration, this default mapping is discarded, so you need to explicitly add it to your mapping (as shown in the example above.)

Referencing Extraction Methods

To be able to use short extraction method names such as “genshi”, you need to have pkg_resources installed, and the package implementing that extraction method needs to have been installed with its meta data (the egg-info).

If this is not possible for some reason, you need to map the short names to fully qualified function names in an extract section in the mapping configuration. For example:

# Some custom extraction method

[extractors]
custom = mypackage.module:extract_custom

[custom: **.ctm]
some_option = foo

Note that the builtin extraction methods python and ignore are available by default, even if pkg_resources is not installed. You should never need to explicitly define them in the [extractors] section.

Writing Extraction Methods

Adding new methods for extracting localizable methods is easy. First, you’ll need to implement a function that complies with the following interface:

def extract_xxx(fileobj, keywords, comment_tags, options):
    """Extract messages from XXX files.

    :param fileobj: the file-like object the messages should be extracted
                    from
    :param keywords: a list of keywords (i.e. function names) that should
                     be recognized as translation functions
    :param comment_tags: a list of translator tags to search for and
                         include in the results
    :param options: a dictionary of additional options (optional)
    :return: an iterator over ``(lineno, funcname, message, comments)``
             tuples
    :rtype: ``iterator``
    """
NOTE:

Any strings in the tuples produced by this function must be either unicode objects, or str objects using plain ASCII characters. That means that if sources contain strings using other encodings, it is the job of the extractor implementation to do the decoding to unicode objects.

Next, you should register that function as an entry point. This requires your setup.py script to use setuptools, and your package to be installed with the necessary metadata. If that’s taken care of, add something like the following to your setup.py script:

def setup(...

    entry_points = """
    [babel.extractors]
    xxx = your.package:extract_xxx
    """,

That is, add your extraction method to the entry point group babel.extractors, where the name of the entry point is the name that people will use to reference the extraction method, and the value being the module and the name of the function (separated by a colon) implementing the actual extraction.

NOTE:

As shown in Referencing Extraction Methods, declaring an entry point is not  strictly required, as users can still reference the extraction  function directly. But whenever possible, the entry point should be  declared to make configuration more convenient.

Translator Comments

First of all what are comments tags. Comments tags are excerpts of text to search for in comments, only comments, right before the python gettext calls, as shown on the following example:

# NOTE: This is a comment about `Foo Bar`
_('Foo Bar')

The comments tag for the above example would be NOTE:, and the translator comment for that tag would be This is a comment about `Foo Bar`.

The resulting output in the catalog template would be something like:

#. This is a comment about `Foo Bar`
#: main.py:2
msgid "Foo Bar"
msgstr ""

Now, you might ask, why would I need that?

Consider this simple case; you have a menu item called “manual”. You know what it means, but when the translator sees this they will wonder did you mean:

  1. a document or help manual, or
  2. a manual process?

This is the simplest case where a translation comment such as “The installation manual” helps to clarify the situation and makes a translator more productive.

NOTE:

Whether translator comments can be extracted depends on the extraction method in use. The Python extractor provided by Babel does implement this feature, but others may not.

Command-Line Interface

Babel includes a command-line interface for working with message catalogs, similar to the various GNU gettext tools commonly available on Linux/Unix systems.

When properly installed, Babel provides a script called pybabel:

$ pybabel --help
Usage: pybabel command [options] [args]

Options:
  --version       show program's version number and exit
  -h, --help      show this help message and exit
  --list-locales  print all known locales and exit
  -v, --verbose   print as much as possible
  -q, --quiet     print as little as possible

commands:
  compile  compile message catalogs to MO files
  extract  extract messages from source files and generate a POT file
  init     create new message catalogs from a POT file
  update   update existing message catalogs from a POT file

The pybabel script provides a number of sub-commands that do the actual work. Those sub-commands are described below.

compile

The compile sub-command can be used to compile translation catalogs into binary MO files:

$ pybabel compile --help
Usage: pybabel compile [options]

compile message catalogs to MO files

Options:
  -h, --help            show this help message and exit
  -D DOMAIN, --domain=DOMAIN
                        domains of PO files (space separated list, default
                        'messages')
  -d DIRECTORY, --directory=DIRECTORY
                        path to base directory containing the catalogs
  -i INPUT_FILE, --input-file=INPUT_FILE
                        name of the input file
  -o OUTPUT_FILE, --output-file=OUTPUT_FILE
                        name of the output file (default
                        '<output_dir>/<locale>/LC_MESSAGES/<domain>.mo')
  -l LOCALE, --locale=LOCALE
                        locale of the catalog to compile
  -f, --use-fuzzy       also include fuzzy translations
  --statistics          print statistics about translations

If directory is specified, but output-file is not, the default filename of the output file will be:

<directory>/<locale>/LC_MESSAGES/<domain>.mo

If neither the input_file nor the locale option is set, this command looks for all catalog files in the base directory that match the given domain, and compiles each of them to MO files in the same directory.

extract

The extract sub-command can be used to extract localizable messages from a collection of source files:

$ pybabel extract --help
Usage: pybabel extract [options] <input-paths>

extract messages from source files and generate a POT file

Options:
  -h, --help            show this help message and exit
  --charset=CHARSET     charset to use in the output file (default "utf-8")
  -k KEYWORDS, --keywords=KEYWORDS, --keyword=KEYWORDS
                        space-separated list of keywords to look for in
                        addition to the defaults (may be repeated multiple
                        times)
  --no-default-keywords
                        do not include the default keywords
  -F MAPPING_FILE, --mapping-file=MAPPING_FILE, --mapping=MAPPING_FILE
                        path to the mapping configuration file
  --no-location         do not include location comments with filename and
                        line number
  --add-location=ADD_LOCATION
                        location lines format. If it is not given or "full",
                        it generates the lines with both file name and line
                        number. If it is "file", the line number part is
                        omitted. If it is "never", it completely suppresses
                        the lines (same as --no-location).
  --omit-header         do not include msgid "" entry in header
  -o OUTPUT_FILE, --output-file=OUTPUT_FILE, --output=OUTPUT_FILE
                        name of the output file
  -w WIDTH, --width=WIDTH
                        set output line width (default 76)
  --no-wrap             do not break long message lines, longer than the
                        output line width, into several lines
  --sort-output         generate sorted output (default False)
  --sort-by-file        sort output by file location (default False)
  --msgid-bugs-address=MSGID_BUGS_ADDRESS
                        set report address for msgid
  --copyright-holder=COPYRIGHT_HOLDER
                        set copyright holder in output
  --project=PROJECT     set project name in output
  --version=VERSION     set project version in output
  -c ADD_COMMENTS, --add-comments=ADD_COMMENTS
                        place comment block with TAG (or those preceding
                        keyword lines) in output file. Separate multiple TAGs
                        with commas(,)
  -s, --strip-comments, --strip-comment-tags
                        strip the comment TAGs from the comments.
  --input-dirs=INPUT_DIRS
                        alias for input-paths (does allow files as well as
                        directories).

init

The init sub-command creates a new translations catalog based on a PO template file:

$ pybabel init --help
Usage: pybabel init [options]

create new message catalogs from a POT file

Options:
  -h, --help            show this help message and exit
  -D DOMAIN, --domain=DOMAIN
                        domain of PO file (default 'messages')
  -i INPUT_FILE, --input-file=INPUT_FILE
                        name of the input file
  -d OUTPUT_DIR, --output-dir=OUTPUT_DIR
                        path to output directory
  -o OUTPUT_FILE, --output-file=OUTPUT_FILE
                        name of the output file (default
                        '<output_dir>/<locale>/LC_MESSAGES/<domain>.po')
  -l LOCALE, --locale=LOCALE
                        locale for the new localized catalog
  -w WIDTH, --width=WIDTH
                        set output line width (default 76)
  --no-wrap             do not break long message lines, longer than the
                        output line width, into several lines

update

The update sub-command updates an existing new translations catalog based on a PO template file:

$ pybabel update --help
Usage: pybabel update [options]

update existing message catalogs from a POT file

Options:
  -h, --help            show this help message and exit
  -D DOMAIN, --domain=DOMAIN
                        domain of PO file (default 'messages')
  -i INPUT_FILE, --input-file=INPUT_FILE
                        name of the input file
  -d OUTPUT_DIR, --output-dir=OUTPUT_DIR
                        path to base directory containing the catalogs
  -o OUTPUT_FILE, --output-file=OUTPUT_FILE
                        name of the output file (default
                        '<output_dir>/<locale>/LC_MESSAGES/<domain>.po')
  --omit-header         do not include msgid  entry in header
  -l LOCALE, --locale=LOCALE
                        locale of the catalog to compile
  -w WIDTH, --width=WIDTH
                        set output line width (default 76)
  --no-wrap             do not break long message lines, longer than the
                        output line width, into several lines
  --ignore-obsolete     whether to omit obsolete messages from the output
  --init-missing        if any output files are missing, initialize them first
  -N, --no-fuzzy-matching
                        do not use fuzzy matching
  --update-header-comment
                        update target header comment
  --previous            keep previous msgids of translated messages

If output_dir is specified, but output-file is not, the default filename of the output file will be:

<directory>/<locale>/LC_MESSAGES/<domain>.mo

If neither the output_file nor the locale option is set, this command looks for all catalog files in the base directory that match the given domain, and updates each of them.

Distutils/Setuptools Integration

Babel provides commands for integration into setup.py scripts, based on either the distutils package that is part of the Python standard library, or the third-party setuptools package.

These commands are available by default when Babel has been properly installed, and setup.py is using setuptools. For projects that use plain old distutils, the commands need to be registered explicitly, for example:

from distutils.core import setup
from babel.messages import frontend as babel

setup(
    ...
    cmdclass = {'compile_catalog': babel.compile_catalog,
                'extract_messages': babel.extract_messages,
                'init_catalog': babel.init_catalog,
                'update_catalog': babel.update_catalog}
)

compile_catalog

The compile_catalog command is similar to the GNU msgfmt tool, in that it takes a message catalog from a PO file and compiles it to a binary MO file.

If the command has been correctly installed or registered, a project’s setup.py script should allow you to use the command:

$ ./setup.py compile_catalog --help
Global options:
  --verbose (-v)  run verbosely (default)
  --quiet (-q)    run quietly (turns verbosity off)
  --dry-run (-n)  don't actually do anything
  --help (-h)     show detailed help message

Options for 'compile_catalog' command:
   ...

Running the command will produce a binary MO file:

$ ./setup.py compile_catalog --directory foobar/locale --locale pt_BR
running compile_catalog
compiling catalog to foobar/locale/pt_BR/LC_MESSAGES/messages.mo

Options

The compile_catalog command accepts the following options:

OptionDescription
--domaindomain of the PO file (defaults to lower-cased project name)
--directory (-d)name of the base directory
--input-file (-i)name of the input file
--output-file (-o)name of the output file
--locale (-l)locale for the new localized string
--use-fuzzy (-f)also include “fuzzy” translations
--statisticsprint statistics about translations

If directory is specified, but output-file is not, the default filename of the output file will be:

<directory>/<locale>/LC_MESSAGES/<domain>.mo

If neither the input_file nor the locale option is set, this command looks for all catalog files in the base directory that match the given domain, and compiles each of them to MO files in the same directory.

These options can either be specified on the command-line, or in the setup.cfg file.

extract_messages

The extract_messages command is comparable to the GNU xgettext program: it can extract localizable messages from a variety of difference source files, and generate a PO (portable object) template file from the collected messages.

If the command has been correctly installed or registered, a project’s setup.py script should allow you to use the command:

$ ./setup.py extract_messages --help
Global options:
  --verbose (-v)  run verbosely (default)
  --quiet (-q)    run quietly (turns verbosity off)
  --dry-run (-n)  don't actually do anything
  --help (-h)     show detailed help message

Options for 'extract_messages' command:
   ...

Running the command will produce a PO template file:

$ ./setup.py extract_messages --output-file foobar/locale/messages.pot
running extract_messages
extracting messages from foobar/__init__.py
extracting messages from foobar/core.py
...
writing PO template file to foobar/locale/messages.pot

Method Mapping

The mapping of file patterns to extraction methods (and options) can be specified using a configuration file that is pointed to using the --mapping-file option shown above. Alternatively, you can configure the mapping directly in setup.py using a keyword argument to the setup() function:

setup(...

    message_extractors = {
        'foobar': [
            ('**.py',                'python', None),
            ('**/templates/**.html', 'genshi', None),
            ('**/templates/**.txt',  'genshi', {
                'template_class': 'genshi.template:TextTemplate'
            })
        ],
    },

    ...
)

Options

The extract_messages command accepts the following options:

OptionDescription
--charsetcharset to use in the output file
--keywords (-k)space-separated list of keywords to look for in addition to the defaults
--no-default-keywordsdo not include the default keywords
--mapping-file (-F)path to the mapping configuration file
--no-locationdo not include location comments with filename and line number
--omit-headerdo not include msgid “” entry in header
--output-file (-o)name of the output file
--width (-w)set output line width (default 76)
--no-wrapdo not break long message lines, longer than the output line width, into several lines
--input-dirsdirectories that should be scanned for messages
--sort-outputgenerate sorted output (default False)
--sort-by-filesort output by file location (default False)
--msgid-bugs-addressset email address for message bug reports
--copyright-holderset copyright holder in output
--add-comments (-c)place comment block with TAG (or those preceding keyword lines) in output file. Separate multiple TAGs with commas(,)

These options can either be specified on the command-line, or in the setup.cfg file. In the latter case, the options above become entries of the section [extract_messages], and the option names are changed to use underscore characters instead of dashes, for example:

[extract_messages]
keywords = _ gettext ngettext
mapping_file = mapping.cfg
width = 80

This would be equivalent to invoking the command from the command-line as follows:

$ setup.py extract_messages -k _ -k gettext -k ngettext -F mapping.cfg -w 80

Any path names are interpreted relative to the location of the setup.py file. For boolean options, use “true” or “false” values.

init_catalog

The init_catalog command is basically equivalent to the GNU msginit program: it creates a new translation catalog based on a PO template file (POT).

If the command has been correctly installed or registered, a project’s setup.py script should allow you to use the command:

$ ./setup.py init_catalog --help
Global options:
  --verbose (-v)  run verbosely (default)
  --quiet (-q)    run quietly (turns verbosity off)
  --dry-run (-n)  don't actually do anything
  --help (-h)     show detailed help message

Options for 'init_catalog' command:
  ...

Running the command will produce a PO file:

$ ./setup.py init_catalog -l fr -i foobar/locales/messages.pot \
                         -o foobar/locales/fr/messages.po
running init_catalog
creating catalog 'foobar/locales/fr/messages.po' based on 'foobar/locales/messages.pot'

Options

The init_catalog command accepts the following options:

OptionDescription
--domaindomain of the PO file (defaults to lower-cased project name)
--input-file (-i)name of the input file
--output-dir (-d)name of the output directory
--output-file (-o)name of the output file
--localelocale for the new localized string

If output-dir is specified, but output-file is not, the default filename of the output file will be:

<output_dir>/<locale>/LC_MESSAGES/<domain>.po

These options can either be specified on the command-line, or in the setup.cfg file.

update_catalog

The update_catalog command is basically equivalent to the GNU msgmerge program: it updates an existing translations catalog based on a PO template file (POT).

If the command has been correctly installed or registered, a project’s setup.py script should allow you to use the command:

$ ./setup.py update_catalog --help
Global options:
  --verbose (-v)  run verbosely (default)
  --quiet (-q)    run quietly (turns verbosity off)
  --dry-run (-n)  don't actually do anything
  --help (-h)     show detailed help message

Options for 'update_catalog' command:
  ...

Running the command will update a PO file:

$ ./setup.py update_catalog -l fr -i foobar/locales/messages.pot \
                            -o foobar/locales/fr/messages.po
running update_catalog
updating catalog 'foobar/locales/fr/messages.po' based on 'foobar/locales/messages.pot'

Options

The update_catalog command accepts the following options:

OptionDescription
--domaindomain of the PO file (defaults to lower-cased project name)
--input-file (-i)name of the input file
--output-dir (-d)name of the output directory
--output-file (-o)name of the output file
--localelocale for the new localized string
--ignore-obsoletedo not include obsolete messages in the output
--no-fuzzy-matching (-N)do not use fuzzy matching
--previouskeep previous msgids of translated messages

If output-dir is specified, but output-file is not, the default filename of the output file will be:

<output_dir>/<locale>/LC_MESSAGES/<domain>.po

If neither the input_file nor the locale option is set, this command looks for all catalog files in the base directory that match the given domain, and updates each of them.

These options can either be specified on the command-line, or in the setup.cfg file.

Support Classes and Functions

The babel.support modules contains a number of classes and functions that can help with integrating Babel, and internationalization in general, into your application or framework. The code in this module is not used by Babel itself, but instead is provided to address common requirements of applications that should handle internationalization.

Lazy Evaluation

One such requirement is lazy evaluation of translations. Many web-based applications define some localizable message at the module level, or in general at some level where the locale of the remote user is not yet known. For such cases, web frameworks generally provide a “lazy” variant of the gettext functions, which basically translates the message not when the gettext function is invoked, but when the string is accessed in some manner.

Extended Translations Class

Many web-based applications are composed of a variety of different components (possibly using some kind of plugin system), and some of those components may provide their own message catalogs that need to be integrated into the larger system.

To support this usage pattern, Babel provides a Translations class that is derived from the GNUTranslations class in the gettext module. This class adds a merge() method that takes another Translations instance, and merges the content of the latter into the main catalog:

translations = Translations.load('main')
translations.merge(Translations.load('plugin1'))

API Reference

The API reference lists the full public API that Babel provides.

API Reference

This part of the documentation contains the full API reference of the public API of Babel.

Core Functionality

The core API provides the basic core functionality.  Primarily it provides the Locale object and ways to create it.  This object encapsulates a locale and exposes all the data it contains.

All the core functionality is also directly importable from the babel module for convenience.

Basic Interface

class babel.core.Locale(language, territory=None, script=None, variant=None)

Representation of a specific locale.

>>> locale = Locale('en', 'US')
>>> repr(locale)
"Locale('en', territory='US')"
>>> locale.display_name
u'English (United States)'

A Locale object can also be instantiated from a raw locale string:

>>> locale = Locale.parse('en-US', sep='-')
>>> repr(locale)
"Locale('en', territory='US')"

Locale objects provide access to a collection of locale data, such as territory and language names, number and date format patterns, and more:

>>> locale.number_symbols['decimal']
u'.'

If a locale is requested for which no locale data is available, an UnknownLocaleError is raised:

>>> Locale.parse('en_XX')
Traceback (most recent call last):
    ...
UnknownLocaleError: unknown locale 'en_XX'

For more information see RFC 3066.

property character_order

The text direction for the language.

>>> Locale('de', 'DE').character_order
'left-to-right'
>>> Locale('ar', 'SA').character_order
'right-to-left'
property currencies

Mapping of currency codes to translated currency names.  This only returns the generic form of the currency name, not the count specific one.  If an actual number is requested use the babel.numbers.get_currency_name() function.

>>> Locale('en').currencies['COP']
u'Colombian Peso'
>>> Locale('de', 'DE').currencies['COP']
u'Kolumbianischer Peso'
property currency_formats

Locale patterns for currency number formatting.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en', 'US').currency_formats['standard']
<NumberPattern u'\xa4#,##0.00'>
>>> Locale('en', 'US').currency_formats['accounting']
<NumberPattern u'\xa4#,##0.00;(\xa4#,##0.00)'>
property currency_symbols

Mapping of currency codes to symbols.

>>> Locale('en', 'US').currency_symbols['USD']
u'$'
>>> Locale('es', 'CO').currency_symbols['USD']
u'US$'
property date_formats

Locale patterns for date formatting.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en', 'US').date_formats['short']
<DateTimePattern u'M/d/yy'>
>>> Locale('fr', 'FR').date_formats['long']
<DateTimePattern u'd MMMM y'>
property datetime_formats

Locale patterns for datetime formatting.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en').datetime_formats['full']
u"{1} 'at' {0}"
>>> Locale('th').datetime_formats['medium']
u'{1} {0}'
property datetime_skeletons

Locale patterns for formatting parts of a datetime.

>>> Locale('en').datetime_skeletons['MEd']
<DateTimePattern u'E, M/d'>
>>> Locale('fr').datetime_skeletons['MEd']
<DateTimePattern u'E dd/MM'>
>>> Locale('fr').datetime_skeletons['H']
<DateTimePattern u"HH 'h'">
property day_period_rules

Day period rules for the locale.  Used by get_period_id.

property day_periods

Locale display names for various day periods (not necessarily only AM/PM).

These are not meant to be used without the relevant day_period_rules.

property days

Locale display names for weekdays.

>>> Locale('de', 'DE').days['format']['wide'][3]
u'Donnerstag'
property decimal_formats

Locale patterns for decimal number formatting.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en', 'US').decimal_formats[None]
<NumberPattern u'#,##0.###'>
classmethod default(category=None, aliases={'ar': 'ar_SY', 'bg': 'bg_BG', 'bs': 'bs_BA', 'ca': 'ca_ES', 'cs': 'cs_CZ', 'da': 'da_DK', 'de': 'de_DE', 'el': 'el_GR', 'en': 'en_US', 'es': 'es_ES', 'et': 'et_EE', 'fa': 'fa_IR', 'fi': 'fi_FI', 'fr': 'fr_FR', 'gl': 'gl_ES', 'he': 'he_IL', 'hu': 'hu_HU', 'id': 'id_ID', 'is': 'is_IS', 'it': 'it_IT', 'ja': 'ja_JP', 'km': 'km_KH', 'ko': 'ko_KR', 'lt': 'lt_LT', 'lv': 'lv_LV', 'mk': 'mk_MK', 'nl': 'nl_NL', 'nn': 'nn_NO', 'no': 'nb_NO', 'pl': 'pl_PL', 'pt': 'pt_PT', 'ro': 'ro_RO', 'ru': 'ru_RU', 'sk': 'sk_SK', 'sl': 'sl_SI', 'sv': 'sv_SE', 'th': 'th_TH', 'tr': 'tr_TR', 'uk': 'uk_UA'})

Return the system default locale for the specified category.

>>> for name in ['LANGUAGE', 'LC_ALL', 'LC_CTYPE', 'LC_MESSAGES']:
...     os.environ[name] = ''
>>> os.environ['LANG'] = 'fr_FR.UTF-8'
>>> Locale.default('LC_MESSAGES')
Locale('fr', territory='FR')

The following fallbacks to the variable are always considered:

  • LANGUAGE
  • LC_ALL
  • LC_CTYPE
  • LANG
Parameters
  • category – one of the LC_XXX environment variable names
  • aliases – a dictionary of aliases for locale identifiers
property display_name

The localized display name of the locale.

>>> Locale('en').display_name
u'English'
>>> Locale('en', 'US').display_name
u'English (United States)'
>>> Locale('sv').display_name
u'svenska'
Type

unicode

property english_name

The english display name of the locale.

>>> Locale('de').english_name
u'German'
>>> Locale('de', 'DE').english_name
u'German (Germany)'
Type

unicode

property eras

Locale display names for eras.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en', 'US').eras['wide'][1]
u'Anno Domini'
>>> Locale('en', 'US').eras['abbreviated'][0]
u'BC'
property first_week_day

The first day of a week, with 0 being Monday.

>>> Locale('de', 'DE').first_week_day
0
>>> Locale('en', 'US').first_week_day
6
get_display_name(locale=None)

Return the display name of the locale using the given locale.

The display name will include the language, territory, script, and variant, if those are specified.

>>> Locale('zh', 'CN', script='Hans').get_display_name('en')
u'Chinese (Simplified, China)'
Parameters

locale – the locale to use

get_language_name(locale=None)

Return the language of this locale in the given locale.

>>> Locale('zh', 'CN', script='Hans').get_language_name('de')
u'Chinesisch'

New in version 1.0.

Parameters

locale – the locale to use

get_script_name(locale=None)

Return the script name in the given locale.

get_territory_name(locale=None)

Return the territory name in the given locale.

property interval_formats

Locale patterns for interval formatting.

NOTE:

The format of the value returned may change between Babel versions.

How to format date intervals in Finnish when the day is the smallest changing component:

>>> Locale('fi_FI').interval_formats['MEd']['d']
[u'E d. – ', u'E d.M.']
SEE ALSO:

The primary API to use this data is babel.dates.format_interval().

Return type

dict[str, dict[str, list[str]]]

language

the language code

property language_name

The localized language name of the locale.

>>> Locale('en', 'US').language_name
u'English'
property languages

Mapping of language codes to translated language names.

>>> Locale('de', 'DE').languages['ja']
u'Japanisch'

See ISO 639 for more information.

property list_patterns

Patterns for generating lists

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en').list_patterns['standard']['start']
u'{0}, {1}'
>>> Locale('en').list_patterns['standard']['end']
u'{0}, and {1}'
>>> Locale('en_GB').list_patterns['standard']['end']
u'{0} and {1}'
property measurement_systems

Localized names for various measurement systems.

>>> Locale('fr', 'FR').measurement_systems['US']
u'am\xe9ricain'
>>> Locale('en', 'US').measurement_systems['US']
u'US'
property meta_zones

Locale display names for meta time zones.

Meta time zones are basically groups of different Olson time zones that have the same GMT offset and daylight savings time.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en', 'US').meta_zones['Europe_Central']['long']['daylight']
u'Central European Summer Time'

New in version 0.9.

property min_week_days

The minimum number of days in a week so that the week is counted as the first week of a year or month.

>>> Locale('de', 'DE').min_week_days
4
property months

Locale display names for months.

>>> Locale('de', 'DE').months['format']['wide'][10]
u'Oktober'
classmethod negotiate(preferred, available, sep='_', aliases={'ar': 'ar_SY', 'bg': 'bg_BG', 'bs': 'bs_BA', 'ca': 'ca_ES', 'cs': 'cs_CZ', 'da': 'da_DK', 'de': 'de_DE', 'el': 'el_GR', 'en': 'en_US', 'es': 'es_ES', 'et': 'et_EE', 'fa': 'fa_IR', 'fi': 'fi_FI', 'fr': 'fr_FR', 'gl': 'gl_ES', 'he': 'he_IL', 'hu': 'hu_HU', 'id': 'id_ID', 'is': 'is_IS', 'it': 'it_IT', 'ja': 'ja_JP', 'km': 'km_KH', 'ko': 'ko_KR', 'lt': 'lt_LT', 'lv': 'lv_LV', 'mk': 'mk_MK', 'nl': 'nl_NL', 'nn': 'nn_NO', 'no': 'nb_NO', 'pl': 'pl_PL', 'pt': 'pt_PT', 'ro': 'ro_RO', 'ru': 'ru_RU', 'sk': 'sk_SK', 'sl': 'sl_SI', 'sv': 'sv_SE', 'th': 'th_TH', 'tr': 'tr_TR', 'uk': 'uk_UA'})

Find the best match between available and requested locale strings.

>>> Locale.negotiate(['de_DE', 'en_US'], ['de_DE', 'de_AT'])
Locale('de', territory='DE')
>>> Locale.negotiate(['de_DE', 'en_US'], ['en', 'de'])
Locale('de')
>>> Locale.negotiate(['de_DE', 'de'], ['en_US'])

You can specify the character used in the locale identifiers to separate the differnet components. This separator is applied to both lists. Also, case is ignored in the comparison:

>>> Locale.negotiate(['de-DE', 'de'], ['en-us', 'de-de'], sep='-')
Locale('de', territory='DE')
Parameters
  • preferred – the list of locale identifers preferred by the user
  • available – the list of locale identifiers available
  • aliases – a dictionary of aliases for locale identifiers
property number_symbols

Symbols used in number formatting.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('fr', 'FR').number_symbols['decimal']
u','
property ordinal_form

Plural rules for the locale.

>>> Locale('en').ordinal_form(1)
'one'
>>> Locale('en').ordinal_form(2)
'two'
>>> Locale('en').ordinal_form(3)
'few'
>>> Locale('fr').ordinal_form(2)
'other'
>>> Locale('ru').ordinal_form(100)
'other'
classmethod parse(identifier, sep='_', resolve_likely_subtags=True)

Create a Locale instance for the given locale identifier.

>>> l = Locale.parse('de-DE', sep='-')
>>> l.display_name
u'Deutsch (Deutschland)'

If the identifier parameter is not a string, but actually a Locale object, that object is returned:

>>> Locale.parse(l)
Locale('de', territory='DE')

This also can perform resolving of likely subtags which it does by default.  This is for instance useful to figure out the most likely locale for a territory you can use 'und' as the language tag:

>>> Locale.parse('und_AT')
Locale('de', territory='AT')
Parameters
  • identifier – the locale identifier string
  • sep – optional component separator
  • resolve_likely_subtags – if this is specified then a locale will have its likely subtag resolved if the locale otherwise does not exist.  For instance zh_TW by itself is not a locale that exists but Babel can automatically expand it to the full form of zh_hant_TW.  Note that this expansion is only taking place if no locale exists otherwise.  For instance there is a locale en that can exist by itself.
Raises
  • ValueError – if the string does not appear to be a valid locale identifier
  • UnknownLocaleError – if no locale data is available for the requested locale
property percent_formats

Locale patterns for percent number formatting.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en', 'US').percent_formats[None]
<NumberPattern u'#,##0%'>
property periods

Locale display names for day periods (AM/PM).

>>> Locale('en', 'US').periods['am']
u'AM'
property plural_form

Plural rules for the locale.

>>> Locale('en').plural_form(1)
'one'
>>> Locale('en').plural_form(0)
'other'
>>> Locale('fr').plural_form(0)
'one'
>>> Locale('ru').plural_form(100)
'many'
property quarters

Locale display names for quarters.

>>> Locale('de', 'DE').quarters['format']['wide'][1]
u'1. Quartal'
property scientific_formats

Locale patterns for scientific number formatting.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en', 'US').scientific_formats[None]
<NumberPattern u'#E0'>
script

the script code

property script_name

The localized script name of the locale if available.

>>> Locale('sr', 'ME', script='Latn').script_name
u'latinica'
property scripts

Mapping of script codes to translated script names.

>>> Locale('en', 'US').scripts['Hira']
u'Hiragana'

See ISO 15924 for more information.

property territories

Mapping of script codes to translated script names.

>>> Locale('es', 'CO').territories['DE']
u'Alemania'

See ISO 3166 for more information.

territory

the territory (country or region) code

property territory_name

The localized territory name of the locale if available.

>>> Locale('de', 'DE').territory_name
u'Deutschland'
property text_direction

The text direction for the language in CSS short-hand form.

>>> Locale('de', 'DE').text_direction
'ltr'
>>> Locale('ar', 'SA').text_direction
'rtl'
property time_formats

Locale patterns for time formatting.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en', 'US').time_formats['short']
<DateTimePattern u'h:mm a'>
>>> Locale('fr', 'FR').time_formats['long']
<DateTimePattern u'HH:mm:ss z'>
property time_zones

Locale display names for time zones.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en', 'US').time_zones['Europe/London']['long']['daylight']
u'British Summer Time'
>>> Locale('en', 'US').time_zones['America/St_Johns']['city']
u'St. John’s'
property unit_display_names

Display names for units of measurement.

SEE ALSO:

You may want to use babel.units.get_unit_name() instead.

NOTE:

The format of the value returned may change between Babel versions.

variant

the variant code

property variants

Mapping of script codes to translated script names.

>>> Locale('de', 'DE').variants['1901']
u'Alte deutsche Rechtschreibung'
property weekend_end

The day the weekend ends, with 0 being Monday.

>>> Locale('de', 'DE').weekend_end
6
property weekend_start

The day the weekend starts, with 0 being Monday.

>>> Locale('de', 'DE').weekend_start
5
property zone_formats

Patterns related to the formatting of time zones.

NOTE:

The format of the value returned may change between Babel versions.

>>> Locale('en', 'US').zone_formats['fallback']
u'%(1)s (%(0)s)'
>>> Locale('pt', 'BR').zone_formats['region']
u'Hor\xe1rio %s'

New in version 0.9.

babel.core.default_locale(category=None, aliases={'ar': 'ar_SY', 'bg': 'bg_BG', 'bs': 'bs_BA', 'ca': 'ca_ES', 'cs': 'cs_CZ', 'da': 'da_DK', 'de': 'de_DE', 'el': 'el_GR', 'en': 'en_US', 'es': 'es_ES', 'et': 'et_EE', 'fa': 'fa_IR', 'fi': 'fi_FI', 'fr': 'fr_FR', 'gl': 'gl_ES', 'he': 'he_IL', 'hu': 'hu_HU', 'id': 'id_ID', 'is': 'is_IS', 'it': 'it_IT', 'ja': 'ja_JP', 'km': 'km_KH', 'ko': 'ko_KR', 'lt': 'lt_LT', 'lv': 'lv_LV', 'mk': 'mk_MK', 'nl': 'nl_NL', 'nn': 'nn_NO', 'no': 'nb_NO', 'pl': 'pl_PL', 'pt': 'pt_PT', 'ro': 'ro_RO', 'ru': 'ru_RU', 'sk': 'sk_SK', 'sl': 'sl_SI', 'sv': 'sv_SE', 'th': 'th_TH', 'tr': 'tr_TR', 'uk': 'uk_UA'})

Returns the system default locale for a given category, based on environment variables.

>>> for name in ['LANGUAGE', 'LC_ALL', 'LC_CTYPE']:
...     os.environ[name] = ''
>>> os.environ['LANG'] = 'fr_FR.UTF-8'
>>> default_locale('LC_MESSAGES')
'fr_FR'

The “C” or “POSIX” pseudo-locales are treated as aliases for the “en_US_POSIX” locale:

>>> os.environ['LC_MESSAGES'] = 'POSIX'
>>> default_locale('LC_MESSAGES')
'en_US_POSIX'

The following fallbacks to the variable are always considered:

  • LANGUAGE
  • LC_ALL
  • LC_CTYPE
  • LANG
Parameters
  • category – one of the LC_XXX environment variable names
  • aliases – a dictionary of aliases for locale identifiers
babel.core.negotiate_locale(preferred, available, sep='_', aliases={'ar': 'ar_SY', 'bg': 'bg_BG', 'bs': 'bs_BA', 'ca': 'ca_ES', 'cs': 'cs_CZ', 'da': 'da_DK', 'de': 'de_DE', 'el': 'el_GR', 'en': 'en_US', 'es': 'es_ES', 'et': 'et_EE', 'fa': 'fa_IR', 'fi': 'fi_FI', 'fr': 'fr_FR', 'gl': 'gl_ES', 'he': 'he_IL', 'hu': 'hu_HU', 'id': 'id_ID', 'is': 'is_IS', 'it': 'it_IT', 'ja': 'ja_JP', 'km': 'km_KH', 'ko': 'ko_KR', 'lt': 'lt_LT', 'lv': 'lv_LV', 'mk': 'mk_MK', 'nl': 'nl_NL', 'nn': 'nn_NO', 'no': 'nb_NO', 'pl': 'pl_PL', 'pt': 'pt_PT', 'ro': 'ro_RO', 'ru': 'ru_RU', 'sk': 'sk_SK', 'sl': 'sl_SI', 'sv': 'sv_SE', 'th': 'th_TH', 'tr': 'tr_TR', 'uk': 'uk_UA'})

Find the best match between available and requested locale strings.

>>> negotiate_locale(['de_DE', 'en_US'], ['de_DE', 'de_AT'])
'de_DE'
>>> negotiate_locale(['de_DE', 'en_US'], ['en', 'de'])
'de'

Case is ignored by the algorithm, the result uses the case of the preferred locale identifier:

>>> negotiate_locale(['de_DE', 'en_US'], ['de_de', 'de_at'])
'de_DE'
>>> negotiate_locale(['de_DE', 'en_US'], ['de_de', 'de_at'])
'de_DE'

By default, some web browsers unfortunately do not include the territory in the locale identifier for many locales, and some don’t even allow the user to easily add the territory. So while you may prefer using qualified locale identifiers in your web-application, they would not normally match the language-only locale sent by such browsers. To workaround that, this function uses a default mapping of commonly used language-only locale identifiers to identifiers including the territory:

>>> negotiate_locale(['ja', 'en_US'], ['ja_JP', 'en_US'])
'ja_JP'

Some browsers even use an incorrect or outdated language code, such as “no” for Norwegian, where the correct locale identifier would actually be “nb_NO” (Bokmål) or “nn_NO” (Nynorsk). The aliases are intended to take care of such cases, too:

>>> negotiate_locale(['no', 'sv'], ['nb_NO', 'sv_SE'])
'nb_NO'

You can override this default mapping by passing a different aliases dictionary to this function, or you can bypass the behavior althogher by setting the aliases parameter to None.

Parameters
  • preferred – the list of locale strings preferred by the user
  • available – the list of locale strings available
  • sep – character that separates the different parts of the locale strings
  • aliases – a dictionary of aliases for locale identifiers

Exceptions

exception babel.core.UnknownLocaleError(identifier)

Exception thrown when a locale is requested for which no locale data is available.

identifier

The identifier of the locale that could not be found.

Utility Functions

babel.core.get_global(key)

Return the dictionary for the given key in the global data.

The global data is stored in the babel/global.dat file and contains information independent of individual locales.

>>> get_global('zone_aliases')['UTC']
u'Etc/UTC'
>>> get_global('zone_territories')['Europe/Berlin']
u'DE'

The keys available are:

  • all_currencies
  • currency_fractions
  • language_aliases
  • likely_subtags
  • parent_exceptions
  • script_aliases
  • territory_aliases
  • territory_currencies
  • territory_languages
  • territory_zones
  • variant_aliases
  • windows_zone_mapping
  • zone_aliases
  • zone_territories
NOTE:

The internal structure of the data may change between versions.

New in version 0.9.

Parameters

key – the data key

babel.core.parse_locale(identifier, sep='_')

Parse a locale identifier into a tuple of the form (language, territory, script, variant).

>>> parse_locale('zh_CN')
('zh', 'CN', None, None)
>>> parse_locale('zh_Hans_CN')
('zh', 'CN', 'Hans', None)
>>> parse_locale('ca_es_valencia')
('ca', 'ES', None, 'VALENCIA')
>>> parse_locale('en_150')
('en', '150', None, None)
>>> parse_locale('en_us_posix')
('en', 'US', None, 'POSIX')

The default component separator is “_”, but a different separator can be specified using the sep parameter:

>>> parse_locale('zh-CN', sep='-')
('zh', 'CN', None, None)

If the identifier cannot be parsed into a locale, a ValueError exception is raised:

>>> parse_locale('not_a_LOCALE_String')
Traceback (most recent call last):
  ...
ValueError: 'not_a_LOCALE_String' is not a valid locale identifier

Encoding information and locale modifiers are removed from the identifier:

>>> parse_locale('it_IT@euro')
('it', 'IT', None, None)
>>> parse_locale('en_US.UTF-8')
('en', 'US', None, None)
>>> parse_locale('de_DE.iso885915@euro')
('de', 'DE', None, None)

See RFC 4646 for more information.

Parameters
  • identifier – the locale identifier string
  • sep – character that separates the different components of the locale identifier
Raises

ValueError – if the string does not appear to be a valid locale identifier

babel.core.get_locale_identifier(tup, sep='_')

The reverse of parse_locale().  It creates a locale identifier out of a (language, territory, script, variant) tuple.  Items can be set to None and trailing Nones can also be left out of the tuple.

>>> get_locale_identifier(('de', 'DE', None, '1999'))
'de_DE_1999'

New in version 1.0.

Parameters
  • tup – the tuple as returned by parse_locale().
  • sep – the separator for the identifier.

Date and Time

The date and time functionality provided by Babel lets you format standard Python datetime, date and time objects and work with timezones.

Date and Time Formatting

babel.dates.format_datetime(datetime=None, format='medium', tzinfo=None, locale=default_locale('LC_TIME'))

Return a date formatted according to the given pattern.

>>> dt = datetime(2007, 4, 1, 15, 30)
>>> format_datetime(dt, locale='en_US')
u'Apr 1, 2007, 3:30:00 PM'

For any pattern requiring the display of the time-zone, the third-party pytz package is needed to explicitly specify the time-zone:

>>> format_datetime(dt, 'full', tzinfo=get_timezone('Europe/Paris'),
...                 locale='fr_FR')
u'dimanche 1 avril 2007 \xe0 17:30:00 heure d\u2019\xe9t\xe9 d\u2019Europe centrale'
>>> format_datetime(dt, "yyyy.MM.dd G 'at' HH:mm:ss zzz",
...                 tzinfo=get_timezone('US/Eastern'), locale='en')
u'2007.04.01 AD at 11:30:00 EDT'
Parameters
  • datetime – the datetime object; if None, the current date and time is used
  • format – one of “full”, “long”, “medium”, or “short”, or a custom date/time pattern
  • tzinfo – the timezone to apply to the time for display
  • locale – a Locale object or a locale identifier
babel.dates.format_date(date=None, format='medium', locale=default_locale('LC_TIME'))

Return a date formatted according to the given pattern.

>>> d = date(2007, 4, 1)
>>> format_date(d, locale='en_US')
u'Apr 1, 2007'
>>> format_date(d, format='full', locale='de_DE')
u'Sonntag, 1. April 2007'

If you don’t want to use the locale default formats, you can specify a custom date pattern:

>>> format_date(d, "EEE, MMM d, ''yy", locale='en')
u"Sun, Apr 1, '07"
Parameters
  • date – the date or datetime object; if None, the current date is used
  • format – one of “full”, “long”, “medium”, or “short”, or a custom date/time pattern
  • locale – a Locale object or a locale identifier
babel.dates.format_time(time=None, format='medium', tzinfo=None, locale=default_locale('LC_TIME'))

Return a time formatted according to the given pattern.

>>> t = time(15, 30)
>>> format_time(t, locale='en_US')
u'3:30:00 PM'
>>> format_time(t, format='short', locale='de_DE')
u'15:30'

If you don’t want to use the locale default formats, you can specify a custom time pattern:

>>> format_time(t, "hh 'o''clock' a", locale='en')
u"03 o'clock PM"

For any pattern requiring the display of the time-zone a timezone has to be specified explicitly:

>>> t = datetime(2007, 4, 1, 15, 30)
>>> tzinfo = get_timezone('Europe/Paris')
>>> t = tzinfo.localize(t)
>>> format_time(t, format='full', tzinfo=tzinfo, locale='fr_FR')
u'15:30:00 heure d\u2019\xe9t\xe9 d\u2019Europe centrale'
>>> format_time(t, "hh 'o''clock' a, zzzz", tzinfo=get_timezone('US/Eastern'),
...             locale='en')
u"09 o'clock AM, Eastern Daylight Time"

As that example shows, when this function gets passed a datetime.datetime value, the actual time in the formatted string is adjusted to the timezone specified by the tzinfo parameter. If the datetime is “naive” (i.e. it has no associated timezone information), it is assumed to be in UTC.

These timezone calculations are not performed if the value is of type datetime.time, as without date information there’s no way to determine what a given time would translate to in a different timezone without information about whether daylight savings time is in effect or not. This means that time values are left as-is, and the value of the tzinfo parameter is only used to display the timezone name if needed:

>>> t = time(15, 30)
>>> format_time(t, format='full', tzinfo=get_timezone('Europe/Paris'),
...             locale='fr_FR')
u'15:30:00 heure normale d\u2019Europe centrale'
>>> format_time(t, format='full', tzinfo=get_timezone('US/Eastern'),
...             locale='en_US')
u'3:30:00 PM Eastern Standard Time'
Parameters
  • time – the time or datetime object; if None, the current time in UTC is used
  • format – one of “full”, “long”, “medium”, or “short”, or a custom date/time pattern
  • tzinfo – the time-zone to apply to the time for display
  • locale – a Locale object or a locale identifier
babel.dates.format_timedelta(delta, granularity='second', threshold=.85, add_direction=False, format='long', locale=default_locale('LC_TIME'))

Return a time delta according to the rules of the given locale.

>>> format_timedelta(timedelta(weeks=12), locale='en_US')
u'3 months'
>>> format_timedelta(timedelta(seconds=1), locale='es')
u'1 segundo'

The granularity parameter can be provided to alter the lowest unit presented, which defaults to a second.

>>> format_timedelta(timedelta(hours=3), granularity='day',
...                  locale='en_US')
u'1 day'

The threshold parameter can be used to determine at which value the presentation switches to the next higher unit. A higher threshold factor means the presentation will switch later. For example:

>>> format_timedelta(timedelta(hours=23), threshold=0.9, locale='en_US')
u'1 day'
>>> format_timedelta(timedelta(hours=23), threshold=1.1, locale='en_US')
u'23 hours'

In addition directional information can be provided that informs the user if the date is in the past or in the future:

>>> format_timedelta(timedelta(hours=1), add_direction=True, locale='en')
u'in 1 hour'
>>> format_timedelta(timedelta(hours=-1), add_direction=True, locale='en')
u'1 hour ago'

The format parameter controls how compact or wide the presentation is:

>>> format_timedelta(timedelta(hours=3), format='short', locale='en')
u'3 hr'
>>> format_timedelta(timedelta(hours=3), format='narrow', locale='en')
u'3h'
Parameters
  • delta – a timedelta object representing the time difference to format, or the delta in seconds as an int value
  • granularity – determines the smallest unit that should be displayed, the value can be one of “year”, “month”, “week”, “day”, “hour”, “minute” or “second”
  • threshold – factor that determines at which point the presentation switches to the next higher unit
  • add_direction – if this flag is set to True the return value will include directional information.  For instance a positive timedelta will include the information about it being in the future, a negative will be information about the value being in the past.
  • format – the format, can be “narrow”, “short” or “long”. ( “medium” is deprecated, currently converted to “long” to maintain compatibility)
  • locale – a Locale object or a locale identifier
babel.dates.format_skeleton(skeleton, datetime=None, tzinfo=None, fuzzy=True, locale=default_locale('LC_TIME'))

Return a time and/or date formatted according to the given pattern.

The skeletons are defined in the CLDR data and provide more flexibility than the simple short/long/medium formats, but are a bit harder to use. The are defined using the date/time symbols without order or punctuation and map to a suitable format for the given locale.

>>> t = datetime(2007, 4, 1, 15, 30)
>>> format_skeleton('MMMEd', t, locale='fr')
u'dim. 1 avr.'
>>> format_skeleton('MMMEd', t, locale='en')
u'Sun, Apr 1'
>>> format_skeleton('yMMd', t, locale='fi')  # yMMd is not in the Finnish locale; yMd gets used
u'1.4.2007'
>>> format_skeleton('yMMd', t, fuzzy=False, locale='fi')  # yMMd is not in the Finnish locale, an error is thrown
Traceback (most recent call last):
    ...
KeyError: yMMd

After the skeleton is resolved to a pattern format_datetime is called so all timezone processing etc is the same as for that.

Parameters
  • skeleton – A date time skeleton as defined in the cldr data.
  • datetime – the time or datetime object; if None, the current time in UTC is used
  • tzinfo – the time-zone to apply to the time for display
  • fuzzy – If the skeleton is not found, allow choosing a skeleton that’s close enough to it.
  • locale – a Locale object or a locale identifier
babel.dates.format_interval(start, end, skeleton=None, tzinfo=None, fuzzy=True, locale=default_locale('LC_TIME'))

Format an interval between two instants according to the locale’s rules.

>>> format_interval(date(2016, 1, 15), date(2016, 1, 17), "yMd", locale="fi")
u'15.–17.1.2016'
>>> format_interval(time(12, 12), time(16, 16), "Hm", locale="en_GB")
'12:12–16:16'
>>> format_interval(time(5, 12), time(16, 16), "hm", locale="en_US")
'5:12 AM – 4:16 PM'
>>> format_interval(time(16, 18), time(16, 24), "Hm", locale="it")
'16:18–16:24'

If the start instant equals the end instant, the interval is formatted like the instant.

>>> format_interval(time(16, 18), time(16, 18), "Hm", locale="it")
'16:18'

Unknown skeletons fall back to “default” formatting.

>>> format_interval(date(2015, 1, 1), date(2017, 1, 1), "wzq", locale="ja")
'2015/01/01~2017/01/01'
>>> format_interval(time(16, 18), time(16, 24), "xxx", locale="ja")
'16:18:00~16:24:00'
>>> format_interval(date(2016, 1, 15), date(2016, 1, 17), "xxx", locale="de")
'15.01.2016 – 17.01.2016'
Parameters
  • start – First instant (datetime/date/time)
  • end – Second instant (datetime/date/time)
  • skeleton – The “skeleton format” to use for formatting.
  • tzinfo – tzinfo to use (if none is already attached)
  • fuzzy – If the skeleton is not found, allow choosing a skeleton that’s close enough to it.
  • locale – A locale object or identifier.
Returns

Formatted interval

Timezone Functionality

babel.dates.get_timezone(zone=None)

Looks up a timezone by name and returns it.  The timezone object returned comes from pytz and corresponds to the tzinfo interface and can be used with all of the functions of Babel that operate with dates.

If a timezone is not known a LookupError is raised.  If zone is None a local zone object is returned.

Parameters

zone – the name of the timezone to look up.  If a timezone object itself is passed in, mit’s returned unchanged.

babel.dates.get_timezone_gmt(datetime=None, width='long', locale='en_US_POSIX', return_z=False)

Return the timezone associated with the given datetime object formatted as string indicating the offset from GMT.

>>> dt = datetime(2007, 4, 1, 15, 30)
>>> get_timezone_gmt(dt, locale='en')
u'GMT+00:00'
>>> get_timezone_gmt(dt, locale='en', return_z=True)
'Z'
>>> get_timezone_gmt(dt, locale='en', width='iso8601_short')
u'+00'
>>> tz = get_timezone('America/Los_Angeles')
>>> dt = tz.localize(datetime(2007, 4, 1, 15, 30))
>>> get_timezone_gmt(dt, locale='en')
u'GMT-07:00'
>>> get_timezone_gmt(dt, 'short', locale='en')
u'-0700'
>>> get_timezone_gmt(dt, locale='en', width='iso8601_short')
u'-07'

The long format depends on the locale, for example in France the acronym UTC string is used instead of GMT:

>>> get_timezone_gmt(dt, 'long', locale='fr_FR')
u'UTC-07:00'

New in version 0.9.

Parameters
  • datetime – the datetime object; if None, the current date and time in UTC is used
  • width – either “long” or “short” or “iso8601” or “iso8601_short”
  • locale – the Locale object, or a locale string
  • return_z – True or False; Function returns indicator “Z” when local time offset is 0
babel.dates.get_timezone_location(dt_or_tzinfo=None, locale='en_US_POSIX', return_city=False)

Return a representation of the given timezone using “location format”.

The result depends on both the local display name of the country and the city associated with the time zone:

>>> tz = get_timezone('America/St_Johns')
>>> print(get_timezone_location(tz, locale='de_DE'))
Kanada (St. John’s) Zeit
>>> print(get_timezone_location(tz, locale='en'))
Canada (St. John’s) Time
>>> print(get_timezone_location(tz, locale='en', return_city=True))
St. John’s
>>> tz = get_timezone('America/Mexico_City')
>>> get_timezone_location(tz, locale='de_DE')
u'Mexiko (Mexiko-Stadt) Zeit'

If the timezone is associated with a country that uses only a single timezone, just the localized country name is returned:

>>> tz = get_timezone('Europe/Berlin')
>>> get_timezone_name(tz, locale='de_DE')
u'Mitteleurop\xe4ische Zeit'

New in version 0.9.

Parameters
  • dt_or_tzinfo – the datetime or tzinfo object that determines the timezone; if None, the current date and time in UTC is assumed
  • locale – the Locale object, or a locale string
  • return_city – True or False, if True then return exemplar city (location) for the time zone
Returns

the localized timezone name using location format

babel.dates.get_timezone_name(dt_or_tzinfo=None, width='long', uncommon=False, locale='en_US_POSIX', zone_variant=None, return_zone=False)

Return the localized display name for the given timezone. The timezone may be specified using a datetime or tzinfo object.

>>> dt = time(15, 30, tzinfo=get_timezone('America/Los_Angeles'))
>>> get_timezone_name(dt, locale='en_US')
u'Pacific Standard Time'
>>> get_timezone_name(dt, locale='en_US', return_zone=True)
'America/Los_Angeles'
>>> get_timezone_name(dt, width='short', locale='en_US')
u'PST'

If this function gets passed only a tzinfo object and no concrete datetime,  the returned display name is indenpendent of daylight savings time. This can be used for example for selecting timezones, or to set the time of events that recur across DST changes:

>>> tz = get_timezone('America/Los_Angeles')
>>> get_timezone_name(tz, locale='en_US')
u'Pacific Time'
>>> get_timezone_name(tz, 'short', locale='en_US')
u'PT'

If no localized display name for the timezone is available, and the timezone is associated with a country that uses only a single timezone, the name of that country is returned, formatted according to the locale:

>>> tz = get_timezone('Europe/Berlin')
>>> get_timezone_name(tz, locale='de_DE')
u'Mitteleurop\xe4ische Zeit'
>>> get_timezone_name(tz, locale='pt_BR')
u'Hor\xe1rio da Europa Central'

On the other hand, if the country uses multiple timezones, the city is also included in the representation:

>>> tz = get_timezone('America/St_Johns')
>>> get_timezone_name(tz, locale='de_DE')
u'Neufundland-Zeit'

Note that short format is currently not supported for all timezones and all locales.  This is partially because not every timezone has a short code in every locale.  In that case it currently falls back to the long format.

For more information see LDML Appendix J: Time Zone Display Names

New in version 0.9.

Changed in version 1.0: Added zone_variant support.

Parameters
  • dt_or_tzinfo – the datetime or tzinfo object that determines the timezone; if a tzinfo object is used, the resulting display name will be generic, i.e. independent of daylight savings time; if None, the current date in UTC is assumed
  • width – either “long” or “short”
  • uncommon – deprecated and ignored
  • zone_variant – defines the zone variation to return.  By default the variation is defined from the datetime object passed in.  If no datetime object is passed in, the 'generic' variation is assumed.  The following values are valid: 'generic', 'daylight' and 'standard'.
  • locale – the Locale object, or a locale string
  • return_zone – True or False. If true then function returns long time zone ID
babel.dates.get_next_timezone_transition(zone=None, dt=None)

Given a timezone it will return a TimezoneTransition object that holds the information about the next timezone transition that’s going to happen.  For instance this can be used to detect when the next DST change is going to happen and how it looks like.

The transition is calculated relative to the given datetime object.  The next transition that follows the date is used.  If a transition cannot be found the return value will be None.

Transition information can only be provided for timezones returned by the get_timezone() function.

This function is pending deprecation with no replacement planned in the Babel library.

Parameters
  • zone – the timezone for which the transition should be looked up. If not provided the local timezone is used.
  • dt – the date after which the next transition should be found. If not given the current time is assumed.
babel.dates.UTC

A timezone object for UTC.

babel.dates.LOCALTZ

A timezone object for the computer’s local timezone.

class babel.dates.TimezoneTransition(activates, from_tzinfo, to_tzinfo, reference_date=None)

A helper object that represents the return value from get_next_timezone_transition().

This class is pending deprecation with no replacement planned in the Babel library.

Field activates

The time of the activation of the timezone transition in UTC.

Field from_tzinfo

The timezone from where the transition starts.

Field to_tzinfo

The timezone for after the transition.

Field reference_date

The reference date that was provided.  This is the dt parameter to the get_next_timezone_transition().

Data Access

babel.dates.get_period_names(width='wide', context='stand-alone', locale='en_US_POSIX')

Return the names for day periods (AM/PM) used by the locale.

>>> get_period_names(locale='en_US')['am']
u'AM'
Parameters
  • width – the width to use, one of “abbreviated”, “narrow”, or “wide”
  • context – the context, either “format” or “stand-alone”
  • locale – the Locale object, or a locale string
babel.dates.get_day_names(width='wide', context='format', locale='en_US_POSIX')

Return the day names used by the locale for the specified format.

>>> get_day_names('wide', locale='en_US')[1]
u'Tuesday'
>>> get_day_names('short', locale='en_US')[1]
u'Tu'
>>> get_day_names('abbreviated', locale='es')[1]
u'mar'
>>> get_day_names('narrow', context='stand-alone', locale='de_DE')[1]
u'D'
Parameters
  • width – the width to use, one of “wide”, “abbreviated”, “short” or “narrow”
  • context – the context, either “format” or “stand-alone”
  • locale – the Locale object, or a locale string
babel.dates.get_month_names(width='wide', context='format', locale='en_US_POSIX')

Return the month names used by the locale for the specified format.

>>> get_month_names('wide', locale='en_US')[1]
u'January'
>>> get_month_names('abbreviated', locale='es')[1]
u'ene'
>>> get_month_names('narrow', context='stand-alone', locale='de_DE')[1]
u'J'
Parameters
  • width – the width to use, one of “wide”, “abbreviated”, or “narrow”
  • context – the context, either “format” or “stand-alone”
  • locale – the Locale object, or a locale string
babel.dates.get_quarter_names(width='wide', context='format', locale='en_US_POSIX')

Return the quarter names used by the locale for the specified format.

>>> get_quarter_names('wide', locale='en_US')[1]
u'1st quarter'
>>> get_quarter_names('abbreviated', locale='de_DE')[1]
u'Q1'
>>> get_quarter_names('narrow', locale='de_DE')[1]
u'1'
Parameters
  • width – the width to use, one of “wide”, “abbreviated”, or “narrow”
  • context – the context, either “format” or “stand-alone”
  • locale – the Locale object, or a locale string
babel.dates.get_era_names(width='wide', locale='en_US_POSIX')

Return the era names used by the locale for the specified format.

>>> get_era_names('wide', locale='en_US')[1]
u'Anno Domini'
>>> get_era_names('abbreviated', locale='de_DE')[1]
u'n. Chr.'
Parameters
  • width – the width to use, either “wide”, “abbreviated”, or “narrow”
  • locale – the Locale object, or a locale string
babel.dates.get_date_format(format='medium', locale='en_US_POSIX')

Return the date formatting patterns used by the locale for the specified format.

>>> get_date_format(locale='en_US')
<DateTimePattern u'MMM d, y'>
>>> get_date_format('full', locale='de_DE')
<DateTimePattern u'EEEE, d. MMMM y'>
Parameters
  • format – the format to use, one of “full”, “long”, “medium”, or “short”
  • locale – the Locale object, or a locale string
babel.dates.get_datetime_format(format='medium', locale='en_US_POSIX')

Return the datetime formatting patterns used by the locale for the specified format.

>>> get_datetime_format(locale='en_US')
u'{1}, {0}'
Parameters
  • format – the format to use, one of “full”, “long”, “medium”, or “short”
  • locale – the Locale object, or a locale string
babel.dates.get_time_format(format='medium', locale='en_US_POSIX')

Return the time formatting patterns used by the locale for the specified format.

>>> get_time_format(locale='en_US')
<DateTimePattern u'h:mm:ss a'>
>>> get_time_format('full', locale='de_DE')
<DateTimePattern u'HH:mm:ss zzzz'>
Parameters
  • format – the format to use, one of “full”, “long”, “medium”, or “short”
  • locale – the Locale object, or a locale string

Basic Parsing

babel.dates.parse_date(string, locale='en_US_POSIX', format='medium')

Parse a date from a string.

This function uses the date format for the locale as a hint to determine the order in which the date fields appear in the string.

>>> parse_date('4/1/04', locale='en_US')
datetime.date(2004, 4, 1)
>>> parse_date('01.04.2004', locale='de_DE')
datetime.date(2004, 4, 1)
Parameters
  • string – the string containing the date
  • locale – a Locale object or a locale identifier
  • format – the format to use (see get_date_format)
babel.dates.parse_time(string, locale='en_US_POSIX', format='medium')

Parse a time from a string.

This function uses the time format for the locale as a hint to determine the order in which the time fields appear in the string.

>>> parse_time('15:30:00', locale='en_US')
datetime.time(15, 30)
Parameters
  • string – the string containing the time
  • locale – a Locale object or a locale identifier
  • format – the format to use (see get_time_format)
Returns

the parsed time

Return type

time

babel.dates.parse_pattern(pattern)

Parse date, time, and datetime format patterns.

>>> parse_pattern("MMMMd").format
u'%(MMMM)s%(d)s'
>>> parse_pattern("MMM d, yyyy").format
u'%(MMM)s %(d)s, %(yyyy)s'

Pattern can contain literal strings in single quotes:

>>> parse_pattern("H:mm' Uhr 'z").format
u'%(H)s:%(mm)s Uhr %(z)s'

An actual single quote can be used by using two adjacent single quote characters:

>>> parse_pattern("hh' o''clock'").format
u"%(hh)s o'clock"
Parameters

pattern – the formatting pattern to parse

Languages

The languages module provides functionality to access data about languages that is not bound to a given locale.

Official Languages

babel.languages.get_official_languages(territory, regional=False, de_facto=False)

Get the official language(s) for the given territory.

The language codes, if any are known, are returned in order of descending popularity.

If the regional flag is set, then languages which are regionally official are also returned.

If the de_facto flag is set, then languages which are “de facto” official are also returned.

WARNING:

Note that the data is as up to date as the current version of the CLDR used by Babel.  If you need scientifically accurate information, use another source!

Parameters
  • territory (str) – Territory code
  • regional (bool) – Whether to return regionally official languages too
  • de_facto (bool) – Whether to return de-facto official languages too
Returns

Tuple of language codes

Return type

tuple[str]

babel.languages.get_territory_language_info(territory)

Get a dictionary of language information for a territory.

The dictionary is keyed by language code; the values are dicts with more information.

The following keys are currently known for the values:

  • population_percent: The percentage of the territory’s population speaking the

    language.

  • official_status: An optional string describing the officiality status of the language.

    Known values are “official”, “official_regional” and “de_facto_official”.

WARNING:

Note that the data is as up to date as the current version of the CLDR used by Babel.  If you need scientifically accurate information, use another source!

NOTE:

Note that the format of the dict returned may change between Babel versions.

See https://www.unicode.org/cldr/charts/latest/supplemental/territory_language_information.html

Parameters

territory (str) – Territory code

Returns

Language information dictionary

Return type

dict[str, dict]

List Formatting

This module lets you format lists of items in a locale-dependent manner.

babel.lists.format_list(lst, style='standard', locale='en_US_POSIX')

Format the items in lst as a list.

>>> format_list(['apples', 'oranges', 'pears'], locale='en')
u'apples, oranges, and pears'
>>> format_list(['apples', 'oranges', 'pears'], locale='zh')
u'apples、oranges和pears'
>>> format_list(['omena', 'peruna', 'aplari'], style='or', locale='fi')
u'omena, peruna tai aplari'

These styles are defined, but not all are necessarily available in all locales. The following text is verbatim from the Unicode TR35-49 spec [1].

  • standard: A typical ‘and’ list for arbitrary placeholders. eg. “January, February, and March”
  • standard-short: A short version of a ‘and’ list, suitable for use with short or abbreviated placeholder values. eg. “Jan., Feb., and Mar.”
  • or: A typical ‘or’ list for arbitrary placeholders. eg. “January, February, or March”
  • or-short: A short version of an ‘or’ list. eg. “Jan., Feb., or Mar.”
  • unit: A list suitable for wide units. eg. “3 feet, 7 inches”
  • unit-short: A list suitable for short units eg. “3 ft, 7 in”
  • unit-narrow: A list suitable for narrow units, where space on the screen is very limited. eg. “3′ 7″”

[1]: https://www.unicode.org/reports/tr35/tr35-49/tr35-general.html#ListPatterns

Parameters
  • lst – a sequence of items to format in to a list
  • style – the style to format the list with. See above for description.
  • locale – the locale

Messages and Catalogs

Babel provides functionality to work with message catalogs.  This part of the API documentation shows those parts.

Messages and Catalogs

This module provides a basic interface to hold catalog and message information.  It’s generally used to modify a gettext catalog but it is not being used to actually use the translations.

Catalogs

class babel.messages.catalog.Catalog(locale=None, domain=None, header_comment='# Translations template for PROJECT.\n# Copyright (C) YEAR ORGANIZATION\n# This file is distributed under the same license as the PROJECT project.\n# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.\n#', project=None, version=None, copyright_holder=None, msgid_bugs_address=None, creation_date=None, revision_date=None, last_translator=None, language_team=None, charset=None, fuzzy=True)

Representation of a message catalog.

__iter__()

Iterates through all the entries in the catalog, in the order they were added, yielding a Message object for every entry.

Return type

iterator

add(id, string=None, locations=(), flags=(), auto_comments=(), user_comments=(), previous_id=(), lineno=None, context=None)

Add or update the message with the specified ID.

>>> catalog = Catalog()
>>> catalog.add(u'foo')
<Message ...>
>>> catalog[u'foo']
<Message u'foo' (flags: [])>

This method simply constructs a Message object with the given arguments and invokes __setitem__ with that object.

Parameters
  • id – the message ID, or a (singular, plural) tuple for pluralizable messages
  • string – the translated message string, or a (singular, plural) tuple for pluralizable messages
  • locations – a sequence of (filename, lineno) tuples
  • flags – a set or sequence of flags
  • auto_comments – a sequence of automatic comments
  • user_comments – a sequence of user comments
  • previous_id – the previous message ID, or a (singular, plural) tuple for pluralizable messages
  • lineno – the line number on which the msgid line was found in the PO file, if any
  • context – the message context
check()

Run various validation checks on the translations in the catalog.

For every message which fails validation, this method yield a (message, errors) tuple, where message is the Message object and errors is a sequence of TranslationError objects.

Return type

iterator

delete(id, context=None)

Delete the message with the specified ID and context.

Parameters
  • id – the message ID
  • context – the message context, or None for no context
get(id, context=None)

Return the message with the specified ID and context.

Parameters
  • id – the message ID
  • context – the message context, or None for no context
property header_comment

The header comment for the catalog.

>>> catalog = Catalog(project='Foobar', version='1.0',
...                   copyright_holder='Foo Company')
>>> print(catalog.header_comment) 
# Translations template for Foobar.
# Copyright (C) ... Foo Company
# This file is distributed under the same license as the Foobar project.
# FIRST AUTHOR <EMAIL@ADDRESS>, ....
#

The header can also be set from a string. Any known upper-case variables will be replaced when the header is retrieved again:

>>> catalog = Catalog(project='Foobar', version='1.0',
...                   copyright_holder='Foo Company')
>>> catalog.header_comment = '''\
... # The POT for my really cool PROJECT project.
... # Copyright (C) 1990-2003 ORGANIZATION
... # This file is distributed under the same license as the PROJECT
... # project.
... #'''
>>> print(catalog.header_comment)
# The POT for my really cool Foobar project.
# Copyright (C) 1990-2003 Foo Company
# This file is distributed under the same license as the Foobar
# project.
#
Type

unicode

is_identical(other)

Checks if catalogs are identical, taking into account messages and headers.

language_team

Name and email address of the language team.

last_translator

Name and email address of the last translator.

property mime_headers

The MIME headers of the catalog, used for the special msgid "" entry.

The behavior of this property changes slightly depending on whether a locale is set or not, the latter indicating that the catalog is actually a template for actual translations.

Here’s an example of the output for such a catalog template:

>>> from babel.dates import UTC
>>> created = datetime(1990, 4, 1, 15, 30, tzinfo=UTC)
>>> catalog = Catalog(project='Foobar', version='1.0',
...                   creation_date=created)
>>> for name, value in catalog.mime_headers:
...     print('%s: %s' % (name, value))
Project-Id-Version: Foobar 1.0
Report-Msgid-Bugs-To: EMAIL@ADDRESS
POT-Creation-Date: 1990-04-01 15:30+0000
PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE
Last-Translator: FULL NAME <EMAIL@ADDRESS>
Language-Team: LANGUAGE <LL@li.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Generated-By: Babel ...

And here’s an example of the output when the locale is set:

>>> revised = datetime(1990, 8, 3, 12, 0, tzinfo=UTC)
>>> catalog = Catalog(locale='de_DE', project='Foobar', version='1.0',
...                   creation_date=created, revision_date=revised,
...                   last_translator='John Doe <jd@example.com>',
...                   language_team='de_DE <de@example.com>')
>>> for name, value in catalog.mime_headers:
...     print('%s: %s' % (name, value))
Project-Id-Version: Foobar 1.0
Report-Msgid-Bugs-To: EMAIL@ADDRESS
POT-Creation-Date: 1990-04-01 15:30+0000
PO-Revision-Date: 1990-08-03 12:00+0000
Last-Translator: John Doe <jd@example.com>
Language: de_DE
Language-Team: de_DE <de@example.com>
Plural-Forms: nplurals=2; plural=(n != 1);
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Generated-By: Babel ...
Type

list

property num_plurals

The number of plurals used by the catalog or locale.

>>> Catalog(locale='en').num_plurals
2
>>> Catalog(locale='ga').num_plurals
5
Type

int

property plural_expr

The plural expression used by the catalog or locale.

>>> Catalog(locale='en').plural_expr
'(n != 1)'
>>> Catalog(locale='ga').plural_expr
'(n==1 ? 0 : n==2 ? 1 : n>=3 && n<=6 ? 2 : n>=7 && n<=10 ? 3 : 4)'
>>> Catalog(locale='ding').plural_expr  # unknown locale
'(n != 1)'
Type

str

property plural_forms

Return the plural forms declaration for the locale.

>>> Catalog(locale='en').plural_forms
'nplurals=2; plural=(n != 1);'
>>> Catalog(locale='pt_BR').plural_forms
'nplurals=2; plural=(n > 1);'
Type

str

update(template, no_fuzzy_matching=False, update_header_comment=False, keep_user_comments=True)

Update the catalog based on the given template catalog.

>>> from babel.messages import Catalog
>>> template = Catalog()
>>> template.add('green', locations=[('main.py', 99)])
<Message ...>
>>> template.add('blue', locations=[('main.py', 100)])
<Message ...>
>>> template.add(('salad', 'salads'), locations=[('util.py', 42)])
<Message ...>
>>> catalog = Catalog(locale='de_DE')
>>> catalog.add('blue', u'blau', locations=[('main.py', 98)])
<Message ...>
>>> catalog.add('head', u'Kopf', locations=[('util.py', 33)])
<Message ...>
>>> catalog.add(('salad', 'salads'), (u'Salat', u'Salate'),
...             locations=[('util.py', 38)])
<Message ...>
>>> catalog.update(template)
>>> len(catalog)
3
>>> msg1 = catalog['green']
>>> msg1.string
>>> msg1.locations
[('main.py', 99)]
>>> msg2 = catalog['blue']
>>> msg2.string
u'blau'
>>> msg2.locations
[('main.py', 100)]
>>> msg3 = catalog['salad']
>>> msg3.string
(u'Salat', u'Salate')
>>> msg3.locations
[('util.py', 42)]

Messages that are in the catalog but not in the template are removed from the main collection, but can still be accessed via the obsolete member:

>>> 'head' in catalog
False
>>> list(catalog.obsolete.values())
[<Message 'head' (flags: [])>]
Parameters
  • template – the reference catalog, usually read from a POT file
  • no_fuzzy_matching – whether to use fuzzy matching of message IDs

Messages

class babel.messages.catalog.Message(id, string='', locations=(), flags=(), auto_comments=(), user_comments=(), previous_id=(), lineno=None, context=None)

Representation of a single message in a catalog.

check(catalog=None)

Run various validation checks on the message.  Some validations are only performed if the catalog is provided.  This method returns a sequence of TranslationError objects.

Return type

iterator

Parameters

catalog – A catalog instance that is passed to the checkers

See

Catalog.check for a way to perform checks for all messages in a catalog.

property fuzzy

Whether the translation is fuzzy.

>>> Message('foo').fuzzy
False
>>> msg = Message('foo', 'foo', flags=['fuzzy'])
>>> msg.fuzzy
True
>>> msg
<Message 'foo' (flags: ['fuzzy'])>
Type

bool

is_identical(other)

Checks whether messages are identical, taking into account all properties.

property pluralizable

Whether the message is plurizable.

>>> Message('foo').pluralizable
False
>>> Message(('foo', 'bar')).pluralizable
True
Type

bool

property python_format

Whether the message contains Python-style parameters.

>>> Message('foo %(name)s bar').python_format
True
>>> Message(('foo %(name)s', 'foo %(name)s')).python_format
True
Type

bool

Exceptions

exception babel.messages.catalog.TranslationError

Exception thrown by translation checkers when invalid message translations are encountered.

Low-Level Extraction Interface

The low level extraction interface can be used to extract from directories or files directly.  Normally this is not needed as the command line tools can do that for you.

Extraction Functions

The extraction functions are what the command line tools use internally to extract strings.

babel.messages.extract.extract_from_dir(dirname=None, method_map=[('**.py', 'python')], options_map=None, keywords={'N_': None, '_': None, 'dgettext': (2,), 'dngettext': (2, 3), 'gettext': None, 'ngettext': (1, 2), 'npgettext': ((1, 'c'), 2, 3), 'pgettext': ((1, 'c'), 2), 'ugettext': None, 'ungettext': (1, 2)}, comment_tags=(), callback=None, strip_comment_tags=False, directory_filter=None)

Extract messages from any source files found in the given directory.

This function generates tuples of the form (filename, lineno, message, comments, context).

Which extraction method is used per file is determined by the method_map parameter, which maps extended glob patterns to extraction method names. For example, the following is the default mapping:

>>> method_map = [
...     ('**.py', 'python')
... ]

This basically says that files with the filename extension “.py” at any level inside the directory should be processed by the “python” extraction method. Files that don’t match any of the mapping patterns are ignored. See the documentation of the pathmatch function for details on the pattern syntax.

The following extended mapping would also use the “genshi” extraction method on any file in “templates” subdirectory:

>>> method_map = [
...     ('**/templates/**.*', 'genshi'),
...     ('**.py', 'python')
... ]

The dictionary provided by the optional options_map parameter augments these mappings. It uses extended glob patterns as keys, and the values are dictionaries mapping options names to option values (both strings).

The glob patterns of the options_map do not necessarily need to be the same as those used in the method mapping. For example, while all files in the templates folders in an application may be Genshi applications, the options for those files may differ based on extension:

>>> options_map = {
...     '**/templates/**.txt': {
...         'template_class': 'genshi.template:TextTemplate',
...         'encoding': 'latin-1'
...     },
...     '**/templates/**.html': {
...         'include_attrs': ''
...     }
... }
Parameters
  • dirname – the path to the directory to extract messages from.  If not given the current working directory is used.
  • method_map – a list of (pattern, method) tuples that maps of extraction method names to extended glob patterns
  • options_map – a dictionary of additional options (optional)
  • keywords – a dictionary mapping keywords (i.e. names of functions that should be recognized as translation functions) to tuples that specify which of their arguments contain localizable strings
  • comment_tags – a list of tags of translator comments to search for and include in the results
  • callback – a function that is called for every file that message are extracted from, just before the extraction itself is performed; the function is passed the filename, the name of the extraction method and and the options dictionary as positional arguments, in that order
  • strip_comment_tags – a flag that if set to True causes all comment tags to be removed from the collected comments.
  • directory_filter – a callback to determine whether a directory should be recursed into. Receives the full directory path; should return True if the directory is valid.
See

pathmatch

babel.messages.extract.extract_from_file(method, filename, keywords={'N_': None, '_': None, 'dgettext': (2,), 'dngettext': (2, 3), 'gettext': None, 'ngettext': (1, 2), 'npgettext': ((1, 'c'), 2, 3), 'pgettext': ((1, 'c'), 2), 'ugettext': None, 'ungettext': (1, 2)}, comment_tags=(), options=None, strip_comment_tags=False)

Extract messages from a specific file.

This function returns a list of tuples of the form (lineno, message, comments, context).

Parameters
  • filename – the path to the file to extract messages from
  • method – a string specifying the extraction method (.e.g. “python”)
  • keywords – a dictionary mapping keywords (i.e. names of functions that should be recognized as translation functions) to tuples that specify which of their arguments contain localizable strings
  • comment_tags – a list of translator tags to search for and include in the results
  • strip_comment_tags – a flag that if set to True causes all comment tags to be removed from the collected comments.
  • options – a dictionary of additional options (optional)
Returns

list of tuples of the form (lineno, message, comments, context)

Return type

list[tuple[int, str|tuple[str], list[str], str|None]

babel.messages.extract.extract(method, fileobj, keywords={'N_': None, '_': None, 'dgettext': (2,), 'dngettext': (2, 3), 'gettext': None, 'ngettext': (1, 2), 'npgettext': ((1, 'c'), 2, 3), 'pgettext': ((1, 'c'), 2), 'ugettext': None, 'ungettext': (1, 2)}, comment_tags=(), options=None, strip_comment_tags=False)

Extract messages from the given file-like object using the specified extraction method.

This function returns tuples of the form (lineno, message, comments, context).

The implementation dispatches the actual extraction to plugins, based on the value of the method parameter.

>>> source = b'''# foo module
... def run(argv):
...    print(_('Hello, world!'))
... '''
>>> from io import BytesIO
>>> for message in extract('python', BytesIO(source)):
...     print(message)
(3, u'Hello, world!', [], None)
Parameters
  • method – an extraction method (a callable), or a string specifying the extraction method (.e.g. “python”); if this is a simple name, the extraction function will be looked up by entry point; if it is an explicit reference to a function (of the form package.module:funcname or package.module.funcname), the corresponding function will be imported and used
  • fileobj – the file-like object the messages should be extracted from
  • keywords – a dictionary mapping keywords (i.e. names of functions that should be recognized as translation functions) to tuples that specify which of their arguments contain localizable strings
  • comment_tags – a list of translator tags to search for and include in the results
  • options – a dictionary of additional options (optional)
  • strip_comment_tags – a flag that if set to True causes all comment tags to be removed from the collected comments.
Raises

ValueError – if the extraction method is not registered

Returns

iterable of tuples of the form (lineno, message, comments, context)

Return type

Iterable[tuple[int, str|tuple[str], list[str], str|None]

Language Parsing

The language parsing functions are used to extract strings out of source files.  These are automatically being used by the extraction functions but sometimes it can be useful to register wrapper functions, then these low level functions can be invoked.

New functions can be registered through the setuptools entrypoint system.

babel.messages.extract.extract_python(fileobj, keywords, comment_tags, options)

Extract messages from Python source code.

It returns an iterator yielding tuples in the following form (lineno, funcname, message, comments).

Parameters
  • fileobj – the seekable, file-like object the messages should be extracted from
  • keywords – a list of keywords (i.e. function names) that should be recognized as translation functions
  • comment_tags – a list of translator tags to search for and include in the results
  • options – a dictionary of additional options (optional)
Return type

iterator

babel.messages.extract.extract_javascript(fileobj, keywords, comment_tags, options)

Extract messages from JavaScript source code.

Parameters
  • fileobj – the seekable, file-like object the messages should be extracted from
  • keywords – a list of keywords (i.e. function names) that should be recognized as translation functions
  • comment_tags – a list of translator tags to search for and include in the results
  • options – a dictionary of additional options (optional) Supported options are: * jsx – set to false to disable JSX/E4X support. * template_string – set to false to disable ES6 template string support.
babel.messages.extract.extract_nothing(fileobj, keywords, comment_tags, options)

Pseudo extractor that does not actually extract anything, but simply returns an empty list.

MO File Support

The MO file support can read and write MO files.  It reads them into Catalog objects and also writes catalogs out.

babel.messages.mofile.read_mo(fileobj)

Read a binary MO file from the given file-like object and return a corresponding Catalog object.

Parameters

fileobj – the file-like object to read the MO file from

Note

The implementation of this function is heavily based on the GNUTranslations._parse method of the gettext module in the standard library.

babel.messages.mofile.write_mo(fileobj, catalog, use_fuzzy=False)

Write a catalog to the specified file-like object using the GNU MO file format.

>>> import sys
>>> from babel.messages import Catalog
>>> from gettext import GNUTranslations
>>> from io import BytesIO
>>> catalog = Catalog(locale='en_US')
>>> catalog.add('foo', 'Voh')
<Message ...>
>>> catalog.add((u'bar', u'baz'), (u'Bahr', u'Batz'))
<Message ...>
>>> catalog.add('fuz', 'Futz', flags=['fuzzy'])
<Message ...>
>>> catalog.add('Fizz', '')
<Message ...>
>>> catalog.add(('Fuzz', 'Fuzzes'), ('', ''))
<Message ...>
>>> buf = BytesIO()
>>> write_mo(buf, catalog)
>>> x = buf.seek(0)
>>> translations = GNUTranslations(fp=buf)
>>> if sys.version_info[0] >= 3:
...     translations.ugettext = translations.gettext
...     translations.ungettext = translations.ngettext
>>> translations.ugettext('foo')
u'Voh'
>>> translations.ungettext('bar', 'baz', 1)
u'Bahr'
>>> translations.ungettext('bar', 'baz', 2)
u'Batz'
>>> translations.ugettext('fuz')
u'fuz'
>>> translations.ugettext('Fizz')
u'Fizz'
>>> translations.ugettext('Fuzz')
u'Fuzz'
>>> translations.ugettext('Fuzzes')
u'Fuzzes'
Parameters
  • fileobj – the file-like object to write to
  • catalog – the Catalog instance
  • use_fuzzy – whether translations marked as “fuzzy” should be included in the output

PO File Support

The PO file support can read and write PO and POT files.  It reads them into Catalog objects and also writes catalogs out.

babel.messages.pofile.read_po(fileobj, locale=None, domain=None, ignore_obsolete=False, charset=None, abort_invalid=False)

Read messages from a gettext PO (portable object) file from the given file-like object and return a Catalog.

>>> from datetime import datetime
>>> from io import StringIO
>>> buf = StringIO('''
... #: main.py:1
... #, fuzzy, python-format
... msgid "foo %(name)s"
... msgstr "quux %(name)s"
...
... # A user comment
... #. An auto comment
... #: main.py:3
... msgid "bar"
... msgid_plural "baz"
... msgstr[0] "bar"
... msgstr[1] "baaz"
... ''')
>>> catalog = read_po(buf)
>>> catalog.revision_date = datetime(2007, 4, 1)
>>> for message in catalog:
...     if message.id:
...         print((message.id, message.string))
...         print(' ', (message.locations, sorted(list(message.flags))))
...         print(' ', (message.user_comments, message.auto_comments))
(u'foo %(name)s', u'quux %(name)s')
  ([(u'main.py', 1)], [u'fuzzy', u'python-format'])
  ([], [])
((u'bar', u'baz'), (u'bar', u'baaz'))
  ([(u'main.py', 3)], [])
  ([u'A user comment'], [u'An auto comment'])

New in version 1.0: Added support for explicit charset argument.

Parameters
  • fileobj – the file-like object to read the PO file from
  • locale – the locale identifier or Locale object, or None if the catalog is not bound to a locale (which basically means it’s a template)
  • domain – the message domain
  • ignore_obsolete – whether to ignore obsolete messages in the input
  • charset – the character set of the catalog.
  • abort_invalid – abort read if po file is invalid
babel.messages.pofile.write_po(fileobj, catalog, width=76, no_location=False, omit_header=False, sort_output=False, sort_by_file=False, ignore_obsolete=False, include_previous=False, include_lineno=True)

Write a gettext PO (portable object) template file for a given message catalog to the provided file-like object.

>>> catalog = Catalog()
>>> catalog.add(u'foo %(name)s', locations=[('main.py', 1)],
...             flags=('fuzzy',))
<Message...>
>>> catalog.add((u'bar', u'baz'), locations=[('main.py', 3)])
<Message...>
>>> from io import BytesIO
>>> buf = BytesIO()
>>> write_po(buf, catalog, omit_header=True)
>>> print(buf.getvalue().decode("utf8"))
#: main.py:1
#, fuzzy, python-format
msgid "foo %(name)s"
msgstr ""

#: main.py:3
msgid "bar"
msgid_plural "baz"
msgstr[0] ""
msgstr[1] ""
Parameters
  • fileobj – the file-like object to write to
  • catalog – the Catalog instance
  • width – the maximum line width for the generated output; use None, 0, or a negative number to completely disable line wrapping
  • no_location – do not emit a location comment for every message
  • omit_header – do not include the msgid "" entry at the top of the output
  • sort_output – whether to sort the messages in the output by msgid
  • sort_by_file – whether to sort the messages in the output by their locations
  • ignore_obsolete – whether to ignore obsolete messages and not include them in the output; by default they are included as comments
  • include_previous – include the old msgid as a comment when updating the catalog
  • include_lineno – include line number in the location comment

Numbers and Currencies

The number module provides functionality to format numbers for different locales.  This includes arbitrary numbers as well as currency.

Number Formatting

babel.numbers.format_number(number, locale='en_US_POSIX')

Return the given number formatted for a specific locale.

>>> format_number(1099, locale='en_US')
u'1,099'
>>> format_number(1099, locale='de_DE')
u'1.099'

Deprecated since version 2.6.0: Use babel.numbers.format_decimal() instead.

Parameters
  • number – the number to format
  • locale – the Locale object or locale identifier
babel.numbers.format_decimal(number, format=None, locale='en_US_POSIX', decimal_quantization=True, group_separator=True)

Return the given decimal number formatted for a specific locale.

>>> format_decimal(1.2345, locale='en_US')
u'1.234'
>>> format_decimal(1.2346, locale='en_US')
u'1.235'
>>> format_decimal(-1.2346, locale='en_US')
u'-1.235'
>>> format_decimal(1.2345, locale='sv_SE')
u'1,234'
>>> format_decimal(1.2345, locale='de')
u'1,234'

The appropriate thousands grouping and the decimal separator are used for each locale:

>>> format_decimal(12345.5, locale='en_US')
u'12,345.5'

By default the locale is allowed to truncate and round a high-precision number by forcing its format pattern onto the decimal part. You can bypass this behavior with the decimal_quantization parameter:

>>> format_decimal(1.2346, locale='en_US')
u'1.235'
>>> format_decimal(1.2346, locale='en_US', decimal_quantization=False)
u'1.2346'
>>> format_decimal(12345.67, locale='fr_CA', group_separator=False)
u'12345,67'
>>> format_decimal(12345.67, locale='en_US', group_separator=True)
u'12,345.67'
Parameters
  • number – the number to format
  • format
  • locale – the Locale object or locale identifier
  • decimal_quantization – Truncate and round high-precision numbers to the format pattern. Defaults to True.
  • group_separator – Boolean to switch group separator on/off in a locale’s number format.
babel.numbers.format_currency(number, currency, format=None, locale='en_US_POSIX', currency_digits=True, format_type='standard', decimal_quantization=True, group_separator=True)

Return formatted currency value.

>>> format_currency(1099.98, 'USD', locale='en_US')
u'$1,099.98'
>>> format_currency(1099.98, 'USD', locale='es_CO')
u'US$\xa01.099,98'
>>> format_currency(1099.98, 'EUR', locale='de_DE')
u'1.099,98\xa0\u20ac'

The format can also be specified explicitly.  The currency is placed with the ‘¤’ sign.  As the sign gets repeated the format expands (¤ being the symbol, ¤¤ is the currency abbreviation and ¤¤¤ is the full name of the currency):

>>> format_currency(1099.98, 'EUR', u'¤¤ #,##0.00', locale='en_US')
u'EUR 1,099.98'
>>> format_currency(1099.98, 'EUR', u'#,##0.00 ¤¤¤', locale='en_US')
u'1,099.98 euros'

Currencies usually have a specific number of decimal digits. This function favours that information over the given format:

>>> format_currency(1099.98, 'JPY', locale='en_US')
u'\xa51,100'
>>> format_currency(1099.98, 'COP', u'#,##0.00', locale='es_ES')
u'1.099,98'

However, the number of decimal digits can be overriden from the currency information, by setting the last parameter to False:

>>> format_currency(1099.98, 'JPY', locale='en_US', currency_digits=False)
u'\xa51,099.98'
>>> format_currency(1099.98, 'COP', u'#,##0.00', locale='es_ES', currency_digits=False)
u'1.099,98'

If a format is not specified the type of currency format to use from the locale can be specified:

>>> format_currency(1099.98, 'EUR', locale='en_US', format_type='standard')
u'\u20ac1,099.98'

When the given currency format type is not available, an exception is raised:

>>> format_currency('1099.98', 'EUR', locale='root', format_type='unknown')
Traceback (most recent call last):
    ...
UnknownCurrencyFormatError: "'unknown' is not a known currency format type"
>>> format_currency(101299.98, 'USD', locale='en_US', group_separator=False)
u'$101299.98'
>>> format_currency(101299.98, 'USD', locale='en_US', group_separator=True)
u'$101,299.98'

You can also pass format_type=’name’ to use long display names. The order of the number and currency name, along with the correct localized plural form of the currency name, is chosen according to locale:

>>> format_currency(1, 'USD', locale='en_US', format_type='name')
u'1.00 US dollar'
>>> format_currency(1099.98, 'USD', locale='en_US', format_type='name')
u'1,099.98 US dollars'
>>> format_currency(1099.98, 'USD', locale='ee', format_type='name')
u'us ga dollar 1,099.98'

By default the locale is allowed to truncate and round a high-precision number by forcing its format pattern onto the decimal part. You can bypass this behavior with the decimal_quantization parameter:

>>> format_currency(1099.9876, 'USD', locale='en_US')
u'$1,099.99'
>>> format_currency(1099.9876, 'USD', locale='en_US', decimal_quantization=False)
u'$1,099.9876'
Parameters
  • number – the number to format
  • currency – the currency code
  • format – the format string to use
  • locale – the Locale object or locale identifier
  • currency_digits – use the currency’s natural number of decimal digits
  • format_type – the currency format type to use
  • decimal_quantization – Truncate and round high-precision numbers to the format pattern. Defaults to True.
  • group_separator – Boolean to switch group separator on/off in a locale’s number format.
babel.numbers.format_percent(number, format=None, locale='en_US_POSIX', decimal_quantization=True, group_separator=True)

Return formatted percent value for a specific locale.

>>> format_percent(0.34, locale='en_US')
u'34%'
>>> format_percent(25.1234, locale='en_US')
u'2,512%'
>>> format_percent(25.1234, locale='sv_SE')
u'2\xa0512\xa0%'

The format pattern can also be specified explicitly:

>>> format_percent(25.1234, u'#,##0‰', locale='en_US')
u'25,123‰'

By default the locale is allowed to truncate and round a high-precision number by forcing its format pattern onto the decimal part. You can bypass this behavior with the decimal_quantization parameter:

>>> format_percent(23.9876, locale='en_US')
u'2,399%'
>>> format_percent(23.9876, locale='en_US', decimal_quantization=False)
u'2,398.76%'
>>> format_percent(229291.1234, locale='pt_BR', group_separator=False)
u'22929112%'
>>> format_percent(229291.1234, locale='pt_BR', group_separator=True)
u'22.929.112%'
Parameters
  • number – the percent number to format
  • format
  • locale – the Locale object or locale identifier
  • decimal_quantization – Truncate and round high-precision numbers to the format pattern. Defaults to True.
  • group_separator – Boolean to switch group separator on/off in a locale’s number format.
babel.numbers.format_scientific(number, format=None, locale='en_US_POSIX', decimal_quantization=True)

Return value formatted in scientific notation for a specific locale.

>>> format_scientific(10000, locale='en_US')
u'1E4'

The format pattern can also be specified explicitly:

>>> format_scientific(1234567, u'##0.##E00', locale='en_US')
u'1.23E06'

By default the locale is allowed to truncate and round a high-precision number by forcing its format pattern onto the decimal part. You can bypass this behavior with the decimal_quantization parameter:

>>> format_scientific(1234.9876, u'#.##E0', locale='en_US')
u'1.23E3'
>>> format_scientific(1234.9876, u'#.##E0', locale='en_US', decimal_quantization=False)
u'1.2349876E3'
Parameters
  • number – the number to format
  • format
  • locale – the Locale object or locale identifier
  • decimal_quantization – Truncate and round high-precision numbers to the format pattern. Defaults to True.

Number Parsing

babel.numbers.parse_number(string, locale='en_US_POSIX')

Parse localized number string into an integer.

>>> parse_number('1,099', locale='en_US')
1099
>>> parse_number('1.099', locale='de_DE')
1099

When the given string cannot be parsed, an exception is raised:

>>> parse_number('1.099,98', locale='de')
Traceback (most recent call last):
    ...
NumberFormatError: '1.099,98' is not a valid number
Parameters
  • string – the string to parse
  • locale – the Locale object or locale identifier
Returns

the parsed number

Raises

NumberFormatError – if the string can not be converted to a number

babel.numbers.parse_decimal(string, locale='en_US_POSIX', strict=False)

Parse localized decimal string into a decimal.

>>> parse_decimal('1,099.98', locale='en_US')
Decimal('1099.98')
>>> parse_decimal('1.099,98', locale='de')
Decimal('1099.98')
>>> parse_decimal('12 345,123', locale='ru')
Decimal('12345.123')

When the given string cannot be parsed, an exception is raised:

>>> parse_decimal('2,109,998', locale='de')
Traceback (most recent call last):
    ...
NumberFormatError: '2,109,998' is not a valid decimal number

If strict is set to True and the given string contains a number formatted in an irregular way, an exception is raised:

>>> parse_decimal('30.00', locale='de', strict=True)
Traceback (most recent call last):
    ...
NumberFormatError: '30.00' is not a properly formatted decimal number. Did you mean '3.000'? Or maybe '30,00'?
>>> parse_decimal('0.00', locale='de', strict=True)
Traceback (most recent call last):
    ...
NumberFormatError: '0.00' is not a properly formatted decimal number. Did you mean '0'?
Parameters
  • string – the string to parse
  • locale – the Locale object or locale identifier
  • strict – controls whether numbers formatted in a weird way are accepted or rejected
Raises

NumberFormatError – if the string can not be converted to a decimal number

Exceptions

exception babel.numbers.NumberFormatError(message, suggestions=None)

Exception raised when a string cannot be parsed into a number.

suggestions

a list of properly formatted numbers derived from the invalid input

Data Access

babel.numbers.get_currency_name(currency, count=None, locale='en_US_POSIX')

Return the name used by the locale for the specified currency.

>>> get_currency_name('USD', locale='en_US')
u'US Dollar'

New in version 0.9.4.

Parameters
  • currency – the currency code.
  • count – the optional count.  If provided the currency name will be pluralized to that number if possible.
  • locale – the Locale object or locale identifier.
babel.numbers.get_currency_symbol(currency, locale='en_US_POSIX')

Return the symbol used by the locale for the specified currency.

>>> get_currency_symbol('USD', locale='en_US')
u'$'
Parameters
  • currency – the currency code.
  • locale – the Locale object or locale identifier.
babel.numbers.get_currency_unit_pattern(currency, count=None, locale='en_US_POSIX')

Return the unit pattern used for long display of a currency value for a given locale. This is a string containing {0} where the numeric part should be substituted and {1} where the currency long display name should be substituted.

>>> get_currency_unit_pattern('USD', locale='en_US', count=10)
u'{0} {1}'

New in version 2.7.0.

Parameters
  • currency – the currency code.
  • count – the optional count.  If provided the unit pattern for that number will be returned.
  • locale – the Locale object or locale identifier.
babel.numbers.get_decimal_symbol(locale='en_US_POSIX')

Return the symbol used by the locale to separate decimal fractions.

>>> get_decimal_symbol('en_US')
u'.'
Parameters

locale – the Locale object or locale identifier

babel.numbers.get_plus_sign_symbol(locale='en_US_POSIX')

Return the plus sign symbol used by the current locale.

>>> get_plus_sign_symbol('en_US')
u'+'
Parameters

locale – the Locale object or locale identifier

babel.numbers.get_minus_sign_symbol(locale='en_US_POSIX')

Return the plus sign symbol used by the current locale.

>>> get_minus_sign_symbol('en_US')
u'-'
Parameters

locale – the Locale object or locale identifier

babel.numbers.get_territory_currencies(territory, start_date=None, end_date=None, tender=True, non_tender=False, include_details=False)

Returns the list of currencies for the given territory that are valid for the given date range.  In addition to that the currency database distinguishes between tender and non-tender currencies.  By default only tender currencies are returned.

The return value is a list of all currencies roughly ordered by the time of when the currency became active.  The longer the currency is being in use the more to the left of the list it will be.

The start date defaults to today.  If no end date is given it will be the same as the start date.  Otherwise a range can be defined.  For instance this can be used to find the currencies in use in Austria between 1995 and 2011:

>>> from datetime import date
>>> get_territory_currencies('AT', date(1995, 1, 1), date(2011, 1, 1))
['ATS', 'EUR']

Likewise it’s also possible to find all the currencies in use on a single date:

>>> get_territory_currencies('AT', date(1995, 1, 1))
['ATS']
>>> get_territory_currencies('AT', date(2011, 1, 1))
['EUR']

By default the return value only includes tender currencies.  This however can be changed:

>>> get_territory_currencies('US')
['USD']
>>> get_territory_currencies('US', tender=False, non_tender=True,
...                          start_date=date(2014, 1, 1))
['USN', 'USS']

New in version 2.0.

Parameters
  • territory – the name of the territory to find the currency for.
  • start_date – the start date.  If not given today is assumed.
  • end_date – the end date.  If not given the start date is assumed.
  • tender – controls whether tender currencies should be included.
  • non_tender – controls whether non-tender currencies should be included.
  • include_details – if set to True, instead of returning currency codes the return value will be dictionaries with detail information.  In that case each dictionary will have the keys 'currency', 'from', 'to', and 'tender'.

Pluralization Support

The pluralization support provides functionality around the CLDR pluralization rules.  It can parse and evaluate pluralization rules, as well as convert them to other formats such as gettext.

Basic Interface

class babel.plural.PluralRule(rules)

Represents a set of language pluralization rules.  The constructor accepts a list of (tag, expr) tuples or a dict of CLDR rules. The resulting object is callable and accepts one parameter with a positive or negative number (both integer and float) for the number that indicates the plural form for a string and returns the tag for the format:

>>> rule = PluralRule({'one': 'n is 1'})
>>> rule(1)
'one'
>>> rule(2)
'other'

Currently the CLDR defines these tags: zero, one, two, few, many and other where other is an implicit default.  Rules should be mutually exclusive; for a given numeric value, only one rule should apply (i.e. the condition should only be true for one of the plural rule elements.

classmethod parse(rules)

Create a PluralRule instance for the given rules.  If the rules are a PluralRule object, that object is returned.

Parameters

rules – the rules as list or dict, or a PluralRule object

Raises

RuleError – if the expression is malformed

property rules

The PluralRule as a dict of unicode plural rules.

>>> rule = PluralRule({'one': 'n is 1'})
>>> rule.rules
{'one': 'n is 1'}
property tags

A set of explicitly defined tags in this rule.  The implicit default 'other' rules is not part of this set unless there is an explicit rule for it.

Conversion Functionality

babel.plural.to_javascript(rule)

Convert a list/dict of rules or a PluralRule object into a JavaScript function.  This function depends on no external library:

>>> to_javascript({'one': 'n is 1'})
"(function(n) { return (n == 1) ? 'one' : 'other'; })"

Implementation detail: The function generated will probably evaluate expressions involved into range operations multiple times.  This has the advantage that external helper functions are not required and is not a big performance hit for these simple calculations.

Parameters

rule – the rules as list or dict, or a PluralRule object

Raises

RuleError – if the expression is malformed

babel.plural.to_python(rule)

Convert a list/dict of rules or a PluralRule object into a regular Python function.  This is useful in situations where you need a real function and don’t are about the actual rule object:

>>> func = to_python({'one': 'n is 1', 'few': 'n in 2..4'})
>>> func(1)
'one'
>>> func(3)
'few'
>>> func = to_python({'one': 'n in 1,11', 'few': 'n in 3..10,13..19'})
>>> func(11)
'one'
>>> func(15)
'few'
Parameters

rule – the rules as list or dict, or a PluralRule object

Raises

RuleError – if the expression is malformed

babel.plural.to_gettext(rule)

The plural rule as gettext expression.  The gettext expression is technically limited to integers and returns indices rather than tags.

>>> to_gettext({'one': 'n is 1', 'two': 'n is 2'})
'nplurals=3; plural=((n == 1) ? 0 : (n == 2) ? 1 : 2);'
Parameters

rule – the rules as list or dict, or a PluralRule object

Raises

RuleError – if the expression is malformed

General Support Functionality

Babel ships a few general helpers that are not being used by Babel itself but are useful in combination with functionality provided by it.

Convenience Helpers

class babel.support.Format(locale, tzinfo=None)

Wrapper class providing the various date and number formatting functions bound to a specific locale and time-zone.

>>> from babel.util import UTC
>>> from datetime import date
>>> fmt = Format('en_US', UTC)
>>> fmt.date(date(2007, 4, 1))
u'Apr 1, 2007'
>>> fmt.decimal(1.2345)
u'1.234'
currency(number, currency)

Return a number in the given currency formatted for the locale.

date(date=None, format='medium')

Return a date formatted according to the given pattern.

>>> from datetime import date
>>> fmt = Format('en_US')
>>> fmt.date(date(2007, 4, 1))
u'Apr 1, 2007'
datetime(datetime=None, format='medium')

Return a date and time formatted according to the given pattern.

>>> from datetime import datetime
>>> from pytz import timezone
>>> fmt = Format('en_US', tzinfo=timezone('US/Eastern'))
>>> fmt.datetime(datetime(2007, 4, 1, 15, 30))
u'Apr 1, 2007, 11:30:00 AM'
decimal(number, format=None)

Return a decimal number formatted for the locale.

>>> fmt = Format('en_US')
>>> fmt.decimal(1.2345)
u'1.234'
number(number)

Return an integer number formatted for the locale.

>>> fmt = Format('en_US')
>>> fmt.number(1099)
u'1,099'
percent(number, format=None)

Return a number formatted as percentage for the locale.

>>> fmt = Format('en_US')
>>> fmt.percent(0.34)
u'34%'
scientific(number)

Return a number formatted using scientific notation for the locale.

time(time=None, format='medium')

Return a time formatted according to the given pattern.

>>> from datetime import datetime
>>> from pytz import timezone
>>> fmt = Format('en_US', tzinfo=timezone('US/Eastern'))
>>> fmt.time(datetime(2007, 4, 1, 15, 30))
u'11:30:00 AM'
timedelta(delta, granularity='second', threshold=0.85, format='long', add_direction=False)

Return a time delta according to the rules of the given locale.

>>> from datetime import timedelta
>>> fmt = Format('en_US')
>>> fmt.timedelta(timedelta(weeks=11))
u'3 months'
class babel.support.LazyProxy(func, *args, **kwargs)

Class for proxy objects that delegate to a specified function to evaluate the actual object.

>>> def greeting(name='world'):
...     return 'Hello, %s!' % name
>>> lazy_greeting = LazyProxy(greeting, name='Joe')
>>> print(lazy_greeting)
Hello, Joe!
>>> u'  ' + lazy_greeting
u'  Hello, Joe!'
>>> u'(%s)' % lazy_greeting
u'(Hello, Joe!)'

This can be used, for example, to implement lazy translation functions that delay the actual translation until the string is actually used. The rationale for such behavior is that the locale of the user may not always be available. In web applications, you only know the locale when processing a request.

The proxy implementation attempts to be as complete as possible, so that the lazy objects should mostly work as expected, for example for sorting:

>>> greetings = [
...     LazyProxy(greeting, 'world'),
...     LazyProxy(greeting, 'Joe'),
...     LazyProxy(greeting, 'universe'),
... ]
>>> greetings.sort()
>>> for greeting in greetings:
...     print(greeting)
Hello, Joe!
Hello, universe!
Hello, world!

Gettext Support

class babel.support.Translations(fp=None, domain=None)

An extended translation catalog class.

add(translations, merge=True)

Add the given translations to the catalog.

If the domain of the translations is different than that of the current catalog, they are added as a catalog that is only accessible by the various d*gettext functions.

Parameters
  • translations – the Translations instance with the messages to add
  • merge – whether translations for message domains that have already been added should be merged with the existing translations
classmethod load(dirname=None, locales=None, domain=None)

Load translations from the given directory.

Parameters
  • dirname – the directory containing the MO files
  • locales – the list of locales in order of preference (items in this list can be either Locale objects or locale strings)
  • domain – the message domain (default: ‘messages’)
merge(translations)

Merge the given translations into the catalog.

Message translations in the specified catalog override any messages with the same identifier in the existing catalog.

Parameters

translations – the Translations instance with the messages to merge

Units

The unit module provides functionality to format measurement units for different locales.

babel.units.format_unit(value, measurement_unit, length='long', format=None, locale='en_US_POSIX')

Format a value of a given unit.

Values are formatted according to the locale’s usual pluralization rules and number formats.

>>> format_unit(12, 'length-meter', locale='ro_RO')
u'12 metri'
>>> format_unit(15.5, 'length-mile', locale='fi_FI')
u'15,5 mailia'
>>> format_unit(1200, 'pressure-millimeter-ofhg', locale='nb')
u'1\xa0200 millimeter kvikks\xf8lv'
>>> format_unit(270, 'ton', locale='en')
u'270 tons'

Number formats may be overridden with the format parameter.

>>> import decimal
>>> format_unit(decimal.Decimal("-42.774"), 'temperature-celsius', 'short', format='#.0', locale='fr')
u'-42,8\u202f\xb0C'

The locale’s usual pluralization rules are respected.

>>> format_unit(1, 'length-meter', locale='ro_RO')
u'1 metru'
>>> format_unit(0, 'length-mile', locale='cy')
u'0 mi'
>>> format_unit(1, 'length-mile', locale='cy')
u'1 filltir'
>>> format_unit(3, 'length-mile', locale='cy')
u'3 milltir'
>>> format_unit(15, 'length-horse', locale='fi')
Traceback (most recent call last):
    ...
UnknownUnitError: length-horse is not a known unit in fi

New in version 2.2.0.

Parameters
  • value – the value to format. If this is a string, no number formatting will be attempted.
  • measurement_unit – the code of a measurement unit. Known units can be found in the CLDR Unit Validity XML file: https://unicode.org/repos/cldr/tags/latest/common/validity/unit.xml
  • length – “short”, “long” or “narrow”
  • format – An optional format, as accepted by format_decimal.
  • locale – the Locale object or locale identifier
babel.units.format_compound_unit(numerator_value, numerator_unit=None, denominator_value=1, denominator_unit=None, length='long', format=None, locale='en_US_POSIX')

Format a compound number value, i.e. “kilometers per hour” or similar.

Both unit specifiers are optional to allow for formatting of arbitrary values still according to the locale’s general “per” formatting specifier.

>>> format_compound_unit(7, denominator_value=11, length="short", locale="pt")
'7/11'
>>> format_compound_unit(150, "kilometer", denominator_unit="hour", locale="sv")
'150 kilometer per timme'
>>> format_compound_unit(150, "kilowatt", denominator_unit="year", locale="fi")
'150 kilowattia / vuosi'
>>> format_compound_unit(32.5, "ton", 15, denominator_unit="hour", locale="en")
'32.5 tons per 15 hours'
>>> format_compound_unit(160, denominator_unit="square-meter", locale="fr")
'160 par m\xe8tre carr\xe9'
>>> format_compound_unit(4, "meter", "ratakisko", length="short", locale="fi")
'4 m/ratakisko'
>>> format_compound_unit(35, "minute", denominator_unit="fathom", locale="sv")
'35 minuter per famn'
>>> from babel.numbers import format_currency
>>> format_compound_unit(format_currency(35, "JPY", locale="de"), denominator_unit="liter", locale="de")
'35\xa0\xa5 pro Liter'

See https://www.unicode.org/reports/tr35/tr35-general.html#perUnitPatterns

Parameters
  • numerator_value – The numerator value. This may be a string, in which case it is considered preformatted and the unit is ignored.
  • numerator_unit – The numerator unit. See format_unit.
  • denominator_value – The denominator value. This may be a string, in which case it is considered preformatted and the unit is ignored.
  • denominator_unit – The denominator unit. See format_unit.
  • length – The formatting length. “short”, “long” or “narrow”
  • format – An optional format, as accepted by format_decimal.
  • locale – the Locale object or locale identifier
Returns

A formatted compound value.

babel.units.get_unit_name(measurement_unit, length='long', locale='en_US_POSIX')

Get the display name for a measurement unit in the given locale.

>>> get_unit_name("radian", locale="en")
'radians'

Unknown units will raise exceptions:

>>> get_unit_name("battery", locale="fi")
Traceback (most recent call last):
    ...
UnknownUnitError: battery/long is not a known unit/length in fi
Parameters
Returns

The unit display name, or None.

Additional Notes

Babel Development

Babel as a library has a long history that goes back to the Trac project. Since then it has evolved into an independently developed project that implements data access for the CLDR project.

This document tries to explain as best as possible the general rules of the project in case you want to help out developing.

Tracking the CLDR

Generally the goal of the project is to work as closely as possible with the CLDR data.  This has in the past caused some frustrating problems because the data is entirely out of our hand.  To minimize the frustration we generally deal with CLDR updates the following way:

  • bump the CLDR data only with a major release of Babel.
  • never perform custom bugfixes on the CLDR data.
  • never work around CLDR bugs within Babel.  If you find a problem in the data, report it upstream.
  • adjust the parsing of the data as soon as possible, otherwise this will spiral out of control later.  This is especially the case for bigger updates that change pluralization and more.
  • try not to test against specific CLDR data that is likely to change.

Python Versions

At the moment the following Python versions should be supported:

  • Python 2.7
  • Python 3.4 and up
  • PyPy tracking 2.7 and 3.2 and up

While PyPy does not currently support 3.3, it does support traditional unicode literals which simplifies the entire situation tremendously.

Documentation must build on Python 2, Python 3 support for the documentation is an optional goal.  Code examples in the docs preferably are written in a style that makes them work on both 2.x and 3.x with preference to the former.

Unicode

Unicode is a big deal in Babel.  Here is how the rules are set up:

  • internally everything is unicode that makes sense to have as unicode. The exception to this rule are things which on Python 2 traditionally have been bytes.  For example file names on Python 2 should be treated as bytes wherever possible.
  • Encode / decode at boundaries explicitly.  Never assume an encoding in a way it cannot be overridden.  utf-8 should be generally considered the default encoding.
  • Dot not use unicode_literals, instead use the u'' string syntax.  The reason for this is that the former introduces countless of unicode problems by accidentally upgrading strings to unicode which should not be.  (docstrings for instance).

Dates and Timezones

Generally all timezone support in Babel is based on pytz which it just depends on.  Babel should assume that timezone objects are pytz based because those are the only ones with an API that actually work correctly (due to the API problems with non UTC based timezones).

Assumptions to make:

  • use UTC where possible.
  • be super careful with local time.  Do not use local time without knowing the exact timezone.
  • time without date is a very useless construct.  Do not try to support timezones for it.  If you do, assume that the current local date is assumed and not utc date.

Babel Changelog

Version 2.10.3

This is a bugfix release for Babel 2.10.2, which was mistakenly packaged with outdated locale data.

Thanks to Michał Górny for pointing this out and Jun Omae for verifying.

This and future Babel PyPI packages will be built by a more automated process, which should make problems like this less likely to occur.

Version 2.10.2

This is a bugfix release for Babel 2.10.1.

  • Fallback count=”other” format in format_currency() (#872) - Jun Omae
  • Fix get_period_id() with dayPeriodRule across 0:00 (#871) - Jun Omae
  • Add support for b and B period symbols in time format (#869) - Jun Omae
  • chore(docs/typo): Fixes a minor typo in a function comment (#864) - Frank Harrison

Version 2.10.1

This is a bugfix release for Babel 2.10.0.

  • Messages: Fix distutils import. Regressed in #843. (#852) - Nehal J Wani
  • The wheel file is no longer marked as universal, since Babel only supports Python 3.

Version 2.10.0

Upcoming deprecation

  • The get_next_timezone_transition() function is marked deprecated in this version and will be removed likely as soon as Babel 2.11.  No replacement for this function is planned; based on discussion in #716, it’s likely the function is not used in any real code. (#852) - Aarni Koskela, Paul Ganssle

Improvements

  • CLDR: Upgrade to CLDR 41.0. (#853) - Aarni Koskela

    • The c and e plural form operands introduced in CLDR 40 are parsed, but otherwise unsupported. (#826)
    • Non-nominative forms of units are currently ignored.
  • Messages: Implement --init-missing option for pybabel update (#785) - ruro
  • Messages: For extract, you can now replace the built-in .* / _* ignored directory patterns with ones of your own. (#832) - Aarni Koskela, Kinshuk Dua
  • Messages: Add --check to verify if catalogs are up-to-date (#831) - Krzysztof Jagiełło
  • Messages: Add --header-comment to override default header comment (#720) - Mohamed Hafez Morsy, Aarni Koskela
  • Dates: parse_time now supports 12-hour clock, and is better at parsing partial times. (#834) - Aarni Koskela, David Bauer, Arthur Jovart
  • Dates: parse_date and parse_time now raise ParseError, a subclass of ValueError, in certain cases. (#834) - Aarni Koskela
  • Dates: parse_date and parse_time now accept the format parameter. (#834) - Juliette Monsel, Aarni Koskela

Infrastructure

  • The internal babel/_compat.py module is no more (#808) - Hugo van Kemenade
  • Python 3.10 is officially supported (#809) - Hugo van Kemenade
  • There’s now a friendly GitHub issue template. (#800) – Álvaro Mondéjar Rubio
  • Don’t use the deprecated format_number function internally or in tests - Aarni Koskela
  • Add GitHub URL for PyPi (#846) - Andrii Oriekhov
  • Python 3.12 compatibility: Prefer setuptools imports to distutils imports (#843) - Aarni Koskela
  • Python 3.11 compatibility: Add deprecations to l*gettext variants (#835) - Aarni Koskela
  • CI: Babel is now tested with PyPy 3.7. (#851) - Aarni Koskela

Bugfixes

  • Date formatting: Allow using other as fallback form (#827) - Aarni Koskela
  • Locales: Locale.parse() normalizes variant tags to upper case (#829) - Aarni Koskela
  • A typo in the plural format for Maltese is fixed. (#796) - Lukas Winkler
  • Messages: Catalog date parsing is now timezone independent. (#701) - rachele-collin
  • Messages: Fix duplicate locations when writing without lineno (#837) - Sigurd Ljødal
  • Messages: Fix missing trailing semicolon in plural form headers (#848) - farhan5900
  • CLI: Fix output of --list-locales to not be a bytes repr (#845) - Morgan Wahl

Documentation

  • Documentation is now correctly built again, and up to date (#830) - Aarni Koskela

Version 2.9.1

Bugfixes

  • The internal locale-data loading functions now validate the name of the locale file to be loaded and only allow files within Babel’s data directory.  Thank you to Chris Lyne of Tenable, Inc. for discovering the issue!

Version 2.9.0

Upcoming version support changes

  • This version, Babel 2.9, is the last version of Babel to support Python 2.7, Python 3.4, and Python 3.5.

Improvements

  • CLDR: Use CLDR 37 – Aarni Koskela (#734)
  • Dates: Handle ZoneInfo objects in get_timezone_location, get_timezone_name - Alessio Bogon (#741)
  • Numbers: Add group_separator feature in number formatting - Abdullah Javed Nesar (#726)

Bugfixes

  • Dates: Correct default Format().timedelta format to ‘long’ to mute deprecation warnings – Aarni Koskela
  • Import: Simplify iteration code in “import_cldr.py” – Felix Schwarz
  • Import: Stop using deprecated ElementTree methods “getchildren()” and “getiterator()” – Felix Schwarz
  • Messages: Fix unicode printing error on Python 2 without TTY. – Niklas Hambüchen
  • Messages: Introduce invariant that _invalid_pofile() takes unicode line. – Niklas Hambüchen
  • Tests: fix tests when using Python 3.9 – Felix Schwarz
  • Tests: Remove deprecated ‘sudo: false’ from Travis configuration – Jon Dufresne
  • Tests: Support Py.test 6.x – Aarni Koskela
  • Utilities: LazyProxy: Handle AttributeError in specified func – Nikiforov Konstantin (#724)
  • Utilities: Replace usage of parser.suite with ast.parse – Miro Hrončok

Documentation

  • Update parse_number comments – Brad Martin (#708)
  • Add __iter__ to Catalog documentation – @CyanNani123

Version 2.8.1

This is solely a patch release to make running tests on Py.test 6+ possible.

Bugfixes

  • Support Py.test 6 - Aarni Koskela (#747, #750, #752)

Version 2.8.0

Improvements

  • CLDR: Upgrade to CLDR 36.0 - Aarni Koskela (#679)
  • Messages: Don’t even open files with the “ignore” extraction method - @sebleblanc (#678)

Bugfixes

  • Numbers: Fix formatting very small decimals when quantization is disabled - Lev Lybin, @miluChen (#662)
  • Messages: Attempt to sort all messages – Mario Frasca (#651, #606)

Docs

  • Add years to changelog - Romuald Brunet
  • Note that installation requires pytz - Steve (Gadget) Barnes

Version 2.7.0

Possibly incompatible changes

These may be backward incompatible in some cases, as some more-or-less internal APIs have changed. Please feel free to file issues if you bump into anything strange and we’ll try to help!

  • General: Internal uses of babel.util.odict have been replaced with collections.OrderedDict from The Python standard library.

Improvements

  • CLDR: Upgrade to CLDR 35.1 - Alberto Mardegan, Aarni Koskela (#626, #643)
  • General: allow anchoring path patterns to the start of a string - Brian Cappello (#600)
  • General: Bumped version requirement on pytz - @chrisbrake (#592)
  • Messages: pybabel compile: exit with code 1 if errors were encountered - Aarni Koskela (#647)
  • Messages: Add omit-header to update_catalog - Cédric Krier (#633)
  • Messages: Catalog update: keep user comments from destination by default - Aarni Koskela (#648)
  • Messages: Skip empty message when writing mo file - Cédric Krier (#564)
  • Messages: Small fixes to avoid crashes on badly formatted .po files - Bryn Truscott (#597)
  • Numbers: parse_decimal() strict argument and suggestions - Charly C (#590)
  • Numbers: don’t repeat suggestions in parse_decimal strict - Serban Constantin (#599)
  • Numbers: implement currency formatting with long display names - Luke Plant (#585)
  • Numbers: parse_decimal(): assume spaces are equivalent to non-breaking spaces when not in strict mode - Aarni Koskela (#649)
  • Performance: Cache locale_identifiers() - Aarni Koskela (#644)

Bugfixes

  • CLDR: Skip alt=… for week data (minDays, firstDay, weekendStart, weekendEnd) - Aarni Koskela (#634)
  • Dates: Fix wrong weeknumber for 31.12.2018 - BT-sschmid (#621)
  • Locale: Avoid KeyError trying to get data on WindowsXP - mondeja (#604)
  • Locale: get_display_name(): Don’t attempt to concatenate variant information to None - Aarni Koskela (#645)
  • Messages: pofile: Add comparison operators to _NormalizedString - Aarni Koskela (#646)
  • Messages: pofile: don’t crash when message.locations can’t be sorted - Aarni Koskela (#646)

Tooling & docs

  • Docs: Remove all references to deprecated easy_install - Jon Dufresne (#610)
  • Docs: Switch print statement in docs to print function - NotAFile
  • Docs: Update all pypi.python.org URLs to pypi.org - Jon Dufresne (#587)
  • Docs: Use https URLs throughout project where available - Jon Dufresne (#588)
  • Support: Add testing and document support for Python 3.7 - Jon Dufresne (#611)
  • Support: Test on Python 3.8-dev - Aarni Koskela (#642)
  • Support: Using ABCs from collections instead of collections.abc is deprecated. - Julien Palard (#609)
  • Tests: Fix conftest.py compatibility with pytest 4.3 - Miro Hrončok (#635)
  • Tests: Update pytest and pytest-cov - Miro Hrončok (#635)

Version 2.6.0

Possibly incompatible changes

These may be backward incompatible in some cases, as some more-or-less internal APIs have changed. Please feel free to file issues if you bump into anything strange and we’ll try to help!

  • Numbers: Refactor decimal handling code and allow bypass of decimal quantization. (@kdeldycke) (PR #538)
  • Messages: allow processing files that are in locales unknown to Babel (@akx) (PR #557)
  • General: Drop support for EOL Python 2.6 and 3.3 (@hugovk) (PR #546)

Other changes

  • CLDR: Use CLDR 33 (@akx) (PR #581)
  • Lists: Add support for various list styles other than the default (@akx) (#552)
  • Messages: Add new PoFileError exception (@Bedrock02) (PR #532)
  • Times: Simplify Linux distro specific explicit timezone setting search (@scop) (PR #528)

Bugfixes

  • CLDR: avoid importing alt=narrow currency symbols (@akx) (PR #558)
  • CLDR: ignore non-Latin numbering systems (@akx) (PR #579)
  • Docs: Fix improper example for date formatting (@PTrottier) (PR #574)
  • Tooling: Fix some deprecation warnings (@akx) (PR #580)

Tooling & docs

  • Add explicit signatures to some date autofunctions (@xmo-odoo) (PR #554)
  • Include license file in the generated wheel package (@jdufresne) (PR #539)
  • Python 3.6 invalid escape sequence deprecation fixes (@scop) (PR #528)
  • Test and document all supported Python versions (@jdufresne) (PR #540)
  • Update copyright header years and authors file (@akx) (PR #559)

Version 2.5.3

This is a maintenance release that reverts undesired API-breaking changes that slipped into 2.5.2 (see #550).

It is based on v2.5.1 (f29eccd) with commits 7cedb84, 29da2d2 and edfb518 cherry-picked on top.

Version 2.5.2

Bugfixes

  • Revert the unnecessary PyInstaller fixes from 2.5.0 and 2.5.1 (#533) (@yagebu)

Version 2.5.1

Minor Improvements and bugfixes

  • Use a fixed datetime to avoid test failures (#520) (@narendravardi)
  • Parse multi-line __future__ imports better (#519) (@akx)
  • Fix validate_currency docstring (#522)
  • Allow normalize_locale and exists to handle various unexpected inputs (#523) (@suhojm)
  • Make PyInstaller support more robust (#525, #526) (@thijstriemstra, @akx)

Version 2.5.0

New Features

  • Numbers: Add currency utilities and helpers (#491) (@kdeldycke)
  • Support PyInstaller (#500, #505) (@wodo)

Minor Improvements and bugfixes

  • Dates: Add __str__ to DateTimePattern (#515) (@sfermigier)
  • Dates: Fix an invalid string to bytes comparison when parsing TZ files on Py3 (#498) (@rowillia)
  • Dates: Formatting zero-padded components of dates is faster (#517) (@akx)
  • Documentation: Fix “Good Commits” link in CONTRIBUTING.md (#511) (@naryanacharya6)
  • Documentation: Fix link to Python gettext module (#512) (@Linkid)
  • Messages: Allow both dash and underscore separated locale identifiers in pofiles (#489, #490) (@akx)
  • Messages: Extract Python messages in nested gettext calls (#488) (@sublee)
  • Messages: Fix in-place editing of dir list while iterating (#476, #492) (@MarcDufresne)
  • Messages: Stabilize sort order (#482) (@xavfernandez)
  • Time zones: Honor the no-inherit marker for metazone names (#405) (@akx)

Version 2.4.0

New Features

Some of these changes might break your current code and/or tests.

  • CLDR: CLDR 29 is now used instead of CLDR 28 (#405) (@akx)
  • Messages: Add option ‘add_location’ for location line formatting (#438, #459) (@rrader, @alxpy)
  • Numbers: Allow full control of decimal behavior (#410) (@etanol)

Minor Improvements and bugfixes

  • Documentation: Improve Date Fields descriptions (#450) (@ldwoolley)
  • Documentation: Typo fixes and documentation improvements (#406, #412, #403, #440, #449, #463) (@zyegfryed, @adamchainz, @jwilk, @akx, @roramirez, @abhishekcs10)
  • Messages: Default to UTF-8 source encoding instead of ISO-8859-1 (#399) (@asottile)
  • Messages: Ensure messages are extracted in the order they were passed in (#424) (@ngrilly)
  • Messages: Message extraction for JSX files is improved (#392, #396, #425) (@karloskar, @georgschoelly)
  • Messages: PO file reading supports multi-line obsolete units (#429) (@mbirtwell)
  • Messages: Python message extractor respects unicode_literals in __future__ (#427) (@sublee)
  • Messages: Roundtrip Language headers (#420) (@kruton)
  • Messages: units before obsolete units are no longer erroneously marked obsolete (#452) (@mbirtwell)
  • Numbers: parse_pattern now preserves the full original pattern (#414) (@jtwang)
  • Numbers: Fix float conversion in extract_operands (#435) (@akx)
  • Plurals: Fix plural forms for Czech and Slovak locales (#373) (@ykshatroff)
  • Plurals: More plural form fixes based on Mozilla and CLDR references (#431) (@mshenfield)

Internal improvements

  • Local times are constructed correctly in tests (#411) (@etanol)
  • Miscellaneous small improvements (#437) (@scop)
  • Regex flags are extracted from the regex strings (#462) (@singingwolfboy)
  • The PO file reader is now a class and has seen some refactoring (#429, #452) (@mbirtwell)

Version 2.3.4

(Bugfix release, released on April 22th 2016)

Bugfixes

  • CLDR: The lxml library is no longer used for CLDR importing, so it should not cause strange failures either. Thanks to @aronbierbaum for the bug report and @jtwang for the fix. (https://github.com/python-babel/babel/pull/393)
  • CLI: Every last single CLI usage regression should now be gone, and both distutils and stand-alone CLIs should work as they have in the past. Thanks to @paxswill and @ajaeger for bug reports. (https://github.com/python-babel/babel/pull/389)

Version 2.3.3

(Bugfix release, released on April 12th 2016)

Bugfixes

Version 2.3.2

(Bugfix release, released on April 9th 2016)

Bugfixes

  • Dates: Period (am/pm) formatting was broken in certain locales (namely zh_TW). Thanks to @jun66j5 for the bug report. (#378, #379)

Version 2.3.1

(Bugfix release because of deployment problems, released on April 8th 2016)

Version 2.3

(Feature release, released on April 8th 2016)

Internal improvements

Features

Bugfixes

Version 2.2

(Feature release, released on January 2nd 2016)

Bugfixes

  • General: Add __hash__ to Locale. (#303) (2aa8074)
  • General: Allow files with BOM if they’re UTF-8 (#189) (da87edd)
  • General: localedata directory is now locale-data (#109) (2d1882e)
  • General: odict: Fix pop method (0a9e97e)
  • General: Removed uses of datetime.date class from .dat files (#174) (94f6830)
  • Messages: Fix plural selection for Chinese (531f666)
  • Messages: Fix typo and add semicolon in plural_forms (5784501)
  • Messages: Flatten NullTranslations.files into a list (ad11101)
  • Times: FixedOffsetTimezone: fix display of negative offsets (d816803)

Features

  • CLDR: Update to CLDR 28 (#292) (9f7f4d0)
  • General: Add __copy__ and __deepcopy__ to LazyProxy. (a1cc3f1)
  • General: Add official support for Python 3.4 and 3.5
  • General: Improve odict performance by making key search O(1) (6822b7f)
  • Locale: Add an ordinal_form property to Locale (#270) (b3f3430)
  • Locale: Add support for list formatting (37ce4fa, be6e23d)
  • Locale: Check inheritance exceptions first (3ef0d6d)
  • Messages: Allow file locations without line numbers (#279) (79bc781)
  • Messages: Allow passing a callable to extract() (#289) (3f58516)
  • Messages: Support ‘Language’ header field of PO files (#76) (3ce842b)
  • Messages: Update catalog headers from templates (e0e7ef1)
  • Numbers: Properly load and expose currency format types (#201) (df676ab)
  • Numbers: Use cdecimal by default when available (b6169be)
  • Numbers: Use the CLDR’s suggested number of decimals for format_currency (#139) (201ed50)
  • Times: Add format_timedelta(format=’narrow’) support (edc5eb5)

Version 2.1

(Bugfix/minor feature release, released on September 25th 2015)

  • Parse and honour the locale inheritance exceptions (#97)
  • Fix Locale.parse using global.dat incompatible types (#174)
  • Fix display of negative offsets in FixedOffsetTimezone (#214)
  • Improved odict performance which is used during localization file build, should improve compilation time for large projects
  • Add support for “narrow” format for format_timedelta
  • Add universal wheel support
  • Support ‘Language’ header field in .PO files (fixes #76)
  • Test suite enhancements (coverage, broken tests fixed, etc)
  • Documentation updated

Version 2.0

(Released on July 27th 2015, codename Second Coming)

  • Added support for looking up currencies that belong to a territory through the babel.numbers.get_territory_currencies() function.
  • Improved Python 3 support.
  • Fixed some broken tests for timezone behavior.
  • Improved various smaller things for dealing with dates.

Version 1.4

(bugfix release, release date to be decided)

  • Fixed a bug that caused deprecated territory codes not being converted properly by the subtag resolving.  This for instance showed up when trying to use und_UK as a language code which now properly resolves to en_GB.
  • Fixed a bug that made it impossible to import the CLDR data from scratch on windows systems.

Version 1.3

(bugfix release, released on July 29th 2013)

  • Fixed a bug in likely-subtag resolving for some common locales. This primarily makes zh_CN work again which was broken due to how it was defined in the likely subtags combined with our broken resolving.  This fixes #37.
  • Fixed a bug that caused pybabel to break when writing to stdout on Python 3.
  • Removed a stray print that was causing issues when writing to stdout for message catalogs.

Version 1.2

(bugfix release, released on July 27th 2013)

  • Included all tests in the tarball.  Previously the include skipped past recursive folders.
  • Changed how tests are invoked and added separate standalone test command.  This simplifies testing of the package for linux distributors.

Version 1.1

(bugfix release, released on July 27th 2013)

  • added dummy version requirements for pytz so that it installs on pip 1.4.
  • Included tests in the tarball.

Version 1.0

(Released on July 26th 2013, codename Revival)

  • support python 2.6, 2.7, 3.3+ and pypy - drop all other versions
  • use tox for testing on different pythons
  • Added support for the locale plural rules defined by the CLDR.
  • Added format_timedelta function to support localized formatting of relative times with strings such as “2 days” or “1 month” (ticket #126).
  • Fixed negative offset handling of Catalog._set_mime_headers (ticket #165).
  • Fixed the case where messages containing square brackets would break with an unpack error.
  • updated to CLDR 23
  • Make the CLDR import script work with Python 2.7.
  • Fix various typos.
  • Sort output of list-locales.
  • Make the POT-Creation-Date of the catalog being updated equal to POT-Creation-Date of the template used to update (ticket #148).
  • Use a more explicit error message if no option or argument (command) is passed to pybabel (ticket #81).
  • Keep the PO-Revision-Date if it is not the default value (ticket #148).
  • Make –no-wrap work by reworking –width’s default and mimic xgettext’s behaviour of always wrapping comments (ticket #145).
  • Add –project and –version options for commandline (ticket #173).
  • Add a __ne__() method to the Local class.
  • Explicitly sort instead of using sorted() and don’t assume ordering (Jython compatibility).
  • Removed ValueError raising for string formatting message checkers if the string does not contain any string formattings (ticket #150).
  • Fix Serbian plural forms (ticket #213).
  • Small speed improvement in format_date() (ticket #216).
  • Fix so frontend.CommandLineInterface.run does not accumulate logging handlers (ticket #227, reported with initial patch by dfraser)
  • Fix exception if environment contains an invalid locale setting (ticket #200)
  • use cPickle instead of pickle for better performance (ticket #225)
  • Only use bankers round algorithm as a tie breaker if there are two nearest numbers, round as usual if there is only one nearest number (ticket #267, patch by Martin)
  • Allow disabling cache behaviour in LazyProxy (ticket #208, initial patch from Pedro Algarvio)
  • Support for context-aware methods during message extraction (ticket #229, patch from David Rios)
  • “init” and “update” commands support “–no-wrap” option (ticket #289)
  • fix formatting of fraction in format_decimal() if the input value is a float with more than 7 significant digits (ticket #183)
  • fix format_date() with datetime parameter (ticket #282, patch from Xavier Morel)
  • fix format_decimal() with small Decimal values (ticket #214, patch from George Lund)
  • fix handling of messages containing ‘\n’ (ticket #198)
  • handle irregular multi-line msgstr (no “” as first line) gracefully (ticket #171)
  • parse_decimal() now returns Decimals not floats, API change (ticket #178)
  • no warnings when running setup.py without installed setuptools (ticket #262)
  • modified Locale.__eq__ method so Locales are only equal if all of their attributes (language, territory, script, variant) are equal
  • resort to hard-coded message extractors/checkers if pkg_resources is installed but no egg-info was found (ticket #230)
  • format_time() and format_datetime() now accept also floats (ticket #242)
  • add babel.support.NullTranslations class similar to gettext.NullTranslations but with all of Babel’s new gettext methods (ticket #277)
  • “init” and “update” commands support “–width” option (ticket #284)
  • fix ‘input_dirs’ option for setuptools integration (ticket #232, initial patch by Étienne Bersac)
  • ensure .mo file header contains the same information as the source .po file (ticket #199)
  • added support for get_language_name() on the locale objects.
  • added support for get_territory_name() on the locale objects.
  • added support for get_script_name() on the locale objects.
  • added pluralization support for currency names and added a ‘¤¤¤’ pattern for currencies that includes the full name.
  • depend on pytz now and wrap it nicer.  This gives us improved support for things like timezone transitions and an overall nicer API.
  • Added support for explicit charset to PO file reading.
  • Added experimental Python 3 support.
  • Added better support for returning timezone names.
  • Don’t throw away a Catalog’s obsolete messages when updating it.
  • Added basic likelySubtag resolving when doing locale parsing and no match can be found.

Version 0.9.6

(released on March 17th 2011)

  • Backport r493-494: documentation typo fixes.
  • Make the CLDR import script work with Python 2.7.
  • Fix various typos.
  • Fixed Python 2.3 compatibility (ticket #146, ticket #233).
  • Sort output of list-locales.
  • Make the POT-Creation-Date of the catalog being updated equal to POT-Creation-Date of the template used to update (ticket #148).
  • Use a more explicit error message if no option or argument (command) is passed to pybabel (ticket #81).
  • Keep the PO-Revision-Date if it is not the default value (ticket #148).
  • Make –no-wrap work by reworking –width’s default and mimic xgettext’s behaviour of always wrapping comments (ticket #145).
  • Fixed negative offset handling of Catalog._set_mime_headers (ticket #165).
  • Add –project and –version options for commandline (ticket #173).
  • Add a __ne__() method to the Local class.
  • Explicitly sort instead of using sorted() and don’t assume ordering (Python 2.3 and Jython compatibility).
  • Removed ValueError raising for string formatting message checkers if the string does not contain any string formattings (ticket #150).
  • Fix Serbian plural forms (ticket #213).
  • Small speed improvement in format_date() (ticket #216).
  • Fix number formatting for locales where CLDR specifies alt or draft items (ticket #217)
  • Fix bad check in format_time (ticket #257, reported with patch and tests by jomae)
  • Fix so frontend.CommandLineInterface.run does not accumulate logging handlers (ticket #227, reported with initial patch by dfraser)
  • Fix exception if environment contains an invalid locale setting (ticket #200)

Version 0.9.5

(released on April 6th 2010)

  • Fixed the case where messages containing square brackets would break with an unpack error.
  • Backport of r467: Fuzzy matching regarding plurals should NOT be checked against len(message.id)  because this is always 2, instead, it’s should be checked against catalog.num_plurals (ticket #212).

Version 0.9.4

(released on August 25th 2008)

  • Currency symbol definitions that is defined with choice patterns in the CLDR data are no longer imported, so the symbol code will be used instead.
  • Fixed quarter support in date formatting.
  • Fixed a serious memory leak that was introduces by the support for CLDR aliases in 0.9.3 (ticket #128).
  • Locale modifiers such as “@euro” are now stripped from locale identifiers when parsing (ticket #136).
  • The system locales “C” and “POSIX” are now treated as aliases for “en_US_POSIX”, for which the CLDR provides the appropriate data. Thanks to Manlio Perillo for the suggestion.
  • Fixed JavaScript extraction for regular expression literals (ticket #138) and concatenated strings.
  • The Translation class in babel.support can now manage catalogs with different message domains, and exposes the family of d*gettext functions (ticket #137).

Version 0.9.3

(released on July 9th 2008)

  • Fixed invalid message extraction methods causing an UnboundLocalError.
  • Extraction method specification can now use a dot instead of the colon to separate module and function name (ticket #105).
  • Fixed message catalog compilation for locales with more than two plural forms (ticket #95).
  • Fixed compilation of message catalogs for locales with more than two plural forms where the translations were empty (ticket #97).
  • The stripping of the comment tags in comments is optional now and is done for each line in a comment.
  • Added a JavaScript message extractor.
  • Updated to CLDR 1.6.
  • Fixed timezone calculations when formatting datetime and time values.
  • Added a get_plural function into the plurals module that returns the correct plural forms for a locale as tuple.
  • Added support for alias definitions in the CLDR data files, meaning that the chance for items missing in certain locales should be greatly reduced (ticket #68).

Version 0.9.2

(released on February 4th 2008)

  • Fixed catalogs’ charset values not being recognized (ticket #66).
  • Numerous improvements to the default plural forms.
  • Fixed fuzzy matching when updating message catalogs (ticket #82).
  • Fixed bug in catalog updating, that in some cases pulled in translations from different catalogs based on the same template.
  • Location lines in PO files do no longer get wrapped at hyphens in file names (ticket #79).
  • Fixed division by zero error in catalog compilation on empty catalogs (ticket #60).

Version 0.9.1

(released on September 7th 2007)

  • Fixed catalog updating when a message is merged that was previously simple but now has a plural form, for example by moving from gettext to ngettext, or vice versa.
  • Fixed time formatting for 12 am and 12 pm.
  • Fixed output encoding of the pybabel –list-locales command.
  • MO files are now written in binary mode on windows (ticket #61).

Version 0.9

(released on August 20th 2007)

  • The new_catalog distutils command has been renamed to init_catalog for consistency with the command-line frontend.
  • Added compilation of message catalogs to MO files (ticket #21).
  • Added updating of message catalogs from POT files (ticket #22).
  • Support for significant digits in number formatting.
  • Apply proper “banker’s rounding” in number formatting in a cross-platform manner.
  • The number formatting functions now also work with numbers represented by Python Decimal objects (ticket #53).
  • Added extensible infrastructure for validating translation catalogs.
  • Fixed the extractor not filtering out messages that didn’t validate against the keyword’s specification (ticket #39).
  • Fixed the extractor raising an exception when encountering an empty string msgid. It now emits a warning to stderr.
  • Numerous Python message extractor fixes: it now handles nested function calls within a gettext function call correctly, uses the correct line number for multi-line function calls, and other small fixes (tickets ticket #38 and ticket #39).
  • Improved support for detecting Python string formatting fields in message strings (ticket #57).
  • CLDR upgraded to the 1.5 release.
  • Improved timezone formatting.
  • Implemented scientific number formatting.
  • Added mechanism to lookup locales by alias, for cases where browsers insist on including only the language code in the Accept-Language header, and sometimes even the incorrect language code.

Version 0.8.1

(released on July 2nd 2007)

  • default_locale() would fail when the value of the LANGUAGE environment variable contained multiple language codes separated by colon, as is explicitly allowed by the GNU gettext tools. As the default_locale() function is called at the module level in some modules, this bug would completely break importing these modules on systems where LANGUAGE is set that way.
  • The character set specified in PO template files is now respected when creating new catalog files based on that template. This allows the use of characters outside the ASCII range in POT files (ticket #17).
  • The default ordering of messages in generated POT files, which is based on the order those messages are found when walking the source tree, is no longer subject to differences between platforms; directory and file names are now always sorted alphabetically.
  • The Python message extractor now respects the special encoding comment to be able to handle files containing non-ASCII characters (ticket #23).
  • Added N_ (gettext noop) to the extractor’s default keywords.
  • Made locale string parsing more robust, and also take the script part into account (ticket #27).
  • Added a function to list all locales for which locale data is available.
  • Added a command-line option to the pybabel command which prints out all available locales (ticket #24).
  • The name of the command-line script has been changed from just babel to pybabel to avoid a conflict with the OpenBabel project (ticket #34).

Version 0.8

(released on June 20th 2007)

  • First public release

License

Babel is licensed under a three clause BSD License.  It basically means: do whatever you want with it as long as the copyright in Babel sticks around, the conditions are not modified and the disclaimer is present. Furthermore you must not use the names of the authors to promote derivatives of the software without written consent.

The full license text can be found below (Babel License).

Authors

Babel is written and maintained by the Babel team and various contributors:

  • Aarni Koskela
  • Christopher Lenz
  • Armin Ronacher
  • Alex Morega
  • Lasse Schuirmann
  • Felix Schwarz
  • Pedro Algarvio
  • Jeroen Ruigrok van der Werven
  • Philip Jenvey
  • benselme
  • Isaac Jurado
  • Tobias Bieniek
  • Erick Wilder
  • Michael Birtwell
  • Jonas Borgström
  • Kevin Deldycke
  • Jon Dufresne
  • Ville Skyttä
  • Jun Omae
  • Hugo
  • Heungsub Lee
  • Jakob Schnitzer
  • Sachin Paliwal
  • Alex Willmer
  • Daniel Neuhäuser
  • Hugo van Kemenade
  • Miro Hrončok
  • Cédric Krier
  • Luke Plant
  • Jennifer Wang
  • Lukas Balaga
  • sudheesh001
  • Niklas Hambüchen
  • Changaco
  • Xavier Fernandez
  • KO. Mattsson
  • Sébastien Diemer
  • alexbodn@gmail.com
  • saurabhiiit
  • srisankethu
  • Erik Romijn
  • Lukas B
  • Ryan J Ollos
  • Arturas Moskvinas
  • Leonardo Pistone
  • Hyunjun Kim
  • Frank Harrison
  • Nehal J Wani
  • Mohamed Morsy
  • Krzysztof Jagiełło
  • Morgan Wahl
  • farhan5900
  • Sigurd Ljødal
  • Andrii Oriekhov
  • rachele-collin
  • Lukas Winkler
  • Juliette Monsel
  • Álvaro Mondéjar Rubio
  • ruro
  • Alessio Bogon
  • Nikiforov Konstantin
  • Abdullah Javed Nesar
  • Brad Martin
  • Tyler Kennedy
  • CyanNani123
  • sebleblanc
  • He Chen
  • Steve (Gadget) Barnes
  • Romuald Brunet
  • Mario Frasca
  • BT-sschmid
  • Alberto Mardegan
  • mondeja
  • NotAFile
  • Julien Palard
  • Brian Cappello
  • Serban Constantin
  • Bryn Truscott
  • Chris
  • Charly C
  • PTrottier
  • xmo-odoo
  • StevenJ
  • Jungmo Ku
  • Simeon Visser
  • Narendra Vardi
  • Stefane Fermigier
  • Narayan Acharya
  • François Magimel
  • Wolfgang Doll
  • Roy Williams
  • Marc-André Dufresne
  • Abhishek Tiwari
  • David Baumgold
  • Alex Kuzmenko
  • Georg Schölly
  • ldwoolley
  • Rodrigo Ramírez Norambuena
  • Jakub Wilk
  • Roman Rader
  • Max Shenfield
  • Nicolas Grilly
  • Kenny Root
  • Adam Chainz
  • Sébastien Fievet
  • Anthony Sottile
  • Yuriy Shatrov
  • iamshubh22
  • Sven Anderson
  • Eoin Nugent
  • Roman Imankulov
  • David Stanek
  • Roy Wellington Ⅳ
  • Florian Schulze
  • Todd M. Guerra
  • Joseph Breihan
  • Craig Loftus
  • The Gitter Badger
  • Régis Behmo
  • Julen Ruiz Aizpuru
  • astaric
  • Felix Yan
  • Philip_Tzou
  • Jesús Espino
  • Jeremy Weinstein
  • James Page
  • masklinn
  • Sjoerd Langkemper
  • Matt Iversen
  • Alexander A. Dyshev
  • Dirkjan Ochtman
  • Nick Retallack
  • Thomas Waldmann
  • xen

Babel was previously developed under the Copyright of Edgewall Software.  The following copyright notice holds true for releases before 2013: “Copyright (c) 2007 - 2011 by Edgewall Software”

In addition to the regular contributions Babel includes a fork of Lennart Regebro’s tzlocal that originally was licensed under the CC0 license.  The original copyright of that project is “Copyright 2013 by Lennart Regebro”.

General License Definitions

The following section contains the full license texts for Babel and the documentation.

  • Authors” hereby refers to all the authors listed in the Authors section.
  • The “Babel License” applies to all the sourcecode shipped as part of Babel (Babel itself as well as the examples and the unit tests) as well as documentation.

Babel License

Copyright (c) 2013-2022 by the Babel Team, see Authors for more information.

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  3. The name of the author may not be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE Author “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE Author BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Author

The Babel Team

Info

Jul 20, 2022 2.10 Babel