gff2xml - Man Page

BioWare GFF to XML converter


gff2xml[options] input_file [output_file]


gff2xml converts BioWare's GFF files (versions V3.2/V3.3 and V4.0/V4.1) into human-readable XML. GFF are hierarchical data files, similar to XML in concept, but stored in binary. As such, these files are used as a basis for many of the file formats found in the BioWare games. For example, an UTC file is a GFF holding a template for a creature, while a GUI file is a GFF describing an in-game menu.

Both version 3 of the format (V3.2/V3.3) and version 4 (V4.0/V4.1) are supported. While they are similar, the 4th version carries several changes to make the files more efficient to read in-game. This includes replacing the string field names (which map to XML tags) with numerical identifiers, resulting in converted XML files that are stripped of their meaning. To compensate, this tool adds readable aliases to many of these numerical identifiers, giving them back their meaning. Unfortunately, not all of them are known. Most notably, the identifiers introduced in Sonic Chronicles: The Dark Brotherhood and Dragon Age 2 are still missing.

The changes in the minor versions (V3.2 vs. V3.3 and V4.0 vs. V4.1) are less significant. V3.3 simply changes which languages are supported, and V4.1 adds a common string table at the start of the file. Both of these additions are handled transparently.

LocStrings found in GFF V3.2 and V3.3 contain localized string data, which, depending on the game and the language, can be encoded in various ways. There is no way to autodetect the specific encoding. gff2xml employs a simple heuristic to combat this, but it may fail for certain strings and files. However, there are options to explicitly specify the game this GFF file is from. gff2xml will then use the correct game-specific encoding tables.

Unfortunately, even these tables might not be completely correct in all cases. Neverwinter Nights, for example, treated many strings as being encoded in the native encoding used for the language of the game installation. This lead to many people putting non-English strings into fields tagged as language ID 0, nominally reserved for English. To read these files correctly, gff2xml provides an --encoding parameter to override the encoding used for a specific language ID.



Show a help text and exit.


Show version information and exit.


Read GFF4 strings as Windows CP-1252. Usually, strings in version 4 of the GFF format are encoded in little-endian UTF-16. But some files store them as Windows CP-1252 instead. Since there's no clean way to autodetect the different encoding, this switch manually selects Windows CP-1252. This option only concerns strings embedded in GFF4 files, not GFF3 LocStrings.


The GFF files found in the encrypted HAK files of Neverwinter Nights premium modules are deliberately broken. This options tells gff2xml to work around the brokenness.


Read LocStrings in an encoding appropriate for Neverwinter Nights.


Read LocStrings in an encoding appropriate for Neverwinter Nights 2.


Read LocStrings in an encoding appropriate for Knights of the Old Republic.


Read LocStrings in an encoding appropriate for Knights of the Old Republic II.


Read LocStrings in an encoding appropriate for Jade Empire.


Read LocStrings in an encoding appropriate for The Witcher.


Read LocStrings in an encoding appropriate for Dragon Age: Origins.


Read LocStrings in an encoding appropriate for Dragon Age II.

--encoding str

Override an encoding. The string has to be of the form n=encoding, for example 0=cp-1252 to override the encoding of the (ungendered) language ID 0 to be Windows codepage 1252. To override several encodings, specify the --encoding parameter multiple times.


The GFF file to convert.


The XML file will be written there. If no output file is specified, the XML data is written to stdout. The encoding of the XML stream is always UTF-8.


Convert the GFF file1.utc into an XML file:

$ gff2xml file1.utc file2.xml

Convert the GFF file1.utc into an XML file on stdout:

$ gff2xml file1.utc

Convert the GFF file1.utc, which uses Windows CP-1252 strings:

$ gff2xml --cp1252 file1.utc file2.xml

Convert the GFF file1.utc, which encodes language ID 0 in LocStrings as Windows CP-1250:

$ gff2xml --encoding 0=cp1250 file1.utc file2.xml

See Also

convert2da(1), fixpremiumgff(1), tlk2xml(1), ssf2xml(1)

More information about the xoreos project can be found on its website.


This program is part of the xoreos-tools package, which in turn is part of the xoreos project, and was written by the xoreos team. Please see the AUTHORS file for details.

Referenced By

convert2da(1), fixpremiumgff(1), ssf2xml(1), tlk2xml(1).

September 1, 2016