gxter, a GXT file utility

I have recently developed a small library and command-line utility in Rust that opens and creates GXT files from older Grand Theft Auto games. GXT files (description of format) are binary-based lists of localizable text strings that are used both by the game’s executable and all the game/mission scripts. Since the games are meant to be released in different languages, using a separate format for storing strings makes sense.

The program supports, and was tested on, files from GTA III, Vice City and San Andreas, and should also work with the “Stories” games, which are based on Vice City’s engine.

Packaging and installation

Right now, the best way to get the library or the program is to install it via cargo, Rust’s package manager. To install the library into your own Rust project, run the following command in the project’s folder:

cargo add gxter

And to install the command-line utility as an executable:

cargo install gxter-cli

gxter-cli usage

The program has two main modes of operation. The default mode is to “compile” a GXT file from a text file listing all the strings. If the -d command-line parameter is used, it switches to the “decompile” mode, in which a GXT file is read and a text file is printed or written as a result. The text file is based on the TOML format and looks somewhat like this:

format = "Three"

[main_table]
1000 = "YOU ARE DEAD"
1001 = "YOU ARE DEAD"
1002 = "YOU ARE DEAD"
1003 = "YOU ARE DEAD"
1004 = "YOU ARE DEAD"
1005 = "BUSTED"
1006 = "BUSTED"
...

The first line is an indication of which format the GXT file is. The loading and saving procedure is different based on which format is used, and some limits change. There are four format strings that are supported:

After that follows the [main_table] section, which contains strings listed by their name and value. Each string’s name in the “Three” and “Vice” formats is a string of up to 8 bytes, but in practice, strings are never longer than 7 bytes. (Whether or not 8-byte string names are actually supported by the games is to be tested, but they are possible in the file format.)

If the GXT file has a “Vice”, “San8” or “San16” format, it may also contain auxiliary tables, which have separate sections, listed as [aux_tables.NAME]. NAME, in this case, is the table’s actual name, and can also be up to 8 bytes long (but in practice, at most 7).

If the GXT file has a “San8” or “San16” format, then instead of names, each string is associated with a CRC32 checksum. These hashes will be rendered as strings starting with a hash sign and the checksum in hexadecimal format (e.g. "#01234567"). If a string ever needs to have an actual name that starts with a hash, that hash should be duplicated (e.g. ##NAME).

By default, strings from a GXT file are listed in their “key” ordering, which is a simple ASCIIbetical sort of their names or hashes. However, it is also possible to use an “offset” ordering, which provides interesting results. It seems like the offsets correspond roughly to when the strings were first added into the game, or otherwise have a category-based ordering. For example, the last strings in the GTA III file all concern features added in the game’s PC port. Here is what the first few lines look when ordered by offset:

format = "Three"

[main_table]
LETTER1 = """abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ 0123456789"$,.'-?!!SDBF"""
DEFNAM = "Claude----------------------"
IN_VEH = "~g~Hey! Get back in the vehicle!"
IN_VEH2 = "~g~You need some wheels for this job!"
IN_BOAT = "~g~You need a boat for this job!"
HEY = "~g~Don't go solo, keep your posse together!"
HEY2 = "~g~Don't split up, keep the group together!"
HEY3 = "~g~You've dropped your main man, go back and get 8-Ball!"

(It is worth noting that the DEFNAM string is not used by the final game. GTA 1 and 2 both allowed the player to name the protagonist, whereas GTA III did not. The fact that GTA III’s main character is named Claude was only officially revealed in GTA: San Andreas.)

gxter has built-in character tables for the EFIGS (English/French/Italian/German/Spanish) versions of all the supported games, but also supports using a custom character table for any other translations. In the source code are bundled two tables for popular bootleg Russian translations of GTA: Vice City and GTA: San Andreas. These reassign certain characters to Cyrillic letters instead of Latin ones, in order to be able to fit strings in a language not originally supported by the game. If one desires to translate a GTA game into another language, writing a custom character table (in addition to changing the fonts) would likewise be necessary. Example of a Russian string being read from a bootleg version of GTA: San Andreas:

"#C0EBC586" = " ~z~ОНА - СО МНОЙ, УГЛЕПЛАСТИК. ТАК ОХЛАДИТЕ ТРАХАНИЕ. Я РАССМАТРИВАЮ ЕЕ ПОЛЬЗУ."

To make working on “San” format files easier, the program can also accept a “name table” file, which contains a list of known string names. When loading a GXT file with such a table, the checksums for the names in it are pre-calculated, and those matching checksums in the GXT file are matched to their corresponding names.

gxter usage

The library’s documentation is available on docs.rs. The most important structure is GXTFile, as it would store the actual strings. It supports loading data from actual GXT files, saving into those files, as well as reading and writing TOML-based text files, as described above. The data in the GXTFile structure, once read, is stored in an easy-to-access format, with the main table and each auxiliary table being a string-indexed [IndexMap](https://docs.rs/indexmap) (a variant of HashMap that preserves the objects’ order).

Links

Source code and detailed instructions