gxter, a GXT file utility
I have recently developed a small library and command-line utility in Rust that opens and creates GXT files from older Grand Theft Auto games. GXT files (description of format) are binary-based lists of localizable text strings that are used both by the game’s executable and all the game/mission scripts. Since the games are meant to be released in different languages, using a separate format for storing strings makes sense.
The program supports, and was tested on, files from GTA III, Vice City and San Andreas, and should also work with the “Stories” games, which are based on Vice City’s engine.
Packaging and installation
Right now, the best way to get the library or the program is to install it via
cargo, Rust’s package manager. To install the library into your own Rust
project, run the following command in the project’s folder:
cargo add gxter
And to install the command-line utility as an executable:
cargo install gxter-cli
gxter-cli usage
The program has two main modes of operation. The default mode is to “compile” a
GXT file from a text file listing all the strings. If the -d command-line
parameter is used, it switches to the “decompile” mode, in which a GXT file is
read and a text file is printed or written as a result. The text file is based
on the TOML format and looks somewhat like this:
format = "Three"
[main_table]
1000 = "YOU ARE DEAD"
1001 = "YOU ARE DEAD"
1002 = "YOU ARE DEAD"
1003 = "YOU ARE DEAD"
1004 = "YOU ARE DEAD"
1005 = "BUSTED"
1006 = "BUSTED"
...
The first line is an indication of which format the GXT file is. The loading and saving procedure is different based on which format is used, and some limits change. There are four format strings that are supported:
- “Three”: GTA III (and GTA VC on Xbox)
- “Vice”: GTA: Vice City, LCS, VCS
- “San8”: GTA: San Andreas
- “San16”: (untested, but may work with GTA IV)
After that follows the [main_table] section, which contains strings listed by
their name and value. Each string’s name in the “Three” and “Vice” formats is a
string of up to 8 bytes, but in practice, strings are never longer than 7 bytes.
(Whether or not 8-byte string names are actually supported by the games is to be
tested, but they are possible in the file format.)
If the GXT file has a “Vice”, “San8” or “San16” format, it may also contain
auxiliary tables, which have separate sections, listed as [aux_tables.NAME].
NAME, in this case, is the table’s actual name, and can also be up to 8 bytes
long (but in practice, at most 7).
If the GXT file has a “San8” or “San16” format, then instead of names, each
string is associated with a CRC32 checksum. These hashes will be rendered as
strings starting with a hash sign and the checksum in hexadecimal format
(e.g. "#01234567"). If a string ever needs to have an actual name that starts
with a hash, that hash should be duplicated (e.g. ##NAME).
By default, strings from a GXT file are listed in their “key” ordering, which is a simple ASCIIbetical sort of their names or hashes. However, it is also possible to use an “offset” ordering, which provides interesting results. It seems like the offsets correspond roughly to when the strings were first added into the game, or otherwise have a category-based ordering. For example, the last strings in the GTA III file all concern features added in the game’s PC port. Here is what the first few lines look when ordered by offset:
format = "Three"
[main_table]
LETTER1 = """abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ 0123456789"$,.'-?!!SDBF"""
DEFNAM = "Claude----------------------"
IN_VEH = "~g~Hey! Get back in the vehicle!"
IN_VEH2 = "~g~You need some wheels for this job!"
IN_BOAT = "~g~You need a boat for this job!"
HEY = "~g~Don't go solo, keep your posse together!"
HEY2 = "~g~Don't split up, keep the group together!"
HEY3 = "~g~You've dropped your main man, go back and get 8-Ball!"
(It is worth noting that the DEFNAM string is not used by the final game. GTA
1 and 2 both allowed the player to name the protagonist, whereas GTA III did
not. The fact that GTA III’s main character is named Claude was only officially
revealed in GTA: San Andreas.)
gxter has built-in character tables for the EFIGS
(English/French/Italian/German/Spanish) versions of all the supported games, but
also supports using a custom character table for any other translations. In the
source code are bundled two tables for popular bootleg Russian translations of
GTA: Vice City and GTA: San Andreas. These reassign certain characters to
Cyrillic letters instead of Latin ones, in order to be able to fit strings in a
language not originally supported by the game. If one desires to translate a GTA
game into another language, writing a custom character table (in addition to
changing the fonts) would likewise be necessary. Example of a Russian string
being read from a bootleg version of GTA: San Andreas:
"#C0EBC586" = " ~z~ОНА - СО МНОЙ, УГЛЕПЛАСТИК. ТАК ОХЛАДИТЕ ТРАХАНИЕ. Я РАССМАТРИВАЮ ЕЕ ПОЛЬЗУ."
To make working on “San” format files easier, the program can also accept a “name table” file, which contains a list of known string names. When loading a GXT file with such a table, the checksums for the names in it are pre-calculated, and those matching checksums in the GXT file are matched to their corresponding names.
gxter usage
The library’s documentation is available on
docs.rs. The most important structure is
GXTFile, as it would store the actual strings. It supports loading data from
actual GXT files, saving into those files, as well as reading and writing
TOML-based text files, as described above. The data in the GXTFile structure,
once read, is stored in an easy-to-access format, with the main table and each
auxiliary table being a string-indexed [IndexMap](https://docs.rs/indexmap) (a
variant of HashMap that preserves the objects’ order).