From CloudModding TWW Wiki

ARC files are archives that store information about files and directories, and are used to organize assets within the games. While the file extension is .arc, the FourCC for the format is RARC. Following other observed naming conventions, the addition of R to the beginning of the FourCC suggests that the format was refactored at some point from an earlier version.

ARC is used in nearly every first-party Nintendo game in the GameCube era, including The Wind Waker and Twilight Princess. Luigi's Mansion typically used YAY0-compressed archives with the extension .szp, and Super Mario Sunshine used YAZ0-compressed archives with the extension .szs.

Header

This contains data about the archive itself, such as the total size of the file and the counts and offsets of the nodes and file entries. It is 0x40/64 bytes long and has the following structure:

RARC Header

Offset Size Name Description
0x00
4
FourCC The file type, the four characters 'RARC'
0x04
4
File Size The size of the entire ARC file
0x08
4
Data Header Offset Offset from the start of the file to the data header. In practice this is always 0x20, the size of the RARC header. However, the game does properly read this, so it would have no issues loading RARCs with headers larger than 0x20 bytes as long as this value was updated to match.
0x0C
4
File Data Offset The offset to the file data (relative to the start of the data header, so add 0x20)
0x10
4
Total File Data Size The combined size of all the file data, including both the MRAM and ARAM blocks as well as any files loaded directly from the DVD that are not in either block.
0x14
4
MRAM Preload File Data Size The size of the first block of file data, which has files to be preloaded into MRAM as soon as the RARC is loaded. Can be 0 if no files have the MRAM flag set.
0x18
4
ARAM Preload File Data Size The size of the second block of file data, which has files to be preloaded into ARAM as soon as the RARC is loaded. Can be 0 if no files have the ARAM flag set.
0x1C
4
Padding Pads the header to 0x20/32 bytes

Data Header

The data header always starts at offset 0x20 into the RARC in practice.

Offset Size Name Description
0x00
4
Node Count The number of Nodes in the archive
0x04
4
Node Offset The offset to the start of the Node data (relative to the start of the data header, so add 0x20)
0x08
4
File Entry Count The number of File Entries in the archive
0x0C
4
File Entry Offset The offset of the File Entry data (relative to the start of the data header, so add 0x20)
0x10
4
String Table Length The length of the String Table
0x14
4
String Table Offset The offset of the String Table (relative to the start of the data header, so add 0x20)
0x18
2
Next Free File ID If a new file is added to this ARC, this is the ID that it should be given. Then this value should be incremented by 1. Not read by the game itself.
0x1A
1
Sync File IDs and Indexes Seems to be a boolean that, when true, indicates all the files will have file IDs equal to their index in the list of all the files in the ARC. When false, the file IDs will be out of sync with their indexes. This doesn't appear to be read by the game itself.
0x1B
1
Padding Always 0x00
0x1C
4
Padding Pads the header to 0x40/64 bytes

Node

Nodes represent the directories stored within the archive. They are 0x10/16 bytes long, and are laid out like this:

Offset Size Name Description
0x00
4
Type Typically the file type contained in this archive (BMD, BDL, STB, etc)
0x04
4
Name Offset The offset to the directory's actual name
0x08
2
Name Hash A hash of the directory's name. See "Filename Hashing Algorithm" below
0x0A
2
File Entry Count The number of file entries that belong to this directory
0x0C
4
First File Entry Index The index of the first file entry that belongs to this directory

File Entry

These entries represent the actual hierarchy of files and directories. They are 0x14/20 bytes long and have the following structure:

Offset Size Name Description
0x00
2
File ID A unique identifier for each file. Directories don't use this, they usually have it set to 0xFFFF, but may have it set to 0x0000 instead in rare cases.
0x02
2
Name Hash A hash of the file or directory's name. See "Filename Hashing Algorithm" below
0x04
1
Type Flags A bitfield of various flags indicating what type of entry this is. See below for details.
0x05
3
Name Offset The offset of this entry's name in the string table
0x08
4
Data Offset If this is a directory, the is the index of the directory's node. If it's a file, it's the offset to the file's data.
0x0C
4
Data Size If this is a directory, the is the size of a directory node, 0x10/16. If it's a file, it's the size of the file's data.
0x10
4
Padding Four bytes of padding to round out the entry to 0x14/20 bytes long.

Type Flags Bitfield

Mask Field Name Description
0x01 File This entry is a file, not a directory.
0x02 Directory This entry is a directory, not a file.
0x04 Compressed File This entry's data is compressed, either with Yaz0 or Yay0.
0x10 Preload to MRAM This file should be preloaded into MRAM. In practice all non-REL files have this set.
0x20 Preload to ARAM This file should be preloaded into ARAM, and then transferred to MRAM when it is needed. In practice only REL files have this set.
0x40 Load from DVD This file should not be preloaded, it should be loaded directly from the DVD to MRAM when it is needed.
0x80 Yaz0 Compressed File This entry's data is Yaz0-compressed, not Yay0 compressed. Bit 0x04 should be set as well.

Note: Every directory contains two extra file entries in addition to the entries that represent the actual files it contains. These two file entries have the names "." and "..", and may stem from how Unix-like systems and Windows handle directories. In those cases, "." represents a link to the current directory, and ".." represents a link to the current directory's parent.

Note: The game can load files based off of file index, file ID, or file name. The most common is file index, used by almost all objects and a number of NPCs. File ID is used by most NPCs, as well as STB cutscenes. File name is used by only a handful of things, such as loading stages or dungeon doors.

String Table

The string table is simply a series of null-terminated strings.

Filename Hashing Algorithm

Nodes and File Entries have a field for a hash of their names. An example function for creating this hash from the name string is found below. It uses C# syntax.

private ushort HashName(string name)
{
    short hash = 0;
    short multiplier = 1;

    if (name.Length + 1 == 2)
    {
        multiplier = 2;
    }

    if (name.Length + 1 >= 3)
    {
        multiplier = 3;
    }

    foreach (char c in name)
    {
        hash = (short)(hash * multiplier);
        hash += (short)c;
    }

    return (ushort)hash;
}

See Also

Tools that can extract and repack ARC files: