Series of explanations #1: Hashlists
Avatar

The word ‘hashlist’ has come up a lot recently, but most people don’t know much about it apart from ‘we need it for modding’. The concept is pretty straightforward and comes up frequently in computing. The focus is around the concept of ‘hashing’, you can find a video explaining this here. Hashing is the process of taking a piece of data, such as a file or a series of characters (string) and creating a mathematically unique number representation of that data which is irreversible. A common form of this comes as an md5 checksum which we actually use here on the site. Often this is used to verify that the data you have downloaded hasn’t been corrupted or tampered with.

This concept applies to Payday 2 as a large part of the assets use hash representations of the path, language and extension for defining and referencing files. So with this, if you’re looking for the file with the path ‘units/payday2/masks/msk_dallas/msk_dallas’ you’ll find it stored as 9e8cd8d332833dcf in the bundles/packages where it is located. This is done to reduce the time taken for looking up files in the database. As a result the path I just mentioned with 41 characters(41 bytes) the hashing algorithm used in PD2 will produce a 64-bit number representing that path which’ll be stored as 8 bytes, considerable saving space. In certain cases for extensions where extensions are below 8 characters it will be using more space, however overall it appears that more is saved than lost. It is not only the bundles/packages that use the hashes, they are also used in the localization files for the id’s, model files for material, object, etc names and sound ids (under a different algorithm).

Historically the game would include the file ‘idstring_lookup.idstring_lookup’, which is what we call the hashlist. This file is very important as the only way to ‘undo’ a hash is to know what the original data was, hash it and compare. The problem we are having currently is that the hashlist has been removed from the game’s assets, which restricts us to what we had previously which is everything up to and including U100. It was removed due to it occasionally containing hashes for unreleased content which led to leaks. Due to this we were aware of all the currently released classic heists being in development before they were ever announced or first released. Without the hashlist files that were added to the game post U100, when extracted will give a filename of something like 9e8cd8d332833dcf.unit as we just don’t know the actual path. This causes a major issue when it comes to modifying files, if we don't know what the path/name of the file it is hard to know what it is for and you cannot mod override it.

What can we do about it?

  • Get Overkill to return the ‘idstring_lookup.idstring_lookup’. This option would be ideal as it would resolve all the issues we currently have. However I am unsure if this will happen soon as Joakim assured us at the beginning of September that it would return with the 2nd update post-talk and would be manually updated to prevent leaks. Unfortunately it has not yet been added and we haven’t heard any news about it (potentially because of the Hoxton Housewarming event).
  • Obtain hashes from internal asset references. As most modders are probably aware of at this stage, individual assets commonly reference each other with the real path that we need. This option is helpful and I have an implementation of this setup in the in-dev modding tools (an announcement regarding this will be coming soon). What I’ve got works pretty well for what it can get. However there is one major issue with this idea which is that a large number of paths are not referenced in the assets. These are primarily sound files, heist core files and gui textures. Some of these are only referenced in the lua code which could potentially be obtained through a script running in-game. This presents issues but it should be able to eliminate another large subset of the missing paths if done well.
  • Brute forcing hashes. Often hashes are reversed due to hashing possible original data until the result is equal to the hash. This in theory could work as you could set up decent patterns for creating possible paths. However the amount this would be able to crack would be limited as many of the longer paths would take an extremely long time to get and would need to be manually defined to even have a hope. You can find a good computerphile video on this here
For a summary, a file with the path ‘units/payday2/masks/msk_dallas/msk_dallas’ will be stored as 0x9e8cd8d332833dcf and the only way to get the actual path is to have a list that contains the path. Without the hashlist in the game, we don't know what the path of 9e8cd8d332833dcf is, so determining its purpose is difficult and you can't mod override it. However some of the solutions mentioned could help solve the issue partly/fully.

If you have any questions about anything mentioned here or you wish to request what I should talk about next, feel free to reply to this thread.

Avatar
I asked about adding the hashlist back in during one of the housewarming streams on Twitch.  Joakim said they would make a manual hashlist soon.

Hopefully the Hashlist will return soon so we can see more mods for the newest content in the game!

21 966