This paper outlines the technical implementation, benefits, and performance considerations of using compressed wordlists with Hashcat, the industry-standard password recovery tool.
Efficient Password Cracking with Compressed Wordlists in Hashcat 1. Introduction
Modern password cracking often requires wordlists (dictionaries) exceeding several terabytes in size, such as the Weakpass collections. Storing and processing these massive files in uncompressed formats creates significant storage overhead and I/O bottlenecks. Since Hashcat version 6.0.0, the software natively supports on-the-fly decompression for specific formats, allowing researchers to optimize their hardware resources. 2. Supported Formats and Usage
Hashcat automatically detects and decompresses wordlists in the following formats during execution: Gzip (.gz) ZIP (.zip) Standard Implementation
To use a compressed wordlist, simply reference the file directly in a Straight Attack (-a 0) command:hashcat -a 0 -m [mode] [hash] wordlist.gz Limitations
7-Zip (.7z): Not natively supported for direct wordlist reading. If provided, Hashcat may treat the binary compressed data as the wordlist itself, leading to failed cracks.
Decompression Delay: For very large files (e.g., 250GB compressed), Hashcat may require significant startup time (sometimes hours) to index and build the dictionary cache before the GPU begins cracking. 3. Legacy and Alternative Methods (Piping)
For versions prior to 6.0.0 or for unsupported formats like .zst, users must pipe the decompressed stream into Hashcat.
Syntax: gunzip -cd wordlist.gz | hashcat -a 0 -m [mode] [hash]
Critical Drawback: Piping prevents Hashcat from performing "Dictionary cache building." Because the tool doesn't know the full length of the input, it cannot provide an accurate ETA or allow certain status features (like skipping/restoring) efficiently. 4. Performance Considerations
I/O vs. CPU: Compressed wordlists reduce disk read time (I/O) but increase CPU load for decompression. In most high-speed GPU cracking scenarios, the CPU overhead is negligible compared to the benefits of reduced disk activity. hashcat compressed wordlist
Caching: Native support (.gz/.zip) allows Hashcat to build a .dict.stat2 file, which speeds up subsequent runs using the same wordlist.
Memory: Very large compressed files may require substantial system RAM for indexing during the initial load phase. 5. Conclusion
Native compressed wordlist support in Hashcat is a vital feature for handling modern "leak" databases. For optimal results, researchers should prioritize Gzip (.gz) compression and use Hashcat 6.0+ to maintain full status-tracking and caching capabilities. Sources: Hashcat Forum, Hashcat Wiki, Super User. Using Hashcat to load a compressed wordlist - Super User
Using compressed wordlists in Hashcat is a highly efficient way to manage massive password dictionaries without exhausting your local storage. Modern versions of Hashcat support reading certain compressed formats directly, allowing you to run attacks on the fly without needing to manually decompress hundreds of gigabytes of text. Supported Formats and Usage Hashcat can natively handle wordlists compressed with Gzip (.gz) ZIP (.zip) Standard Syntax
: Simply point Hashcat to the compressed file as you would with a hashcat -a -m [hash_type] target_hashes.txt wordlist.gz Use code with caution. Copied to clipboard ZIP Specifics : If using a
archive, ensure the wordlist is the only file inside and that it was compressed using the method for maximum compatibility. Performance Considerations On-the-Fly Decompression
: Hashcat decompresses the data in memory as it processes it. This means you don't lose cracking speed during the actual attack, though there may be a slight delay at the start while Hashcat builds its dictionary cache. RAM Limits
: While it saves disk space, Hashcat still needs to analyze the file for statistics. For extremely large files (e.g., 100GB+ compressed), you may see a long "Dictionary cache building" phase where the system appears to hang before the crack begins. Comparison of Formats : Many users report that is more reliable than
for exceptionally large wordlists (terabyte-scale uncompressed), as it avoids certain internal ZIP file size limits. Advanced Piping (The "Zcat" Method)
If you encounter a format Hashcat doesn't natively support (like Why Use a Compressed Wordlist
), you can pipe the output from a decompression tool directly into Hashcat using standard input ( zcat wordlist.gz | hashcat -a -m [hash_type] target_hashes.txt Use code with caution. Copied to clipboard
Note: Using piping may prevent Hashcat from showing an accurate progress bar or ETA, as it doesn't know the total size of the incoming stream. Best Practices Avoid Subfolders
: When zipping a wordlist, do not include any subfolders in the archive; Hashcat expects the raw dictionary file to be at the root. Prioritize Rules
: Instead of using a 500GB compressed wordlist, it is often more efficient to use a smaller, high-quality list (like ) combined with Hashcat Rules ) to generate permutations on the fly. most effective rule sets to use with smaller compressed wordlists?
Starting with Hashcat 6.0 , the tool supports the native decompression of wordlists on-the-fly, allowing you to use compressed files directly in your attack commands without pre-extracting them. This is particularly useful for massive wordlists that would otherwise consume significant disk space. Super User Supported Formats
Hashcat natively detects and decompresses the following formats during the initial loading phase: Gzip (.gz) : Widely used for standard wordlists like rockyou.txt.gz ZIP (.zip)
: Supported for individual wordlist files contained within an archive. Note on .7z : Native support for is generally not available; pointing hashcat to a
file may result in it attempting to read the compressed binary data as plaintext, which will fail. Super User How to Use Compressed Wordlists
You can supply a compressed wordlist just as you would a standard text file: # Direct usage in Hashcat 6.0+ hashcat -a hash.txt wordlist.txt.gz Use code with caution. Copied to clipboard Manual Decompression (Piping)
If you are using an older version of Hashcat or a format it doesn't natively support (like Decompress on the fly (pipe into Hashcat) Decompress
), you can pipe the decompressed output directly into Hashcat's standard input (stdin): Super User # Using gunzip for .gz files gunzip -c wordlist.txt.gz | hashcat -a # Using 7z for .7z files z e -so wordlist.7z | hashcat -a Use code with caution. Copied to clipboard Performance & Trade-offs Disk vs. CPU
: Compressed wordlists save massive amounts of disk space but require a small amount of CPU overhead for real-time decompression.
: When using native support, Hashcat still needs to decompress the file once to build a dictionary cache
(to analyze statistics like password counts). This may cause a slight delay at the start of the attack. Piping Limitations : If you use the piping method (
), Hashcat cannot build a dictionary cache because it doesn't know the full size of the input. This means you will not see an accurate or progress bar for the overall wordlist. Alternative Tools
: For generating and automatically compressing massive custom wordlists, high-performance tools like can output directly to Super User Using Hashcat to load a compressed wordlist - Super User 23 Dec 2018 —
Wordlists (dictionaries) for password cracking can be huge — sometimes tens or hundreds of gigabytes. Compressed formats like .gz, .bz2, .xz, or .7z save disk space and bandwidth. However, Hashcat itself does not directly read compressed files.
You have two main options:
crunch 8 8 abc123 -o stdout | gzip > custom_8char.gz
Later, use it with Hashcat:
zcat custom_8char.gz | hashcat -a 0 -m 1800 hash.txt