Please note, this is a STATIC archive of website hashcat.net from October 2020, cach3.com does not collect or store any user information, there is no "phishing" involved.

hashcat Forum

Full Version: On-the-fly loading of gz wordlists
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Quote:* changes v5.1.0 -> v6.0.0
(...)
- Support on-the-fly loading of compressed wordlists in zip and gzip format
Seen in hashcat beta 1803.

Just tried with a gz file:

Code:
$ wc -l  dic*
314'265 dic.gz
19'487'556 dic

Hashcat sees:
Code:
Dictionary cache built:
* Filename..: dic.gz
* Passwords.: 314266
* Keyspace..: 205218

Progress.........: 205218/205218 (100.00%)

Edit:
Code:
$ file dic.gz
dic.gz: gzip compressed data, last modified: Sun Apr 12 05:01:34 2020, from Unix


Looks like the file has not been unzipped?

Thanks.
WFM:

Code:
$ hashcat --version
v5.1.0-1789-gc7da6357

$ file rockyou.txt.gz
rockyou.txt.gz: gzip compressed data, was "rockyou.txt", last modified: Thu Jul  7 14:27:39 2016, from Unix

$ wc -l rockyou.txt.gz
213432 rockyou.txt.gz
$ gunzip -cd rockyou.txt.gz | wc -l
14344391

$ cat brandon.hash
fc275ac3498d6ab0f0b4389f8e94422c

$ hashcat -m 0 -w 4 -O -D 2 brandon.hash rockyou.txt.gz --potfile-disable
[...]
Dictionary cache built:
* Filename..: rockyou.txt.gz
* Passwords.: 14344391
* Bytes.....: 139921497
* Keyspace..: 14344384
* Runtime...: 3 secs
[...]
fc275ac3498d6ab0f0b4389f8e94422c:brandon
Weird.. I may have failed somewhere in the compress step..
From a performance perspective I cannot see any loss of speed; how does hashcat handle gigas of compressed wordlist without spending minutes/hours on decompression?
I'm not sure, but I think that decompressed data can be used "on the fly" - as soon as it's decompressed, without having to wait for the entire file to decompress.
Ok.. but what about "Dictionary cache building", hashcat have to decompress all the file first to know the statistics, number of passwords etc?
Yes, that seems right. It would have to decompress the whole thing first, enough to analyze the statistics and then cache them. so I assume there would be some duplicated work, just like dictionary cache building without compression.