Tough nut to crack
To exercise my (almost-non-existent) Python skillz, I wrote a script to generate passwords and passphrases from common words. The script (in Github) also computes the bit entropy of the generated passwords, and compares them against the plain words (without obfuscation).
As a bonus, I also wrote a generator for XKCD's passphrase scheme: choose four random words from a word list, resulting in a higher entropy and better recall.
The script uses four schemes -- well, three, actually, if you don't count the plain text version (for control): simple character substitutions (e.g. "@" for "a", "3" for "e", similar to "l337 $p3a|<") and padding in-between words; padding short passwords with random characters; and the XKCD algorithm.
Here's a test run using the word list from Linux, /usr/share/dict/words:
Cryptogamia738314396617973004177668 35 208.40
0nl3pYUG%/50$?9&(.)+.70:01<[=_: 31 203.19
resoaked16166235184403414647125208 34 202.44
flophouses37070490830801046024 30 178.63
nontaxonomic1968304520123681 28 166.72
ferrocerium42507400944791500 28 166.72
^#fILLIPIng/@-2/1;. 19 124.54
?&p5y(HROM37rI(2l 17 111.43
]NigHt-cELlAr\]4? 17 111.43
squads-leftpertestself-regulatedecarburized 43 75.68
arienzomiscounselingtritheocracyreverberantly 45 75.68
pseudepigraphic 15 70.51
bubble-and-squeakMariandDorkus 30 56.76
Lyonaisquadriannulatebumblepuppy 32 56.76
re-excitelactoidununderstandable 32 56.76
subthreshold 12 56.41
bioscientist 12 56.41
francisca 9 42.30
apyrene 7 32.90
The padding scheme produces passwords with higher bit entropy, understandably, because of the length. Complexity-wise, though, they won't cut it since they don't have enough character classes (mix of upper- and lower-case, punctuation marks, and digits). The character-substitution-plus-padding scheme does a better job, while XKCD-style is good enough.
(Caveat: I'm not a mathematician. This is a purely "pedestrian", hence loose, interpretation of entropy based on my limited understanding of the concept, but I'd like to think that the math is sound. :-)
Take note that I used a different algorithm for calculating the entropy for XKCD and the other schemes. For XCKD, for example, I assumed a fixed-sized data set from which to derive the passphrases, so the calculation is:
$$E = l * {log_2{R}}$$
where $E$ = entropy
$l$ = number of words in the passphrase
$R$ = total number of words in the data set
For the other schemes, I use the classic definition for calculating entropy, which is the same as above, but $l$ is the length of the password, and $R$ is the range of possible characters.
I also tested the generated passwords against several online strength testers. Interestingly, Wolfram Alpha provides a great password strength tester. It also suggests passwords, based on your input.
Next step is to write a generator for the Schneier scheme: take a memorable sentence, add some memorable tricks (character substitutions, padding), then turn it into a password. That should be fun.
Comments
Post a Comment