Wednesday, 18 January 2012

Passphrase generator in bash

[EDIT]

The script below has been improved upon and can be found here: http://blog.0x10.co.uk/2013/04/passphrase-generators-part-ii.html

Passwords are dead. Long live passphrases. Or something like that anyway. As perfectly summarized by XKCD here http://xkcd.com/936/, passwords are hard to remember and easy to guess. Passphrases on the other hand are easy to remember and hard to guess.
To help generate passphrases, I've put together a little bash script.
Firstly, download the dictionary containing 10000 most common English words. If you have a 4 word passphrase, thats 10^16 possible combinations.
$ wget http://wortschatz.uni-leipzig.de/Papers/top10000en.txt
Now save the following script:
 
#!/bin/bash

dict=$1
script=$2

chars="a-zA-Z0-9 "
size=$(cat $dict | wc -l)
echo "#!/bin/bash" > $script
( echo -n "dict=( "; i=0; cat $dict | while read word; do echo -n "$word " | tr -c -d "$chars"; if [ "$i" -eq "10" ]; then i=1; echo ""; fi; ((i++)); done; echo ")" ) >> $script

echo "size=$size" >> $script
echo 'count=${1:-10}
words=${2:-4}

for i in $(seq 1 $count); do
    first=1
    for j in $(seq 1 $words); do
        r=$RANDOM
        let "r %= $size"
        word=${dict[$r]}
        if [ "$first" == "0" ]; then
            echo -n " "
                fi
        echo -n "$word"
        first=0
    done
    echo ""
done
' >> $script
chmod +x $script
Using modulo on a random number? tut tut tut. Anyway, I've called this script make.sh.
Now run it, giving it the dictionary file and an output script name:
$ ./make.sh top10000en.txt ppgen.sh
What it basically does is embed an array of words into ppgen.sh plus the code necessary to show passphrases.
It will look something like:
#!/bin/bash
dict=( the of to and a in for is The that on 
said with be was by as are at from 
it has an have will or its he not 
were which this but can more his been would 
...
definite fragile rewards antiabortion respects careers backers seize inefficient 
conceptual densities EPS Me Sparc spirits experimentally shallow )
size=10000
count=${1:-10}
words=${2:-4}

for i in $(seq 1 $count); do
    first=1
    for j in $(seq 1 $words); do
        r=$RANDOM
        let "r %= $size"
        word=${dict[$r]}
        if [ "$first" == "0" ]; then
            echo -n " "
                fi
        echo -n "$word"
        first=0
    done
    echo ""
done
We can now run it:
$ ./ppgen.sh 
Does Brazil quotas message
amount losers sing Oregon
patient misleading rebels smoking
refuse subordinated clerk wrote
have handle Rockwell strong
Committee Insurance consumer else
tariffs Unit mouth loved
V singled graduate kidney
uranium Hungary scientists Mar
fined Hercules Kemp Indian
Or, if we wanted to create just one 5 word phrase:
$ ./ppgen.sh 1 5
Commodity dioxide Boris unit drop

4 comments:

  1. How about:

    read n; i="1"; while [ $i -le $n ]; do cat /etc/dictionaries-common/words | shuf | head -1; i=$(( $i + 1 )); done

    ?

    Especially considering:

    $ wc -l /etc/dictionaries-common/words
    98569 /etc/dictionaries-common/words

    :)

    ReplyDelete
  2. r, thats also a valid approach. I wanted to embed the dictionary into the script so that the script could be moved around easily.

    ReplyDelete
  3. Great idea, but insufficient because $RANDOM is an internal Bash function (not a constant) that returns a pseudorandom [1] integer in the range 0 - 32767. It should not be used to generate an encryption keys. In other words the 'periodicity' which is the length of the repetition-freeness of the sequence is completely insufficient for your purposes.

    You will find that the repetition of random numbers gives you far less than your expected 10^16 which assumes a far better random number generator such the Mersenne twister algorithm or the random number generators in R. Of course I realise the choice of $RANDOM was portability, but it maybe worth modelling the differences in R.

    $RANDOM = 1:32767
    Mersenne Twister = 1:4.3x10^6,001

    ReplyDelete
  4. Hey Dennis, thanks for the feedback. It was done originally as a quick and dirty alternative to trying to make up 4 random words. However, as you pointed out, using $RANDOM isn't secure at all. I have improved it here http://blog.0x10.co.uk/2013/04/passphrase-generators-part-ii.html

    ReplyDelete