correcthorse, in 27 lines of bash

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · edit-2 2 months ago

correcthorse, in 27 lines of bash

her01n · 2 months ago

diceware is an online as well as command line implementation.

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 2 months ago

That’s the one that made me make this script. I thought, surely this is a one-liner? And indeed, except that I kept adding features. I think I’m going to change my shell function back to the one-liner though. All of that extra complexity is unnecessary.

some_guy@lemmy.sdf.org · 2 months ago

Finally got around to reviewing this and it’s surprisingly efficient. I’ve considered myself a pretty advanced Basher for a while and admit learning better technique from this. More specifically, I was unaware of shuf after years and years of Bash scripting. Cheers!

some_guy@lemmy.sdf.org · 2 months ago

I opened this in a browser tab on my phone so that I can remember to review on my computer when I get home. Cheers for sharing.

apotheotic (she/her)@beehaw.org · edit-2 2 months ago

Looks nice - though I don’t feel great about the 2n solution to apostrophes. You could just as well end up with 2n words with apostrophes, no? Its not particularly robust.

With n=6, and only grabbing n words, you have a roughly 88.24% chance of getting at least one word with an apostrophe (ie, you can’t generate a valid passphrase) With n=6, and grabbing 2n words, you have a roughly 3.86% chance of getting at least 7 words with an apostrophe (ie, you can’t generate a valid passphrase). That’s more than 1 in 30 fail!

If every apostrophed word has a non-apostrophe pair, as you say, then perhaps better practice would be to keep the dictionary in order and generate a (somewhat) random number as an index to grab. If it has an apostrophe, grab the next/previous word (ie, its pair).

Might be trickier to fit into exactly 27 lines, but at least it would be robust?

Edit: in hindsight, if the dictionary is required for this to function in the first place, you could just pre-prune it to remove apostrophe words

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · edit-2 2 months ago

I don’t feel great about the 2n solution to apostrophes. You could just as well end up with 2n words with apostrophes, no? Its not particularly robust.

It doesn’t matter - the algorithm takes the stems, it doesn’t drop the words. “Dad’s” becomes “Dad”. If you get both “Dad’s” and “Dad”, you might indeed get a passphrase containing “DadDad” - but that’s not a weakness. Good randomness doesn’t include a guarantee of no duplicates. In fact, the uniq call reduces the quality of the passphrase: “DadDadDadDadDadDad” is a perfectly good phrase.

But it’s a good catch in another way: I’d considering only plurals and possessives, but the American dictionary word file does indeed include many words with more than one apostrophe suffix. No word of more than one letter appears more than 5 times, so 5n would guarantee enough different words. But the best thing about your comment is that it exposes another weakness: the dictionary contains several 1-letter “words”, and one of them - “O” - contains 25 variations with apostrophes. They’re all names: “O’Connell”, “O’Keefe”, etc. The next largest is “L” with 8 appearances: all borrowed words from French, such as “L’Amour”.

I don’t see a simple solution to excluding names, although a tweak could ensure that we get no single letter words. However, maybe simplifying the algorithm would be better: simply grab N words and delete any apostrophes. You might end up with mush like “OBrianMustveHed”, but perhaps that’s not a bad thing.

Perhaps the best implementation would be the simplest:

alias pony="shuf -n 6 /usr/share/dict/american-english | xargs echo | tr -d ' '

Leave in the apostrophes; more random bits. Leave in the spaces, if they’re legal characters in the authentication program, and you get even more.

apotheotic (she/her)@beehaw.org · 2 months ago

Aaaaah I totally misunderstood why you were taking 2n. You were taking 2n in case the truncated string was the same as one you already had. Makes more sense now.