Most of Ledger Nano owners at some point start to wonder if there is a possibility to customize their seed. Some of them possibly go even further and try to use randomly chosen words in the recovery option. In such a case after confirming the last word they almost certainly see the message:
"Recovery phrase is invalid, retry"
Thus, does it mean that creating a new wallet with own custom words set is impossible?
Here we got a nice surprise. It is possible, however with some limitation. But let's start from the beginning.
The first restriction is a requirement, that all the words must comply with the BIP39 standard list
(Bitcoin Improvement Proposal - number 39). What exactly Bitcoin Improvement Proposals are you can find
here.
BIP39 is nothing more than just a set of 2048 words, chosen that way that every first 4 letters
of any word are unique. And this rule includes even 3 letters words.
Such a property makes them easier to remember if you make a several elements sequence. Which exactly your seed is.
The second restriction is a requirement to the last word of your sequence. It is a specific checksum determined
by all the rest of sequence elements and you can't just choose it by yourself. This is exactly the reason why we see
an error message, anytime we try to enter randomly chosen words as a seed to our Ledger Nano.
Usually the checksum condition is complied by more than just a one word, however chance of finding
the proper one by simple drawing is small. The amount of words complying with the checksum condition
is strictly related to the number of seed words. In the case of 24 elements seed (the most secure one) there are always exactly
8 possible words complying with the checksum.
Thus the chance of drawing the proper one from the BIP39 elements set is exactly like 8 to 2048.
So the probability is as little as 0,00390625.
Well, it is no wonder that you saw an error message.
Then how we can find the proper checksum for the sequence of our own choice? For example, let's take a random set of 23 words complying with BIP39 standard:
Every word has its index, a number resulting from its alphabetical order. It is important to start the index numeration with 0. An index should be expressed in a binary value. As the BIP39 contains 2048 elements, then every word index has to consist of 11 bits. Thus we find:
In the next step we concat all the indexes preserving their order of appearance. After this we get 253-bit binary word:
The obtained result we complement with additional 3 bits of our own choice.
Do you remember as we said that 24 words seed has exactly 8 possible checksum results?
Number 8 is just a simple consequence of those 3 bits.
All of the 8 results we will get by simple putting here 000, 001, 010, 011, 100, 101, 110 and 111.
For this example we choose: 000.
We concat it at the end, which results with 256-bit binary word:
The next step is to calculate SHA256 hash of such prepared binary data.
The SHA256 algorithm (secure hashing algorithm, 256-bit output) is an essential part of the blockchain technology.
More about what the SHA256 is you can find here.
As a result of this calculation we get the following hexadecimal word:
Next move is to extract the first 8 bits from the result above. Thus we take the first 2 hex characters - 27.
They respond to:
Once again we use the same 3 bits we have chosen before. This time we concat them to the front of the extracted binary string. In consequence, we get:
which is binary value 39. This is a BIP39 index of the checksum word we are trying to find. Don't forget that the index numeration starts from '0'. So if you use a BIP39 wordlist with standard numeration starting from '1', then the actual index numer will be 40. We find the proper index on the BIP39 list and we get the word: agent Finally 24-word seed containing 23 words of our own choice is:
For the remaining 3-bit combinations the last checksum word is: