Partial Passwords - How?
If a system asks users for two letters, the database of the system will store all possible pairs of letters that appear in a user password - all "obfuscated" (e.g., with a cryptographic hash function). We say obfuscated because brute-force attack on data with about 10 bits of entropy (the number of all possible options being 1,000) is trivial.
Those systems can not ask for more than two letters because more letters would require a lot more database space. A two-letter partial password system (with eight characters' passwords) requires 56 strings (hash values), while a three-letter system would require 336 strings - easily taking more than 5kB of data for a single password.
It appears that most companies indeed store all possible combination of letters a user may enter in a database. The second option is to store the password unencrypted or encrypted with a symmetric algorithm (e.g. 3DES, AES). Passwords cannot be stored as hash values as the system needs to have access to plaint text passwords to compare the letters entered by users.
Better SolutionWe have suggested a solution based on Shamir secret sharing scheme. This saves database space and it is also fast as only a polynomial computation (in the simplest form) is needed for verification.
A. Global Parameters
At the beginning, someone has to define global parameters of the system. Well, there is actually just one - how many letters will users have to select. Let us call the parameter N. The maximum length of password (L) is important only from the database point of view - you will need to store a 32b long number for each character.
B. adding New User
- User chooses his/her password P. It consists of letters p1, p2, p3, ... pk.
- The system will generate at least 32 bits' long secret key - K - unique for each user. 32 bits is enough if N is 3, for larger N (the number of letters users have to correctly select each time) you may want to increase the length of K.
- The system will also generate N-1 32 bits' long random numbers R1, R2, ... R(N-1)
- The next step is a computation of k points (k being the length of the password) on a polynomial: y=K+R1*x + R2*x^2 + ... + R(N-1)*x^(N-1), for x = 1, 2, ... k. Let us denote the results as y1, y2, ..., y(N-1).
- Values s1=(y1-p1), s2=(y2-p2), ... sk=(yk-pk) is stored in the database. Each number takes 32 bits. One will also need to store K, or the hash of K.
The next part of the system is user authentication, which is very simple and fast.
- The system selects N positions in the password - i1, i2, ... iN.
- A user selects N letters from her/his password at specified positions so that we have pairs (p'1, i1), (p'2, i2), ..., (p'N, iN).
- The system recovers yi values for indices i selected in step 1 - simply just adding stored values (see step 5 above) to values p'i entered by the users.
- Now we have to solve the polynomial equation to obtain K'. The equation for that looks horribly but it is quick to compute and can even be partially pre-computed as it uses indices (positions of letters): K' = \sum_i [ yi * [ (\PI_j (j) ) / (\PI_j (i-j)) ] ], where i and j run over i1, i2, ..., iN (step 1), and j skips the actual selected i.
Example: let's say that user selected 2nd, 3rd, and 4th letter, the solution will be: K'=y2*( 3*4/[(2-3)*(2-4)] )+y3*( 2*4/[(3-2)*(3-4)] ) + y4*( 2*3/[(4-2)*(4-3)] )
- The last step is to compare K and K'. If they are equal, user entered correct values and is logged in.
One note here - the secret is not K, but values yi reconstructed when user enters his/her letters.
Pros and Cons
The highlights of this solution are:
- The number of letters for authentication and the length of passwords is not really limited.
- The database space that is needed to store all necessary information is linear with the length of a password, not quadratic.
- Secrets are not stored in plain-text and need certain computation.
- All the data can be still encrypted for storage if needed.
- Faster to compute - verification is several multiplications of 32 bit numbers that is much much faster than computing a hash value.
The low points are:
- It is not a straight forward solution.
- The security is still not increased beyond the difficulty of finding the correct K letters.