This is an old revision of this page, as edited by 81.156.74.241 (talk) at 23:44, 18 September 2006 (→Padding schemes). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Revision as of 23:44, 18 September 2006 by 81.156.74.241 (talk) (→Padding schemes)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff) This article is about an algorithm for public-key encryption. For other meanings, including the company that first marketed the algorithm, see RSA (disambiguation).In cryptology, RSA is an algorithm for public-key encryption. It was the first algorithm known to be suitable for signing as well as encryption, and one of the first great advances in public key cryptography. RSA is still widely used in electronic commerce protocols, and is believed to be secure given sufficiently long keys.
History
The algorithm was publically described in 1977 by Ron Rivest, Adi Shamir and Len Adleman at MIT; the letters RSA are the initials of their surnames. Apocryphally, it was invented at a Passover seder in Schenectady, N.Y. .
Clifford Cocks, a British mathematician working for the UK intelligence agency GCHQ, described an equivalent system in an internal document in 1973, but given the relatively expensive computers needed to implement it at the time, it was mostly considered a curiosity and, as far as is publicly known, was never deployed. His discovery, however, was not revealed until 1997 due to its top-secret classification, and Rivest, Shamir, and Adleman appear to have devised RSA independently of Cocks' work.
The algorithm was patented by MIT in 1983 in the United States as U.S. patent 4,405,829. The patent expired on 21 September 2000. Since the algorithm had been published prior to filing the patent application, regulations in much of the rest of the world precluded patents elsewhere. Had Cocks' work been publicly known, a patent in the US would not have been possible either.
Operation
Intuition
RSA involves two keys: public key and private key (a key is a constant number later used in the encryption formula.) The public key can be known to everyone and is used to encrypt messages. These messages can only be decrypted by use of the private key. In other words, anybody can encrypt a message, but only the holder of a private key can actually decrypt the message and read it. Intuitive example: Bob wants to send Alice a secret message that only she can read. To do this, Alice sends Bob a box with an open lock, for which only Alice has the key. Bob receives the box, he writes the message in plain English, puts it in the box and locks it with Alice's lock (now Bob can no longer read the message.) Bob sends the box to Alice and she opens it with her key. In this example, the box with the lock is Alice's public key, and the key to the lock is her private key.
Key generation
Suppose Alice and Bob are communicating over an insecure (open) channel, and Alice wants Bob to send her a private (or secure) message. Using RSA, Alice will take the following steps to generate a public key and a private key:
- Choose two large prime numbers and such that , randomly and independently of each other.
- Compute .
- Compute the totient .
- Choose an integer that is coprime to .
- Compute such that .
- The prime numbers can be probabilistically tested for primality.
- A popular choice for the public exponents is e=2+1=65537. Some applications choose smaller values such as or instead. This is done in order to make implementations on small devices (e.g. smart cards) easier, i.e. encryption and signature verification are faster. But choosing small public exponents may lead to greater security risks.
- Steps 4 and 5 can be performed with the extended Euclidean algorithm; see modular arithmetic.
- Step 3 changed in PKCS#1 v2.0 to instead of .
The public key consists of
- n, the modulus, and
- e, the public exponent (sometimes encryption exponent).
The private key consists of
- n, the modulus, which is public and appears in the public key, and
- d, the private exponent (sometimes decryption exponent), which must be kept secret.
For reasons of efficiency sometimes a different form of the private key (including CRT parameters) is stored:
- p and q, the primes from the key generation,
- d mod (p-1) and d mod (q-1) (often known as dmp1 and dmq1)
- (1/q) mod p (often known as iqmp)
Though this form allows faster decryption and signing using the Chinese Remainder Theorem (CRT), it considerably lowers the security. In this form, all of the parts of the private key must be kept secret. Yet, it is a bad idea to use it, since it enables side channel attacks in particular if implemented on smart cards, which would most benefit from the efficiency win. (Start with and let the card decrypt that. So it computes or whose results give some value . Now, induce an error in one of the computations. Then will reveal or .)
Alice transmits the public key to Bob, and keeps the private key secret. p and q are sensitive since they are the factors of n, and allow computation of d given e. If p and q are not stored in the CRT form of the private key, they are securely deleted along with the other intermediate values from the key generation.
Encrypting messages
Suppose Bob wishes to send a message M to Alice. He turns M into a number m < n, using some previously agreed-upon reversible protocol known as a padding scheme.
Bob now has m, and knows n and e, which Alice has announced. He then computes the ciphertext c corresponding to m:
This can be done quickly using the method of exponentiation by squaring. Bob then transmits c to Alice.
Decrypting messages
Alice receives c from Bob, and knows her private key d. She can recover m from c by the following procedure:
Given m, she can recover the original message M. The decryption procedure works because
- .
Now, since ed ≡ 1 (mod p-1) and ed ≡ 1 (mod q-1), Fermat's little theorem yields
and
Since p and q are distinct prime numbers, applying the Chinese remainder theorem to these two congruences yields
- .
Thus,
- .
A working example
Here is an example of RSA encryption and decryption. The parameters used here are artificially small, but you can also use OpenSSL to generate and examine a real keypair.
We let
p = 61 | — first prime number (to be kept secret or deleted securely) |
q = 53 | — second prime number (to be kept secret or deleted securely) |
n = pq = 3233 | — modulus (to be made public) |
e = 17 | — public exponent (to be made public) |
d = 2753 | — private exponent (to be kept secret) |
The public key is (e, n). The private key is d. The encryption function is:
- encrypt(m) = m mod n = m mod 3233
where m is the plaintext. The decryption function is:
- decrypt(c) = c mod n = c mod 3233
where c is the ciphertext.
To encrypt the plaintext value 123, we calculate
- encrypt(123) = 123 mod 3233 = 855
To decrypt the ciphertext value 855, we calculate
- decrypt(855) = 855 mod 3233 = 123
Both of these computations can be done efficiently using the square-and-multiply algorithm for modular exponentiation.
Padding schemes
When used in practice, RSA must be combined with some form of padding scheme, so that no values of M result in insecure ciphertexts. RSA used without padding may suffer from a number of potential problems:
- The values m = 0 or m = 1 always produce ciphertexts equal to 0 or 1 respectively, due to the properties of exponentiation.
- When encrypting with low encryption exponents (e.g., e = 3) and small values of the m, the (non-modular) result of may be strictly less than the modulus n. In this case, ciphertexts may be easily decrypted by taking the eth root of the ciphertext with no regard to the modulus.
- Because RSA encryption is a deterministic encryption algorithm – i.e., has no random component – an attacker can successfully launch a chosen plaintext attack against the cryptosystem, building a dictionary by encrypting likely plaintexts under the public key, and storing the resulting ciphertexts. When matching ciphertexts are observed on a communication channel, the attacker can use this dictionary in order to learn the content of the message.
In practice, the first two problems might arise when sending short ASCII messages, where m is the concatenation of one or more ASCII-encoded character(s). A message consisting of a single ASCII NUL
character (whose numeric value is 0) would be encoded as m = 0, which produces a ciphertext of 0 regardless of what e and N are used. Likewise, a single ASCII SOH
(whose numeric value is 1) would always produce a ciphertext of 1. For systems which conventionally use small values of e, such as 3, all single character ASCII messages encoded using this scheme would be insecure, since the largest m would have a value of 255, and 255 is less than any reasonable modulus. Such plaintexts could be recovered by simply taking the cube root of the ciphertext.
To avoid these problems, practical RSA implementations typically embed some form of structured, randomized padding into the value m before encrypting it. This padding ensures that m does not fall into the range of insecure plaintexts, and that a given message, once padded, will encrypt to one of a large number of different possible ciphertexts. The latter property can increase the cost of a dictionary attack beyond the capabilities of a reasonable attacker.
Standards such as PKCS have been carefully designed to securely pad messages prior to RSA encryption. Because these schemes pad the plaintext m with some number of additional bits, the size of the un-padded message M must be somewhat smaller. RSA padding schemes must be carefully designed so as to prevent sophisticated attacks which may be facilitated by a predictable message structure. Early versions of the PKCS standard used ad-hoc constructions, which were later found vulnerable to a practical adaptive chosen ciphertext attack. Modern constructions use secure techniques such as Optimal Asymmetric Encryption Padding (OAEP) to protect messages while preventing these attacks. The PKCS standard also incorporates processing schemes designed to provide additional security for RSA signatures, e.g., the Probabilistic Signature Scheme for RSA (RSA-PSS).
Signing messages
Suppose Alice uses Bob's public key to send him an encrypted message. In the message, she can claim to be Alice but Bob has no way of verifying that the message was actually from Alice since anyone can use Bob's public key to send him encrypted messages. So, in order to verify the origin of a message, RSA can also be used to sign a message.
Suppose Alice wishes to send a signed message to Bob. She produces a hash value of the message, raises it to the power of d mod n (as she does when decrypting a message), and attaches it as a "signature" to the message. When Bob receives the signed message, he raises the signature to the power of e mod n (as he does when encrypting a message), and compares the resulting hash value with the message's actual hash value. If the two agree, he knows that the author of the message was in possession of Alice's secret key, and that the message has not been tampered with since.
Note that secure padding schemes such as RSA-PSS are as essential for the security of message signing as they are for message encryption, and that the same key should never be used for both encryption and signing purposes.
Security
The security of the RSA cryptosystem is based on two mathematical problems: the problem of factoring large numbers and the RSA problem. Full decryption of an RSA ciphertext is thought to be infeasible on the assumption that both of these problems are hard, i.e., no efficient algorithm exists for solving them. Providing security against partial decryption may require the addition of a secure padding scheme.
The RSA problem is defined as the task of taking eth roots modulo a composite n: recovering a value m such that m=c mod n, where (e, n) is an RSA public key and c is an RSA ciphertext. Currently the most promising approach to solving the RSA problem is to factor the modulus n. With the ability to recover prime factors, an attacker can compute the secret exponent d from a public key (e, n), then decrypt c using the standard procedure. To accomplish this, an attacker factors n into p and q, and computes (p-1)(q-1) which allows the determination of d from e. No polynomial-time method for factoring large integers on a classical computer has yet been found, but it has not been proven that none exists. See integer factorization for a discussion of this problem.
As of 2005, the largest number factored by general-purpose methods was 663 bits long, using state-of-the-art distributed methods. RSA keys are typically 1024–2048 bits long. Some experts believe that 1024-bit keys may become breakable in the near term (though this is disputed); few see any way that 4096-bit keys could be broken in the foreseeable future. Therefore, it is generally presumed that RSA is secure if n is sufficiently large. If n is 256 bits or shorter, it can be factored in a few hours on a personal computer, using software already freely available. If n is 512 bits or shorter, it can be factored by several hundred computers as of 1999. A theoretical hardware device named TWIRL and described by Shamir and Tromer in 2003 called into question the security of 1024 bit keys. It is currently recommended that n be at least 2048 bits long.
In 1993, Peter Shor published Shor's algorithm, showing that a quantum computer could in principle perform the factorization in polynomial time, rendering RSA and related algorithms obsolete. However, quantum computation is not expected to be developed to such a level for many years.
- See also: RSA Factoring Challenge
Practical considerations
Key generation
Finding the large primes p and q is usually done by testing random numbers of the right size with probabilistic primality tests which quickly eliminate virtually all non-primes.
p and q should not be 'too close', lest the Fermat factorization for n be successful. Furthermore, if either p-1 or q-1 has only small prime factors, n can be factored quickly and these values of p or q should therefore be discarded as well.
One should not employ a prime search method which gives any information whatsoever about the primes to the attacker. In particular, a good random number generator for the start value needs to be employed. Note that the requirement here is both 'random' and 'unpredictable'. These are not the same criteria; a number may have been chosen by a random process (ie, no pattern in the results), but if it is predictable in any manner (or even partially predictable), the method used will result in loss of security. For example, the random number table published by the Rand Corp in the 1950s might very well be truly random, but it has been published and thus can serve an attacker as well. If the attacker can guess half of the digits of p or q, they can quickly compute the other half (shown by Coppersmith in 1997).
It is important that the secret key d be large enough. Wiener showed in 1990 that if p is between q and 2q (which is quite typical) and d < n/3, then d can be computed efficiently from n and e. There is no known attack against small public exponents such as e=3, provided that proper padding is used. However, when no padding is used or when the padding is improperly implemented then small public exponents have a greater risk of leading to an attack, such as for example the unpadded plaintext vulnerability listed above. 65537 is a commonly used value for e. This value can be regarded as a compromise between avoiding potential small exponent attacks and still allowing efficient encryptions (or signature verification). The NIST draft FIPS PUB 186-3 (March 2006) does not allow public exponents e smaller than 65537, but does not state a reason for this restriction.
Speed
RSA is much slower than DES and other symmetric cryptosystems. In practice, Bob typically encrypts a secret message with a symmetric algorithm, encrypts the (comparatively short) symmetric key with RSA, and transmits both the RSA-encrypted symmetric key and the symmetrically-encrypted message to Alice.
This procedure raises additional security issues. For instance, it is of utmost importance to use a strong random number generator for the symmetric key, because otherwise Eve (an eavesdropper wanting to see what was sent) could bypass RSA by guessing the symmetric key.
Key distribution
As with all ciphers, how RSA public keys are distributed is important to security. Key distribution must be secured against a man-in-the-middle attack. Suppose Eve has some way to give Bob arbitrary keys and make him believe they belong to Alice. Suppose further that Eve can intercept transmissions between Alice and Bob. Eve sends Bob her own public key, which Bob believes to be Alice's. Eve can then intercept any ciphertext sent by Bob, decrypt it with her own secret key, keep a copy of the message, encrypt the message with Alice's public key, and send the new ciphertext to Alice. In principle, neither Alice nor Bob would be able to detect Eve's presence. Defenses against such attacks are often based on digital certificates or other components of a public key infrastructure.
Timing attacks
Kocher described a new attack on RSA in 1995: if the attacker Eve knows Alice's hardware in sufficient detail and is able to measure the decryption times for several known ciphertexts, she can deduce the decryption key d quickly. This attack can also be applied against the RSA signature scheme. In 2003, Boneh and Brumley demonstrated a more practical attack capable of recovering RSA factorizations over a network connection (e.g., from a Secure Socket Layer (SSL)-enabled webserver). This attack takes advantage of information leaked by the Chinese remainder theorem optimization used by many RSA implementations.
One way to thwart these attacks is to ensure that the decryption operation takes a constant amount of time for every ciphertext. However, this approach can significantly reduce performance. Instead, most RSA implementations use an alternate technique known as cryptographic blinding. RSA blinding makes use of the multiplicative property of RSA. Instead of computing c mod n, Alice first chooses a secret random value r and computes (rc) mod n. The result of this computation is r m mod n and so the effect of r can be removed by multiplying by its inverse. A new value of r is chosen for each ciphertext. With blinding applied, the decryption time is no longer correlated to the value of the input ciphertext and so the timing attack fails.
Adaptive chosen ciphertext attacks
In 1998, Daniel Bleichenbacher described the first practical adaptive chosen ciphertext attack, against RSA-encrypted messages using the PKCS #1 v1 padding scheme (a padding scheme randomizes and adds structure to an RSA-encrypted message, so it is possible to determine whether a decrypted message is valid.) Due to flaws with the PKCS #1 scheme, Bleichenbacher was able to mount a practical attack against RSA implementations of the Secure Socket Layer protocol, and to recover session keys. As a result of this work, cryptographers now recommend the use of provably secure padding schemes such as Optimal Asymmetric Encryption Padding, and RSA Laboratories has released new versions of PKCS #1 that are not vulnerable to these attacks.
See also
- Clifford Cocks
- Quantum cryptography
- Cryptographic key length
- Computational complexity theory
- Diffie-Hellman
References
- R. Rivest, A. Shamir, L. Adleman. A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Communications of the ACM, Vol. 21 (2), pp.120–126. 1978. Previously released as an MIT "Technical Memo" in April 1977. Initial publication of the RSA scheme.
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 31.7: The RSA public-key cryptosystem, pp.881–887.
External links
- PKCS #1: RSA Cryptography Standard (RSA Laboratories website)
- The PKCS #1 standard "provides recommendations for the implementation of public-key cryptography based on the RSA algorithm, covering the following aspects: cryptographic primitives; encryption schemes; signature schemes with appendix; ASN.1 syntax for representing keys and for identifying the schemes".
- Thorough walk through of RSA
Template:Public-key cryptography
Categories: