This library provides bindings to functionality of OpenSSL that is related to cryptography and authentication, not necessarily involving connections, sockets or streams.
A basic design principle of this library is that its default algorithms are cryptographically secure at the time of this writing. We will change the default algorithms if an attack on them becomes known, and replace them by new defaults that are deemed appropriate at that time.
This may mean, for example, that where sha256
is
currently the default algorithm, blake2s256
or some other
algorithm may become the default in the future.
To preserve interoperability and compatibility and at the same time allow us to transparently update default algorithms of this library, the following conventions are used:
This allows application programmers to inspect which algorithm was actually used, and store it for later reference.
For example:
?- crypto_data_hash(test, Hash, [algorithm(A)]). Hash = '9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08', A = sha256.
This shows that at the time of this writing, sha256
was
deemed sufficiently secure, and was used as default algorithm for
hashing.
You therefore must not rely on which concrete algorithm is being used by default. However, you can rely on the fact that the default algorithms are secure. In other words, if they are not secure, then this is a mistake in this library, and we ask you to please report such a situation as an urgent security issue.
In the context of this library, bytes can be represented as lists of integers between 0 and 255. Such lists can be converted to and from hexadecimal notation with the following bidirectional relation:
Example:
?- hex_bytes('501ACE', Bs). Bs = [80, 26, 206].
Almost all cryptographic applications require the availability of numbers that are sufficiently unpredictable. Examples are the creation of keys, nonces and salts. With this library, you can generate cryptographically strong pseudo-random numbers for such use cases:
One way to relate such a list of bytes to an integer is to use CLP(FD) constraints as follows:
:- use_module(library(clpfd)). bytes_integer(Bs, N) :- foldl(pow, Bs, 0-0, N-_). pow(B, N0-I0, N-I) :- B in 0..255, N #= N0 + B*256^I0, I #= I0 + 1.
With this definition, you can generate a random 256-bit integer from a list of 32 random bytes:
?- crypto_n_random_bytes(32, Bs), bytes_integer(Bs, I). Bs = [98, 9, 35, 100, 126, 174, 48, 176, 246|...], I = 109798276762338328820827...(53 digits omitted).
The above relation also works in the other direction, letting you translate an integer to a list of bytes. In addition, you can use hex_bytes/2 to convert bytes to tokens that can be easily exchanged in your applications. This also works if you have compiled SWI-Prolog without support for large integers.
A hash, also called digest, is a way to verify the integrity of data. In typical cases, a hash is significantly shorter than the data itself, and already miniscule changes in the data lead to different hashes.
The hash functionality of this library subsumes and extends that of
library(sha)
, library(hash_stream)
and library(md5)
by providing a unified interface to all available digest algorithms.
The underlying OpenSSL library (libcrypto
) is
dynamically loaded if
either library(crypto)
or library(ssl)
are loaded. Therefore, if your application uses library(ssl)
,
you can use library(crypto)
for hashing without increasing
the memory footprint of your application. In other cases, the
specialised hashing libraries are more lightweight but less general
alternatives to library(crypto)
.
The most important predicates to compute hashes are:
md5
(insecure), sha1
(insecure), ripemd160
,
sha224
, sha256
, sha384
, sha512
, sha3_224
, sha3_256
,
sha3_384
, sha3_512
, blake2s256
or blake2b512
.
The BLAKE digest algorithms require OpenSSL 1.1.0 or greater, and the
SHA-3 algorithms require OpenSSL 1.1.1 or greater. The default is a
cryptographically secure algorithm. If you specify a variable, then that
variable is unified with the algorithm that was used.utf8
. The other
meaningful value is octet
, claiming that Data
contains raw bytes.Data | is either an atom, string or code-list |
Hash | is an atom that represents the hash in hexadecimal encoding. |
For the important case of deriving hashes from passwords, the following specialised predicates are provided:
crypto_password_hash(Password, Hash, [])
and computes a
password-based hash using the default options.Another important distinction is that equal passwords must yield, with very high probability, different hashes. For this reason, cryptographically strong random numbers are automatically added to the password before a hash is derived.
Hash is unified with an atom that contains the computed hash and all parameters that were used, except for the password. Instead of storing passwords, store these hashes. Later, you can verify the validity of a password with crypto_password_hash/2, comparing the then entered password to the stored hash. If you need to export this atom, you should treat it as opaque ASCII data with up to 255 bytes of length. The maximal length may increase in the future.
Admissible options are:
pbkdf2-sha512
(the default) and bcrypt
.^
C.
Currently, the default is 17, and thus more than one hundred thousand
iterations. You should set this option as high as your server and users
can tolerate. The default is subject to change and will likely increase
in the future or adapt to new algorithms.Currently, PBKDF2 with SHA-512 is used as the hash derivation function, using 128 bits of salt. All default parameters, including the algorithm, are subject to change, and other algorithms will also become available in the future. Since computed hashes store all parameters that were used during their derivation, such changes will not affect the operation of existing deployments. Note though that new hashes will then be computed with the new default parameters.
The following predicate implements the Hashed Message Authentication Code (HMAC)-based key derivation function, abbreviated as HKDF. It supports a wide range of applications and requirements by concentrating possibly dispersed entropy of the input keying material and then expanding it to the desired length. The number and lengths of the output keys depend on the specific cryptographic algorithms for which the keys are needed.
Admissible options are:
utf8
(default) or octet
, denoting the
representation of Data as in crypto_data_hash/3.
The info/1 option can be used to
generate multiple keys from a single master key, using for example
values such as key
and
iv
, or the name of a file that is to be encrypted.
This predicate requires OpenSSL 1.1.0 or greater.
The following predicates are provided for building hashes incrementally. This works by first creating a context with crypto_context_new/2, then using this context with crypto_data_context/3 to incrementally obtain further contexts, and finally extract the resulting hash with crypto_context_hash/2.
Context | is an opaque pure Prolog term that is subject to garbage collection. |
This predicate allows a hash to be computed in chunks, which may be important while working with Metalink (RFC 5854), BitTorrent or similar technologies, or simply with big files.
The following hashing predicates work over streams:
true
(default), closing the filter stream also closes
the original (parent) stream.
A digital signature is a relation between a key and data that only someone who knows the key can compute.
Signing uses a private key, and verifying a signature uses the corresponding public key of the signing entity. This library supports both RSA and ECDSA signatures. You can use load_private_key/3 and load_public_key/2 to load keys from files and streams.
In typical cases, we use this mechanism to sign the hash of data. See hashing (section 3.5). For this reason, the following predicates work on the hexadecimal representation of hashes that is also used by crypto_data_hash/3 and related predicates.
Signatures are also represented in hexadecimal notation, and you can use hex_bytes/2 to convert them to and from lists of bytes (integers).
hex
) assumes that Data is an atom,
string, character list or code list representing the data in hexadecimal
notation. See rsa_sign/4 for an
example.
Options:
hex
.
Alternatives are octet
, utf8
and text
.Options:
hex
.
Alternatives are octet
, utf8
and text
.
sha1
, sha224
, sha256
, sha384
or sha512
. The default is a cryptographically secure
algorithm. If you specify a variable, then it is unified with the
algorithm that was used.hex
.
Alternatives are octet
, utf8
and text
.
This predicate can be used to compute a sha256WithRSAEncryption
signature as follows:
sha256_with_rsa(PemKeyFile, Password, Data, Signature) :- Algorithm = sha256, read_key(PemKeyFile, Password, Key), crypto_data_hash(Data, Hash, [algorithm(Algorithm), encoding(octet)]), rsa_sign(Key, Hash, Signature, [type(Algorithm)]). read_key(File, Password, Key) :- setup_call_cleanup( open(File, read, In, [type(binary)]), load_private_key(In, Password, Key), close(In)).
Note that a hash that is computed by crypto_data_hash/3 can be directly used in rsa_sign/4 as well as ecdsa_sign/4.
Options:
sha1
,
sha224
, sha256
, sha384
or sha512
.
The default is the same as for rsa_sign/4.
This option must match the algorithm that was used for signing. When
operating with different parties, the used algorithm must be
communicated over an authenticated channel.hex
.
Alternatives are octet
, utf8
and text
.
The following predicates provide asymmetric RSA encryption and decryption. This means that the key that is used for encryption is different from the one used to decrypt the data:
Options:
utf8
.
Alternatives are utf8
and octet
.pkcs1
. Alternatives are pkcs1_oaep
, sslv23
and none
. Note that none
should only be used
if you implement cryptographically sound padding modes in your
application code as encrypting unpadded data with RSA is insecuressl_error(Code, LibName, FuncName, Reason)
is raised if
there is an error, e.g., if the text is too long for the key.
The following predicates provide symmetric encryption and decryption. This means that the same key is used in both cases.
PlainText must be a string, atom or list of codes or characters, and CipherText is created as a string. Key and IV are typically lists of bytes, though atoms and strings are also permitted. Algorithm must be an algorithm which your copy of OpenSSL knows about.
Keys and IVs can be chosen at random (using for example crypto_n_random_bytes/2) or derived from input keying material (IKM) using for example crypto_data_hkdf/4. This input is often a shared secret, such as a negotiated point on an elliptic curve, or the hash that was computed from a password via crypto_password_hash/3 with a freshly generated and specified salt.
Reusing the same combination of Key and IV
typically leaks at least
some information about the plaintext. For example, identical
plaintexts will then correspond to identical ciphertexts. For some
algorithms, reusing an IV with the same Key has
disastrous results and can cause the loss of all properties that are
otherwise guaranteed. Especially in such cases, an IV is also
called a
nonce (number used once). If an IV is not needed for
your algorithm (such as 'aes-128-ecb'
) then any value can
be provided as it will be ignored by the underlying implementation. Note
that such algorithms do not provide semantic security and are
thus insecure. You should use stronger algorithms instead.
It is safe to store and transfer the used initialization vector (or nonce) in plain text, but the key must be kept secret.
Commonly used algorithms include:
’chacha20-poly1305’
’aes-128-gcm’
’aes-128-cbc’
Options:
utf8
.
Alternatives are utf8
and octet
.block
. You can disable padding by supplying none
here. If padding is disabled for block ciphers, then the length of the
ciphertext must be a multiple of the block size.For example, with OpenSSL 1.1.0 and greater, we can use the ChaCha20 stream cipher with the Poly1305 authenticator. This cipher uses a 256-bit key and a 96-bit nonce, i.e., 32 and 12 bytes, respectively:
?- Algorithm = 'chacha20-poly1305', crypto_n_random_bytes(32, Key), crypto_n_random_bytes(12, IV), crypto_data_encrypt("this is some input", Algorithm, Key, IV, CipherText, [tag(Tag)]), crypto_data_decrypt(CipherText, Algorithm, Key, IV, RecoveredText, [tag(Tag)]). Algorithm = 'chacha20-poly1305', Key = [65, 147, 140, 197, 27, 60, 198, 50, 218|...], IV = [253, 232, 174, 84, 168, 208, 218, 168, 228|...], CipherText = <binary string>, Tag = [248, 220, 46, 62, 255, 9, 178, 130, 250|...], RecoveredText = "this is some input".
In this example, we use crypto_n_random_bytes/2 to generate a key and nonce from cryptographically secure random numbers. For repeated applications, you must ensure that a nonce is only used once together with the same key. Note that for authenticated encryption schemes, the tag that was computed during encryption is necessary for decryption. It is safe to store and transfer the tag in plain text.
utf8
.
Alternatives are utf8
and octet
.block
. You can disable padding by supplying none
here.
This library provides operations from number theory that frequently arise in cryptographic applications, complementing the existing built-ins and GMP bindings:
true
(default is false
),
then a safe prime is generated. This means that P is
of the form 2*Q + 1 where Q is also prime.^
(-80).
This library provides functionality for reasoning over elliptic curves. Elliptic curves are represented as opaque objects. You acquire a handle for an elliptic curve via crypto_name_curve/2.
A point on a curve is represented by the Prolog term point(X, Y)
,
where X and Y are integers that represent the
point's affine coordinates.
The following predicates are provided for reasoning over elliptic curves:
prime256v1
and
secp256k1
.
If you have OpenSSL installed, you can get a list of supported curves via:
$ openssl ecparam -list_curves
As one example that involves most predicates of this library, we explain a way to establish a shared secret over an insecure channel. We shall use elliptic curves for this purpose.
Suppose Alice wants to establish an encrypted connection with Bob. To achieve this even over a channel that may be subject to eavesdrooping and man-in-the-middle attacks, Bob performs the following steps:
This mechanism hinges on a way for Alice to establish the authenticity of the signed message (using predicates like rsa_verify/4 and ecdsa_verify/4), for example by means of a public key that was previously exchanged or is signed by a trusted party in such a way that Alice can be sufficiently certain that it belongs to Bob. However, none of these steps require any encryption!
Alice in turn performs the following steps:
Bob receives j*G in plain text and can arrive at the same shared secret by performing the calculation k*(j*G), which is - by associativity and commutativity of scalar multiplication - identical to the point j*(k*G), which is again Q from which the shared secret can be derived, and the message can be decrypted with crypto_data_decrypt/6.
This method is known as Diffie-Hellman-Merkle key exchange over elliptic curves, abbreviated as ECDH. It provides forward secrecy (FS): Even if the private key that was used to establish the authenticity of Bob is later compromised, the encrypted messages cannot be decrypted with it.
A major attraction of using elliptic curves for this purpose is found in the comparatively small key size that suffices to make any attacks unrealistic as far as we currently know. In particular, given any point on the curve, we currently have no efficient way to determine by which scalar the generator was multiplied to obtain that point. The method described above relies on the hardness of this so-called elliptic curve discrete logarithm problem (ECDLP). On the other hand, some of the named curves have been suspected to be chosen in such a way that they could be prone to attacks that are not publicly known.
As an alternative to ECDH, you can use the original DH key exchange scheme, where the prime field GF(p) is used instead of an elliptic curve, and exponentiation of a suitable generator is used instead of scalar multiplication. You can use crypto_generate_prime/3 to generate a sufficiently large prime for this purpose.