How are cryptographic hash functions developed

Cryptographic hash functions

Cryptographic hash functions are an important cryptographic tool and form a separate area in cryptography. Cryptographic hash functions generate a character string with a fixed length (specified in bits) from data records of any length. A data record can be a word, a sentence, a longer text or an entire file.
The character string generated is called a digital fingerprint, cryptographic checksum, Message Digest (MD) or Message Authentication Code (MAC). This usually means the so-called hash value or just hash. This is a digital code that comes out as a result after applying the cryptographic hash function.

The formation of a hash value initially has nothing to do with cryptography. Because not all hash functions are cryptographic hash functions from the point of view of cryptography.
There are many different terms for "real" cryptographic hash functions, as well as product names or performance features, but these do not say anything about whether they meet cryptographic requirements.

  • Footprint function
  • secure hash function
  • Manipulation Detection Code (MDC)
  • Message Integrity Code (MIC)
  • Checksum method

Requirements for cryptographic hash functions

  • Uniqueness: An identical string must lead to the same hash value.
  • Reversibility: It must not be possible to recalculate the hash value back into the original character string.
  • Collision resistance: Two different strings cannot result in the same hash value.

Not all hash functions meet all of these requirements. Therefore, not all hash functions are suitable for cryptographic applications such as authentication and encryption.

How a cryptographic hash function works


In principle, a hash function generates a dual number from a data set, which is called a pre-image or in English, a dual number, which is usually represented in hexadecimal notation and is called a hash value.
The functionality of a cryptographic hash function is based on a one-way function that is very easy to calculate, but the inversion of which, on the other hand, is very complex or even impossible. To infer the inversion of the hash value to the archetype is what you want to prevent.

Reversibility

In principle, it should not be possible to recalculate the original data from a hash value. Because possibilities are found over time and the computing power increases, there are always better methods of calculating the original data back from a hash value. This is why it turns out time and again that hash functions are reversible.

Collision resistance

In principle, an archetype can occupy any number of places and any number of values. However, a hash value is limited to a certain length. It can happen that any hash value corresponds to different archetypes. One then speaks of a collision. With a good hash function, there should be as few collisions as possible.
Let's take checksums as an example. Here it can happen that the checksum can correspond to several numerical values. From the point of view of cryptography, the formation of the checksum is therefore not a cryptographic hash function.
Cryptography places higher demands on hash functions and their applications. It should be impossible for an attacker to generate collisions.

  • Statistically, each hash value should appear about the same number of times.
  • The hash value should be different even with small changes to the original image.

In order to avoid the likelihood of collisions, better and better methods are used, which usually generate longer hash values.
For example, the well-known and popular hash functions MD5 and SHA1 are vulnerable to collision attacks. This means that another data set can generate the same hash value. This means that an MD5 or SHA1 hash is not unique. It is better to use SHA256 or SHA512.

Attacks on cryptographic hash functions

An attack on a cryptographic hash function aims to preserve the original image. Several possibilities are known.

  • Substitution attack
  • Birthday attack
  • Dictionary attack
  • Rainbow table

You can only protect yourself against the first two attacks with sufficiently long hash values. These should be longer than the recommended key length for symmetric procedures. 160 bits are considered the absolute minimum.

By far the most common attack is the dictionary attack. He is the most successful. It works best when the archetype consists of one word. Maybe even a term in a dictionary.
Dictionary attacks are particularly applied to passwords that are stored as hash values ​​in a user database.
It can be made more difficult for the attacker if a hash value is generated several times from the generated hash value. Even with several hundred runs, there should rarely be a performance problem. However, an attacker has to make just as many runs to get the right hash value. And that for his entire dictionary.

Key-dependent hash functions

Typically, cryptographic hash functions work without a key. As a rule, it does not matter if the attacker knows which hash function is being used. Most applications are about creating a kind of checksum or avoiding data storage or transmission in plain text.

However, there are also applications where a secret key that the attacker does not know would be advantageous for calculating a hash value. For example, if two communication participants have already exchanged a secret session key. In such a case, one speaks of a key-dependent hash function. A Message Authentication Code (MAC) is often used.

The question now is, why do you need a key-dependent hash function when there are signature processes. Because a signature procedure would be exactly that. However, a key-dependent hash function requires significantly less computing time and manages with shorter keys. The digital signature is therefore not superfluous. On the contrary. When it comes to commitment, you cannot do without the digital signature. Because the digital signature can only be created by someone who has the private key of the key pair. If, on the other hand, the sender and recipient agreed on only one key and applied a key-dependent hash function, the recipient would be unable to prove anything to the sender. Because anyone could have had the secret key.

The difference between a key-dependent hash function and a normal hash function lies in the different security goal. The preferred attack in the key-dependent hash function is to get the key out. With the normal hash function, the attacker wants to find collisions.
With a key-dependent hash function, the attacks are similar to those with symmetric encryption. For example, a known pre-image or chosen pre-image attack.

Applications of cryptographic hash functions

  • Pseudo random generator
  • Mixer for random sources
  • Derive the session key from the master key
  • One-time password generator
  • Procedure for authentication (digital signature)
  • Saving passwords
  • Formation of cryptographic checksums
  • Integrity check

Overview: cryptographic hash functions

Cryptographic hash functions form a separate area in cryptography. Well-known cryptographers who are known from other cryptographic processes were often involved in their development.

Overview: Cryptographic key-dependent hash functions

The common key-dependent hash functions are based on other cryptographic methods. Either a cryptographic hash function or a symmetric encryption method forms the basis. Another, rarer option is a stream cipher that also generates a hash value.

  • HMAC
  • CBC-MAC
  • UMAC
  • EMAC
  • TTMAC
  • COMP128

Other related topics:

share

Product recommendations

Everything you need to know about networks.

Network technology primer

The network technology primer is a book about the basics of network technology, transmission technology, TCP / IP, services, applications and network security.

I want that!

Everything you need to know about networks.

Network technology primer

The network technology primer is a book about the basics of network technology, transmission technology, TCP / IP, services, applications and network security.

I want that!