PRACTICAL IOT CRYPTOGRAPHY ON THE ESPRESSIF ESP8266

The Espressif ESP8266 chipset makes three-dollar ‘Internet of Things’ development boards an economic reality. According to the popular automatic firmware-building site nodeMCU-builds, in the last 60 days there have been 13,341 custom firmware builds for that platform. Of those, only 19% have SSL support, and 10% include the cryptography module.

We’re often critical of the lack of security in the IoT sector, and frequently cover botnets and other attacks, but will we hold our projects to the same standards we demand? will we stop at identifying the problem, or can we be part of the solution?

This article will focus on applying AES encryption and hash authorization functions to the MQTT protocol using the popular ESP8266 chip running NodeMCU firmware. Our purpose is not to provide a copy/paste panacea, but to go through the process step by step, identifying challenges and solutions along the way. The result is a system that’s end-to-end encrypted and authenticated, preventing eavesdropping along the way, and spoofing of valid data, without relying on SSL.

We’re aware that there are also more powerful platforms that can easily support SSL (e.g. Raspberry Pi, Orange Pi, FriendlyARM), but let’s start with the cheapest hardware most of us have lying around, and a protocol suitable for many of our projects. AES is something you could implement on an AVR if you needed to.

Teoria

MQTT is a lightweight messaging protocol that runs on top of TCP/IP and is frequently used for IoT projects. client devices subscribe or publish to topics (e.g. sensors/temperature/kitchen), and these messages are relayed by an MQTT broker. more information on MQTT is available on their webpage or in our own getting-started series.

The MQTT protocol doesn’t have any built-in security features beyond username/password authentication, so it’s common to encrypt and authenticate across a network with SSL. However, SSL can be rather demanding for the ESP8266 and when enabled, you’re left with much less memory for your application. As a lightweight alternative, you can encrypt only the data payload being sent, and use a session ID and hash function for authentication.

A straightforward way to do this is using Lua and the NodeMCU Crypto module, which includes support for the AES algorithm in CBC mode as well as the HMAC hash function. using AES encryption correctly requires three things to produce ciphertext: a message, a key, and an initialization vector (IV). Messages and keys are straightforward concepts, but the initialization vector is worth some discussion.

When you encode a message in AES with a static key, it will always produce the same output. For example, the message “usernamepassword” encrypted with key “1234567890ABCDEF” might produce a result like “E40D86C04D723AFF”. If you run the encryption again with the same key and message, you will get the same result. This opens you to several common types of attack, especially pattern analysis and replay attacks.

In a pattern analysis attack, you use the knowledge that a given piece of data will always produce the same ciphertext to guess what the purpose or content of different messages are without actually knowing the secret key. For example, if the message “E40D86C04D723AFF” is sent prior to all other communications, one might quickly guess it is a login. In short, if the login system is simplistic, sending that packet (a replay attack) might be enough to identify yourself as an authorized user, and chaos ensues.

IVs make pattern analysis more difficult. An IV is a piece of data sent along with the key that modifies the end ciphertext result. As the name suggests, it initializes the state of the encryption algorithm before the data enters. The IV needs to be different for each message sent so that repeated data encrypts into different ciphertext, and some ciphers (like AES-CBC) require it to be unpredictable – a practical way to accomplish this is just to randomize it each time. IVs do not have to be kept secret, but it’s typical to obfuscate them in some way.

While this protects against pattern analysis, it doesn’t help with replay attacks. For example, retransmitting a given set of encrypted data will still duplicate the result. To prevent that, we need to authenticate the sender. We will use a public, pseudorandomly generated session ID for each message. This session ID can be generated by the receiving device by posting to an MQTT topic.

Preventing these types of attacks is important in a couple of common use cases. Internet controlled stoves exist, and questionable utility aside, it would be nice if they didn’t use insecure commands. Secondly, if I’m datalogging from a hundred sensors, I don’t want anyone filling my database with garbage.

Practical Encryption

Implementing the above on the NodeMCU requires some effort. You will need firmware compiled to include the ‘crypto’ module in addition to any others you require for your application. SSL support is not required.

First, let’s assume you’re connected to an MQTT broker with something like the following. You can implement this as a separate function from the cryptography to keep things clean. The client subscribes to a sessionID channel, which publishes suitably long, pseudorandom session IDs. You could encrypt them, but it’s not necessary.

1.
2.
3.
4.
5.
6.
7
8.
9
10.
11
12.
13.
14
15.
m = mqtt.Client("clientid", 120)

m:connect("myserver.com", 1883, 0,
function(client)
print("connected")
client:subscribe("mytopic/sessionID", 0,
function(client) print("subscribe success") end
)
fim,
function(client, reason)
print("failed reason: " .. reason)
fim
)

m:on("message", function(client, topic, sessionID) end)

Moving on, the node ID is a convenient way to help identify data sources. You can use any string you wish though: nodeid = node.chipid().

Then, we set up a static initialization vector and a key. This is only used to obfuscate the randomized initialization vector sent with each message, NOT used for any data. We also choose a separate key for the data. These keys are 16-bit hex, just replace them with yours.

Finally we’ll need a passphrase for a hash function we’ll be using later. A string of reasonable length is fine.

1.
2.
3.
4.
staticiv = "abcdef2345678901"
ivkey = "2345678901abcdef"
datakey = "0123456789abcdef"
passphrase = "mypassphrase"

We’ll also assume you have some source of data. For this example it will be a value read from the ADC. data = adc.read(0)

Now, we generate a pseudorandom initialization vector. A 16-digit hex number is too large for the pseudorandom number function, so we generate it in two halves (16^8 minus 1) and concatenate them.

1.
2.
3.
4.
5.
half1 = node.random(4294967295)
half2 = node.random(4294967295)
I = string.format("%8x", half1)
V = string.format("%8x", half2)
iv = I .. V

We can now run the actual encryption. here we are encrypting the current initialization vector, the node ID, and one piece of sensor data.

1.
2.
3.
encrypted_iv = crypto.encrypt("AES-CBC", ivkey, iv, staticiv)
encrypted_nodeid = crypto.encrypt("AES-CBC", datakey, nodeid,iv)
encrypted_data = crypto.encrypt("AES-CBC", datakey, data,iv)

Now we apply the hash function for authentication. first we combine the nodeid, iv, data, and session ID into a single message, then compute a HMAC SHA1 hash using the passphrase we defined earlier. We convert it to hex to make it a bit more human-readable for any debugging.

1.
2.
fullmessage = nodeid .. iv .. data .. sessionID
hmac = crypto.toHex(crypto.hmac("sha1", fullmessage, passphrase))

Now that both encryption and authentication checks are in place, we can place all this information in some structure and send it. Here, we’ll use comma separated values as it’s convenient:

1.
2.
payload = table.concat({encrypted_iv, eid, data1, hmac}, ",")
m:publish("yourMQTTtopic", payload, 2, 1, function(client) p = "Sent" print(p) end)

When we run the above code on an actual NodeMCU, we would get output something like this:

1d54dd1af0f75a91a00d4dcd8f4ad28d,
d1a0b14d187c5adfc948dfd77c2b2ee5,
564633a4a053153bcbd6ed25370346d5,
c66697df7e7d467112757c841bfb6bce051d6289

All together, the encryption program is as follows (MQTT sections excluded for clarity):

1.
2.
3.
4.
5.
6.
7
8.
9
10.
11
12.
13.
14
15.
16
17
18.
19
nodeid = node.chipid()
staticiv = "abcdef2345678901"
ivkey = "2345678901abcdef"
datakey = "0123456789abcdef"
passphrase = "mypassphrase"

data = adc.read(0)
half1 = node.random(4294967295)
half2 = node.random(4294967295)
I = string.format("%8x", half1)
V = string.format("%8x", half2)
iv = I .. V

encrypted_iv = crypto.encrypt("AES-CBC", ivkey, iv, staticiv)
encrypted_nodeid = crypto.encrypt("AES-CBC", datakey, nodeid,iv)
encrypted_data = crypto.encrypt("AES-CBC", datakey, data,iv)
fullmessage = nodeid .. iv .. data .. sessionID
hmac = crypto.toHex(crypto.hmac("sha1",fullmessage,passphrase))
payload = table.concat({encrypted_iv, encrypted_nodeid, encrypted_data, hmac}, ",")

Decryption

Now, your MQTT broker doesn’t know or care that the data is encrypted, it just passes it on. So, your other MQTT clients subscribed to the topic will need to know how to decrypt the data. On NodeMCU this is rather easy. just split the received data into strings via the commas, and do something like the below. note this end will have generated the session ID so already knows it.

1.
2.
3.
4.
5.
6.
7
8.
9
10.
staticiv = "abcdef2345678901"
ivkey = "2345678901abcdef"
datakey = "0123456789abcdef"
passphrase = "mypassphrase"

iv = crypto.decrypt("AES-CBC", ivkey, encrypted_iv, staticiv)
nodeid = crypto.decrypt("AES-CBC&quOT; DataKey, criptoted_nodeid, iv)
Data = Crypto.Decrypt (& quot; AES-CBC & quot; DataKey, Encrypted_Data, IV)
Fullmessage = Nodeid .. IV .. Dados .. SessionID
hmac = crypto.tohex (cripto.hmac (& quot; sha1?, completo, frase secreta))

Em seguida, compare o HMAC recebido e computado, e independentemente do resultado, invalide esse ID de sessão gerando um novo.

Mais uma vez, em Python

Para uma pequena variedade, considere como lidaríamos de descriptografia em Python, se tivéssemos um cliente MQTT na mesma máquina virtual que o corretor que analisava os dados ou armazenando-o em um banco de dados. Vamos supor que você recebeu os dados como uma corda “Payload”, de algo como o excelente cliente da Paho MQTT para Python.

Neste caso, é conveniente para o HEX codificar os dados criptografados no NODEMCU antes de transmitir. Então, no Nodemcu, convertemos todos os dados criptografados para Hex, por exemplo: Encrypted_iv = Crypto.Tohex (Crypto.Encrypt (“AES-CBC”, IVKY, IV, Staticiv))

A publicação de uma sessão randomizada não é discutida abaixo, mas é fácil o suficiente usando o OS.Urandom () e o cliente da Paho MQTT. A descriptografia é tratada da seguinte forma:

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
de Crypto.cifer Import AES
Importar Binascii.
de Crypto.Hash Import Sha, HMAC

# Defina todas as chaves
IVKEY = ‘2345678901ABCDEF’
DataKey = ‘0123456789abcdef’
staticiv = ‘abcdef2345678901’
Passphrase = ‘Mypassphrase’

# Converta a string recebida para uma lista
Dados = Payload.Split (& quot;, & quot;)

# extrair itens de lista
criptoted_iv = binascii.unHexlify (dados [0])
criptoted_nodeid = binascii.unHexlify (dados [1])
criptoted_data = binascii.unHexlify (dados [2])
recebido_hash = binascii.unHexlify (dados [3])

# descriptografar o vetor de inicialização
iv_decryption_suite = AES.NEW (IVKY, AES.MODE_CBC, Staticiv)
iv = iv_decryption_suite.decrypt (Encrypted_iv)

# descriptografar os dados usando o vetor de inicialização
id_decryption_suite = aes.new (datakey, aes.mode_cbc, iv)
Nodeid = id_decryption_suite.decrypt (Encrypted_nodeID)
Data_Decryption_Suite = AES.NEW (DataKey, AES.MODE_CBC, iv)
sensordata = data_decryption_suite.decrypt (Encrypted_Data)

# função de compute hash para comparar a recebido_hash
Fullmessage = s.join ([Nodeid, IV, sensordata, sessionid])
HMAC = HMAC.NEW (Passphrase, FullMessage, SHA)
computed_hash = hmac.hexdigest ()

# ver docs.python.org/2/library/hmac.html para como comparar hashes com segurança

O fim, o começo

Agora temos um sistema que envia mensagens autenticadas criptografadas por meio de um servidor MQTT para outro cliente ESP8266 ou um sistema maior em execução Python. Ainda há extremidades soltas importantes para você amarrar se você implementar isso sozinho. As chaves são todas armazenadas na memória flash do ESP8266S, portanto, você vai querer controlar o acesso a esses dispositivos para evitar a engenharia inversa. As chaves também são armazenadas no código no computador que recebe os dados, aqui executando o Python. Além disso, você provavelmente quer que cada cliente tenha uma chave e senha diferentes. Isso é muito material secreto para se manter seguro e potencialmente atualizado quando necessário. Resolver o problema de distribuição chave é deixado como um exercício para o leitor motivado.

E em uma nota de fechamento, uma das coisas terríveis sobre escrever um artigo envolvendo a criptografia é a possibilidade de estar errado na Internet. Esta é uma aplicação bastante simples do modo AES-CBC testado e verdadeiro com HMAC, por isso deve ser bastante sólido. No entanto, se você encontrar alguma deficiências interessantes no acima, por favor, informe-nos nos comentários.

Posted in Uncategorized

Leave a Reply

Your email address will not be published. Required fields are marked *

Extra Text
Cape Town, South Africa