Google CTF 2018: Crypto - MITM

Posted on 01 Jul 2018, 10:00 in WriteUps

Challenge description

The challenge was called MITM and as confirmed by the small description given ("Man in the Middle the communication between the client and the server.") we will need to execute a Man-In-The-Middle (MITM) attack There are two more informations given: a zip file containing a python source code and a netcat command line (nc mitm.ctfcompetition.com 1337) to reach the challenge-server where the code is running.


Before trying to find where is the flaw, the general approach used is to answer the following questions:

  • What kind of python modules are used?
  • What is the role of each function?
  • How does the interaction with the challenge-server work?

Python modules used

The source code starts by importing a bunch of modules. Some are just helpers to manage input and outputs (not really interesting):

from binascii import hexlify
from binascii import unhexlify
import logging
import sys
import os

And some are modules implementing cryptographic primitives:

from curve25519 import Private, Public
import nacl.secret
import hmac
import hashlib

After a little bit of research in the source code, hmac and hashlib will be used to define a HMAC based on SHA256, nacl.secret to ensure confidentiality of the data sent between the server and the client (based on a share secret), and curve25519to established this shared secret.

Roles of functions

Apart from helper functions (ReadLine, WriteLine, ReadBin, WriteBin) that are used to manage input and outputs, their are 6 main functions.

ComputeProof

Python source code

def ComputeProof(key, data):
  return hmac.new(key, data, digestmod=hashlib.sha256).digest()

Takes a key and some data as input. Returns the HMAC of the data under the key.

VerifyProof

Python source code

def VerifyProof(key, data, proof):
  return hmac.compare_digest(ComputeProof(key, data), proof)

Takes a key, some data and a proof as input. Returns the result of the verification proof (checks that the proof and the HMAC of the data under the key are equals).

Handshake

Python source code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def Handshake(password, reader, writer):
  myPrivateKey = Private()
  myNonce = os.urandom(32)

  WriteBin(writer, myPrivateKey.get_public().serialize())
  WriteBin(writer, myNonce)

  theirPublicKey = ReadBin(reader)
  theirNonce = ReadBin(reader)

  if myNonce == theirNonce:
    return None
  if theirPublicKey in (b'\x00'*32, b'\x01' + (b'\x00' * 31)):
    return None

  theirPublicKey = Public(theirPublicKey)

  sharedKey = myPrivateKey.get_shared_key(theirPublicKey)
  myProof = ComputeProof(sharedKey, theirNonce + password)

  WriteBin(writer, myProof)
  theirProof = ReadBin(reader)

  if not VerifyProof(sharedKey, myNonce + password, theirProof):
    return None

  return sharedKey

This function takes a pre-shared password and establish a new shared secret between the caller of the function and another entity (server or client). This handshake is made through a supposed insecure channel and is based on Curve25519. Curve25519's goal is to provide a Diffie-Hellman protocol based on an elliptic curve: each entity generate a random private key PrivA (respectively, PrivB) and derives a public key PubA (respectively, PubB) that are used to generate a secret S = f(PubA, PrivB) = f(PubB, PrivA). This way, only PubA and PubB are send through the insecure channel.

The handshake function used can be described by this diagram:

Handshake diagram

As seen on this diagram, the Handshake function is completely symmetric. Particularly, the challenge-server executing the Handshake function (either as a client or a server) checks the public key and the nonce sent to him (see the source code, line 11 and 13). The Handshake function fails if:

  • The nonce receive is the same as the nonce sent, or;
  • The public key received is all null bytes or is 1 represent in little endian.

As we will see in the solution of the challenge, the check of the public key received is a big hint.

Server

Python source code

def Server(password, flag, reader, writer):
  sharedKey = Handshake(password, reader, writer)
  if sharedKey is None:
    WriteLine(writer, b'Error: nope.')
    return 1

  mySecretBox = nacl.secret.SecretBox(sharedKey)
  WriteBin(writer, mySecretBox.encrypt(b"AUTHENTICATED"))

  while 1:
    cmd = mySecretBox.decrypt(ReadBin(reader))
    if cmd == b'help':
      rsp = b'help|exit|whoami|getflag'
    elif cmd == b'exit':
      return 0
    elif cmd == b'whoami':
      rsp = b'root'
    elif cmd == b'getflag':
      rsp = flag
    else:
      return 1
    WriteBin(writer, mySecretBox.encrypt(rsp))

After negotating a new shared secret with a client using Handshake (with the pre-shared password), every communication to this client is encrypted using this shared secret. If the handshake fails (i.e., returns None), it sends an error message in clear.

The server sends 'AUTHENTICATED' to the client to notify him that the handshake ended successfully. It then accepts some basic commands from the client, including getflag that tells him to send the flag.

Client

Python source code

def Client(password, reader, writer):
  sharedKey = Handshake(password, reader, writer)
  if sharedKey is None:
    WriteLine(writer, b'Error: nope.')
    return 1

  mySecretBox = nacl.secret.SecretBox(sharedKey)
  line = mySecretBox.decrypt(ReadBin(reader))
  if line != b"AUTHENTICATED":
    WriteLine(writer, b'Error: nope.')
    return 1

  WriteBin(writer, mySecretBox.encrypt(b"whoami"))
  line = mySecretBox.decrypt(ReadBin(reader))

  if line != b'root':
    return 1

  WriteBin(writer, mySecretBox.encrypt(b"exit"))
  return 0

After negotating a new shared secret with a server using Handshake(with the pre-shared password), every communication to this server is encrypted using this shared secret. If the handshake fails (i.e., returns None), it sends an error message in clear.

The client expect to receive 'AUTHENTICATED' by the server, which is the sign that the handshake ended successfully. Then it sends some commands to the server and checks the output.

Challenge

Python source code

def Challenge(password, flag, reader, writer):
  try:
    server_or_client = ReadLine(reader)
    is_server = server_or_client[0] in b'sS'
    is_client = server_or_client[0] in b'cC'

    if is_server:
      return Server(password, flag, reader, writer)
    elif is_client:
      return Client(password, reader, writer)
    else:
      WriteLine(writer, b'Error: Select if you want to speak to the (s)erver or (c)lient.')
      return 1
  except Exception as e:
    WriteLine(writer, b'Error')
    return 1

This function is used to communicate with the challenge-server. It is this function that is called by the challenge-server when a TCP connection is established. First, the challenge-server expects an input 's' or 'S' (respectively, 'c' or 'C') representing the fact that the player want to talk to the server (respectively, to the client) and be in the role of the client (respectively, the server). According to this input, the challenge-server calls the corresponding function (complementary role to the one chosen by the ctf-player).

Proposed solution

Now we will see how I manage to capture the flag for this challenge as well as some explanations on how I found this solution.


Basic and unsuccessful attempts

As often with CTF challenges, the biggest hint is the title and description of the challenge. Here we know that we have to make a man-in-the-middle attack on this protocol. Previously we identified the protocol as being a Diffie-Hellman key exchange. The simple Diffie-Hellman key exchange can be quickly summarize by this simple diagram:

Classic Diffie-Hellman

The Diffie-Hellman protocol is vulnerable to a simple man-in-the-middle attack if it is not properly protected by an authentication mechanism. This diagram illustrates this attack:

Man-in-the-middle on classic Diffie-Hellman

At the end, Alex (respectively, Billie) think that the communication with Billie (respectively, Alex) is secure but Charlie can decrypt, read and re-encrypt the messages at will.

But in our case the proof mechanism (secured by a state-of-the-art HMAC) is used to authenticate Alex toward Billie and Billie toward Alex using a pre-shared secret (a password) and using a nonce to avoid replays (sending an intercepted previous proof as a proof for the current session). Thus, the classic man-in-the-middle attack on Diffie-Hellman cannot be used.

The handshake being perfectly symmetric, the attacker (as a client or as a server) can wait that the challenge-server sends him data before continuing. Thus an attack like the one describe in the following diagram can theoretically be made.

Man-in-the-middle based on a replay attack

But because the nonce receive is checked to be different than the one receiveed, this attack will not work in our case.


Even if these attempts were unsuccessful, they are useful to understand why the challenge is not trivial and to come up with an attack that actually works.

Successful attack

One thing was odd in the source code of the Handshake function: during the handshake, the public key received is checked. I started to wonder why is that, so I juste ran the following code to check what was the shared secret generated by the Curve25519 algorithm when used with the public key rejected by Handshake.

from curve25519 import Public, Private
from binascii import hexlify

weird_pubkey1 = b'\x00'*32
weird_pubkey2 = b'\x01' + b'\x00'*31

def sharedsecret(pubkey):
    pubkey_obj = Public(pubkey)
    privkey_obj = Private() # Generate a random new private key each time
    return privkey_obj.get_shared_key(pubkey_obj)

for i in range(5):
    print(hexlify(sharedsecret(weird_pubkey1)))
    print(hexlify(sharedsecret(weird_pubkey2)))

The result was surprising:

$ python2.7 test.py 
68b59f127c671255346e099c3b9ea067a5595ba2adf26daa5e69d6a8a29d191a
68b59f127c671255346e099c3b9ea067a5595ba2adf26daa5e69d6a8a29d191a
68b59f127c671255346e099c3b9ea067a5595ba2adf26daa5e69d6a8a29d191a
68b59f127c671255346e099c3b9ea067a5595ba2adf26daa5e69d6a8a29d191a
68b59f127c671255346e099c3b9ea067a5595ba2adf26daa5e69d6a8a29d191a
68b59f127c671255346e099c3b9ea067a5595ba2adf26daa5e69d6a8a29d191a
68b59f127c671255346e099c3b9ea067a5595ba2adf26daa5e69d6a8a29d191a
68b59f127c671255346e099c3b9ea067a5595ba2adf26daa5e69d6a8a29d191a
68b59f127c671255346e099c3b9ea067a5595ba2adf26daa5e69d6a8a29d191a
68b59f127c671255346e099c3b9ea067a5595ba2adf26daa5e69d6a8a29d191a

Even if the private key changes every time, the computed shared secret is the same. That's a good thing that Handshake reject them! But wait, there is more: according to the website of Curve25519 algorithm, the public key checked are not the only 'bad' public key. And these other 'bad' public keys are not rejected by Handshake. We have our vulnerability: we can force the shared secret to be equal to a known value, independently from the value of the private key.

Thus, the following attack is possible:

Diagram of the actual attack

We establish with both the server and the client the same known shared secret by using a 'bad' public key with both of them. Then, we use Alex to provide a correct proof using the unknown password and the nonce chosen by the server. Then we send to the server the valid proof (because the shared secret between us and the client and between us and the server are the same). Thus, we can now send commands to the server as if we were the client.

By reusing some input-output functions, it is easy to come up with a python program implementing the attack.

Here is mine, letting me interact with the server (and send him arbitrary commands) as if I was the client and without knowing the pre-shared secret (password).

Python source code of solver.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
from challenge import WriteBin, ReadBin
from curve25519 import Private, Public
import os
from binascii import hexlify, unhexlify
import nacl.secret
import socket


weakPublic =  Public(b'\xe0\xebz|;A\xb8\xae\x16V\xe3\xfa\xf1\x9f\xc4j\xda\t\x8d\xeb\x9c2\xb1\xfd\x86b\x05\x16_I\xb8\x00')

class Reader():
    def __init__(self, socket):
        self.s = socket

    def read(self, size):
        return self.s.recv(size)

class Writer():
    def __init__(self, socket):
        self.s = socket

    def write(self, text):
        return self.s.send(text)

    def flush(self):
        None

def InitSocket(toWho):
    s = socket.socket()
    s.connect(("mitm.ctfcompetition.com",1337))
    s.send(toWho+'\n')

    return (Writer(s), Reader(s))

def Attack():

    (writerA, readerA) = InitSocket('c')
    #(writerA, readerA) = InitSocket('s')
    (writerB, readerB) = InitSocket('s')


    myNonce = os.urandom(32)

    WriteBin(writerB, weakPublic.serialize())
    WriteBin(writerB, myNonce)

    pubkeyB = ReadBin(readerB)
    nonceB = ReadBin(readerB)
    proofFromB = ReadBin(readerB)

    WriteBin(writerA, weakPublic.serialize())
    WriteBin(writerA, nonceB)

    pubkeyA = ReadBin(readerA)
    nonceA = ReadBin(readerA)
    proofFromA = ReadBin(readerA)

    proofToB = proofFromA

    WriteBin(writerB, proofToB)

    sharedSecret = Private().get_shared_key(weakPublic)
    mySecretBox = nacl.secret.SecretBox(sharedSecret)

    line = mySecretBox.decrypt(ReadBin(readerB))
    print(line)

    while True:
        cmd = raw_input('Command to send: ')
        WriteBin(writerB, mySecretBox.encrypt(cmd))
        line = mySecretBox.decrypt(ReadBin(readerB))
        print(line)

Attack() 

After running the code:

$ python2.7 solver.py
AUTHENTICATED
Command to send: whoami
root
Command to send: getflag
CTF{kae3eebav8Ac7Mi0RKgh6eeLisuut9oP}
Command to send: exit

Remarks

  • For Curve25519, the public keys that are considered 'bad' are the one corresponding to points with the x-coordinate equal to 0, 1, -1, 325606250916557431795983626356110631294008115727848805560023387167927233504, or 39382357235489614581723060781553021112529911719440698176882885853963445705823. The x-coordinate is taken modulo 2^555 - 19 and every 256-bits strings corresponds to a point, there are in fact 12 different 'bad' public key strings;

  • Even if here it had no impacts, it is generally a bad practice to use the same value as key to compute the HMAC and to encrypt/decrypt further communications;

  • The attack only uses the client as a way to compute the needed proof. This proof is computed in the Handshake function, which is exactly the same when used as a server or as a client. Thus, it doesn't matter if we use the server or the client to compute this proof as shown by uncommenting line 38 of the solver program.