Blog | G5 Cyber Security

Password Cracking with Markov Chains & HMMs

TL;DR

Markov chains and Hidden Markov Models (HMMs) can be surprisingly effective for password auditing and cracking, especially against weaker passwords or those following common patterns. They work by learning the probability of character sequences from a sample dataset (e.g., previously cracked passwords). This guide explains how to use them practically.

1. Understanding the Basics

Before we start, let’s quickly cover what these models are:

2. Gathering Password Data

You’ll need a dataset of cracked or known passwords to train your model. Larger, more diverse datasets are better. Sources include:

3. Building a Markov Chain

Here’s how you can build a simple Markov chain in Python:

import collections

def build_markov_chain(passwords):
    chain = collections.defaultdict(list)
    for password in passwords:
        for i in range(len(password) - 1):
            chain[password[i]].append(password[i+1])
    return chain

# Example Usage
passwords = ['password', 'passw0rd', 'p@ssword']
markov_chain = build_markov_chain(passwords)
print(markov_chain)

This code creates a dictionary where keys are characters and values are lists of characters that follow them in the training data. The more passwords you feed it, the better the probabilities will be.

4. Generating Passwords from the Markov Chain

Now let’s generate potential passwords:

import random

def generate_password(chain, length):
    start_char = random.choice(list(chain.keys())) # Start with a random character
    password = [start_char]
    for _ in range(length - 1):
        next_chars = chain[password[-1]]
        if next_chars:
            next_char = random.choice(next_chars)
            password.append(next_char)
        else:
            break # Stop if no possible next characters
    return ''.join(password)

# Example Usage
length = 8
generated_password = generate_password(markov_chain, length)
print(generated_password)

This function starts with a random character and iteratively adds characters based on the probabilities learned from the chain. It stops when it can’t find a valid next character.

5. Using Hidden Markov Models (HMMs)

HMMs are more complex to implement but offer better results for structured passwords. You’ll need a library like hmmlearn:

A full HMM example is beyond the scope of this guide due to its complexity, but the hmmlearn documentation provides excellent tutorials.

6. Password Auditing Tools

Several tools integrate Markov chain-based password cracking:

7. Important Considerations

Exit mobile version