SHA1 and AES

As noted in the previous post on big numbers, common protocols use expensive public-key cryptography just to establish a symmetric session key; the payload of the connection is protected via cheaper symmetric-key cryptography. Thus, the next logical thing to implement is a symmetric-key algorithm.

Whether using it to protect a connection or just encrypt some local data, nothing pairs better with a cipher than a hash. Hashes are used to protect data integrity from accidental or intentional tampering. As you've probably guessed, I'm starting with SHA1 and AES as my hash and cipher. The two algorithms aren't related, but they're equally simple to implement, so I've just lumped them together.

Because the algorithm is already established, the only real design choice is the API. Both SHA1 and AES operate on fixed-size blocks of data, rather than individual bytes. I prefer to hide that detail with some internal buffering (i.e. storing the last incomplete block). The benefit is obvious: callers can pass in any amount of data, rather than remembering the block size for each algorithm and buffering/segmenting data manually.

SHA1 is widely documented, including pseudocode in RFCs and Wikipedia. Converting this to Rust posed no significant challenge, as no ownership issues arise from processing a blob of bytes. SHA1 will come in handy for various protocols (SSH, Git, etc.) in the future, but it's just a simple commandline tool for now.

			⌨1118 c ⌨83 a ⌨154 t ⌨132   ⌨119 h ⌨67 e ⌨78 l ⌨124 l ⌨153 o ⌨206 . ⌨82 t ⌨138 x ⌨49 t ⌨232 ⏎
			␤ ⎙21 hello⏎
			␤ ⌨1000 c ⌨74 a ⌨144 r ⌨71 g ⌨93 o ⌨116   ⌨130 r ⌨65 u ⌨107 n ⌨141   ⌨523 . ⌨80 / ⌨196 h ⌨101 e ⌨62 l ⌨117 l ⌨139 o ⌨172 . ⌨52 t ⌨120 x ⌨33 t ⌨210 ⏎
			␤ ⎙45   ⎙0    Finished dev [unoptimized + debuginfo] target(s) in 0.01s⏎
			␤ ⎙0   ⎙0     Running `target\debug\sha1.exe ./hello.txt`⏎
			␤ ⎙7 f572d396fae9206628714fb2ce00f72e94f2258f⏎
			␤ ␃

As you can probably guess, AES is also well documented, so there's not much guesswork in the implementation. AES differs from SHA1 in that data flows “through” it; SHA1 processes data and updates a fixed-size internal state (the hash), while AES processes data and emits it all, plus more (the padded ciphertext). This makes AES a good candidate for implementing the std::io::Write/Read traits for encrypting and decrypting.

Wrapping the cipher in a stream provided a nice place to buffer incomplete blocks, but also presented my first Rust ownership question: should the reader/writer own the underlying stream, or merely reference it? Both ways make it possible to re-use the underlying stream: if it's borrowed, the mutable reference is released when the reader/writer disappears; if it's owned, a method can be exposed to move the underlying stream out of the reader/writer before it disappears.

Because both ways work, and I don't have any other criteria to narrow my choices, I chose the way that felt right: the reader/writer uses the underlying stream but doesn't own it. For example, when reading an encrypted network connection, the reader doesn't logically own the connection to the server, it merely transforms a subset of the data.

pub struct EncryptingWriter<'a, T: Write> {

cipher: Cipher,

block: Vec<u8>,

destination: &'a mut T,

}

Some later real-world use will reveal if I chose incorrectly.

I didn't bother making a CLI for this one; I often want to compute the SHA1 of a file, but I never really want to perform bare AES from the commandline. With that, the code seems to work, and the tests pass, so I'll just have to wait for some real-world use cases to see if it all holds up.

Rust is proving to be quite straightforward so far. The most significant part of the learning curve — ownership — seems to work quite intuitively for data-processing libraries like AES, SHA, and the big number implementation. My use of ownership so far just seems like familiar old RAII with most lifetimes inferred automatically. I suspect more complex ownership challenges will arise when I start constructing applications with events and state.