Hallvard B Furuseth wrote:
We can drop the new CRC code in back-ldif in favor of the stronger MD5 from liblutil. They have the same speed on my host: CRC is simpler, but memory-bound. CRC guarantees to catch certain transmission errors like a few wrong bits, but this comes at the expense of its quality as a general hash function.
Or we could use a slower SHA function, but I figure MD5 is already stronger than we need.
I had considered MD5 before (especially since we already had code for it) but it was slower, and we're not looking for cryptographic assurances or hash distribution anyway. Basically all of these crypto hash functions are overkill, in terms of hash size and computation. We're only looking to detect casual misuse or corruption, not malicious deception.
I didn't really spend a lot of time comparing the two functions' speed. But even with the memory access bottleneck, I would guess that on a loaded system with many threads running, the algorithm with fewer instructions is the better choice. Have you measured the throughput when multiple threads are executing?