Monday, November 26, 2007

OpenAniDB: Dead Week

So, first things first: Justin and I have reached a very real roadblock on OpenSmash; we need artwork to proceed. We're looking into a couple sources, but for now it's been placed on the back burner.

I got hashing working. Lemme talk about hashes. The hash primarily used by AniDB is called ED2K, after the filesharing program (eDonkey 2000) that it was designed for. The hash is pretty simple; it's a tree hash using MD4 and one layer of leaves. My algorithm is pretty similar to the original.
  1. Read in up to 9.5 MiB, and hash it using MD4.
  2. Once end-of-file is reached, concatenate the hashes of the file pieces, and hash it using MD4.
My bright idea was to use a static 9.5 MiB buffer on the stack to hold the piece currently being hashed; that way, there would be no memory allocation problems. However, as soon as I declared it, I started getting segmentation faults. Turns out that a buffer of that size, on the stack, forces that entire scope to be on a different page of memory, so all of the pointers back into the application caused segfaults. Needless to say, that large buffer is now dynamically allocated at the start of hashing and released at the end. Still, that puts our memory usage at about 80MB less than AOM, so I'm okay with it.

SQLite internals are getting worked on, bit by bit. The API is completely different from anything else I've ever used, including APSW. Massive amounts of callbacks. It feels like Windows API programming.

Have fun with finals, guys!
~ C.

No comments: