Archive for the "Software Engineering" Category

Reading CHM manuals on Mac OS X

Tuesday, November 24th, 2009

ChmoxUsing Mac OS X as my primary development platform is great since I can emulate my LAMP production environment a lot closely than I ever could developing on Windows.  One thing I missed for a while is being able to load the CHM (Compiled Help format used on Windows) manuals for PHP and MySQL quickly and easily (especially for those great disconnected development sessions while traveling when looking at the online documentation is not an option).

However, today I finally went searching and discovered several CHM viewers for Mac OS X.

Chmox does a pretty decent job, but the search function wasn’t working for me.

Another option is xCHM with a Mac OS X version available on VersionTracker.  It has the Search functionality and looks pretty much like the Windows CHM Viewer, but is slower and for some really bizarre reason, xCHM only lets you load a single CHM file at a time.

iCHM is a an all-around good app, and has a pretty nice tabbed viewing feature, but didn’t seem to be compabile with PHP’s CHM manual (the searching didn’t work) – worked fine with the MySQL CHM manual though.

Finally, there’s ArCHMock which is fast, has a decent search function, and seems to be the winner at least for my needs.

Now I have my CHM files happily sitting on the desktop.  The trip home for Thanksgiving will now be productive, unless the train is so packed that I don’t have room to take out my laptop. :-/

NextJump’s Overwhelming Offers Blackberry App

Monday, November 16th, 2009

OOBBMy brother Konstantin has been working on a BlackBerry app for NextJump’s Overwhelming Offers site.  Looks like the app got some press coverage today, on PCMag’s TechSaver site.  Way to go Konst, too bad the article doesn’t mention all the hours you’ve poured into that application.  Oh well, the developers behind the product rarely get mentioned ;)

For those of you who don’t know, OverWhelming offers has daily offers of 50% off or more every Monday through Friday at noon (and some days more than one offer per day).  Definitely check it out for holiday shopping if you haven’t already.  Here’s a tidbit from a press release NextJump did for the app:

Every Monday through Friday at Noon (Eastern Daylight Time), posts one extraordinary offer from Next Jump’s more than 28,000 well-known retail and brand partners. However, on occasion, will surprise shoppers with “Super OO Days” where multiple offers are featured, such as 11 OOs this past November 11. Next Jump is planning many more Super OO Days in November and December. Shoppers can get a hint as to the day’s OO – such as “Mii+Mii=…” (Answer: Wii) – or they can sign up to get a sneak peek in advance via a daily email.

Funny, back when Konstantin was working on the actual OO site, he use to have to write those hints.  Anyways, Kudos to Konst for the app getting mentioned.

Explanation of a Flash chip’s storage management routine

Saturday, August 8th, 2009

Hats off to Louis Gerbarg on his excellent write-up of how a Flash chip manages reads, writes, and most interestingly – deletes.  I have a relatively comprehensive understanding of how file systems work from classes back at CMU.  But back when I was taking those classes, Flash storage was basically non-existent in a file system context.  This post is definitely eye-opening and if you’re at all interested in how this stuff works, I highly recommend checking it out.  Here’s a quote that’ll get you interested in this stuff:

Okay, so lets assume for a second we have a 1MB flash device with 2 512KB blocks. This would be sold to the consumer as a 512KB flash drive, because some amount of the internal storage needs to be used for bookkeeping as we shall see.

And another quote:

Flash is a relatively complicated storage medium, and has its own view of the world. It works in terms of pages and blocks. Usually a page is the smallest amount of space you can reasonably read or write to a a flash chip (for our discussion, 4K), and a block is the smallest chunk of space you can erase at a time (for our discussion 128 blocks). With a fresh (unwritten block) all the bits are set to “1″, and during a write they can only be transition to “0.” That means in order to rewrite a page you must erase it first. This is a super important point, you can’t just go and erase a page of the flash, you need to erase the 128 contiguous pages contained in a whole block at a the same time.

And I thought tuning a filesystem to provide the best contiguous access to large files was interesting.  By comparison Flash garbage collection and bookkeeping algorithms must be fascinating.  In any case, I guess I’m just a little nostalgic reminiscing about CMU’s 15-412 Operating Systems course.

A different kind of Twitter timeline

Sunday, June 7th, 2009

twitterIts been about a month since I started using Twitter, and I have to say the thing I’m most disappointed with is the fact that its so easy to miss interesting information.  Depending on how busy I am, I may have the time to spend half a day catching up to stuff on Twitter, or I may have the chance to load my Twitter client for just 5 minutes to look for new direct messages or @replies.  Regardless of how much time I put into it, I always wind up missing something, I’m absolutely sure of it.

This is the difference between RSS and Twitter – RSS clients tend to collect everything from a particular feed and preserve it until I either mark things read and ignore them or actually read them.  This is what I want from a Twitter client, except I think there’s a way to make this sort of thing even nicer with Twitter.

I want to be able to see the most relevant tweets in my Friends Timeline.  For instance, for those friends that announce new blog posts, I want to be able to see those – all the time.  I want to prioritize retweets and @replies lower than original posts.  Moreover, I want to prioritize certain friends’ posts higher than others.

Basically, once you follow a certain number of people, the signal-to-noise ratio becomes so low that the chances of you missing out something really interesting and relevant are just horrendous.

I’m proposing an alternate view of your Twitter feed – a prioritized timeline.  First you’ll need to priority-order and assign numerical priorities to each of the people you follow.  After that tweets will be shown in this prioritized timeline ordered by a relevance value.  Lets say you order @ivantumanov as 50 and @engadget as 5.  Then a new post from me (posted at the same time as the post from @engadget) will show up first.  After that the two posts’ position in this prioritized timeline will decay according to how old they are.  However, the relevance value for my post will decay 10 times slower than the relevance value for @engadet’s post.  In this way, it’ll stay closer to the top of this prioritized timeline longer and thus I’ll be more likely to read it.  Because posts may move around as time goes by (@engadget’s posts will continue falling down in relevance and moving down in this list while my post will fall down 10 times slower), a “seen” flag will need to be shown.  Additionally, a filing system (To Read, To Save, For Reference folders) would be nice.  And finally, thumbtacks that keep a post at a particular place in the timeline – basically freezing their relevance value so it doesn’t decay with time.

The source of the tweet is obviously the easiest thing to use for determining relevance.  But how about the content of the tweets?  Anything that starts with RT or @ can be assigned a penalty or a promotion as far as the relevance value / priority value of that post.  Things with “New Blog Post” for instance, or any other specific string can be assigned a penalty or bonus value.  And for those of us who use Tweetie, anything with “(via …” can be prioritized in a specific way as well.

Is there something like this out there already?  If not, I’m gonna have to get my hands dirty and fiddle with the Twitter API some this coming week to make this happen.

A case of mistaken (Kember) Identity

Wednesday, May 6th, 2009

417ccgh2gklI just got pointed to @elliottkember’s page about a challenge he calls “The Kember Identity” – basically a search for a 32-character string which, when passed through the MD5 function, returns a 128 bit value – which when converted to its hexadecimal string representation, is identical to the original string.

I’ve been trying to wrap my head around exactly WHY Elliott Kember is trying to find such a value, except perhaps for the lucrative naming rights to such a weird trivia bit.

The funny thing, is that the 32 character input string, as input to the MD5 function, is a 256 bit input.  The output, before its encoded as a hexadecimal string, is 128 bits.  So while the effort being throw by various people at this challenge is admirable, its not really a search for a true MD5 identity value.

Additionally, if I remember my Applied Cryptography correctly (and if I don’t and by some chance Bruce Schneier happens upon this blog post, I am sure I’ll be turning red even if I never find out about it)… the MD5 function processes its input in 512 bit chunks.  So any input that’s less than 512 bits, essentially gets padded to 512 bits to make all the gears spin.  If I think about it this way, then there’s really no such thing as a possible MD5 function “identity value” given that in a strict interpretation, the domain and the range can’t overlap.

Elliott states that the exercise is a “proof of concept” on the front page of his site.  I suppose its an interesting exercise – perhaps in seeing how quickly a problem can be implemented in as many languages as possible with the help of twitter, the web, etc, etc.  I applaud Elliott for thinking of it.  I just hope it doesn’t land him an entry in Urban Dictionary that defines a Kember Identity as a case of mistaken identity.  Oh, and let me know if any geniuses out there introduce similar challenges for SHA, RIPE-MD, etc, etc.  Also I haven’t considered the possibility that this might all be British humor of some weird variety.  Clue me in if it is.