When Santhosh Thottingal sent out the task to create English-Malayalam/Malayalam-English dictionary conforming to Dict Protocol, I didn’t care much. Just took a look and left it there. But later when he pinged and urged me to take it up – providing many required resources – I just thought I’ll take a look at it. And thus started scratching another itch.
The Govt of Kerala is well known for its support for Free/Open Source Software. And they’ve been doing a pretty good job. But I was surprised when I got the link to an English-Malayalam Dictionary with a Python frontend. And the best part is this – it is GPL’ed.
And I set out to convert the data found inside to suite to the Dict Protocol [RFC2229]. An ugly shell script turned out to be a nice one after 3 days of carving and craving.
This is how it is done:
- Format the input file in the format : {headword\n\tdefinitions}.
- Use dictfmt to convert to Dict format : dictfmt -f –utf8 -s Dict-English-Malayalam -u smc.org.in dict-en-ml < <input_file> && dictzip dict-en-ml.dict
- This will create two files dict-en-ml.dict.dz & dict-en-ml.index.
- Install “dictd“.
- Create folder “/usr/share/dictd” if it doesn’t exist.
- Copy dict-en-ml.dict.dz and dict-en-ml.index to “/usr/share/dictd“
- Create file “/etc/dict.conf” and edit it. Put “server localhost” and save.
- Create file “/etc/dictd.conf” and edit. Put : “database Eng-Mal {data “/usr/share/dictd/dict-en-ml.dict.dz” \n\t index “/usr/share/dictd/dict-en-ml.index”}
- Start the dictd service by “/etc/rc.d/init.d/dictd start“
- Use your favourite dictionary frontend and lookup!
And, here’s a preview as well:
There’s still some more work to do, viz. incorporating the grammatical components (like Noun, Verb etc).
We, at Swathanthra Malayalam Computing hope to release it soon, and even an RPM package as proposed by Sankharshan Mukhopadyay.
Stay tuned.
4 responses to “English-Malayalam Dict [RFC2229]”
Great work Rajeesh! Can’t wait to see it open sourced!
[…] January 17, 2009 Posted by Rajeesh in malayalam. Tags: malayalam, smc trackback As mentioned earlier, we are ready with the beta release of English-Malayalam dictionary in DICT format. The RPM, DEB […]
[…] based english-malayalam dictionary is in developement and we are ready for a beta release. Rajeesh did a woderful job in preparing […]
yes it is a good application …………….
by admin
malayalamsms.in techportal.in