On data encoding and complex text shaping

As part of the historical move of Janayugom news paper migrating into a completely libre software based workflow, Kerala Media Academy organized a summit on self-reliant publishing on 31-Oct-2019. I was invited to speak about Malayalam Unicode fonts.

The summit was inaugurated by Fahad Al-Saidi of the Scribus fame, who was instrumental in implementing complex text layout (CTL). Prior to the talks, I got to meet the team who made it possible to switch Janayogom’s entire publishing process on to free software platform — Kubuntu based ThengOS, Scribus for page layout, Inkspace for vector graphics, GIMP for raster graphics, CMYK color profiling for print, new Malayalam Unicode fonts with traditional orthography etc. It was impressive to see that entire production fleet was transformed, team was trained and the news paper is printed every day without delay.

I also met Fahad later and pleasantly surprised to realize that he already knows me from open source contributions. We had a productive discussion about Scribus.

My talk was on data encoding and text shaping in Unicode Malayalam. The publishing industry in Malayalam is at large still trapped in ASCII which causes numerous issues now, and many are still not aware of Unicode and its advantages. I tried to address that in my presentation with examples — so the preface of my talk filled half of the session; while the second half focused on font shaping. Many in the industry seems to be aware of Unicode and traditional Malayalam orthography can be used in computers now; but many in the academia still has not realized it — evident from the talk of the moderator of the discussion, who is director of the school of Indian languages. There was a lively discussion with the audience in the Q&A session. After the talk, a number of people gave me feedback and requested the slides be made available.

Slides on data encoding and complex text shaping are available under CC-BY-NC license here.

WatchData PROXKey digital signature using emSigner in Fedora 30

TL;DR — go to Howto section to make WatchData PROXKey work with emSigner in GNU/Linux system.

Introduction

Hardware tokens with digital signature are used for filing various financial documents in Govt of India portals. The major tokens supported by eMudhra are WatchData ProxKey, ePass 2003, Aladdin, Safenet, TrustKey etc. Many of these hardware tokens come (in CDROM image mode) with drivers and utilities to manage the signatures, unfortunately only in Windows platform.

Failed attempts

Sometime in 2017, I tried to make these tokens work for signing GST returns under GNU/Linux, using the de-facto pcsc tool. I got a WatchData PROXKey, which doesn’t work out-of-the-box with pcsc. Digging further brings up this report and it seems the driver is a spinoff of upstream (LGPL licensed), but no source code made available, so there is no hope of using these hardware tokens with upstream tools. The only option is depending on vendor provided drivers, unfortunately. There are some instructions by a retailer to get this working under Ubuntu.

Once you download and install that driver (ProxKey_Redhat.rpm), it does a few things — installs a separate pcsc daemon named pcscd_wd, installs the driver CCID bundles and certain supporting binaries/libraries. (The drawback of such custom driver implementations is that different drivers clash with each other (as each one provides a different pcscd_wd binary and their installation scripts silently overwrite existing files!). To avoid any clashes with this pcscd_wd daemon, disable the standard pcscd daemon by systemctl stop pcscd.service.

Plug in the USB hardware token and to the dismay observe that it spews the following error messages in journalctl:

Oct 06 09:16:51 athena pcscd_wd[2408]: ifdhandler.c:134:IFDHCreateChannelByName() failed
Oct 06 09:16:51 athena pcscd_wd[2408]: readerfactory.c:1043:RFInitializeReader() Open Port 0x200001 Failed (usb:163c/0417:libhal:/org/freedesktop/Hal/devices/usb_device_163c_0417_serialnotneeded_if1)
Oct 06 09:16:51 athena pcscd_wd[2408]: readerfactory.c:335:RFAddReader() WD CCID UTL init failed.

This prompted me to try different drivers, mostly from the eMudhra repository — including eMudhra Watchdata, Trust Key and even ePass (there were no *New* drivers at this time) — none of them seemed to work. Many references were towards Ubuntu, so I tried various Ubuntu versions from 14.04 to 18.10, they didn’t yield different result either. At this point, I have put the endeavour in the back burner.

A renewed interest

Around 2019 September, KITE announced that they will start supporting government officials using digital signatures under GNU/Linux, as most of Kerala government offices now run on libre software. KITE have made the necessary drivers, signing tools and manuals available.

I tried this in a (recommended) Ubuntu 18.04 system, but the pcscd_wd errors persisted and NICDSign tool couldn’t recognize the PROXKey digital token. Although, their installation methods gave me a better idea of how these drivers are supposed to work with the signing middleware.

Couple of days ago, with better understanding of how these drivers work, I thought that these should also work in Fedora 30 system (which is my main OS), I set out for another attempt.

How to

  1. Removed all the wdtokentool-proxkey, wdtokentool-trustkey, wdtokentool-eMudhra, ProxKey_Redhat and such drivers, if installed; to start from a clean slate.
  2. Download WatchData ProxKey (Linux) *New* driver from eMudhra.
  3. Unzip and install wdtokentool-ProxKey-1.1.1 RPM/DEB package. Note that this package installs the TRUSTKEY driver (usr/lib/WatchData/TRUSTKEY/lib/libwdpkcs_TRUSTKEY.so), not ProxKey driver (/usr/lib/WatchData/ProxKey/lib/libwdpkcs_SignatureP11.so) and it seems the ProxKey token only works with TRUSTKEY driver!
  4. Start pcscd_wd.service by systemctl start pcscd_wd.service (only if not auto-started)
  5. Plug in your PROXKey token. (journalctl -f would still show the error message, but — lesson learned — this error can be safely ignored!)
  6. Download emsigner from GST website and unzip it into your ~/Documents or another directory (say ~/Documents/emSigner).
  7. Ensure port 1585 is open in firewall settings: firewall-cmd --add-port=1585/tcp --zone=FedoraWorkstation (adjust the firewall zone if necessary). Repeat the same command by adding --permanent to make this change effective across reboot).
  8. Go to ~/Documents/emSigner in shell and run ./startserver.sh (make sure to chmod 0755 startserver.sh, or double-click on this script from a file browser).
  9. Login to GST portal and try to file your return with DSC.
  10. f you get the error Failed to establish connection to the server. Kindly restart the Emsigner when trying to sign, open another tab in browser window and go to https://localhost:1585 and try signing again.
  11. You should be prompted for the digital signature PIN and signing should succeed.

It is possible to use this digital token also in Firefox (via Preferences → Privacy & Security → Certificates → Security Devices → Load with Module filename as usr/lib/WatchData/TRUSTKEY/lib/libwdpkcs_TRUSTKEY.so) as long as the key is plugged in. Here again, you can skip the error message unable to load the module.

SMC Malayalam fonts updated in Fedora 30

The Fedora package smc-fonts has a set of Malayalam fonts (AnjaliOldLipi, Kalyani, Meera, Rachana, RaghuMalayalamSans and Suruma) maintained by SMC. We used to package all these fonts as a single zip file hosted at https://savannah.nongnu.org/projects/smc. These fonts were last updated in 2014 for Fedora, leaving them at version 6.1.

Since then, a lot of improvements were made to these fonts — glyph additions/corrections, opentype layout changes, fontTools based build system and separate source repository for each font etc.. There were lengthy discussions on the release management of the fonts, and it was partially the reason fonts were not updated in Fedora. Once it was agreed to follow different version number for each font, and a continuous build+release system was put in place at Gitlab, we could ensure that fonts downloaded from SMC website were always the latest version.

To reflect the updates in Fedora, we had to decide how to handle the monolithic source package at version 6.1 versus the new individual releases (e.g. Rachana is at version 7.0.1 as of this writing). In a discussion with Pravin Satpute, we agreed to obsolete the existing fonts package and give each font its own package.

Vishal Vijayaraghavan kindly stepped up and did the heavy lifting of creating the new packages, and we now even build the ttf font file from the source. See RHBZ#1648825 for details.

With all that in place, in Fedora 30, all these fonts are in latest version — for instance, see Rachana package. The old package smc-fonts no longer exists, instead each individual package such as smc-rachana-fonts or smc-meera-fonts can be installed. Our users will now be able to enjoy the improvements made over the years — including updated Unicode coverage, new glyphs, improved existing glyphs, much better opentype shaping etc.

Okular: another improvement to annotation

Continuing with the addition of line terminating style for the Straight Line annotation tool, I have added the ability to select the line start style also. The required code changes are committed today.

Line annotation with circled start and closed arrow ending.

Currently it is supported only for PDF documents (and poppler version ≥ 0.72), but that will change soon — thanks to another change by Tobias Deiminger under review to extend the functionality for other documents supported by Okular.

Okular: improved PDF annotation tool

Okular, KDE’s document viewer has very good support for annotating/reviewing/commenting documents. Okular supports a wide variety of annotation tools out-of-the-box (enable the ‘Review’ tool [F6] and see for yourself) and even more can be configured (such as the ‘Strikeout’ tool) — right click on the annotation tool bar and click ‘Configure Annotations’.

One of the annotation tools me and my colleagues frequently wanted to use is a line with arrow to mark an indent. Many PDF annotating software have this tool, but Okular was lacking it.

So a couple of weeks ago I started looking into the source code of okular and poppler (which is the PDF library used by Okular) and noticed that both of them already has support for the ‘Line Ending Style’ for the ‘Straight Line’ annotation tool (internally called the TermStyle). Skimming through the source code for a few hours and adding a few hooks in the code, I could add an option to configure the line ending style for ‘Straight Line’ annotation tool. Many line end styles are provided out of the box, such as open and closed arrows, circle, diamond etc.

An option to the ‘Straight Line’ tool configuration is added to choose the line ending style:

New ‘Line Ending Style’ for the ‘Straight Line’ annotation tool.

Here’s the review tool with ‘Open Arrow’ ending in action:

‘Arrow’ annotation tool in action.

Once happy with the outcome, I’ve created a review request to upstream the improvement. A number of helpful people reviewed and commented. One of the suggestions was to add icon/shape of the line ending style in the configuration options so that users can quickly preview what the shape will look like without having to try each one. The first attempt to implement this feature was by adding Unicode symbols (instead of a SVG or internally drawn graphics) and it looked okay. Here’s a screen shot:

‘Line End’ with symbols preview.

But it had various issues — some symbols are not available in Unicode and the localization of these strings without some context would be difficult. So, for now it is decided to drop the symbols.

For now, this feature works only on PDF documents. The patch is committed today and will be available in the next version of Okular.

Meera font updated to fix issue with InDesign

I have worked to make sure that fonts maintained at SMC work with mlym (Pango/Qt4/Windows XP era) opentype specification as well as mlm2 (Harfbuzz/Windows Vista+ era) specification, in the same font. These have also been tested in the past (2016ish) with Adobe softwares which use their own shaping engine (they use neither Harfbuzz nor Uniscribe; but there are plans to use Harfbuzz in the future — the internet tells me).

Some time ago, I received reports that typesetting articles in Adobe InDesign using Meera font has some serious issues with Chandrakkala/Halant positioning in combination with conjuncts.

When the Savmruthokaram/Chandrakkala ് (U+0D4D) follows a consonant or conjunct, it should be placed at the ‘right shoulder’ of the consonant/conjunct. But in InDesgin (CC 2019), it appears incorrectly on the ‘left shoulder’. This incorrect rendering is highlighted in figure below.

Wrong chandrakkala position before consonant in InDesign.

The correct rendering should have Chandrakkala appearing at the right of as in figure below.

Correct chandrakkala position after consonant.

This issue manifested only in Meera, but not in other fonts like Rachana or Uroob. Digging deeper, I found that only Meera has Mark-to-Base positioning GPOS lookup rule for Chandrakkala. This was done (instead of adjusting leftt bearing of the Chandrakkala glyph) to appear correctly on the ‘right shoulder’ of consonant. Unfortunately, InDesign seems to get this wrong.

To verify, shaping involving the Dot Reph ൎ (U+0D4E) (which is also opentype engineered as Mark-to-Base GPOS lookup) is checked. And sure enough, InDesign gets it wrong as well.

Dot Reph position (InDesign on left, Harfbuzz/Uniscribe on right)

The issue has been worked around by removing the GPOS lookup rules for Chandrakkala and tested with Harfbuzz, Uniscribe and InDesign. I have tagged a new version 7.0.2 of Meera and it is available for download from SMC website. As this issue has affected many users of InDesign, hopefully this update brings much joy to them to use Meera again. Windows/InDesign users make sure that previous versions of the font are uninstalled before installing this version.

New package in Fedora: python-xslxwriter

XlsxWriter is a Python module for creating files in xlsx (MS Excel 2007+) format. It is used by certain python modules some of our customers needed (such as OCA report_xlsx module).

This module is available in pypi but it was not packaged for Fedora. I’ve decided to maintain it in Fedora and created a package review request which is helpfully reviewed by Robert-André Mauchin.

The package, providing python3 compatible module, is available for Fedora 28 onwards.