A new set of OpenType shaping rules for Malayalam script

TLDR; research and development of a completely new OpenType layout rules for Malayalam traditional orthography.

Writing OpenType shaping rules is hard. Writing OpenType shaping rules for advanced (complex) scripts is harder. Writing OpenType shaping rules without causing any undesired ligature formations is even harder.

Background

The shaping rules for SMC fonts abiding v2 of Malayalam OpenType specification (mlm2 script tag) were written and polished in large part by me over many years, fixing shaping errors and undesired ligature formations. It still left some hard to fix bugs. Driven by the desire to fix such difficult bugs in RIT fonts and the copyright fiasco, I have set out to write a simplified OpenType shaping rules for Malayalam from scratch. Two major references helped in that quest: (1) a radically different approach I have tried few years ago but failed with mlym script tag (aka Windows XP era shaping); (2) a manuscript by R. Chithrajakumar of Rachana Aksharavedi who culled and compiled the ‘definitive character set’ for Malayalam script. The idea of ‘definitive character set’ is that it contains all the valid characters in a script and it doesn’t contain any (invalid) characters not in the script. By the definition; I wanted to create the new shaping rules in such a way that it does not generate any invalid characters (for e.g. with a detached u-kar). In short: it shouldn’t be possible to accidentally generate broken reformed orthography forms.

Fig. 1. Samples of Malayalam definitive character set listing by R. Chithrajakumar, circa 1999. Source: K.H. Hussain.

“Simplify, simplify, simplify!”

Henry David Thoreau

It is my opinion that a lot of complexity in the Malayalam shaping largely comes from Indic OpenType shaping specification largely follows Devanagari, which in turn was adapted from ISCII, which has (in my limited understanding) its root in component-wise metal type design of ligature glyphs. Many half, postbase and other shaping rules have their lineage there. I have also heard similar concerns about complexity expressed by others, including Behdad Esfahbod, FreeFont maintainer et al.

Implementation

As K.H. Hussain once rightly noted, the shaping rules were creating many undesired/unnecessary ligature glyphs by default, and additional shaping rules (complex contextual lookups) are written to avoid/undo those. A better, alternate approach would be: simply don’t generate undesired ligatures in the first place.

“Invert, always invert.”

Carl Gustav Jacob Jacobi

Around December 2019, I set out to write a definitive set of OpenType shaping rules for traditional script set of Malayalam. Instead of relying on many different lookup types such as pref, pstf, blwf, pres, psts and myriad of complex contextual substitutions, the only type of lookup required was akhn — because the definitive character set contains all ligatures of Malayalm and those glyphs are designed in the font as a single glyph — no component based design.

The draft rules were written in tandem with RIT-Rachana redesign effort and tested against different shaping engines such as HarfBuzz, Allsorts, XeTeX, LuaHBTeX and DirectWrite/Uniscribe for Windows. Windows, being Windows (also being maintainers of OpenType specification), indeed did not work as expected adhering to the specification. Windows implementation clearly special cased the pstf forms of യ (Ya, 0D2F) and വ (Va, 0D35). To make single set of shaping rules work with all these shaping engines, the draft rules were slightly amended, et voila — it worked in all applications and OSen that use any of these shaping engines. It was decided to drop support for mlym script which was deprecated many years ago and support only mlm2 specification which fixed many irreparable shortcomings of mlym. One notable shaping engine which doesn’t work with these rules is Adobe text engine (Lipika?), but they have recently switched to HarfBuzz. That covers all major typesetting applications.

Testing fonts developed using this new set of shaping rules for Malayalam indeed showed that they do not generate any undesired ligatures in the first place. In addition, compared to the previous shaping rules, it gets rid of 70+ lines of complex contextual substitutions and other rules, while remaining easy to read and maintain.

Old vs new shaping rules in Rachana
Fig. 3. Old vs new shaping rules in RIT Rachana.

Application support

This new set of OpenType layout rules for Malayalam is tested to work 100% with following shaping engines:

  1. HarfBuzz
  2. Allsorts
  3. DirectWrite/Uniscribe (Windows shaping engine)

And GUI toolkits/applications:

  1. Qt (KDE applications)
  2. Pango/GTK (GNOME applications)
  3. LibreOffice
  4. Microsoft Office
  5. XeTeX
  6. LuaHBTeX
  7. Emacs
  8. Adobe InDesign (with HarfBuzz shaping engine)
  9. Adobe Photoshop
  10. Firefox, Chrome/Chromium, Edge browsers

Advantages

In addition, the advantages of the new shaping rules are:

  1. Adheres to the concept of ‘definitive character set’ of the language/script completely. Generate all valid conjunct characters and do not generate any invalid conjunct character.
  2. Same set of rules work fine without adjustments/reprogramming for ‘limited character set’ fonts. The ‘limited character set’ may not contain conjunct characters as extensive in the ‘definitive character set’; yet it would always have characters with reph and u/uu-kars formed correctly.
  3. Reduced complexity and maintenance (no complex contextual lookups, reverse chaining etc.). Write once, use in any fonts.
  4. Open source, libre software.

This new OpenType shaping rules program was released to public along with RIT Rachana few months ago, and also used in all other fonts developed by RIT. It is licensed under Open Font License for anyone to use and integrate into their fonts, please ensure the copyright statements are preserved. The shaping rules are maintained at RIT GitLab repository. Please create an issue in the tracker if you find any bugs; or send a merge request if any improvement is made.

Letsencrypt certificate renewal: Nginx with reverse-proxy

Let’s Encrypt revolutionized the SSL certificate management for websites in a short span of time — it directly improved the security of users of the world wide web by: (1) making it very simple to deploy SSL certificates to websites by administrators and (2) make the certificates available free of cost. To appreciate their efforts, compare to what hoops one had to jump through to obtain a certificate from a certificate authority (CA) and how much money and energy one would have to spend on it.

I make use of letsencrypt in all the servers I manitain(ed) and in the past used the certbot tool to obtain & renew certificates. Recent versions of certbot are only available as a snap package, which is not something I’d want to or able to setup in many cases.

Enter acme. It is shell script that works great. Installing acme will also setup a cron job, which would automatically renew the certificate for the domain(s) near its expiration. I have recently setup dict.sayahna.org using nginx as a reverse proxy to a lexonomy service and acme for certificate management. The cron job is supposed to renew the certificate on time.

Except it didn’t. Few days ago received a notification from about imminent expiry of the certificate. I have searched the interweb quite a bit, but didn’t find a simple enough solution (“make the proxy service redirect the request”…). What follows is the troubleshooting and a solution, may be someone else find it useful.

Problem

acme was unable to renew the certificate, because the HTTP-01 authentication challenge requests were not answered by the proxy server where all traffic was being redirected to. In short: how to renew letsencrypt certificates on an nginx reverse-proxy server?

Certificate renewal attempt by acme would result in errors like:

# .acme.sh/acme.sh --cron --home "/root/.acme.sh" -w /var/www/html/
[Sat 08 May 2021 07:28:17 AM UTC] <strong>===Starting cron===</strong>
[Sat 08 May 2021 07:28:17 AM UTC] <strong>Renew: 'my.domain.org'</strong>
[Sat 08 May 2021 07:28:18 AM UTC] Using CA: https://acme-v02.api.letsencrypt.org/directory
[Sat 08 May 2021 07:28:18 AM UTC] Single domain='my.domain.org'
[Sat 08 May 2021 07:28:18 AM UTC] Getting domain auth token for each domain
[Sat 08 May 2021 07:28:20 AM UTC] Getting webroot for domain='my.domain.org'
[Sat 08 May 2021 07:28:21 AM UTC] Verifying: my.domain.org
[Sat 08 May 2021 07:28:24 AM UTC] <strong>my.domain.org:Verify error:Invalid response from https://<strong>my.domain</strong>.org/.well-known/acme-challenge/Iyx9vzzPWv8iRrl3OkXjQkXTsnWwN49N5aTyFbweJiA [NNN.NNN.NNN.NNN]:</strong>
[Sat 08 May 2021 07:28:24 AM UTC] <strong>Please add '--debug' or '--log' to check more details.</strong>
[Sat 08 May 2021 07:28:24 AM UTC] <strong>See: https://github.com/acmesh-official/acme.sh/wiki/How-to-debug-acme.sh</strong>
[Sat 08 May 2021 07:28:25 AM UTC] <strong>Error renew <strong>my.domain</strong>.org.</strong>

Troubleshooting

The key error to notice is

Verify error:Invalid response from https://my.domain.org/.well-known/acme-challenge/Iyx9vzzPWv8iRrl3OkXjQkXTsnWwN49N5aTyFbweJiA [NNN.NNN.NNN.NNN]

Sure enough, the resource .well-known/acme-challenge/… is not accessible. Let us try to make that accessible, without going through proxy server.

Solution

First, create the directory if it doesn’t exist. Assuming the web root as /var/www/html:

# mkdir -p /var/ww/html/.well-known/acme-challenge

Then, edit /etc/nginx/sites-enabled/my.domain.org and before the proxy_pass directive, add the .well-known/acme-challenge/ location and point it to the correct location in web root. Do this on both HTTPS and HTTP server blocks (otherwise it didn’t work for me).

 6 server {
 7   listen 443 default_server ssl;
...
43   server_name my.domain.org;
44   location /.well-known/acme-challenge/ {
45     root /var/www/html/;
46   }
47  
48   location / {
49     proxy_pass http://myproxyserver;
50     proxy_redirect off;
51   }
...
83 server {
84   listen 80;
85   listen [::]:80;
86 
87   server_name my.domain.org;
88 
89   location /.well-known/acme-challenge/ {
90     root /var/www/html/;
91   }
92 
93   # Redirect to HTTPS
94   return 301 https://$server_name$request_uri;


Make sure the configuration is valid and reload the nginx configuration

nginx -t && systemctl reload nginx.service

Now, try to renew the certificate again:

# .acme.sh/acme.sh --cron --home "/root/.acme.sh" -w /var/www/html/
...
[Sat 08 May 2021 07:45:01 AM UTC] Your cert is in  /root/.acme.sh/my.domain.org/dict.sayahna.org.cer 
[Sat 08 May 2021 07:45:01 AM UTC] Your cert key is in  /root/.acme.sh/my.domain.org/my.domain.org.key 
[Sat 08 May 2021 07:45:01 AM UTC] v2 chain.
[Sat 08 May 2021 07:45:01 AM UTC] The intermediate CA cert is in  /root/.acme.sh/my.domain.org/ca.cer 
[Sat 08 May 2021 07:45:01 AM UTC] And the full chain certs is there:  /root/.acme.sh/my.domain.org/fullchain.cer 
[Sat 08 May 2021 07:45:02 AM UTC] _on_issue_success

Success.

Panmana: new Malayalam body text font

Rachana Institute of Typography starts the new year 2021 with the release of a new body-text Malayalam Unicode font named ‘Panmana’.

Fig. 1: ‘Panmana’ font specimen.

The font is named after and dedicated to Prof. Panmana Ramachandran Nair who steadfastly voiced for the original script of Malayalam. It is designed by K.H. Hussain with inputs from Ashok Kumar and CVR and font engineering by Rajeesh (your correspondent); maintained by RIT.

‘Panmana’ is released under Open Font License, free to use and share. Truetype and Web font can be downloaded from the website. A flyer about the font is available. If you spot any issues, please report those in the source repository.