MathJax processing on the server-side

This is a post about MathJax expression rendering on the server-side

Background

MathJax is an excellent library and makes it very easy to include mathematical expressions to a site, you may have found some in this blog too. But it always bugged me that it requires some client-side Javascript in order to work and it breaks the site in article readers (like Pocket). Some of MathJax’s simplicity is that you just drop a single line of JS into the head section, some configuration if needed, and it just converts the expressions to nice-looking HTML. Let’s see if we can move this to the server-side!

The first attempt

At the very core, MathJax converts MathML to HTML elements, adds some fonts, and the browser shows the expression. The first thought is to make it happen on the server and include the generated HTML in the site. When searching for it I’ve found some existing solution (like this), although none of them seems mature and reliable. Luckily, there are some tools to move JavaScript to the server-side, and one of them is PhantomJs. It might seem an overkill to use an actual browser to render some stuff meant to be rendered by the client (actually it is), but it gives a very solid and reliable result.

Setup

First, we need the MathJax files, as a Ruby dependency. There is an excellent gem source called Rails Assets which essentially brings Bower to Ruby. Thus, adding it as a source then rails-assets-MathJax as a dependency, we automatically get all the needed files.

The next thing we need is a HTTP server that our PhantomJS browser will fetch the site from. For it I’ve used WEBrick, as it comes with standard Ruby. Basically we need to start the server, mount the MathJax assets to a directory, then make an HTML page that contains out MathML expression. We need to do this in a separate thread because we need it running in the background. After that we need to poll and wait for the server to start.

Thread.start do
  require 'webrick'
  mathjax_dir = Gem::Specification.find_by_name("rails-assets-MathJax").gem_dir

  self.server = WEBrick::HTTPServer.new(
    :Port => 0,
    :DocumentRoot => "#{mathjax_dir}/app/assets",
    :AccessLog => [],
    :Logger => WEBrick::Log::new('/dev/null', 7)
  )

  server.mount '/javascripts/MathJax/fonts',
               WEBrick::HTTPServlet::FileHandler,
               "#{mathjax_dir}/app/assets/fonts/MathJax/fonts"

  begin
    server.start
  ensure
    server.shutdown
  end
end

sleep 0.1 until server_started?

And we can have server_started? like this:

def port
  server.config[:Port]
end

def server_started?
  require 'net/http'
  uri = URI("http://localhost:#{port}/javascripts/MathJax/MathJax.js")

  req = Net::HTTP::Get.new(uri)
  res = Net::HTTP.start(uri.hostname, uri.port) {|http|
    http.read_timeout = 1
    http.request(req)
  }

  res.is_a?(Net::HTTPSuccess)
rescue
  false
end

Lastly, we need to mount an HTML file with the expression, like this:

def response(content)
  "
  <html>
    <head>
      <script type='text/x-mathjax_renderer-config'>
          MathJax.Hub.Config({
            messageStyle: 'none',
            showMathMenu:false
          });
      </script>
      <script type='text/javascript'
            src='javascripts/MathJax/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
    </head>
    <body>#{content}</body>
  </html>"
end

server.mount_proc path do |_, res|
  res.body = response(mathml)
end

Getting the HTML

After we have the server ready, we need to actually fetch it, wait for MathJax to render, then grab the HTML. I’ve used Capybara and after the setup, the relevant parts are:

visit url

def mathjax_ready?(page)
  html = Nokogiri::HTML(page.html)
  !html.css('.MathJax').empty? &&
    html.css('.MathJax_Processing').empty? &&
    html.css('.MathJax_Processed').empty?
end

sleep 0.1 until mathjax_ready?(page)

result = Nokogiri::HTML(page.html).css('.MathJax')[0]

This loads the page, waits for MathJax, then extracts the results.

Using the result

The resulting HTML can be directly included into the page, but don’t forget to include the fonts too. You can use the MathJax CDN, but since you already have them as part of the rails-assets-MathJax gem, it’s best to use them instead. For the CDN, you might want to use something similar to this:

@font-face {font-family: MathJax_Main; src: url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff?rev=2.5.0') format('woff'), url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf?rev=2.5.0') format('opentype')}
@font-face {font-family: MathJax_Main-bold; src: url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff?rev=2.5.0') format('woff'), url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf?rev=2.5.0') format('opentype')}
@font-face {font-family: MathJax_Main-italic; src: url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff?rev=2.5.0') format('woff'), url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf?rev=2.5.0') format('opentype')}
@font-face {font-family: MathJax_Math-italic; src: url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff?rev=2.5.0') format('woff'), url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf?rev=2.5.0') format('opentype')}
@font-face {font-family: MathJax_Caligraphic; src: url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff?rev=2.5.0') format('woff'), url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf?rev=2.5.0') format('opentype')}
@font-face {font-family: MathJax_Size1; src: url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff?rev=2.5.0') format('woff'), url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf?rev=2.5.0') format('opentype')}
@font-face {font-family: MathJax_Size2; src: url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff?rev=2.5.0') format('woff'), url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf?rev=2.5.0') format('opentype')}
@font-face {font-family: MathJax_Size3; src: url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff?rev=2.5.0') format('woff'), url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf?rev=2.5.0') format('opentype')}
@font-face {font-family: MathJax_Size4; src: url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff?rev=2.5.0') format('woff'), url('http://cdn.mathjax.org/mathjax/latest/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf?rev=2.5.0') format('opentype')}

Conclusion of the first attempt

The described solution works quite well, it really does the job. After everything was set up, I’ve noticed that the expression in a browser looks the same as before (that’s the good part), but Pocket still don’t show anything (and that’s the bad part). So in conclusion it successfully transitioned the processing to the server-side (although it compromises some compatibility of MathJax to older browsers) it did not solve the original problem.

The second attempt

The second thought is instead of generating the HTML content, let’s capture an image of the rendered expression. This will surely work in every article reader, and it will look the same, so there will be no cross-browser issues. The first thing I tried it PhantomJs, but sadly it does not support custom fonts (here’s a GitHub issue) as of the current version. Version 2 will support them, but it is not available for all platforms at the moment.

The stack will then use Chrome in Xvfb, so it will not actually pop up a browser window, but it will behave exactly like one. Also we need some PNG processing to crop the screenshot, Chunky PNG will do the job well.

Setup

We will reuse most of the code from the first attempt, there are only some additions we need to make. First, we need to start Xvfb. Fortunately there is an excellent gem for this, reducing it to a Headless.ly block. Then we need to reconfigure Capybara to use Chrome instead of PhantomJs.

Taking and cropping the screenshot

Generating the actual image is quite easy, we just need some Javascript to know the actual dimensions of the expression. The following code shows how to do it:

driver.save_screenshot(image_path)

el= page.find('.MathJax .math').native

image = ChunkyPNG::Image.from_file(image_path)

x = el.location.x
y = el.location.y
width = el.size.width
height = el.size.height

image.crop!(x, y, width, height)
image.save(image_path)

And that’s it, we have the image of the expression and we can substitute all MathML with it.

Conclusion of the second attempt

This imaging solution works quite well, you can check at the earlier posts in this blog. I needed some tweaks like a minimal width, some padding and custom CSS, you can find the documentation and the end result at Github. The downside is that it requires some software that is unlikely to be present on every machine, and it’s kinda slow. When PhantomJS 2 will be generally available, I’ll look into it whether it can substitute a real browser.

17 March 2015

Interesting article?

Get hand-crafted emails on new content!