Settings

Theme

Towards LaTeX in the Browser

hackernoon.com

110 points by jxxcarlson 8 years ago · 67 comments

Reader

jdleesmiller 8 years ago

(I'm a co-founder at Overleaf.com, which does collaborative 'LaTeX in the browser' in a different sense.)

I like the idea of a 'sane' subset of LaTeX that is easy to publish to the web. There are tools like LaTeXML and TeX4ht that try to convert general LaTeX documents to (X)HTML, but it's a very hard problem.

Some difficulties arise from the fact that TeX is just very hard to parse in general. Even the first stage of parsing TeX is Turing complete [1]. This makes it hard to write tooling e.g. for linting (though tools exist, e.g. chktex) or creating a WYSIWYG editor backed by LaTeX [2]. (edit: or creating a good LaTeX auto-complete [4])

Others arise from TeX's extensibility --- there are many thousands of packages that define their own commands and environments for different types of documents and different disciplines. This extensibility is on the one hand one of the main reasons that TeX and LaTeX are still actively used some 40 years after TeX's initial release, but on the other hand a major challenge for conversion to HTML. The LaTeXML project has many custom bindings [3] for these packages, but it's far from complete.

I guess the main question is whether we can find the right subset, and this project looks like a great start.

[1] https://tex.stackexchange.com/questions/4201/is-there-a-bnf-...

[2] https://www.overleaf.com/blog/81 --- my first attempt at rich text on Overleaf, many years ago

[3] http://dlmf.nist.gov/LaTeXML/manual/customization/customizat...

[4] https://www.overleaf.com/blog/523-a-data-driven-approach-to-...

  • moultano 8 years ago

    I really like the idea of adding math support to Unicode via combining characters. It's more complicated than anything Unicode currently deals with, but not that much more complicated, and the idea of being able to put math into anything that currently accepts strings is just so enticing. We should treat math as it's own language, and rendering it as we would any other human language with an unusual way of laying out characters.

    • Bromskloss 8 years ago

      It's an interesting idea. At what point, though, do we draw the line between what a character set (like Unicode) should handle, and what should be handled by a higher-level layer? I'm thinking that things like boldness, italicisation, and super script aren't really the job for a character set.

      • yorwba 8 years ago

        Unicode already has 𝐛𝐨𝐥𝐝, 𝘪𝘵𝘢𝘭𝘪𝘤 and ˢᵘᵖᵉʳˢᶜʳⁱᵖᵗ variants of the Latin alphabet.

      • moultano 8 years ago

        I'd say if the formatting changes the meaning of the language, Unicode should support it. So if you are searching through text, any change to your query string that you would like to constrain the text that matches should be supported by Unicode. Unicode should at least support anything that affects the semantic equality of strings.

        • Bromskloss 8 years ago

          I'm thinking of Unicode as a character set, and that text exists on an abstraction level above characters.

          • moultano 8 years ago

            I was thinking that it's analogous to ก็็็็็็็็็็็็็็็็็็็็ where the character dictates how the surrounding characters are rendered.

    • cryptonector 8 years ago

      Hmmm, certainly Unicode ought to be able to represent mathematics as a script like any other. However, the complexity involved is non-trivial. To make things easier, whatever Unicode might do for math should have a mapping to and from TeX or MathAJAX. In any case, Unicode is rather complex as it is; I'm not sure I look forward to this extra level of complexity :(

  • jxxcarlsonOP 8 years ago

    Hi, I am the developer of MiniLatex. Overleaf looks like a fantastic tool.

  • bhrgunatha 8 years ago

    > I like the idea of a 'sane' subset of LaTeX that is easy to publish to the web.

    That's probably the only approach that really makes sense:

    During the past decade I was surprised to learn that the writing of programs for TeX and Metafont proved to be much more difficult than all the other things I had done (like proving theorems or writing books). The creation of good software demand a significiantly higher standard of accuracy than those other things do, and it requires a longer attention span than other intellectual tasks.

kovariance 8 years ago

I have found KaTeX to be the best currently-available solution. In particular, it can be rendered without client-side javascript.

marvy 8 years ago

Just a bit of historical correction. The article/post says:

"Ten years later, in 1978, his work bore fruit"

This gets things pretty wrong. He got the idea in 1977, and his estimate of "this will take 6 months" was pretty close, in that the initial version was finished sometime in 1978. It then took about another ten years to be "actually done". (Rewrite, add features, fix bugs, create Metafont, create WEB, etc...)

DavidSJ 8 years ago

Completely besides the point, but that integral evaluates to sqrt(2pi), not sqrt(pi).

applecrazy 8 years ago

I wonder if somebody has taken TeX and compiled to the browser in wasm using emscripten. That would be easier to port but heavy on load times.

Edit: it exists! https://github.com/manuels/texlive.js/ is a limited port of LaTeX to JS, rendered to PDF

  • TheRealPomax 8 years ago

    Classic TeX would be damn near useless in the age of Unicode, so you're looking at something like XeLaTeX or LUATex. The problem is that it's really easy to implement a really basic form of TeX, but unless you already planned for the really hard cases, maintaining your implementation is going to become intractible. TeX's real text typesetting is almost always woefully ignored even though _everything_ has to type beautifully, not just makes, and in modern version of TeX, that has to happen without insane syntax just to get a Unicode character we can already "just write" rather than needing all kinds of dedicated macros just for diacritics, it something as simple as mixing two writing scripts that necessitate two different fonts entirely.

    • badsectoracula 8 years ago

      > Classic TeX would be damn near useless in the age of Unicode

      Unless you want to write in English, which i am going to bet it still has a somewhat large audience :-P

      • PeterisP 8 years ago

        As soon you want to mention names of people, English text often requires Unicode characters. Looking up some examples, the first random paper I took from arxiv mentioned three surnames that needed Unicode, the second needed four, including the name of one of authors herself.

        Even if you're talking purely about people in USA - for example, a page of MIT faculty https://www.eecs.mit.edu/people/faculty-advisors includes names like Jesús, Corbató and Tomás.

        • badsectoracula 8 years ago

          Doesn't TeX already handle that? A quick search shows http://vjimc.osu.cz/TeXform.html

          FWIW my own name would need Unicode too (Κώστας Μιχαλόπουλος) but i always use its romanized form (Kostas Michalopoulos) in English. I think that is common when writing English text and names from languages that do not use the latin (or derived) alphabet.

          • TheRealPomax 8 years ago

            An answer here would be way too long, but the short answer is "no". The technologies that were available at the time of TeX meant that TeX had to do all kinds of things that in today's world are bizarre.

            TeX has seen a lot of improvements over the last 30 years, and modern TeX engines such as XeTeX and LuaTex have removed a lot of the insane painpoints that came with traditional TeX, which worked well only because there was literally nothing better at the time.

            A modern TeX engine will let you just write what you want to write, using all of Unicode as your playground, using modern OpenType fonts, and with real vector graphics. None of those things can be done with original TeX, not just "it's hard to", it's literally impossible without rewriting it from the ground up. Which is why we HAVE modern TeX engines: just because it worked, doesn't mean it was good. It was merely the best available at the time.

            Time moved on.

      • TheRealPomax 8 years ago

        There is no "you". If the idea is to make a thing for the web, the audience is everyone there, not that one guy who insist they will only ever use English.

  • yodon 8 years ago

    Compiling to wasm certainly seems like the right place to start, though it might be non-trivial to get the output to render into a canvas.

svat 8 years ago

I wonder whether this is the right approach. TeX itself is one of the most heavily documented programs in existence. Not only are its workings documented in detail in The TeXbook (and a host of other books by other authors, such as Eijkhout's TeX by Topic) but even the program itself has been written in a “literate programming” style, with pretty formatted source code (with profuse comments) available in print (Vol B of Computers and Typesetting) and as a PDF (http://texdoc.net/texmf-dist/doc/generic/knuth/tex/tex.pdf), there's a detailed history/retrospective and log of every change that went into the program (see Chapters 10 and 11 of the book Literate Programming, though the log without explanation is also available online http://texdoc.net/texmf-dist/doc/generic/knuth/errata/errorl...), and there are even 12 hours of video of Knuth talking about the internals of the program (https://www.youtube.com/watch?v=bbqY1mTwrj8&index=12&list=PL...).

So when the article says:

> To reproduce all of LaTeX in the browser is too much to ask

I wonder why? The file tex.web is less than 25000 lines long, much of it comments, so I'd estimate that TeX itself is only about 20000 sloc (in fact tangle on tex.web generates a Pascal file tex.p which is only 6115 lines long). This is not a lot IMO, and it would be a lot better to actually re-implement this, with additional support for things like getting the parse tree etc.

patte 8 years ago

I was wondering recently if/how it would be possible to piggybag latex’ georgous typesetting (place the letters) to bring justified-text to websites. I want to do a PoC for absolut positioning all letters of a basic document placed by tex for my screensize.

Did anyone ever see such an approach?

gravypod 8 years ago

Are there any other solutions to document typesetting with latex-like features? TeX is very obtuse for someone who hasn't been using it for a long time.

  • flother 8 years ago

    A common solution is to use LaTeX, but to use it indirectly: write in Markdown and convert to PDF using Pandoc [1], which uses LaTeX in the background. This is (part of) the process used in RMarkdown [2], for example. That way, you get all the benefits of TeX and LaTeX but without most of the pain.

    [1]: https://pandoc.org/index.html [2]: http://rmarkdown.rstudio.com/

    • curiousgal 8 years ago

      I just use Atom with the markdown-preview-plus package for live preview.

    • gravypod 8 years ago

      I've seen some people do org-mode -> TeX -> research paper. It's very impressive. I just wish there was something like that with a more GUI/polished feel.

  • globuous 8 years ago

    I've been using org-mode and exporting it to HTML. Then making an @media(print) style sheet and exporting the HTML/printCSS to PDF through princeXML.

    It's been amazing. Latex equations are exported as pngs (for PDF export because I don't think prince does Mathjax, but org mode can export to mathjax). I have my bibliography with bibtex2html. And templating my pdfs becomes so much easier than with latex. It's just HTML CSS !! My figures are numbered and captioned and referenced throughout the text, same for tables. And my table of content is generated. And code is highlighted. And I have access to ditaa for ascii flow charts and a bunch of other stuff (for making uml in ascii with png export for the PDF for example). It also handles excel like tables with formulae (possible to have lisp formulae !! So cool !!) in text mode !!. And of course, you can plot your table through gnuplot inside your org file. You tell it which columns and rows, the type of graph etc :)

    It's also easy to include other org files, or to go down to raw HTML for the export (rather than org mode->HTML) if need be (for a picture than spans over 2 pages for instance).

    Give it a try, you might like it ;) In the end it's just an org mode export to HTML to PDF with the print CSS media query. But it works remarkably well and you have all the org mode features.

    • lorenzhs 8 years ago

      Any particular reason why you don't use org-mode's latex export (org-latex-export-to-pdf / C-c C-e l p) directly? It will render math nicely, not as embedded images, etc.

      • globuous 8 years ago

        It's really because of theming. I was trying to theme my latex document, but it don't know tex well. I do know CSS well though. So theming my header, my margins, my bloquotes, my images etc is very easy in CSS. I have no idea how to achieve this easily with tex.

    • gsnedders 8 years ago

      > Latex equations are exported as pngs (for PDF export because I don't think prince does Mathjax, but org mode can export to mathjax).

      Prince does MathML, at least, if you want to avoid images.

  • pdm55 8 years ago

    Geogebra, https://www.geogebra.org/, might suit your purpose. See it in action https://www.youtube.com/watch?v=GjPakjpEAXs You can use it to produce docs, such as https://www.geogebra.org/m/M4nBYbbG#material/c3wwdgD5

  • applecrazy 8 years ago

    For lightweight stuff, there's vanilla Markdown, but you have no control over formatting. For more serious work using markdown, you can try out Ulysses[0] or Scribus[1].

    And, if you feel like spending an obscene amount of money, on the order of $10k, there's Arbortext APP[2]. (I don't know why this even exists?)

    [0]: https://ulyssesapp.com/ [1]: https://www.scribus.net/ [2]: https://en.wikipedia.org/wiki/Arbortext_Advanced_Print_Publi...

  • funkaster 8 years ago

    There was Lout[1], but it seems to be abandonded. I really liked it, especially the simpified syntax (compared to latex). It was also unicode-safe by design.

    [1]: https://en.m.wikipedia.org/wiki/Lout_(software)

  • beefhash 8 years ago

    UNIX has been doing that for the past 40 years until AT&T ripped troff out of standard UNIX installations.

    Look into groff and possibly heirloom doctools. It's fairly difficult to learn and the default macro packages on most installations may be somewhat difficult to come to terms with/adjust for your own needs. You're definitely expected to learn basic troff macros to hack up a macro package if needed. See also: http://www.schaffter.ca/mom/ and https://utroff.org/

  • ufo 8 years ago

    You might want to check out LyX. It is a GUI editor that generates beautiful TeX documents but it is designed to be an user-friendly document processor instead of just a TeX GUI.

    http://www.lyx.org/

martyalain 8 years ago

What do you think of this project {lambda way} as an alternative to LaTeX in a browser: http://lambdaway.free.fr

For instance, from this wiki page http://lambdaway.free.fr/workshop/?view=oxford I could directly generate a PDF paper, http://lambdaway.free.fr/workshop/data/lambdatalk_20170728.p..., and slides, http://lambdaway.free.fr/workshop/?view=oxford_slides

Some other pages in this workshop: http://lambdaway.free.fr/workshop/?view=factory http://lambdaway.free.fr/workshop/?view=NIL http://lambdaway.free.fr/workshop/?view=teaching http://lambdaway.free.fr/workshop/?view=lambdacode

Your comments are welcome.

Alain Marty

etaioinshrdlu 8 years ago

I used https://github.com/phfaist/pylatexenc to convert LaTeX to unicode text, with math symbols and superscripts etc.

It's of course never going to be as good looking as MathJax or something like that -- but it may be more appropriate to be able to treat it as plain Unicode text in some cases.

For instance, it works in title fields across the web and search engines will understand it better than anything else.

emeryberger 8 years ago

There is not really a need to modify LaTeX at all to make it run in the browser. It already exists. Without modifying a single line of code, we have implemented a full browser-based port of LaTeX as part of our Browsix project, which makes it possible to run full, unmodified Unix applications inside the browser. See http://browsertex.org and http://browsix.org (and http://bpowers.net and https://jvilk.com/ and http://plasma.cs.umass.edu).

djuerges 8 years ago

I actually did 'LaTeX in the browser' as a master thesis in 2014, but never went to continue developing it afterwards, be it as open-source project or with a commercial intent in mind. Although I though, at that time, I was at least up to the few solutions that were out there and solved the task of instant updates and real-time collaborative work on a document pretty gracefully.

Some neat improvements would have been version and so on, but you know, never made it that far after picking up a job. Kind of a shame...

https://github.com/djuerges/cotex

jessriedel 8 years ago

I read the post but I still don't understand: is it possible to define new commands using \def or \newcommand? At first I thought these are what the other meant by "macro", but later he says

> We are exploring ways for users to define non-default environment behaviors in the browser. The same goes for macros used outside the dollar and double-dollar fences.

But I can't use \def or \newcommend to define things that appear inside dollar signs either.

  • jxxcarlsonOP 8 years ago

    Here is an example:

    $$ \newcommand{\bra}{\left<} \newcommand{\ket}{\right>} $$

    $$ \bra a | b \ket $$

    If you go to https://jxxcarlson.github.io/app/minilatex/src/index.html, press the "Clear" button, then paste the above text, then press "Render", you should see the macros \bra and \ket properly rendered.

    • jessriedel 8 years ago

      Oh I see, thanks. For what it's worth, I would definitely include this example in the demo; it's basically the first thing I wanted to use. Given your pipeline, it makes sense that the \newcommand definitions themselves has to appear inside dollar signs (not just when they are used), but for people with a TeX background it's pretty unintuitive.

      Also, you should definitely use \lange and \rangle in place of < and > for bra-ket notation :)

angarg12 8 years ago

Just for fun here is a little web game I made to look like a maths paper using MathJax.

https://angarg12.github.io/TrueExponential/

jimhefferon 8 years ago

PreTeXt from http://mathbook.pugetsound.edu/ has gotten some mindshare.

abritinthebay 8 years ago

I love the output of LaTeX but the language itself (and it’s dependencies and packages) are an absolute horror show.

I’ve never understood how people can learn be it so, writing it is painful, it’s tooling is abysmal, and it rarely seems to work except on the person who wrote its machine.

We’ve got to be able to do better.

  • mkl 8 years ago

    > it’s tooling is abysmal

    It seems like you haven't tried many editors. Have you tried TeXStudio (https://www.texstudio.org/)?

    > it rarely seems to work except on the person who wrote its machine.

    I and many others edit the same documents at the university where I work, without significant issues. Distributions like TeXLive (https://www.tug.org/texlive/) provide a consistent all-inclusive cross-platform solution.

    • abritinthebay 8 years ago

      TeXStudio would be a perfect example of its abysmal tooling. It’s better than the CLI tools but it’s an awful editor and highlights how incompatible with a good writing experience LaTeX is.

      Yes, many people produce good work in it - it’s output is fantastic after all - but an editor that would have been a substandard user experience in the 90s is the best LaTeX has in tooling.

      That’s exactly what I mean!

      • diffeomorphism 8 years ago

        Can you try to phrase that more precisely/constructively by including a reason why it is "awful" or give an example of "good tooling"?

        As far as editing goes, latexmk, syntax highlighting and good shortcuts are all I ever use and am perfectly happy with (emacs+auctex). It is a different paradigm than WYSIWYG, but different does not say anything about good or bad.

        Now writing new latex classes, I agree. That is very unintuitive and would greatly benefit from simplification, templates and tools.

        • abritinthebay 8 years ago

          I could go into a long, detailed, breakdown of how bad TeXStudio is but, frankly, if they want UI/UX work they should pay for it. Which they clearly don’t.

          It’s... decent enough in the pack of “open source UI” but that isn’t a high bar.

          Here’s the thing about that (oft repeated) line about WYSIWYG vs WYSIWYW: it’s bullshit.

          There’s no justification for it other than the deficiencies of the tooling and tool chain. It’s an excuse.

  • notthemessiah 8 years ago

    A task easier said than done.

    Also, it should be considered that it's impossible to make breaking changes in the LaTeX language otherwise you lost the ability to compile a paper from 30 years ago.

    But if you're trying to do something simple, I would say go for pandoc and use whatever format you're comfortable, then convert it to TeX: https://pandoc.org/

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection