Untitled - NFHN Reader

Jon Sterling 2025 5 27 2025 5 30 https://www.forester-notes.org/QHXS/

QHXS

/QHXS/ Intellectual junkyards I was first inspired to create Forester after exploring the Stacks Project and realising that what I most wanted most dearly was the ability and digital space to create my own—and feeling the pain when I tried to install the essentially defunct Gerby software that runs it. Paolo Brasolin was kind enough to help me with this, by building Sheafy on top of Jekyll. I eventually reached the limits of that tool and created Forester on my own (first as a Hugo plugin, and finally as standalone OCaml software). Later I was joined by Kento Okura who has become a strong partner in designing and developing Forester. Whilst exploring the space of tools for thought, I became aware of a cluster of related and deeply inspiring ideas—starting with Andy Matuschak’s evergreen notes which led me to Niklas Luhmann’s Zettelkasten and various kinds of digital gardening. Although all these trends are slightly different and shouldn’t be conflated, the common thread is to accumulate a bunch of small topical notes over many years in hypertext. (Luhmann’s Zettelkasten was of course a physical hypertext, where hyperlinks are hand-written and addressed to locations in his card box, whereas many Zettelkästen today are digital.) Matuschak’s “evergreen” concept refers to the practice of having notes evolve and remain up to date, as opposed to scratch notes that lose salience with time. I was interested enough in the idea that I spent some time thinking about how it might be adapted to the mathematical sciences, and I wrote a manifesto of sorts on the topic: Designing tools for scientific thought. This manifesto guided the design of Forester, whose name was itself inspired by Matuschak’s writings on evergreen notes. For some time now, however, I have been noticing a great deal of friction in my own use of evergreen notes as a tool for advancing science; for this reason, I have been using my own Forest in a different way for some time now. Jon Sterling 2025 5 27 2025 5 30 https://www.forester-notes.org/QHXT/

QHXT

/QHXT/ Life in the junkyard after a few years Let me describe how someone like me predictably starts building up their forest. Because I’m interested in category theory, I start by creating a tree entitled Category with taxon Definition. I start doing the same for other important notions, like functors, natral transformations, etc. Eventually I prove the Yoneda lemma! I spend about a year building up a glorious forest full of category theory. Only then do I realise that ordinary category theory, in which a category is defined to have a set (or a class) of objects, is a mess; I want to switch to univalent category theory which smooths over many of the rough edges. Unfortunately, univalent foundations is very different from set theoretic foundations in subtle ways. I have a plan to refactor my entire forest to account for this, but obviously I have to get my actual work done too. I’ll just hold off on that epic refactoring... I discover an interesting theorem or insight, and I want to write it down in my forest. But my motivation dies when I realise that it would mean adding to the old-and-busted notes, which would increase my future burden when I will definitely (not kidding) update my hundreds of notes. With my motivation killed, I forget about the insight and move on. By the way, not only do I want to switch to univalent categories, I also want to re-do everything in terms of displayed categories. Ouch. This happens enough times, and the forest dies and becomes a junkyard. The End. I suspect this is a somewhat common negative experience with Zettelkästen, but not many people talk about it. In one brave exception, Christian Tietze writes about a similar experience recently with his Zettelkasten, in which he has written a ton of now-outdated notes about subtleties of Apple’s SwiftUI and Observation frameworks. Keeping such things “evergreen” is almost impossible because Apple frameworks are a rapidly moving target. The point I want to make is that mathematics and the sciences are also rapidly moving targets, and if your outlook on them slows its roll long enough for it to become practical to keep a sizable forest evergreen, it may indicate intellectual stagnation more than intellectual wealth. I must also point out that keeping a forest evergreen in this sense does not only amount to updating various trees periodically. One’s mathematical development over time tends to involve not only evolving definitions and proofs, but also an evolution of the entire ontology—which means that the very decision of what to put in one tree and what to put in another tree is difficult to commit to in the long term. Jon Sterling 2025 5 27 2025 5 30 https://www.forester-notes.org/QHXU/

QHXU

/QHXU/ Tree-rings and the dialectics of knowledge production My own experience with the long-term maintenance of a forest or Zettelkasten is that the level of activity therein is uneven over time—there are periods of expansion followed by periods of quiet. In the former, trees tend to cluster around not only a group of related topics but also a particular approach to ontologising those topics. When you encounter one of those trees later on, you can kind of remember when it was written because of the way it was written and the particular relationships you chose to reify in hypertext. Seeing these distinct “rings of growth” emerging in my own forest brought me to the realisation that my viewpoint on evergreen notes might have been overly simplistic. The question seems to be: should we resist the distinctness of intellectual growth rings by hewing to a homogeneity of the present, or should we embrace time and change as fundamental aspects of knowledge production? Jon Sterling 2025 5 27 2025 5 30 https://www.forester-notes.org/QHXV/

QHXV

/QHXV/ Resist ontology and embrace time Coming back to the difficulty of maintaining a forest in a state of intellectual evolution, I have found that a good approach is to organise my notes around “tracker” trees—which pertain to a particular topic or project but have little content of their own, serving mainly as backlink accumulators for notes written in my private journal, my weeknotes, and my blog. For example, one thing I am working on is Project Pterosaur—which will be a new proof assistant that combines dependent type theory with ideas from Isabelle/HOL. As you can see, there is very little content to the Project Pterosaur tree, but in the private version of my forest there are many backlinks coming from journal entries. I can reconstruct the entire project in my mind by reading through these entries via the backlinks panel, and that to me is far more useful than a (high-churn) “evergreen” note that explains my vision for the project and its status, etc. Obviously this is not a new idea—many people use tools like Obsidian, Logseq, Roam, etc. in the way I am describing. Jon Sterling 2025 5 27 2025 5 30 https://www.forester-notes.org/QHXX/

QHXX

/QHXX/ Blogging is the accretion of temporal insight A few months ago, I had a very enlightening conversation with my colleague Anil Madhavapeddy about blogging. Anil’s research group at Cambridge has a practice of internal blogging and weeknotes: pretty much everyone is writing about their work in blogs that are then syndicated into a Matrix channel. Then, discussion proceeds organically in the chat and in person. These blog posts are not the transient/fleeting notes discussed by Matuschak, because they are not scraps—they are evergreen in the “free” sense of being permanent discourses on a moment in time. For more than a year, I had already been maintaining a personal journal of daily notes in my private forest, but there was always some tension in my writing practices there because I was perfunctorily accumulating “records” of what I did rather than insight. What had not clicked for me at the time was that I really ought to have been blogging (for myself). Journalling in the most progressive sense is simply blogging with a restricted audience, facilitating the accretion of insight more than the mere memory of what happened when. Since my chat with Anil, I have shifted my practice to emphasise semi-public weeknotes whose goal is to provoke and continue discussions that are unfolding within the community that reads them (mainly my colleagues in the Computer Laboratory). Of course, this was only possible after I had added support for syndication to Forester! I still write private entries as well, of course, but not as many. Currently, Forester creates a lot of friction around access control and one of the goals in the coming year or two is to decrease this friction so that one does not have to think so carefully about where to put something that you want to write. Jon Sterling 2025 5 27 2025 5 30 https://www.forester-notes.org/QHXW/

QHXW

/QHXW/ Sprouting shoots, and the Hyperbook I have written above about a forestry practice that embeds time and the evolution of ideas at its core; this is achieved by resisting the packrat’s urge to ontologise every aspect of their intellectual life. This does not mean that static ontologisation has no place; for example, a textbook should not present an evolution of its own ideas (except insofar as it describes history), but should rather present a single unified viewpoint on a topic. Indeed, the original goal of Forester was to simplify the creation and deployment of resources like the Stacks Project and Kerodon, which I will hereafter refer to as hyperbooks. In the hyperbook, we can and must give full play to our instinct for ontologisation, and we are freed from the crushing obligation of permanence by treating a hyperbook in the same way that we treat a blog post: as a unified accretion of insight that pertains to a specific moment in time, making no pretensions as to the universality of the specific ontological approach. This should not surprise anyone who has written books or lecture notes before, but it does represent a retreat from what might now be viewed as an “extremist deviation” in the design of Forester. Prior to Forester 5.0, each person was expected to have just a single forest and all their writings (including lecture notes and hyperbooks!) were expected to be part of that forest. I found that this approach, however, intensifies all the kinds of metal stress that kills motivation and hastens descent into the intellectual junkyard. A better way forward is to liberally split off hyperbooks as independent forests—which can then be federated with your own forest (please note that this functionality is experimental and subject to change). The ability to federate forests is important, because otherwise one would have to choose between bidirectional linking and modularity; with federation, there is no need to make this choice. Splitting off forests in Forester 5.0 still involves some annoying points of friction; the biggest one is the need for global identity that spans all forests. A solution to this problem will not ship with 5.0, but I am preparing to work on it for a subsequent release. Jon Sterling 2025 5 27 2025 5 30 https://www.forester-notes.org/QHXY/

QHXY

/QHXY/ Renew life, have a forest fire! The goal of the Forester project has always been to reduce mental stress and release untapped stores of creativity in the struggle for outlook and insight. In some sense, every iteration of Forester has been an experiment from which we are taking both positive and negative lessons. Some users of Forester have adopted extreme ontological practices (abetted, perhaps regrettably, by our dangerously powerful Datalog query engine). It is not my place to prescribe how people should use Forester, but I do care for people’s well-being and I strongly believe that these kinds of practices can destroy the mind over time. If you are feeling the weight of ontology in a heavy-laden forest, there is nothing wrong with a little forest fire. When the smoke clears, start a blog. References Jacob Lurie 2023 https://www.forester-notes.org/kerodon/

kerodon

/kerodon/ Kerodon Reference https://kerodon.net Kerodon is an online textbook on categorical homotopy theory and related mathematics. It currently consists of a handful of chapters, but should grow (slowly) over time. It is modeled on the Stacks project, and is maintained by Jacob Lurie. Its infrastructure uses Gerby, which is maintained by Pieter Belmans. The design was created by Opus Design. The Stacks Project Authors 2018 https://www.forester-notes.org/stacks-project/

stacks-project

/stacks-project/ The Stacks project Reference https://stacks.math.columbia.edu The Stacks project is an ever growing open source textbook and reference work on algebraic stacks and the algebraic geometry needed to define them. Here are some quick facts: The Stacks project is not an introductory text. It is written for graduate students and researchers in algebraic geometry. The aim is to build algebraic geometry and use this in laying the foundations for algebraic stacks. The theory of commutative algebra, schemes, varieties, and algebraic spaces forms an integral part of the Stacks project. The Stacks project has a maintainer (currently Aise Johan de Jong) who accepts changes etc. proposed by contributors. Everyone is encouraged to participate. The Stacks project is meant to be read online. Consequently we do not worry about length of the chapters, etc. With hyperlinks and the search function it is possible to quickly browse through the chapters to find the lemmas, theorems, etc. that a given result depends on. We use tags to identify results, which are permanent identifiers for a result. You can read more about this on the tags explained page. For a longer discussion, please read the blog post What is the stacks project?. Pieter Belmans Raymond Cheng Aise Johan de Jong https://www.forester-notes.org/gerby/

gerby

/gerby/ Gerby Reference https://gerby-project.github.io/ If you have a LaTeX document which is large (probably several hundreds of pages at least) is regularly updated needs to be externally referenced often you will run into the problem that large PDFs are not easily navigable PDFs of any size are not very searchable the internal numbering changes often, making external references outdated Gerby addresses these problems by providing an online tag-based view, instead of just having a big PDF. Gerby is tailored towards making large online textbooks and reference works more accessible. In case you were wondering, a gerbe is a kind of stack (in the mathematical sense), and the software was originally meant for the Stacks project. Context Jon Sterling 2024 10 23 https://www.forester-notes.org/30FM/

30FM

/30FM/ Forester Blog Jon Sterling 2025 6 15 https://www.forester-notes.org/VNQ9/

VNQ9

/VNQ9/ Rewriting is where the magic happens One question I repeatedly get about Forester is: How do I export a paper written in Forester to a format that I can submit to a journal? It is technically possible to convert Forester’s output XML to LaTeX, using XSLT. There is an outdated stylesheet for that purpose that could be brought up to date. I think this misses something important, though. First of all, what you’ve written in your forest is probably, in form and structure, not a good fit for a journal or conference publication. Second of all, typesetting a publication requires a lot of special-casing and even rewording to avoid things like bad line breaks, etc.; none of that is reflected in the naïve idea of just “exporting your forest to a journal”, but that stuff is very important. I recommend writing in your forest to develop the ideas and explore the space. Then when it is time to fork off a publication, write that story from scratch using a tool that is suited for old-fashioned publication typesetting, like LaTeX. Writing from scratch is not a waste of time; it is the most important part, because you now get to weave together all your insights into a coherent and self-contained narrative that speaks directly to the intended audience of the work in their language; in this process you may adopt notations that are good specifically for that publication, but which you would not wish to expose to (e.g.) a more technical audience, etc. The goals of publishing a traditional article are in total contrast to the goals of writing a forest, which is to leverage the accumulation of interconnected insight (in a necessarily non-self-contained way). There is nothing that I love more about my job as a researcher than taking something I’ve written extensive technical notes on, and then writing the story out linearly from scratch, choosing what to keep and what to omit, how to explain, in what order, and with what notations, for a specific audience. This is where the magic happens. Jon Sterling 2025 5 27 2025 5 30 https://www.forester-notes.org/QHXS/

QHXS

QHXT

QHXU

QHXV

QHXX

QHXW

QHXY

/QHXY/ Renew life, have a forest fire! The goal of the Forester project has always been to reduce mental stress and release untapped stores of creativity in the struggle for outlook and insight. In some sense, every iteration of Forester has been an experiment from which we are taking both positive and negative lessons. Some users of Forester have adopted extreme ontological practices (abetted, perhaps regrettably, by our dangerously powerful Datalog query engine). It is not my place to prescribe how people should use Forester, but I do care for people’s well-being and I strongly believe that these kinds of practices can destroy the mind over time. If you are feeling the weight of ontology in a heavy-laden forest, there is nothing wrong with a little forest fire. When the smoke clears, start a blog. Jon Sterling 2025 3 25 https://www.forester-notes.org/JVIT/

JVIT

/JVIT/ Towards Forester 5.0 II: a design for canonical URLs One of the goals of Forester 5.0 is lightweight federation—the ability to have two forests participate in the same graph and therefore provide backlinks, etc. In a previous post (Towards Forester 5.0: a design for global identity), I talked about some of the difficulties that arise when dealing with identities of people and references that have global scope but could nonetheless be described by trees in many forests. I proposed that such things should be addressed by canonical URIs (e.g. DIDs, DOIs, etc.) and that Forester should grow the ability to bind a canonical URI to multiple trees, which are then gathered into a disambiguation page. Today I want to broaden the discussion to cover the difficulties of addressing trees themselves (as opposed to the global entities they may describe). This is a proposal and I welcome feedback. Jon Sterling 2025 3 25 https://www.forester-notes.org/JVIU/

JVIU

/JVIU/ Forester must become part of the Web I have been working on developing the prerequisites for Forester to emit RSS and Atom feeds for blogs, and I realised that the problem I was trying to solve earlier this month is a more multifaceted than I originally thought. It comes down to analysing what is needed for Forester to be a good citizen of the World Wide Web: in particular, if we emit an RSS feed that has hyperlinks to some trees in it, those links must refer to an actual page on the actual web rather than something specific to Forester’s ontology. This may seem downright obvious in hindsight, but you must understand that for the longest time I was not thinking of Forester as a tool for progressively enhancing the Web, but rather as a tool for building fully-local life-wikis or Zettelkästen; I no longer believe that my former viewpoint is reasonable, and I have concluded that we must integrate Forester into the Web or else we will be buried under friction. This post is the start of a design for how to do this. Forget what you know about how either Forester 5.0 or previous versions currently work; in order to solve these problems in a reasonable way, we cannot be bound by the past versions of an experimental tool. What we are bound by is the architecture of the World Wide Web, and that will be reflected in the design. Jon Sterling 2025 3 25 https://www.forester-notes.org/JVIV/

JVIV

/JVIV/ What is the proposal? Here is the essence of the proposal: We get rid of the forest://host/addr scheme. Instead, trees are globally addressed by a canonical URL. The canonical URL of a tree can in principle be arbitrary, but in practice you will want it to be a place where that tree can be viewed — e.g. the place to which it will be uploaded and served via HTTP(S). Indeed, a default scheme will be provided so as to enable files to be rendered with names and relative locations consistent with the intended global addressing scheme; it is also possible to imagine customisation of this without disturbing the overall design. The canonical URLs are now the vertices of the graph. In Forester source code, a hyperlink like would be resolved right away to or something, using information supplied in the user’s forest; the same goes for transclusion. Links to trees in foreign forests must, for now, be totally explicit (but we can imagine relaxing this in the future). Importantly, this approach does not require knowing what is in the forest at evaluation-time. Jon Sterling 2025 3 25 https://www.forester-notes.org/JVIW/

JVIW

/JVIW/ What about replication and mirroring? It may seem annoying to have canonical URLs. For example, a forest that contains vital information might need to be published in multiple places. That much is true, but the fact that the physical publication of a forest is replicated should not allowed to impact the graph or fill it with redundant vertices and edges (e.g. should two mirrors become federated). So the only problem with replication is that hyperlinks might take you to the original forest instead of keeping you in the mirror, but I think this should be resolved by some kind of middleware that rewrites links, just as the Wayback machine rewrites links in its snapshots. That can be handled outside of Forester. Jon Sterling 2025 3 25 https://www.forester-notes.org/JVIX/

JVIX

/JVIX/ What about viewing my forest locally? Most of the time, an author is working with their forest on their own machine rather than on the web. It is important that links and transclusions point to the local content rather than whatever (if anything) is stored in the “global” canonical URL. I believe this is not actually a problem: although things like RSS feeds and perhaps even published websites would have all the hyperlinks point to the canonical URLs, there is no reason that this should be required for all renderers. It is easy to imagine making this a configurable flag for the default renderer, and for the upcoming “dynamic”/interactive HTML server we would emit links back to the local server rather than to the canonical URLs. Similarly, there may be projects where there is no intention at all of online publication. In such cases, the scheme for assigning canonical URLs can be arbitrary. Jon Sterling 2025 3 25 https://www.forester-notes.org/JVIY/

JVIY

/JVIY/ What about access control? Forester does not currently support any kind of access control, but this is indeed an important area that we are considering carefully in order to enable institutional use of Forester, and ease the burden of collaboration in the usual case of a forest that contains a mixture of data with varying levels of confidentiality. I believe that the current design is compatible with essentially any approach to access control that we might adopt, but I am interested in feedback to the contrary. Jon Sterling 2025 3 8 https://www.forester-notes.org/OYOJ/

OYOJ

/OYOJ/ Towards Forester 5.0: a design for global identity As we move closer to Forester 5.0, which introduces rudimentary federation capabilities, we must address new problems that did not arise in the days when no two forests interacted or linked to each other. The most immediate issue is that trees describing entities with “global” identity (including actual people as well as bibliographic references) will naturally be duplicated across many forests. For example, this happens when one person authors trees in multiple forests, and it happens even more often with bibliographic entries (both for the entries themselves and their author attributions). It is very important to handle this problem properly now in a way that (1) minimises friction and (2) enables us to quietly evolve toward more Web-centric approaches to identity as they emerge. Below, I survey some existing approaches to identity that we would hope to be compatible with at some level. If you want to skip to my concrete proposal, see § . Jon Sterling 2025 3 8 https://www.forester-notes.org/OYOL/

OYOL

/OYOL/ Survey of global identification schemes There are several extant schemes for identifying individuals, organisations, and artefacts. Some are centralised, and others are decentralised. Centralisation of identity is not necessarily a bad thing, but it is most viable when nearly everyone agrees on the central authority; on the other hand, decentralisation can help in situations where a single central authority has not accumulated enough trust or prestige to be viable. Jon Sterling 2025 3 8 https://www.forester-notes.org/OYOM/

OYOM

/OYOM/ Centralised identification via DOIs and ORCIDs Nearly every scholarly paper and book published has a Digital Object Identifier (DOI) assigned to it, which are managed by a single authority (The DOI Foundation); this applies to both traditional publishers and eprint servers like the arXiv. Services like Zenodo allow individuals to mint their own DOIs and pin resources and artefacts to them. Due to their widespread adoption, DOIs are a completely viable way to identify published papers and books—and I would argue that any attempt to replace DOIs with a decentralised identifier is likely to be counterproductive as the goal should not not be decentralisation per se but rather to have a reliable, universal way to refer to scholarly content and artefacts. What DOIs do for artefacts, the Open Researcher and Contributor ID (ORCID) aims to do for people acting within the framework of open science. ORCIDs seem to do their job well, but not everyone has or should have an ORCID—nor would every person who does have one voluntarily choose to pin their entire identity to it. Therefore, although I happily use them, I think ORCIDs are likely to face more of an uphill battle than the DOI—which needed buy-in only from major publishers and eprint servers to reach hegemony. Jon Sterling 2025 3 8 https://www.forester-notes.org/OYOK/

OYOK

/OYOK/ Informal decentralised identification via web addresses A particularly simple way to identify a single person or organisation is by means of a web domain or an email address. Although not everyone has a domain name, many people have email addresses. On the other hand, people often have many domain names and their email address may change over time; and when people die, their presence on the web is often erased or lost. Therefore, although widespread, this approach may create difficulties with longevity and stability. Jon Sterling 2025 3 8 https://www.forester-notes.org/OYON/

OYON

/OYON/ General-purpose decentralised identification via DIDs When reading about the paper of Klepmann et al. outlining Bluesky’s AT Protocol, I learned of Decentralised Identifiers (DIDs). In essence, DIDs are URIs of the form did:method:path where method identifies how the DID is intended to be resolved and path is a colon-separated path that should be resolved by means of that method. In either case, a DID is intended to be resolved to a JSON document that contains information about the resource or entity being described, as well as various methods (like public keys) for verifying the integrity of that information. The methods are somewhat open-ended, but two important methods have emerged. Jon Sterling 2025 3 8 https://www.forester-notes.org/OYOO/

OYOO

/OYOO/ W3C’s did:web method W3C have specified the did:web method, which in which the path is intended to be a web domain. Simplifying somewhat, a DID like did:web:jonmsterling.com would be substantiated by responding to the HTTPS request https://jonmsterling.com/.well-known/did.json with a document in the appropriate format. The upside is that the owner of a web domain is their own identity authority; in this sense did:web is a truly decentralised identification scheme. The downside is that you have to have a web domain, and you also can never change it ever—the same disadvantage of informal decentralised identification which we have discussed. Jon Sterling 2025 3 8 https://www.forester-notes.org/OYOP/

OYOP

/OYOP/ Bluesky’s did:plc method For a social network like Bluesky, it is critical that users be able to migrate their identity from one domain to another. Obviously users may change or lose their domain over time, but it is important to keep in mind that the vast majority of users will never have their own domain and so they will over the course of their lives jump from one subsidiary domain that they don’t control to the next—just as Mastodon users are constantly migrating from instance to instance, driven to wander endlessly by either the petty tyranny of instance maintainers who think they know best, or by the natural quiescence of instances caused by lack of funds or time, or (in many cases) a combination of the two. It seems that there is no way to address this problem without introducing some central authority—a directory of permanent identifiers that are then resolved to documents that establish cryptographically verified bidirectional links with more ephemeral and human-readable forms of identification (such as web domains). This is essentially the design of Bluesky’s did:plc method, as explained by Klepmann et al.: On your own domain, which serves as your (ephemeral) handle, you place a DNS TXT record or file that contains a DID like did:plc:asdlkfh9q8034baliufhbcailurb. Someone resolves this DID to a document by querying a central directory server (such as Bluesky’s own). This document contains a link back to the domain; signatures are used to ensure that every update to the document has been authorised by whoever signed it when it was first minted. Although some centralisation is required here, the use of cryptographic proof ensures that the central authority does not need to be trusted (to a certain extent). Jon Sterling 2025 3 8 https://www.forester-notes.org/OYOQ/

OYOQ

/OYOQ/ Analysis of global identity in Forester Although Forester aims to become a better citizen of the Web and integrate with emerging protocols, it is a non-negotiable design constraint that Forester still remain usable by people who don’t control a domain name, cannot run software on their web host, cannot set DNS records, and could not care less what a DID is. I also have a feeling that there will not be a single protocol that fits all use cases; what I am noticing, however, is that there are commonalities to all the protocols, and that we ought to be informed by these commonalities. For example, in every case an identity is resolved from a URI of some kind—for example, DIDs and DOIs and ORCIDs all have canonical URIs. Therefore, it strikes me that Forester’s approach to global identity must rest on the axiom that an identity is nothing more or less than a URI; we can place no constraints whatsoever on what form this URI takes, and we should also remain flexible as to compatibility with future replacements of URIs (whether in the form of IRIs, or the URL non-“standard”, etc.). If we start from that point of view, there some problems to address: Even if an identity is not canonically addressed by a tree in a forest, an identity still often needs to have a tree in the forest. One wants to store biographical and bibliographic information, and maybe even personal notes, etc., and at the very least it is very important to be able to browse backlinks on a biographical page even if the page itself has no content of its own. Not only must we be able to attach a tree to an identity: we must be able to attach many trees to an identity. This is a requirement of federation. Jon Sterling 2025 3 8 https://www.forester-notes.org/OYOR/

OYOR

/OYOR/ A plan for global identity in Forester Building on my analysis, I propose that Forester allow any tree to declare that it “describes” a given global identity in the form of a URI. At a first cut this can be done via datalog (but we would probably hide this behind something): Now it remains to explain how we shall surface the fact that a given entity is described by some tree. For any identity in this relation, we should automatically create a “disambiguation page” that transcludes all the attached trees. When a hyperlink points to a URI that lies in this relation, it should be directed to the disambiguation page. There are further implications for such a feature—for instance, in the future we might automatically populate bibliographic information, etc. (But we have to be careful due to the near-universal unusably low quality of bibliographic databases keyed by DOIs, etc.) We will need to provide guidance as to how identities should be assigned to (e.g.) people who don’t control an online identity, etc. The rule of thumb should be that we always defer to the preferences of the described person, and to the version of record in the case of an artefact. When there is no canonical choice, users of Forester should do what they like, but they should be willing to update their references in the future should a canonical global entity emerge. Jon Sterling 2025 3 8 https://www.forester-notes.org/OYOS/

OYOS

/OYOS/ Request for comment I am hoping to hear other people’s thoughts on this proposal, including any constructive criticisms or suggestions for how we might go about implementing it. You can write to me or the mailing list with your feedback. Jon Sterling 2023 5 14 2024 4 25 2025 5 26 https://www.forester-notes.org/0052/

0052

/0052/ Build your own Stacks Project in 10 minutes The Stacks project is the most successful scientific hypertext project in history. Its goal is to lay the foundations for the theory of algebraic stacks; to facilitate its scalable and sustainable development, several important innovations have been introduced, with the tags system being the most striking. Each tag refers to a unique item (section, lemma, theorem, etc.) in order for this project to be referenceable. These tags don't change even if the item moves within the text. (Tags explained, The Stacks Project). Many working scientists, students, and hobbyists have wished to create their own tag-based hypertext knowledge base, but the combination of tools historically required to make this happen are extremely daunting. Both the Stacks project and Kerodon use a cluster of software called Gerby, but bitrot has set in and it is no longer possible to build its dependencies on a modern environment without significant difficulty, raising questions of longevity. Moreover, Gerby’s deployment involves running a database on a server (in spite of the fact that almost the entire functionality is static HTML), an architecture that is incompatible with the constraints of the everyday working scientist or student who knows at most how to upload static files to their university-provided public storage. The recent experience of the nLab’s pandemic-era hiatus and near death experience has demonstrated with some urgency the pracarity faced by any project relying heavily on volunteer system administrators. Jon Sterling 2023 5 14 https://www.forester-notes.org/0053/

0053

/0053/ Introducing Forester: a tool for scientific thought After spending two years exploring the design of tools for scientific thought that meet the unique needs of real, scalable scientific writing in hypertext, I have created a tool called Forester which has the following benefits: Forester is tag-based like Gerby, and can therefore power large-scale generational projects like Stacks and Kerodon. Forester produces static content that can be uploaded to any web hosting service without needing to run or install any serverside software. Forester is easy to install on your own machine. To prevent bitrot, Forester is a single tool rather than a composition of several tools. Forester satisfies all the requirements of serious scientific writing, including sophisticated notational macros, typesetting of diagrams, etc. Forester combines associative and hierarchical networks of evergreen notes (called “trees”) into hypertext sites called “forests”. Jon Sterling 2023 3 4 https://www.forester-notes.org/tfmt-000R/

tfmt-000R

/tfmt-000R/ Forests and trees of evergreen notes Definition A forest of evergreen notes (or a forest for short) is loosely defined to be a collection of evergreen notes in which multiple hierarchical structures are allowed to emerge and evolve over time. Concretely, one note may contextualize several other notes via transclusion within its textual structure; in the context of a forest, we refer to an individual note as a tree. Of course, a tree can be viewed as a forest that has a root node. Trees correspond roughly to what are referred to as “tags” in the Stacks Project. In this article, I will show you how to set up your own forest using the Forester software. These instructions pertain to the Forester 5.0 version. Jon Sterling 2023 8 13 https://www.forester-notes.org/006R/

006R

/006R/ Preparing to run the Forester software In this section, we will walk through the installation of the Forester software. Jon Sterling 2023 8 13 https://www.forester-notes.org/006S/

006S

/006S/ System requirements of Forester Jon Sterling 2023 8 13 https://www.forester-notes.org/006T/

006T

/006T/ A unix-based system Requirement Forester requires a unix-based system to run; it has been tested on both macOS and Linux. Windows support is desirable, but there are no concrete plans to implement it at this time. Jon Sterling 2023 8 13 https://www.forester-notes.org/006U/

006U

/006U/ A working OCaml 5 installation Requirement Forester is written in the OCaml programming language, and makes use of the latest features of OCaml 5. Most users should install Forester through OCaml's opam package manager; instructions to install opam and OCaml simultaneously can be found here. Jon Sterling 2023 8 13 https://www.forester-notes.org/006V/

006V

/006V/ A working installation Requirement If you intend to embed -rendered diagrams in your forest, you will need to have a working installation of installed, such as TeX Live. If all your mathematical expressions are supported by , this is not necessary. Jon Sterling 2023 8 14 https://www.forester-notes.org/006Y/

006Y

/006Y/ The git distributed version control system Requirement It is best practice to maintain your forest inside of distributed version control. This serves not only as a way to prevent data loss (because you will be pushing frequently to a remote repository); it also allows you to easily roll back to an earlier version of your forest, or to create “branches” in which you prepare trees that are not yet ready to be integrated into the forest. The recommended distributed version control system is git, which comes preinstalled on many unix-based systems and is easy to install otherwise. Git is not the most user-friendly piece of software, unfortunately, but it is ubiquitous. It is possible (but not recommended) to use Forester without version control, but note that the simplest way to initialize your own forest involves cloning a git repository. Jon Sterling 2023 8 13 https://www.forester-notes.org/006W/

006W

/006W/ Installing the Forester software Once you have met the system requirements, installing Forester requires only a single shell command: To verify that Forester is installed, please run forester --version in your shell. Jon Sterling 2023 8 14 2024 4 25 2024 6 17 https://www.forester-notes.org/006X/

006X

/006X/ Setting up your forest from the template Now that you have installed the Forester software, it is time to set up your forest. Forester provides a simple command to initialise a fresh forest within a folder. We’ll call our folder forest, but you can call it anything you want. Now that we are inside our new directory, we can instruct Forester to initialise a forest. forester init This command initialises a git repository with the skeleton of a forest, which contains a configuration file named forest.toml; this file specifies the locations of your trees, assets, etc. There is also a git submodule bound to the theme/ directory (pointing to the base theme repository) that contains the stylesheets that web browsers will need in order to render your forest as HTML. Jon Sterling 2023 8 14 2024 6 17 https://www.forester-notes.org/0073/

0073

/0073/ Tree addresses in a forest A tree in Forester is usually associated to an address of the form NNNN is a four-digit base-36 number. The purpose of the base-36 code is to uniquely identify a tree within your forest in such a way that you are not tempted to rename it, as you might be when titles or dates are embedded into filenames. A tree with address NNNN is stored in a file named NNNN.tree (unless it is emitted from inside another tree by means of the inline subtrees feature). Note that the format of tree addresses is purely a matter of convention, and is not forced by the Forester tool. Users are free to use their own naming convention for tree addresses, and in some cases alternative (human-readable) formats may be desirable: this includes trees representing bibliographic references, as well as biographical trees. If you don’t like numerical tree addresses, nobody is forcing you to use them. The use of numerical addresses is suitable for projects like The Stacks Project, but it may not be appropriate for your own use-case. Addresses in Forester are not hierarchical: all resources are rendered at the root of the forest, no matter where their source files are kept. This limitation is intentional, but we might revisit it in the future. Jon Sterling 2023 8 15 2024 4 25 https://www.forester-notes.org/007D/

007D

/007D/ Building and viewing your forest for the first time To build your forest, you can run the following command of Forester's executable in your shell: forester build forest.toml The --dev flag is optional, and when activated supplies metadata to the generated website to support an “edit button” on each tree; this flag is meant to be used when developing your forest locally, and should not be used when building the forest to be uploaded to your public web host. Jon Sterling 2023 8 15 https://www.forester-notes.org/007G/

007G

/007G/ Forester renders each tree to an XML document Forester renders your forest to some XML files in the output/ directory; XML is, like HTML, a format for structured documents that can be displayed by web browsers. The forest template comes equipped with a built-in XSLT stylesheet (theme/default.xsl) which is used to instruct web browsers how to render your forest into a pleasing and readable format. Jon Sterling 2023 8 15 https://www.forester-notes.org/007I/

007I

/007I/ Serving and viewing your forest from a local web server To view your forest while editing it, you must serve it from a local web server..To do this, first ensure that you have Python 3 correctly installed. Then run the following command from the root directory of your forest: python3 -m http.server 1313 -d output (You could replace 1313 with whatever port you prefer.) While this command is running, you will be able to access your forest by navigating to localhost:1313 in your preferred web browser. In the future, Forester may be able to run its own local server to avoid the dependency on external tools like Python. Jon Sterling 2023 8 15 https://www.forester-notes.org/007K/

007K

/007K/ Creating your personal biographical tree The first tree that you should create is a biographical tree to represent your own identity; ultimately you will link to this tree when you set the authors of other trees that you create later on. It is convenient to simply use a person’s full name to address a biographical tree. My own biographical tree is located at trees/people/jonmsterling.tree and contains the following source code: Let’s break this code down to understand what it does. The declaration sets the title of the tree to my name. The declaration informs Forester that the tree is biographical. Not ever tree needs to have a taxon; common taxa include Person, Theorem, Definition, Lemma, etc. You are free to use whatever you want, but some taxa are treated specially by Forester. The subsequent declarations attach additional information to the tree that can be used during rendering. These declarations are optional, and you are free to put whatever metadata you want. Like in HTML, paragraphs must be wrapped in . Do not hard-wrap your text, as this can have visible impact on how trees are rendered; it is recommended that you use a text editor with good support for soft-wrapping, like Visual Studio Code. You can see that the concrete syntax of Forester's trees looks superficially like a combination of and Markdown; Markdown-style links are used both for links to other trees and for links to external URLs. Forester's concrete syntax is not fully documented, but it is less ambiguous than both and Markdown. Jon Sterling 2023 8 15 2024 4 25 https://www.forester-notes.org/007H/

007H

/007H/ Creating a new tree using forester new Creating a new tree in your forest is as simple as adding a .tree file to the trees folder. Because it is hard to manually choose the next incremental tree address, Forester provides a command to do this automatically: forester new forest.toml --dest=trees In return, Forester should output the location of the new tree, e.g. trees/0002.tree. If we look at the contents of this new file, we will see that it is empty except for metadata assigning a date to the tree: You may prefer to use randomised addresses over sequential addresses; this can be particularly useful if multiple people are contributing to a forest. In that case, pass the --random option to forester new. Most trees should have a annotation; this date is meant to be the date of the tree's creation; you can have more than one date, if you like to keep track of when a tree has been updated. You should proceed by adding further metadata: the title and the author; for the latter, you will use the address of your personal biographical tree. Tree titles should be given in lower case (except for proper names, etc.); these titles will be rendered by Forester in sentence case. A tree can have as many declarations as it has authors; these will be rendered in their order of appearance. Now you can begin to populate the tree with its content, written in the Forester markup language. Think carefully about keeping each tree relatively independent and atomic. Jon Sterling 2023 8 15 https://www.forester-notes.org/007L/

007L

/007L/ Bottom-up hierarchy via transclusion You may be used to writing documents, where you work from the top down: you create some section headings, put some text under those headings, make some deeper section headings, put more text, etc. Forests work in the opposite way, from the bottom up: you start by writing independent, atomic notes/trees and then only later start to (sparingly) assemble these into a hierarchy in order to reify the emerging structure. Forester’s bottom-up approach to section hierarchy works via something called transclusion. The idea is that at any time, you can include (“transclude”) the full contents of another tree into the current tree as a subsection by adding the following code: This is kind of like ’s command, but much better behaved: for instance, section levels are computed on the fly depending on the position in the hierarchy. This entire tutorial is cobbled together by transcluding many smaller trees, each with their own independent existence. For example, the following two sections are transcluded from an entirely different part of my forest: Jon Sterling 2022 12 27 https://www.forester-notes.org/tfmt-0009/

tfmt-0009

/tfmt-0009/ The best structure to impose is relatively flat It is easy to make the mistake of prematurely imposing a complex hierarchical structure on a network of notes, which leads to excessive refactoring. Hierarchy should be used sparingly, and its strength is for the large-scale organization of ideas. The best structure to impose on a network of many small related ideas is a relatively flat one. I believe that this is one of the mistakes made in the writing of the foundations of relative category theory, whose hierarchical nesting was too complex and quite beholden to my experience with pre-hypertext media. One of the immediate impacts and strengths of Forester’s transclusion model is that a given tree has no canonical “geographic” location in the forest. One tree can appear as a child of many other trees, which allows the same content to be incorporated into different textual and intellectual narratives. Jon Sterling 2022 12 26 https://www.forester-notes.org/tfmt-0006/

tfmt-0006

/tfmt-0006/ Hierarchical structure as non-unique narrative Multiple hierarchical structures can be imposed on the same associative network of nodes; a hierarchical structure amounts to a “narrative” that contextualizes a given subgraph of the network. One example could be the construction of lecture notes; another example could be a homework sheet; a further example could be a book chapter or scientific article. Although these may draw from the same body of definitions, theorems, examples, and exercises, these objects are contextualized within a different narrative, often toward fundamentally different ends. As a result, any interface for navigating the neighbor-relation in hierarchically organized notes would need to take account of the multiplicity of parent nodes. Most hypertext tools assume that the position of a node in the hierarchy is unique, and therefore have a single “next/previous” navigation interface; we must investigate the design of interfaces that surface all parent/neighbor relations. Jon Sterling 2023 8 16 https://www.forester-notes.org/007N/

007N

/007N/ The Forester markup language A tree in Forester is a single file written in a markup language designed specifically for scientific writing with bottom-up hierarchy via transclusion. A tree has two components: the frontmatter and the mainmatter. Jon Sterling 2023 8 16 https://www.forester-notes.org/007P/

007P

/007P/ Forester markup: frontmatter The frontmatter of a Forester tree is a sequence of declarations that we summarize below. Declaration Meaning sets the title of the tree; can contain mainmatter markup sets the author of the tree to be the biographical tree at address name sets the creation date of the tree; full ISO 8601 date-times are supported. sets the taxon of the tree; example taxa include lemma, theorem, person, reference; the latter two taxa are treated specially by Forester for tracking biographical and bibliographical trees respectively defines and exports from the current tree a function named with two arguments; subsequently, the expression would expand to body with the values of u,v substituted for brings the functions exported by the tree address into scope brings the functions exported by the tree address into scope, and exports them from the current tree Jon Sterling 2023 8 16 https://www.forester-notes.org/007O/

007O

/007O/ Forester markup: mainmatter Below we summarize the concrete syntax of the mainmatter in a Forester tree. Function Meaning creates a paragraph containing ...; unlike Markdown, it is mandatory to annotate paragraphs explicitly typesets the content in italics typesets the content in boldface creates an ordered list creates an unordered list creates a list item typesets the content in (inline) math mode using ; note that math mode is idempotent in Forester typesets the content in (display) math mode using transcludes the tree at address address as a subsection formats the text title as a hyperlink to address address; if address is the address of a tree, the link will point to that tree, and otherwise it is treated as a URL defines a local function named with two arguments; subsequently, the expression would expand to body with the values of u,v substituted for . typesets the content in monospace typesets the body externally using using preamble as preamble code (e.g. to set up tikz packages, etc.). It can be useful to wrap this in your own macro in order to insert your preamble code automatically. Jon Sterling 2023 8 16 https://www.forester-notes.org/007Q/

007Q

/007Q/ An complete worked example tree in Forester Example An example of a complete tree in the Forester markup language can be seen below. The code above results in the following tree: Jon Sterling 2023 2 11 https://www.forester-notes.org/001H/

001H

/001H/ Creation of (co)limits Definition Let be a functor and let be a category. The functor is said to create (co)limits of -figures when for any diagram such that has a (co)limit, then has a (co)limit that is both preserved and reflected by . Jon Sterling 2023 8 16 https://www.forester-notes.org/007R/

007R

/007R/ Deploying your forest to a web host Now that you have created your forest and added a few trees of your own, it is time to upload it to your web host. Many users of Forester will have university-supplied static web hosting, and others may prefer to use GitHub pages; deploying a forest works the same way in either case. First, make sure your forest is built using the earlier instructions. Then take the entire contents of your output directory and upload them to your preferred web host. Jon Sterling 2023 8 16 https://www.forester-notes.org/007S/

007S

/007S/ Let a hundred forests bloom! I am eager to see the new forests that people create using Forester. I am happy to offer personal assistance via the mailing list. Many aspects of Forester are in flux and not fully documented; it will often be instructive to consult the source of existings forests, such as this one. Have fun, and be sure to send me links to your forests when you have made them! Backlinks Related Jon Sterling 2023 3 5 https://www.forester-notes.org/tfmt-000W/

tfmt-000W

/tfmt-000W/ Evergreen notes in the sciences Jon Sterling 2022 12 26 https://www.forester-notes.org/tfmt-0003/

tfmt-0003

/tfmt-0003/ Evergreen notes Definition The phrase evergreen note is due to Andy Matuschak, who has written extensively about it in his public Zettelkasten. Evergreen notes are permanent notes that evolve and accumulate over time, cutting across different projects. Jon Sterling 2022 12 27 https://www.forester-notes.org/tfmt-0007/

tfmt-0007

/tfmt-0007/ Atomicity of scientific notes One of the design principles for evergreen notes described by Matuschak is atomicity (Evergreen notes should be atomic): a note should capture just one thing, and if possible, all of that thing. A related point is that it should be possible to understand a note by (1) reading it, and (2) traversing the notes that it links to and recursively understanding those notes. Traditional mathematical writing does not achieve this kind of atomicity: understanding the meaning of a particular node (e.g. a theorem or definition) usually requires understanding everything that came (textually) before it. In the context of the hierarchical organization of evergreen notes, this would translate to needing to go upward in the hierarchy in order to understand the meaning of a leaf node. I regard this property of traditional notes as a defect: we should prefer explicit context over implicit context. High-quality scientific notes should make sense with minimal context; hierarchical context is imposed in order to tell a story, but consumers of scientific notes should not be forced into a particular narrative. Indeed, as many different hierarchical structures can be imposed, many different narratives can be explored. My first exploration of hypertext science was the lecture notes on relative category theory; in hindsight, these lecture notes are very much traditional lecture notes, not written with the atomicity principle in mind. As a result, it is often difficult to understand a given node without ascending upward in the hierarchy. Jon Sterling 2022 12 27 https://www.forester-notes.org/tfmt-0008/

tfmt-0008

/tfmt-0008/ Achieving atomicity Atomicity in evergreen notes is enhanced by adhering to the following principles: no free variables: do not rely on one-off objects that are defined incidentally upwards in the hierarchy; turn them into atomic nodes that can be linked; favor explicit dependency: whenever using a terminology or construction that has been defined elsewhere, link it; notation should be decodable: all notations (except the most very basic) should be recalled via a link. It can be a bit excessive to link every word: but the pertinent links could be added to a “related pages” section. Jon Sterling 2022 12 26 https://www.forester-notes.org/tfmt-0005/

tfmt-0005

/tfmt-0005/ Hierarchy in evergreen notes Matuschak describes a number of organizing principles for evergreen notes, which are quite compelling; one design principle (Prefer associative ontologies to hierarchical taxonomies) deserves additional discussion in the context of mathematical thought. In particular, the problem of circular reference must be grappled with immediately rather than incidentally: in ordinary knowledge management, circularity represents the completion of a train of thought, whereas in mathematical thinking it remains very important to distinguish assumptions from consequences. Hence a purely associative organization of mathematical knowledge is not viable (although it often happens by accident), and so the hierarchical organization of mathematics must be taken seriously from the start. We find that Matuschak has in fact already grappled with the need for hierarchy in his note It’s hard to navigate to unlinked “neighbors” in associative note systems, where he discusses the difficulty of traversing the “neighbor”-relationship between notes that are related by another note’s context, but are not related on their own. Matuschak proposes to solve the problem by grafting hierarchy onto the associative ontology after the fact through “outline notes”: “Outline notes” can create pseudo-hierarchies with order and structure by linking to many child notes. Then we need the UI to support navigating between neighbors “through” these outline notes. The viewpoint of outline hierarchy as a structure imposed on the existing associative ontology is a convenient organizing principle for evergreen notes in the sense of Matuschak, but it is a necessary principle for the design of tools for scientific thought. Jon Sterling 2022 12 26 https://www.forester-notes.org/tfmt-0006/

tfmt-0006

tfmt-0009

/tfmt-0009/ The best structure to impose is relatively flat It is easy to make the mistake of prematurely imposing a complex hierarchical structure on a network of notes, which leads to excessive refactoring. Hierarchy should be used sparingly, and its strength is for the large-scale organization of ideas. The best structure to impose on a network of many small related ideas is a relatively flat one. I believe that this is one of the mistakes made in the writing of the foundations of relative category theory, whose hierarchical nesting was too complex and quite beholden to my experience with pre-hypertext media. There are many ways to model hierarchy, but there are two salient orthogonal distinctions in the different designs. Jon Sterling 2022 12 29 https://www.forester-notes.org/tfmt-000B/

tfmt-000B

/tfmt-000B/ Absolute vs. relative hierarchy in document markup languages Both HTML and LaTeX support a form of hierarchical organization with “absolute” heading levels, i.e. levels that count upward from a single root. In HTML, there is ]]>, ]]>, ]]>..., and in LaTeX there is , , , ,, ..., depending on the document class. This is in contrast to a relative model of hierarchy, in which there is a single command to introduce a section heading at the “current” level, and there are other commands to introduce hierarchical nesting. The absolute sectioning model is completely inadequate for the hierarchical organization of ideas, for the simple reason that it is the context of a node that determines what its level in the hierarchy is, not the node itself. When this is mixed up, it makes re-contextualization an extremely painful and time-consuming process: you must recursively increment or decrement all section levels that occur underneath a given node, as anyone who has used LaTeX for any significant writing project can attest. In traditional texts, re-contextualization occurs when you want to move a section from one place in the hierarchy to another; in the more fluid media I am pursuing, there may be many orthogonal hierarchical structures imposed on the network, so re-contextualization ceases to be a refactoring task and is elevated as a basic unit of scientific activity. In either case, we are drawn to prefer relative hierarchy over absolute hierarchy. See existing implementations of this idea. This is similar to the relationship between De Bruijn levels (global levels) and De Bruijn indices (local levels) in type theory: conventional section headings are like De Bruijn indices in that they count from the root node, whereas what we would want are section headings that count from the present node. Jon Sterling 2022 12 29 https://www.forester-notes.org/tfmt-000D/

tfmt-000D

/tfmt-000D/ Implicit vs. explicit hierarchy in document markup languages Many document markup languages, including LaTeX and HTML, use sectioning commands that evince an implicit hierarchical structure: for instance, consider the following HTML code: Foo

Bar

Baz

Qux

Boo

]]> The above corresponds to the tree [Bar > [Baz, Qux]], Boo]]]>. On the other hand, it is also possible to consider a model in which the hierarchy is made explicit through the syntactical tree structure of the markup language. This style is also supported (but not forced) in HTML:

Foo

Bar

Baz

Qux

Boo

]]> We greatly prefer the combination of (relative, explicit) hierarchy. Jon Sterling 2022 12 29 https://www.forester-notes.org/tfmt-000C/

tfmt-000C

/tfmt-000C/ Relative hierarchy in existing tools There are a few LaTeX packages that implement relative hierarchy for sectioning as an alternative to the backward model of absolute hierarchy. The coseoul package implements relative sectioning commands; similar to the existing sectioning commands, an implicit hierarchy model is employed, leading to an imperative feel with commands like }]]> and }]]>. The modular package builds on coseoul to behave better under the transclusion of LaTeX documents, introducing a command that is to be used instead of or . On the other hand, the dieudonne package implements a form of relative sectioning with an explicit hierarchical model, i.e. one in which the syntactical nesting of LaTeX environments induces the hierarchy. There are some attempts to impose a (relative, explicit) hierarchical model in HTML by using ]]> and only the ]]> heading command. In the HTML5 spec, this behavior was initially endorsed as part of the “outline” algorithm, but unfortunately almost no vendors of browsers nor assistive technology have correctly implemented this behavior. Jon Sterling 2023 3 4 https://www.forester-notes.org/tfmt-000V/

tfmt-000V

/tfmt-000V/ Forests of evergreen notes Jon Sterling 2023 3 4 https://www.forester-notes.org/tfmt-000R/

tfmt-000R

/tfmt-000R/ Forests and trees of evergreen notes Definition A forest of evergreen notes (or a forest for short) is loosely defined to be a collection of evergreen notes in which multiple hierarchical structures are allowed to emerge and evolve over time. Concretely, one note may contextualize several other notes via transclusion within its textual structure; in the context of a forest, we refer to an individual note as a tree. Of course, a tree can be viewed as a forest that has a root node. Jon Sterling 2023 3 4 https://www.forester-notes.org/tfmt-000U/

tfmt-000U

/tfmt-000U/ The extent of a tree in a forest Definition The extent of a tree within a forest is the smallest set of trees closed under the following rules: lies within the extent of . If is transcluded by , then any tree in the extent of lies also in the extent of . Jon Sterling 2023 3 4 https://www.forester-notes.org/tfmt-000Q/

tfmt-000Q

/tfmt-000Q/ Authorship and responsibility in forests A forest of evergreen notes may in general contain the work of many authors who have contributed in different ways. However, the correct attribution of authorship to a given tree is more subtle than one might at first think. To understand this subtlety, we first consider that each individual tree may contain both textual content and transcluded subtrees. Thinking inductively, a simple model of tree authorship would be to take the union of the authors of the immediate textual content and the authors of all trees within its extent. This model is incorrect, however, as authorship is usually taken to imply responsibility and endorsement, as can be seen by way of example from the ACM Policy on Authorship, Peer Review, Readership, and Conference Publication: They agree to be held accountable for any issues relating to the correctness or integrity of the work with the understanding that depending on the circumstances not all authors are necessarily held “equally” accountable. In the case of publications-related misconduct, it may be the case that penalties may vary for co-authors listed on a single publication. In particular, although one person may be aware of and responsible for the content of a given tree, it would be unreasonable to require them to be responsible for any subsequent (and potentially erroneous!) re-contextualization of that tree in the forest. For this reason, authors must be distinguished from contributors in forests of evergreen notes. Jon Sterling 2023 3 4 https://www.forester-notes.org/tfmt-000S/

tfmt-000S

/tfmt-000S/ Author of a tree Definition An author of a tree within a forest is someone who satisfies the following conditions: They contributed intellectually to the immediate textual content of the tree, i.e. the non-transcluded content. They can be held responsible for all the content within the tree, i.e. both the immediate textual content as well as the content of all the subtrees. Jon Sterling 2023 3 4 https://www.forester-notes.org/tfmt-000T/

tfmt-000T

/tfmt-000T/ Contributor to a tree Definition A direct contributor to a tree within a forest is either an author of the tree, or has contributed intellectually to the immediate content of the tree but cannot be held responsible for it. A contributor to a tree is someone who is a direct contributor to at least one tree lying within its extent. Jon Sterling 2022 12 26 https://www.forester-notes.org/tfmt-0003/

tfmt-0003

tfmt-0001

/tfmt-0001/ Designing tools for scientific thought This document records what I have learned about the design of “tools for scientific thought” over the past couple years, with an emphasis on the mathematical sciences. One of my goals in writing this is to set out both the unique requirements of an information data model that is needed to record and facilitate scientific thought, as well as the technical requirements for tools that can be used for mathematics. Jon Sterling 2022 12 26 https://www.forester-notes.org/tfmt-0002/

tfmt-0002

/tfmt-0002/ Tool for scientific thought Definition A “tool for scientific thought” could be many things, but it must be a tool for the development and interlinking of scientific ideas in a way that facilitates authoring, publishing, teaching, learning, and the maintenance of evergreen notes. A tool for scientific thought could be a piece of software, or it could be an organizing principle for physical notes on paper. In these notes, we will primarily explore the design of computerized tools for scientific thought. Jon Sterling 2022 12 26 https://www.forester-notes.org/tfmt-0004/

tfmt-0004

/tfmt-0004/ Existing tools for scientific thought The existing tools for scientific thought can be divided into two main categories: interactive proof assistants and textual authoring and publishing tools (including LaTeX, as well as the Gerby software that runs the Stacks Project). Jon Sterling 2023 3 5 https://www.forester-notes.org/tfmt-000W/

tfmt-000W

/tfmt-000W/ Evergreen notes in the sciences Jon Sterling 2022 12 26 https://www.forester-notes.org/tfmt-0003/

tfmt-0003

tfmt-0007

tfmt-0008

tfmt-0005

tfmt-0006

tfmt-0009

/tfmt-0009/ The best structure to impose is relatively flat It is easy to make the mistake of prematurely imposing a complex hierarchical structure on a network of notes, which leads to excessive refactoring. Hierarchy should be used sparingly, and its strength is for the large-scale organization of ideas. The best structure to impose on a network of many small related ideas is a relatively flat one. I believe that this is one of the mistakes made in the writing of the foundations of relative category theory, whose hierarchical nesting was too complex and quite beholden to my experience with pre-hypertext media. There are many ways to model hierarchy, but there are two salient orthogonal distinctions in the different designs. Jon Sterling 2022 12 29 https://www.forester-notes.org/tfmt-000B/

tfmt-000B

tfmt-000D

Bar

Baz

Qux

Boo

Foo

Bar

Baz

Qux

Boo

]]> We greatly prefer the combination of (relative, explicit) hierarchy. Jon Sterling 2022 12 29 https://www.forester-notes.org/tfmt-000C/

tfmt-000C

tfmt-000V

/tfmt-000V/ Forests of evergreen notes Jon Sterling 2023 3 4 https://www.forester-notes.org/tfmt-000R/

tfmt-000R

/tfmt-000R/ Forests and trees of evergreen notes Definition A forest of evergreen notes (or a forest for short) is loosely defined to be a collection of evergreen notes in which multiple hierarchical structures are allowed to emerge and evolve over time. Concretely, one note may contextualize several other notes via transclusion within its textual structure; in the context of a forest, we refer to an individual note as a tree. Of course, a tree can be viewed as a forest that has a root node. Jon Sterling 2023 3 4 https://www.forester-notes.org/tfmt-000U/

tfmt-000U

tfmt-000Q

tfmt-000S

tfmt-000T

tfmt-000E

/tfmt-000E/ Requirements for typesetting mathematics Many non-LaTeX hypertext tools boast some compatibility with mathematical typesetting: for instance, in any HTML-based tool it is possible to use MathML or, for better cross-browser support and easier authoring, import or MathJax. For instance: Logseq, Obsidian, and Notion all support rendering of LaTeX math code using either or MathJax. Unfortunately, the “support” provided is so limited that it is not usable for a working mathematician — so it is somewhat puzzling why the support is present in the first place. Here we will discuss some fundamental requirements for any tool that aims to support mathematical notes, without which it is not applicable for use by professionals. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000F/

tfmt-000F

/tfmt-000F/ Notational macros in mathematical authoring Mathematical writing tends to involve a variety of notations which (1) can be difficult to typeset by hand, and (2) will likely change over time. The difficulty of hand-typesetting is somewhat less important than the propensity of notation to change over time: when we change notations within a given mathematical work, we must update every occurrence of the notation: but when the representation of the notation is unstructured, it is not in fact possible for a tool (e.g. find-and-replace) to detect every instance that needs to be updated. Therefore, it is mandatory that the representation of mathematical notations be structured. LaTeX allows authors to structure their notations very simply using macros, which can be introduced using or . It is trivial to update all occurences of a notation by simply changing the definition of the corresponding macro. Unfortunately, most tools that purport to support the inclusion of mathematical expressions do not have adequate support for macros. Both and MathJax have excellent support for configuring macros, but these configuration options are not available in most of the tools that build on and MathJax: for instance, Logseq and Obsidian and Notion all support embedding mathematics, but they do not support configuring macros. In fact there is a community plugin for Obsidian that adds this functionality, but it only supports imposing a global macro library on the entire “vault”, which is inadequate. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000H/

tfmt-000H

/tfmt-000H/ Notational macros are local, not global In LaTeX, macros are organized into packages that are then globally imported into a single document. Because a LaTeX document comprises just one project and thus any transclusions (via or ) are of components local to that one project, this model is adequate — although experienced users of LaTeX are nonetheless all too aware of the difficulties of namespacing macro commands when interacting with multiple packages or document classes. The requirements for a tool that aims to bring together multiple projects over a very long period of time are somewhat different: many distinct packages of notation will be used across the body of work, and it is not possible to fix a single global notation package. Indeed, it is not reasonable to expect that all notes within a person’s mathematical life shall share the same notations, and in many cases, it would moreover be necessary for the names of the macros associated to these notations to clash. This can happen because two projects are orthogonal, or it can happen as the author’s tastes change over time — and it is not reasonable for such a tool to force enormous and weighty refactorings (touching thousands or tens of thousands of notes) every time the author’s taste changes. The need for arduous refactorings of this kind is one of the main ways that large mathematical projects tend to collapse under their own weight. It follows that any tool for thought whose support for mathematical notations involves a globally-defined macro package is inadequate for mathematical uses. On the other hand, it is also not reasonable to require the author to define all their macros in each note: notes tend to be small, and there will always be large clusters of notes that share the same notations — and for which the small refactoring tasks involved when notations change are a positive feature rather than a negative one, as one of the goals of a cluster is to accumulate cohesion. Therefore, the precise requirement for macro library support is as follows: The author must be able to define (in their own files) multiple notational macro libraries. A given note must be able to specify which macro libraries (if any) it employs. Finally, careful attention must be paid to the interaction between the above requirements and the transclusion of mathematical notes: a transcluded note must be rendered with respect to its own macro library, and not that of the parent note. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000G/

tfmt-000G

/tfmt-000G/ Mathematical diagrams and macro support A basic requirement of tools for scientific thought is to support the rendering of mathematical diagrams. What kinds of diagrams are needed depends, of course, on the problem domain: for my own work, the main diagram-forms needed are commutative diagrams and string diagrams. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000P/

tfmt-000P

/tfmt-000P/ Mathematical expressions and diagrams are tightly coupled Although diagramming may seem to non-mathematicians to be somewhat orthogonal to notational macro support, in reality any solution to the diagramming problem must be tightly and natively integrated with the rendering of mathematical expressions — simply because most diagrams involve mathematical expressions, and these invariably involve notational macros. The reason PGF/TikZ has been so successful is that it respects this tight coupling. The situation for hypertext mathematical tools is somewhat less advanced than that of LaTeX and PFG/TikZ, but there are several options which we discuss below. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000J/

tfmt-000J

/tfmt-000J/ Commutative diagrams in KaTeX has a very rudimentary support for commutative diagrams built-in, by emulating the package. Unfortunately, this support is completely inadequate for usage by mathematicians today: Only square diagram shapes are supported: commutative diagrams in general have diagonal and curved lines, but these are not supported. The rendering of the limited gamut of supported commutative diagrams is broken in most browsers (at least Safari and Firefox). In particular, lines are jagged as they are pieced together from pipes and arrows that are subtly misaligned. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000K/

tfmt-000K

/tfmt-000K/ Commutative diagrams in MathJax Like , MathJax supports the commands for rudimentary square-shaped commutative diagrams. Unlike the implementation of , the supported diagrams are rendered correctly without jagged lines; this means that for the vanishingly small population of mathematicians whose needs are limited to square-shaped diagrams, MathJax’s builtin support is viable. On the other hand, there is a more advanced option available for users of MathJax: the XyJax-v3 plugin, which adds support for the full gamut of diagrams to MathJax. Notably, this plugin is used by the Stacks Project. The only downside of the support is that it interacts poorly with accessibility features (but no worse than any other solution to rendering non-trivial commutative diagrams), and diagrams created using look considerably less professional than those created using or quiver. Both and MathJax have the benefit that diagrams created using them will respect the ambient macro package with which the tool has been configured; therefore, if one looks past the rudimentary nature of the support for commutative diagrams, our main requirement is indeed satisfied. Another tool worth discussing is quiver. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000I/

tfmt-000I

/tfmt-000I/ The quiver interactive diagramming tool The quiver application is an excellent graphical interface for interactively constructing commutative diagrams, with very high-quality rendering. One positive aspect of quiver is that it is possible to load it with your own macro library, so that diagrams involving custom notations render correctly in the graphical interface. The downside of the approach here is that the macro library must be located on a publicly accessible URL that can be pasted into the quiver interface. Quiver also offers excellent support for embedding the resulting diagrams in existing LaTeX documents: after creating your diagram, you can request a LaTeX snippet that includes a URL which allows you to resume editing your diagram. For example, the following code corresponds to the URL https://q.uiver.app/?q=WzAsMixbMCwwLCJBIl0sWzEsMCwiQiJdLFswLDFdXQ==: Unfortunately, the support for embedding quiver diagrams in HTML documents is currently inadequate for professional use. The HTML embed code provided simply produces an ]]>, and it is not possible to style the interior of the embedded frame (e.g. to change the background color, or decrease the margins): ]]> Therefore, we must conclude that although quiver is an excellent tool for authors of traditional LaTeX documents, it is not currently a candidate for inclusion in tools for hypertext mathematical authoring. Because of the currently inadequate support of quiver for embedding diagrams in hypertext settings, we cannot consider it any further. There is a final option that turns out to be the most used in practice: generating SVG images statically from embedded LaTeX code. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000L/

tfmt-000L

/tfmt-000L/ Generating images statically using LaTeX Because of the general inadequacy of the other available tools, most authors of hypertext mathematics with diagramming needs tend to rely on the static generation of images from LaTeX code using a local LaTeX toolchain. It is not difficult to instrument pandoc with a Lua filter to render tikz code to SVG images. There are also a variety of other tools that do something similar, which tend to be employed in static site generation: antex by Paolo Brasolin is used by Krater as well as jekyll-sheafy, both via jekyll-antex. Forester by Jonathan Sterling is used by the present web site. The basic architecture of such a tool is to scan for LaTeX blocks, and then identify them by a hash of their contents. This hash is used as a filename for files, which are compiled to and thence to using the tool; the resulting file is then embedded in HTML using an ]]> tag. Alternatively, is also possible to transclude the resulting ]]> element directly, but then one must be careful to rename all identifiers in the ]]> element uniquely, as it is possible for two different ]]> elements on a single page to interfere one each other. Both antex and Forester support passing a macro library to be used when rendering. Both jekyll-sheafy and Forester set their macro libraries on a page-local basis. A serious downside of generating images from LaTeX code is the negative impact on accessibility tools. This seems only slightly mitigated by the transclusion of the ]]> element as opposed to using ]]>. Ultimately accessibility for mathematical diagrams remains an unsolved problem, and it does not seem that the existing discussion on accessibility of hypertext mathematics has much to say about this problem. Finally, we comment on more principled approaches using web standards such as SVG and MathML that we hope will take form in the future. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000N/

tfmt-000N

/tfmt-000N/ SVG is not an authoring language SVG is an extremely powerful low-level language for vector images and diagrams with a variety of applications. Unfortunately, it is not reasonable to compose such diagrams directly in SVG as an author: in contrast to programmatic tools like PGF/TikZ, all positions in SVG are fixed, and there is no possibility to impose important abstractions (e.g. the concept of a line that is “glued” to a pair of nodes). On the other hand, there are many advantages to SVG, including the possibility to intermix SVG with other formats such as MathML. Because of the low level of abstraction, SVG images that appear in practice today are nearly always produced by a tool or compiler from an input that is defined at a much higher level of abstraction. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000O/

tfmt-000O

/tfmt-000O/ MathML is not an authoring language Despite some preliminary support for structured representation of high-level mathematical idioms via Content MathML, MathML is not intended to be an authoring language: it is a target language for other tools. Moreover, the content dictionaries (collections of basic elements) of Content MathML are chosen to pertain to the needs of grade-school and secondary-school mathematics and not at all to the needs of professional mathematics: The base set of content elements is chosen to be adequate for simple coding of most of the formulas used from kindergarten to the end of high school in the United States, and probably beyond through the first two years of college, that is up to A-Level or Baccalaureate level in Europe. Nonetheless, it seems that the goal was for the content dictionaries of Content MathML to be extended by the individual “communities of practice” to meet their specific needs: Hence, it is not in general possible for a user agent to automatically determine the proper interpretation for values without further information about the context and community of practice in which the MathML instance occurs. However, in contexts where highly precise semantics are required (e.g. communication between computer algebra systems, within formal systems such as theorem provers, etc.) it is the responsibility of the relevant community of practice to verify, extend or replace definitions provided by OpenMath CDs as appropriate. It seems that there is a possibility to use XSLT to define your own semantic notational macros, and this certainly bears further investigation. Due to the mutually reinforcing combination of historically poor vendor support and near-absolute isolation from actual communities of practice, i.e. working mathematicians, sophisticated direct use of MathML has never caught on. On the other hand, there is a great deal of MathML on the web today in the form of MathJax and output — tools which are not only currently necessary for obtaining consistent (and professional-quality) rendering of mathematics across browsers, but also are necessary for authoring due to their more succinct markup and easy support for macros. It seems that the future of MathML is brighter than it was in the past, as we are finally seeing a vital project to improve vendor support led by Igalia. Currently, even browsers that support the MathML standard do so with completely inadequate and unprofessional rendering quality, which means that tools like MathJax and may remain necessary for some time even after vendors finally support MathML. But we hope that with improved vendor support comes new and productive experiments with using semantic tools like XSLT to handle macros, etc. Unfortunately, given the tight coupling between the authoring of mathematical expressions and of mathematical diagrams, this transformation will not take place unless high-level hypertext-compatible tools for drawing diagrams are simultaneously developed. Jon Sterling 2023 1 7 https://www.forester-notes.org/tfmt-000M/

tfmt-000M

/tfmt-000M/ Towards mixing SVG and MathML in hypertext mathematics The W3C MathML Core Working Draft points out that MathML can be embedded into ]]> elements using the ]]> element. This is a great strength of the modularity of the model, and I believe that in the future, we will be able to use this as a way to render accessible mathematical diagrams in hypertext. What is missing? Essentially the current issue preventing widespread use of this method is the fact that neither SVG nor MathML is an authoring language: they are both currently too low-level to be seriously used by authors. For exactly so long as diagrams must be drawn using LaTeX-based tools rather than something MathML-compatible, it would be non-negotiable for the support of notational macros to itself be based in LaTeX syntax (e.g. as in both and MathJax). But it is worth imagining a future in which mathematical diagrams are drawn using a high-level interface to SVG, and then a pure MathML approach to notational macros becomes quite viable. This is not currently the world we live in, but it is something to hope for. https://www.forester-notes.org/andymatuschak/

andymatuschak

/andymatuschak/ Andy Matuschak Person https://andymatuschak.org/ Independent Researcher I’m an applied researcher, focused on creating user interfaces that expand what people can think and do. My current focus is an augmented book which actively helps people understand, remember, and use what they read. I believe personal computers can enable transformative tools for thought: environments that radically transform what people can think and do, so much so that we expand the set of thoughts it’s possible to think. I want to produce alien cognitive and creative powers—as wondrous and magical to us today as a modern visual effects artist might seem to a cave painter. https://www.forester-notes.org/index/

index

/index/ Forester Forester is a tool for authoring, exploring, and sharing scientific and mathematical hypertexts. It is your lab notebook, your journal, your blackboard, and the home of your lecture notes. Forester is maintained by Jon Sterling and Kento Okura. Project information Forester blog Presentations Release notes https://www.forester-notes.org/kentookura/

kentookura

/kentookura/ Kento Okura Person https://github.com/kentookura https://www.forester-notes.org/paolobrasolin/

paolobrasolin

/paolobrasolin/ Paolo Brasolin Person https://paolobrasolin.github.io/ 0000-0003-2471-7797 Contributions