Settings

Theme

A tale of two path separators

alexwlchan.net

66 points by dbaupp 9 days ago · 37 comments

Reader

VorpalWay 5 days ago

Old Macs (which I grew up with) had even more baroque path handling than mentioned in the blog post:

Double colon (::) meant the same as .. on Unix/DOS, that is "go up one level". So you have to be careful when concatenating paths to not get double separators.

Paths starting with : were relative. If a path didn't start with the separator, the first component was the volume name (disk partition). Again, quite unlike Unix.

Also, remember it was common to have spaces in names on Mac, even the default harddrive on Macs was named "Macintosh HD". So an absolute path like "Macintosh HD:Programs:MacWrite" would have been common. (I grew up with Macs in Swedish, so I'm back translating the names here, could be that the names were slightly different in English.)

  • BoingBoomTschak 5 days ago

    Fun things is that I encountered that for the first time when using Clozure CL (https://ccl.clozure.com/) which quotes colons when converting paths to string even on Linux:

      $ cat <<'EOF' >x.lisp
      heredoc> (require :uiop)
      heredoc> (let ((p (make-pathname :name "foo:bar")))
      heredoc>   (format t "~@{~A~%~}" (namestring p) (uiop:native-namestring p)))
      heredoc> EOF
      $ ccl -b -Q -l x.lisp </dev/null
      foo\:bar
      foo:bar
      $ sbcl --script x.lisp
      foo:bar
      foo:bar
  • fragmede 5 days ago

    Current macOS Finder let's you name files with a slash in them, rendered as a : in the terminal.

    • VorpalWay 5 days ago

      Isn't that just what the original article that we all commented on described? I don't understand what you are trying to add to the conversation here.

    • microtonal 5 days ago

      I think that the slash might be the rendering though? If you

          $ touch "foo:bar"
      
      In the Terminal, then Finder renders it as foo/bar.

      So who is lying, how is it stored in the directory entry in APFS itself?

      • kps 4 days ago

        APFS stores it as "foo:bar". (I don't think the filesystem itself cares what's in a name.)

windowliker 5 days ago

It took me a long time to understand why colon wasn't a valid character for file names on Mac and I still find the colon separator to be the least visible these days. Finder can display paths with the forward slash separator (defaults write com.apple.finder _FXShowPosixPathInTitle -bool YES), and yet forward slash may be used in a file name created through Finder as noted in the post, while colon cannot (which is not addressed), but creating a file in the terminal named with a colon is possible and the shell will escape it correctly in use. This file then shows up with a slash in place of the colon when viewed in Finder, and conversely the file with a slash in the name shows up in Terminal with a colon!

  • zahlman 4 days ago

    You still have to worry about colons on Linux; while they're valid in file and folder names, they prevent folders from being put on PATH and understood properly.

    (Other characters of course cause usability problems and are potentially even a security vulnerability depending on the terminal. But they're still "valid".)

    • windowliker 4 days ago

      Stock macOS uses a bash or zsh shell in the terminal (depending on the OS version), so the same points of caution hold there as well.

kevin_thibedeau 4 days ago

> Windows is weird for using the backwards slash

Windows handles slash as well, also part of a unification with UNIX style paths intended for XEDOS.

pedromlsreis 5 days ago

I'm curious how much of this behaviour is still intentional design vs. just inertia. Are the modern filesystems still constrained by these older choices, or is it mostly for compatibility?

  • p_l 5 days ago

    Filesystems usually do not see path separators at all, it's something handled at VFS level

    • Kwpolska 5 days ago

      But they do see / or : in file names, and the interesting question is which one it is today on which filesystem.

      • syncsynchalt 4 days ago

        The filesystems of macOS are particularly opinionated, much more than most Unices which tend toward "anything is allowed [and usually preserved] except \0 and /".

        macOS supports case-insensitivity[0] and performs unicode normalization[1] on filenames, and decomposes name data to an extent that the question "what does the fs see" is a bit moot.

        With that said, the internal storage of filenames in APFS are a nul-terminated UTF-8 string[2], with (i'm pretty sure) colons as colons, which the Finder displays as slashes.

        [0] if you make a file named "Makefile" then touch a file named "makefile", it'll touch the first file, instead of making a second file.

        [1] if you make a file named "schön" (s-c-h-combining¨-o-n) and then search for (s-c-h-ö-n), you can find it, or vice versa. The particular normalization/canonicalization used is NFD.

        [2] j_drec_key_t description in https://developer.apple.com/support/downloads/Apple-File-Sys...

        • zahlman 4 days ago

          > a file named "schön" (s-c-h-combining¨-o-n)

          Combining marks come after the character they modify, btw. (Presumably thanks to support from things like harfbuzz, modern systems will happily put two dots above an h.)

        • skissane 4 days ago

          > macOS supports case-insensitivity

          Well, strictly speaking Linux does too, since it supports mounting local or remote filesystems with this feature

          For a long time, the real distinction was that “native” Linux filesystems didn’t support it, but “foreign” ones did. However, nowadays even some of the “native” filesystems have optional support for case-insensitivity (e.g. casefold feature on ext4)

          The real difference now: on macOS, it is normal to have this feature turned on, exceptional to have it disabled; on Linux, it is the other way around

      • p_l 3 days ago

        You can imagine many internal APIs as taking "look up this <name> from directory <handle>". Surprisingly often the only practical limit is that the name does not contain NULL bytes.

        Path separators, whether to accept directory entries with path separator in them, etc. are usually handled layer above

breppp 5 days ago

I was expecting the story of the magical ¥ path separator

watersb 4 days ago

Classic Mac OS aliases are similar to shortcuts on Windows; they are not symbolic links but rather actual files that record the path to the target.

I want to call such aliases "normal" files, as opposed to a link, but the path description is saved in the Resource Fork of the file, not the Data fork.

Resolving an alias can involve network path traversal. You can make an alias of a file on an AFP volume and save it locally, and the next time you use the alias the volume will be auto mounted if necessary. I think you can get similar behavior from other OS configurations.

I seem to recall that if you move or rename a file, the system will update the alias for you. It can't always figure this out. But it will try. That's something you might not see elsewhere...

I've forgotten why AppleScript returns alias objects instead of strings.

momoraul 5 days ago

Another Windows oddity: each drive letter has its own current directory. D: doesn't mean the root of D:, it means "wherever you last were on D:". Same with C:foo, which is relative to C:'s current directory. DOS baggage that's still around.

  • skissane 4 days ago

    > Another Windows oddity: each drive letter has its own current directory

    For NT-based Windows: only in cmd.exe, and other apps which choose to support the same convention. The NT/Win32 API only supports a single per-process directory

    There is actually space in NT data structures to store per-drive current directory, but no released version has ever used it. I think they planned to implement the idea in NT itself (or NT’s implementation of Win32), but then settled on just having a single current directory per-process, and faking the old behaviour in cmd.exe using environment variables

    By contrast, Windows 1.x/2.x/3.x/9x/Me retained the old DOS behaviour of per-drive current directories, so Win32 does actually have them if you mean the Win32s or 9x/Me implementations of Win32.

    Separately, both Linux and macOS support per-thread current directories separate from the per-process current directory, although by default all threads use the process-wide current directory. Last I checked, the macOS implementation was a bit more sophisticated, in that on Linux once the link between process and thread current directory was severed, it was gone for the lifetime of the thread; by contrast, macOS has an API to re-establish it.

    • momoraul 3 days ago

      Ah, fair. On modern Windows it's really cmd.exe faking it with env vars, not the API. Didn't know NT reserved space for per-drive CWDs and then never used it.

      • skissane 2 days ago

        > Didn't know NT reserved space for per-drive CWDs and then never used it.

        RTL_USER_PROCESS_PARAMETERS has a field “RTL_DRIVE_LETTER_CURDIR CurrentDirectores[32]” (note the misspelling). And then RTL_DRIVE_LETTER_CURDIR is defined as:

            typedef struct _RTL_DRIVE_LETTER_CURDIR
            {
                 WORD Flags;
                 WORD Length;
                 ULONG TimeStamp;
                 STRING DosPath;
            } RTL_DRIVE_LETTER_CURDIR, *PRTL_DRIVE_LETTER_CURDIR;
        
        But, AFAIK, Microsoft has never shipped anything that uses it. My own impression is this was the original design for handling compatibility with the DOS current directory behaviour, but they ended up deciding on doing it in cmd.exe instead. And of course, NTVDM and 16-bit Windows app support, but I think that just used the 16-bit DOS code and its associated data structures.

        https://www.geoffchappell.com/studies/windows/km/ntoskrnl/in...

  • chrismorgan 4 days ago

    And you need `cd /d` to switch drives. This was how I rendered a Windows computer non-bootable for the first time. Ran Command Prompt as admin (because I was logged in as a user that didn’t have write access to D:\backups), and it starts in a rather important directory, then:

      C:\WINDOWS\system32>cd D:\backups\some-huge-directory
      C:\WINDOWS\system32>del /s *
    
    Oops. I learned to look twice before running a big dangerous command. And to use /d.
    • ChrisSD 4 days ago

      Or you could use powershell and avoid the issue ;).

      Though nowadays system files should be protected even from admin and even if you do manage to delete them, Windows can restore them.

    • momoraul 3 days ago

      Oof, cd silently staying on the wrong drive and then del /s is the worst possible version of this. /d the hard way.

blamestross 4 days ago

Someday ASCII 28 thru 31 will be loved.

bebe83939 5 days ago

Now imagine operating system, that has no directories (and no path separators) or no filesystem at all.

  • rswail 5 days ago

    Like CP/M, DOS v1, RT-11 etc.

    VMS used:

        node::device:[dir1.dir2.dir3]filename.extension;version
    
    from memory, you could have up to 15 nested directories.

    The versioning was cool as long as you remembered to clean them up.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection