Settings

Theme

Sed, a powerfull mini-language from the 70s

julienlargetpiet.tech

19 points by random__duck 6 days ago · 32 comments

Reader

supriyo-biswas 6 days ago

This would have been a good article if the author actually took the time to write out the article themselves, perhaps in their native language, rather than using a LLM to write it.

  • TheChaplain 6 days ago

    This one feels more authentic.

    https://www.grymoire.com/Unix/Sed.html

    • supriyo-biswas 6 days ago

      I actually prefer the "organization" of the original article, but could not continue past the LLMisms.

  • jlp__inf 5 days ago

    Hi, I'm the author of the article, and sorry if it felt overly LLM-generated.

    To clarify: I did spend quite a bit of time diving into `sed` myself (experiments, tests, etc.) and wrote an initial draft in French.

    I then used ChatGPT to help me structure and refine the article in English to reach more people. But the core ideas and understanding come from my own exploration.

    That said, I understand the concern, and for future articles I'll try to rely less on that kind of tooling for structure.

    Happy to clarify or go deeper on any part of the article if needed.

    • IAmBroom 5 days ago

      Try believing in yourself.

      Don't use beauty filters to display what you wish you looked like. Don't use AI to imitate your own voice. Just be.

  • surgical_fire 6 days ago

    Man, I don't even hate LLM for the sake of it, but this LLM language and formatting is really grating after a while.

  • adelks 6 days ago

    Indeed.

evanjrowley 6 days ago

For a while my home had a Raspberry Pi 2B running FreeBSD 10 acting as a router-on-a-stick. When I lost SSH connectivity to it, I would instead use GNU screen as with a serial cable. For whatever reason, I could never get "full screen" TUI apps like vi and nano to display properly. To edit files (like pf.conf for firewall rules), I had to use sed to edit lines of the file. It was an interesting learning experience and perhaps a glimpse of how things used to be with green screen terminals. Shortly thereafter I switched over to a Beaglebone Green with OpenBSD 5 and never needed that workaround again.

  • jlp__inf 5 days ago

    Yep, interesting, do you remember the initial text structure and operatins you would perform on it ?

    • evanjrowley 3 days ago

      Unfortunately my notes from that time are long gone. It was over 10 years ago.

      There were three specific files that needed to be edited. Firewall rules were setup in pf.conf. IP-related configuration was controlled in both rc.conf and sysctl.conf.

      There was also some issue with the difference between GNU sed and BSD sed having to do with the ability to make in-place edits with the -i option.

phplovesong 6 days ago

For those of us using vim/neovim sed is something everyone had to learn, even if they did not realize it.

This is the true power of vim. Even now decades later the unix toolbelt holds up, and is still unmatched for productivity.

Vim is in the end just a small piece of the puzzle. You use small tools for your problem, mix and match.

Its kind of like functional programming. Compose and reduce.

  • jlp__inf 5 days ago

    Yup, notably the famous %s/PATTERN1/PATTERN2/g, which I suppose comes from sed

    • piekvorst 4 days ago

      vi and sed took the s/ syntax from ed, independently of each other.

      • zvr 3 days ago

        Right! ed was the first one, and its ideas and commands influenced sed (a "streams" ed), and ex (an "extended" editor, which also had a "vi"sual mode).

jasonpeacock 6 days ago

Long ago, I bought the O'Reilly "Sed & Awk" book with plans to become a true unix guru.

Then I realized I already knew Perl (and Perl one-liners), so there it sat unused on the shelf.

  • stvltvs 6 days ago

    Mostly it's useful in my experience on systems without Perl installed, but that doesn't often come up in my world.

aperrien 6 days ago

Thanks for the blast from the past. SED led me to AWK, which led me to Perl, which lead me to Python. An interesting chain that brought me back to the interpreted languages like BASIC that I programmed in when I was a kid. Even though my formal training in college was Pascal and C.

  • jlp__inf 5 days ago

    Ha, I went the Python → AWK → sed route, just for the sake of learning what was behind those chamanics expressions. Glad I could bring back those memories!

0xfaded 6 days ago

I learned sed back in the day to show off. I wish I'd invested that effort in learning perl oneliners instead. For whatever reason I picked up enough awk along the way, and now that's what I tend to use if I ever need something beyond a simple substitution.

kg08854 5 days ago

Just read this

https://sed.sourceforge.io/local/scripts/dc.sed.html

piekvorst 6 days ago

I prefer sam [1]. Unlike sed, it's not Turing complete, but it is far more elegant to my taste - consider this example from the article:

    :loop N; s/\n[[:space:]]\+/ /g; t loop; p
In sam, the equivalent is:

    x/(.+\n)*/ x/\n */ c/ /
It reads like this: loop over paragraphs of non-empty lines, loop over newline followed by spaces, replace with a single space. It's surprisingly close to SQL in its eloquence.

Another example:

    N; h; s/\n/->/g ;p; g; D
In sam, an equivalent would be

    {
        1,$-2 x/./ {
            a/->/
            /./ t .
        }
        $-1 d
    }
Again, it's readable from top to bottom: from the first line to the second from the end, loop over each symbol, put "->" after it and copy the next symbol next to it; delete the last line.

Let's see how far we can get. Another example:

    N; h; s/\n/->/g; p; G; D
In sam, an equivalent would be:

    {
        d
        1,$-2 x/./ {
            1,/./ x/.|\n/ {
                g/./ t $
                g/\n/ $ c/->/
            }
            $ c/\n/
        }
    }
It reads like this: delete the whole thing; from the first line to the second from the end, loop over each character; on each iteration, from the first line to the next character, run an inner loop over each character or newline; if it's a character, put it at the end; otherwise, put "->" at the end; once the inner loop is done, put a newline at the end.

The final example from the post is too long to have it here (15 lines). Here's just the sam equivalent for it:

    x/(.+\n)+|\n+/ {
        g/./ x/\n/ c/ /
        v/./ c/\n/
    }
It reads like this: loop over paragraphs of non-empty lines or sequences of newline characters; if it has any symbol (that is, it's a paragraph), replace each newline symbol with a space; if it doesn't have a symbol (that is, it's a sequence of newline symbols), replace it with a single newline.

What I have learned from this is that a tool with limited but well-chosen primitives is more convenient than a universal state machine.

(My examples may not be the exact same algorithms, since I do not understand (or need) sed concepts, but they do produce the same output.)

[1]: https://9p.io/sys/doc/sam/sam.html

  • jlp__inf 5 days ago

    Thanks for sharing this, I’m learning a lot from it.

    So if I understand correctly, it’s selection-based, right? You first select a region, and then you can apply further selections inside it?

    So you kind of build nested selections to match exactly what you want (almost like a conditional tree).

    And then you apply operations on the current selection

    For example substitution with something like c/PATTERN/, is that correct?

    • piekvorst 5 days ago

      That's exactly right. A few unmentioned details:

      . The dot (".") never matches newlines, which keeps line-oriented idioms from accidentally spanning newlines [1]

      . Changes must be sequential and non-overlapping (that's why I deleted the whole thing before processing in the third example).

      . Sam matches only the original input, not past changes.

      . Addresses (expressions before commands) select a single range, x commands select multiple ranges and loop over it.

      [1]: https://p9f.org/sys/doc/sam/sam.html, Regular expressions

      • jlp__inf 5 days ago

        Thanks for those complementary informations.

        Also, while i'm learning it, maybe that the sam traduction for the first file would be simpler, like:

        ,x/\n +/ c/ /

        • jlp__inf 5 days ago

          Also, i managed to, i think simplify the second example to: 1,$-2 x/./ { a/->/ /./ t . } 1,$-2 { p } q

          Which taught me a lot, because at first, i did not understand why we would stop at last line - 2 ($ - 2), but in fact yeah last line is the very last "\n", that is why we stop at $ - 2 to not do E->weird.

          I'm starting to truly love sam

        • jlp__inf 5 days ago

          and yess, g/PATTERN/ and /v/PATTERN/ are supzr powerful, creating branch conditions based on wether or not they (do not)match a PATTERN.

          based on what you wrote:

          i just add trailing \n normalization at the end

          , x/(.+\n)+|\n+/ { g/./ x/\n/ c/ / v/./ c/\n/ } $ a/\n/ , x/\n+$/ c/\n/ w file_out.txt q

          • piekvorst 4 days ago

            Yeah, sam is not universal, but it can solve complex tasks more simply by just running it a few times. The final example already normalizes \n (v/./ c/\n/) in a single pass, but we can make it even simpler by just writing:

                , x/(.+\n)+/ .,+#0-#1 x/\n/ c/ /
                , x/\n+/ c/\n/
            
            (I haven’t tested it.)

            I’m glad that sam worked out for you. You can learn more from [1] and [2]. If you need any help, you’re always welcome to ask me.

            [1]: https://ratfactor.com/papers/sam_tut.pdf

            [2]: https://9p.io/sources/contrib/steve/other-docs/struct-regex....

            • jlp__inf 4 days ago

              thanks, i published an article draft about SAM inspired by your examples (referencing sources / this comment section)

              Do not hesitate to pin point improvements.

  • sanjayjc 5 days ago

    That looks interesting. I thought sam was an editor (which I've only read about, never used.) Good to see it can be used on the command line.

    Is there a port to Apple silicon?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection