Ctypes.sh – A foreign function interface for bash

94 points by afics 10 years ago · 38 comments

Reader

Neat...and on a somewhat similar note (bash enable -f hacks), some bash FUSE bindings I wrote a few years back: https://github.com/zevweiss/booze

I initially wrote it basically just for giggles, but was rather pleased a few months ago when I realized I could use it for something I actually needed, and it was just the right thing (presenting a mount point that acted as a view of the differences between two rsnapshot-style hard-link trees -- only took a few dozen lines of very simple code).

AndyKelley 10 years ago

The problem with this project is that somebody might use it. I argue that if you need to do something in bash that involves more than a sequence of commands, including a simple if statement, then you should switch from bash to a proper scripting language (python, perl, ruby, nodejs, take your pick.)

adrusi 10 years ago

I disagree. Shell scripts are great for automating system tasks — for sysadminy type stuff, including tasks with high complexity with nested loops and conditionals.
A scripting language like those you mentioned is great for if you need to run your script on systems you don't control, or you need it to be cross platform, or you'd describe what you're doing as creating software rather than automating tasks.
No scripting language can ever integrate with the host system quite as well as shell scripts can. Sure, shell scripts make it easy to shoot yourself in the foot, but then can't you say the same of at least perl and ruby?
I don't really see many good use cases for this library though. By the time you'd use it you should probably move on to something else.
- dragonwriter 10 years ago
  
  > No scripting language can ever integrate with the host system quite as well as shell scripts can.
  What is the basis for this? Shell script is a scripting language (more precisely, a set of scripting languages with similar features.) The difference between shell scripting languages and other scripting languages is that the former are optimized around the need to scale down to a convenient line-by-line way to work with the system in a REPL; while the others may support work in a REPL it is not what they are optimized for.
  There's no real reason why other scripting languages can't integrate with the system as well as shell languages.
  - AnimalMuppet 10 years ago
    
    >> No scripting language can ever integrate with the host system quite as well as shell scripts can.
    > What is the basis for this? Shell script is a scripting language (more precisely, a set of scripting languages with similar features.) The difference between shell scripting languages and other scripting languages is that the former are optimized around the need to scale down to a convenient line-by-line way to work with the system in a REPL; while the others may support work in a REPL it is not what they are optimized for.
    Disagree. The defining characteristic of shell scripting languages is that they are a shell that can be scripted. What's a shell? It's a program designed to be a layer around the OS, exposing all of the capabilities of the OS to the user in a convenient form.
    So the only way a non-shell scripting language can become that powerful is to become a shell scripting language.
    
    brazzledazzle 10 years ago
    
    >What's a shell? It's a program designed to be a layer around the OS, exposing all of the capabilities of the OS to the user in a convenient form.
    I think that's what they were getting at when they said:
    >optimized around the need to scale down to a convenient line-by-line way to work with the system in a REPL
    The only real difference is optimization for the REPL. Many of the things you might consider part of the experience aren't even shell built-ins, they're utilities maintained separately from your shell. Powershell is a good example of a shell scripting language that has a lot more going on than your traditional shell. Native object pipelines, .net libraries, etc. but the most important thing about it is that it's optimized for the REPL.
  - adrusi 10 years ago
    
    In addition to AnimalMuppet's point, shell scripts make no attempt to be cross platform, so you don't get any awkward leaky abstractions when trying to interact with low-level system facilities.
    The primitives in shell scripts are the primitives of the operating system. The fundamental building blocks of your language are strings and files and processes, which makes it really convenient to work with strings and files and processes, AKA the operating system.
    
    AndyKelley 10 years ago
    
    You can easily invoke platform-specific binaries from within a scripting language. You can even invoke them in a shell if you want. Check out this API: https://docs.python.org/3/library/subprocess.html#subprocess...
    
    yarrel 10 years ago
    
    If most of your scripting language code is calls to other programs and to the shell, there's a simpler way...
    
    adrusi 10 years ago
    
    Well of course you can. Scripting languages aren't useless stunted toys. The point is that if that's what most of your program is doing, you'll have a lot more success using a language where that functionality is first class.
- AndyKelley 10 years ago
  
  Can you give a shell script example of something that perl/python/ruby/nodejs can't do?
  - adrusi 10 years ago
    
    No such example; after all, you can spawn a shell subprocess in any of those.
    There's just certain types of tasks where you can more clearly express your intent in a shell script. If, for example, you need to spawn a ton of subprocesses, you can do that in any scripting language, but shell is designed from the ground up to launch subprocesses — it's most basic purpose is to launch programs.
    And so while I can't give an example of something that can only be done in a shell script, here's a shell script that I wrote a few weeks ago that would have been a pain in the neck to express in any other language:
    #!/bin/sh cd "$(dirname "$0")" for test in *.in; do test="$(basename "$test" .in)" infile="$test.in" outfile="$test.out" output="$(./whofrom "$infile" 2>&1)" expected="$(cat "$outfile")" if [ "$output" != "$expected" ]; then echo "Failed test $test" echo "Expected: $expected" echo "Actual: $output" echo fi if ! valgrind --error-exitcode=1 --leak-check=full ./whofrom "$infile" 2>/dev/null >/dev/null; then echo "Failed test $test" valgrind -q --leak-check=full ./whofrom "$infile" echo fi done
    
    viraptor 10 years ago
    
    I don't think it's a big pain in python. I imagine ruby wouldn't be terrible either:
    #!/usr/bin/env python import sys, os, subprocess, glob os.chdir(os.path.dirname(sys.argv[0])) for test in glob.glob("*.in"): test = os.path.splitext(test)[0] infile = test + ".in" outfile = test + ".out" output = subprocess.check_output(["./whofrom", infile], stderr=subprocess.STDOUT) expected = open(outfile, 'r').read() if output != expected: print("Failed test", test) try: devnull = open(os.devnull, 'w') subprocess.check_call(["valgrind", "--error-exitcode=1", "--leak-check=full", "./whofrom", infile], stderr=devnull, stdout=devnull) except subprocess.CalledProcessError: print("Failed test", test) subprocess.check_call(["valgrind", "-q", "--leak-check=full", "./whofrom", infile])
    (since it's a quick script, I'm ignoring stuff like closing files - they'll get GCd on each iteration anyway)
    Apart from the header, the whole script is pretty much the same when comparing line-by-line.
    
    jhallenworld 10 years ago
    
    One issue is the problem of indirect documentation, and it comes down to this one line: "import sys, os, subprocess, glob".
    How do you find which module you need? Search on the web may be the best answer. In shell at least you have "man -k".
    Once I know the module, how do I get its documentation? Here at least there is an answer: import glob; help(glob);
    But how good is this documentation? If I do it I get:
    glob(pathname) Return a list of paths matching a pathname pattern. The pattern may contain simple shell-style wildcards a la fnmatch.
    Already python is telling me that the "man" documentation is going to be better :-).
    
    viraptor 10 years ago
    
    > How do you find which module you need? Search on the web may be the best answer.
    Experience, stdlib reference, web searches. Same as with bash.
    > In shell at least you have "man -k".
    I really disagree here. Since you took `glob` as an example, how do you get to the explanation of `for test in *.in`? Go on, try that with "man -k".
    > Already python is telling me that the "man" documentation is going to be better
    I'm not trying to say python is good here (well, the stdlib documentation on the web is actually pretty good, it's just not easily available from the console). But the idea that bash/man is more discoverable is just wrong... You can find the glob under "Pattern Matching" section which is all right, but you need to understand most of the expansion mechanism of shell to know it applies to "for". Then again "for" itself has a definition that belongs more to a CS material, than to a usage guide.
    
    adrusi 10 years ago
    
    Since you took `glob` as an example, how do you get to the explanation of `for test in .in`?*
    `man -k wildcard` points to `man 3am fnmatch` which points to `man 3 fnmatch`. Now the process for python and shell converge, and I still have to know to look at the "See Also" section of the manpage to find `man 7 glob` which finally gives me useful information.
    The python workflow involved one fewer discrete step, but the user would have to know both help() in the python REPL and how to navigate man pages, while for shell scripts the user only needed to know how to navigate man pages.
    I'm not trying to say python is good here (well, the stdlib documentation on the web is actually pretty good, it's just not easily available from the console). But the idea that bash/man is more discoverable is just wrong... You can find the glob under "Pattern Matching" section which is all right, but you need to understand most of the expansion mechanism of shell to know it applies to "for". Then again "for" itself has a definition that belongs more to a CS material, than to a usage guide.
    I do very much agree with you here. Man pages are not very accessible. Some man pages (most that I work with, but I understand that's not everyone's experience) are very good, complete, and understandable. I'd be comfortable saying that python's and the shell's documentation features are roughly equally usefule ± some small amount.
    
    jhallenworld 10 years ago
    
    Well I agree that glob was not the best example. For sure there are things you have to learn about shell scripting as well.. one of them is if "man for" doesn't work, try "help for" or "man bash". In fact there is no good reason for this and it should be improved. (not to mention that you already had to know what *.in does, but you could argue that this is fundamental shell syntax: man bash / pattern matching).
    
    adrusi 10 years ago
    
    I'd argue that that's significantly worse at conveying the intent than the shell script version.
    The shell script started as a workflow that was executed repeatedly, manually, from the shell, and then automated. Naturally, the shell script resembles very closely they commands typed in at the shell, with some added code to encode the part of the routine that was executed in the user's head.
    The python script doesn't look anything like the routine that was previously entered at the shell. That makes it harder to tell at a glance that this script does the same thing.
    It it horrible? No absolutely not. If you're working on a team where everyone knows python but not everyone is comfortable with shell scripts, then writing the script in python is the clear best option. But if you're working on a team where everyone is comfortable with both python and shell scripts, the shell script probably wins out.
  - e12e 10 years ago
    
    > Can you give a shell script example of something that perl/python/ruby/nodejs can't do?
    Install perl/python/ruby and nodejs -- requiring just the shell to be installed?
    Snark aside (it's not only snark - pretty much every system will have something like a posix shell), proper posix portable shell is hard - it's an old gnarly language -- but it's what we've got. And with a bit of discipline and good practices -- it doesn't have to be bad. That said, a lot of real-world shell scripts are bad.
    I tend to agree with the overall sentiment; shell isn't a great language. As soon as you start to mix awk (which awk is that, do you need GNU awk?), sed and perhaps a bit of egrep (or grep -E -- are both available? Does it accept only BSD-style parameters?) -- one should consider moving "up".
    And for eg: setting up a python package/program -- I'd generally prefer a python script -- hopefully one that handles different file-paths (eg: / vs \ ) and other cross-platform stuff. If you already depend on python, why add dependency on shell?
    
    AndyKelley 10 years ago
    
    > Install perl/python/ruby and nodejs -- requiring just the shell to be installed?
    I just meant using any one of them, not all of them. And you can pretty much always rely on python and perl being installed.
    But my point is that if you're doing something other than a one-off thing, then it belongs to some project and you're probably going to commit that script to that project's codebase. That means it has to be maintained. Do future you and whoever else has to maintain the project a favor and use one of the popular scripting languages that has reasonable syntax and semantics.
    
    e12e 10 years ago
    
    Many minimal distributions have only the shell installed - which make it (still) relevant for provisioning/bootstrap etc.
    Personally I'd much rather maintain a shell script than a perl script - but that's just because I know shell better. Maybe shell is the first language people would program without learning it (js being the second)?
    
    coldtea 10 years ago
    
    >Install perl/python/ruby and nodejs -- requiring just the shell to be installed?
    You can run several of those languages (python for sure, but also perl IIRC) as a shell.
    
    e12e 10 years ago
    
    As a system shell? In theory, perhaps. But migrating a typical Linux/bsd distribution away from having any dependency in the shell would be a major undertaking. While some distributions already ship with only shell/busybox.
  - jzwinck 10 years ago
    
    Set environment variables in the calling shell. Because you cannot run Python etc via "source".

mzs 10 years ago

Some earlier discussion, people porting to FreeBSD and what not: https://news.ycombinator.com/item?id=9959628

eximius 10 years ago

What could go wrong? /s

All sarcasm aside, a very interesting idea. I'm not sure what the proper use case is, but I'm sure someone will love this.

geofft 10 years ago

There are a couple of weird APIs like unshare(2) where it matters that you do things in the shell itself, not in a process spawned by a shell. (chdir(2) might be an even better example, come to think of it.) I once wrote a bash plugin for playing with unshare(2) specifically.
That said, unshare(1) now supports `-r` with `-U`, which was the thing I needed.
eric_the_read 10 years ago

The best use-case I can think of is using bash as a REPL for C libraries. Many times in the past, I've made library calls that either misinterpreted the parameters or the result value. I would have loved the ability to prototype those calls in bash until I understood them enough to call them properly.
- michaelhoffman 10 years ago
  
  You can use gdb as an REPL for C.
  - eric_the_read 10 years ago
    
    My gdb-fu is weak, but don't you have to have a binary that you're debugging to do that? I'm thinking of this as more of an exploratory thing-- poking around the edges of an API until I feel comfortable enough with it to start writing real code.
    
    michaelhoffman 10 years ago
    
    Sure, but you could compile:
    #include <my_api.h> int main() { return 0; }
  - sea6ear 10 years ago
    
    This blew my mind the first time some one showed me, and is one of the reasons I currently prefer C to other mainstream compiled languages.
    Edit: I suppose it is possible to get some version of a repl with C++ (Cling?), Java (Beanshell, Java 9?), C# (Mono, but not currently .Net?). But each of those seems generally a bit harder to get access to than simply calling gdb on a binary with debugging symbols enabled.
    
    wvenable 10 years ago
    
    C# has REPL in the immediate window when the debugger is running; so it's kind of like using GDB as a REPL for C. Although it is somewhat limited in what you can do.
    
    sea6ear 10 years ago
    
    That's actually really good to know about.
    Visual Studio's a little bit heavyweight, but if you're programming C#, there's a good chance that's what you're using anyway.
    
    shanemhansen 10 years ago
    
    beanshell is quite painful. I like using the repl of a jvm language to explore classes and methods. Like slime+clojure or groovysh or jruby.
  - jhallenworld 10 years ago
    
    Reminds me of vxworks: the main shell for the operating system is the debugger.
- shanemhansen 10 years ago
  
  When I need a C repl, I often use python and ctypes.

Settings

Ctypes.sh – A foreign function interface for bash

Keyboard Shortcuts