Linux Commands for Developers

blog.jayfields.com

98 points by vamsee 13 years ago · 66 comments

Reader

I was expecting to see the first comment in here complaining about his use of 'cat', as in all of the examples his second argument could've easily taken a filename argument..

  sort order.*

is surely more elegant than

  cat order.* | sort

which is fair enough, however, as it happens, i generally do end up using 'cat' in the way he's used it.. for such small jobs nobody can be genuinely worried about the overhead, and it comes down to a matter of taste..

personally, i find that using 'cat output | $command' helps to separate out the 'logic' of what i'm doing, if that makes sense..

also, again, purely as a matter of taste

i'd prefer

  egrep 'Hardcover|Kindle'

over

  grep "\(Kindle\|Hardcover\)"

EDIT: (as alexfoo has pointed out, this isn't a proper AND as it worries about the order.. my bad, still useful though :D )

and as a sidenote, something i only found recently, but which is quite useful, a logical AND with egrep looks like

  egrep 'Hardcover.*Kindle'

alexfoo 13 years ago
[ EDIT - Two replies as original post had been edited by the time I posted the first. ]
> and as a sidenote, something i only found recently, but which is quite useful, a logical AND with egrep looks like > > egrep 'Hardcover.Kindle'
That's not a true logical AND since it won't pick up an entry with the text "Kindle Hardcover". Only entries with the word "Hardcover" eventually followed by "Kindle". To cover both cases you'd need:-
```
  egrep 'Hardcover.*Kindle|Kindle.*Hardcover'
```
(Of course, someone will now show how this can be done in even fewer characters).
illicium 13 years ago

How about:
grep Hardcover | grep Kindle

alexfoo 13 years ago

Yup, but I had meant in one command though, i.e.
sed -n '/Kindle/{/Hardcover/p}'

phaemon 13 years ago

awk '/Kindle/ && /Hardcover/'
awk doesn't have to be complicated ;)
ralph 13 years ago

If you really prefer to see things in order then
<foo sort | ...
is an alternative to
cat foo | sort | ...
though I wouldn't particularly recommend it. Instead the overhead of cat(1) should be omitted and it written in the normally accepted form of
sort foo | ...
Providing a filename rather than re-directing stdin allows the program more choice over its method of access.

beagle3 13 years ago

I find the "cat foo | .." in the beginning and "| cat > bar" at the end form more regular.
While iterating on a command line, it keeps things uniform, rather than switching between "sort foo" and "tai64nlocal < foo" and "ffmpeg -i foo", by which I mean: different programs take their input in different ways. You can normalize by making each take standard input, and feed the chain with a "cat".

ralph 13 years ago

I can understand liking the regularity but in production code or web examples it shouldn't be done because of the overhead. However, your example doesn't make sense.
If sort, tai64nlocal, and ffmpeg are all happy to read stdin so you can do
cat foo | sort ... cat foo | tai64nlocal ... cat foo | ffmpeg ...
then they can all have their stdin redirected instead by the shell:
<foo sort ...
Similarly with stdout:
... | sort | cat >foo
becomes
... | sort >foo
In both cases regularity of having the filename at the start and end is preserved.
ciupicri 13 years ago

Minor nitpick from the man page: "egrep is the same as grep -E. fgrep is the same as grep -F. Direct invocation as either egrep or fgrep is deprecated, but is provided to allow historical applications that rely on them to run unmodified."
alexfoo 13 years ago

Indeed, I don't see why people get so upset (or pedantic) about what are, effectively, NOPs in command-lines.
However, things change if you start adding certain options to sort:-
sort -m order.*
and
cat order.* | sort -m
are definitely not the same thing (for most input files at least).

ralph 13 years ago

Perhaps because sending GiBs through read(2) and write(2) unnecessarily isn't a NOP?

Athas 13 years ago

You can do that sending while you're waiting for the disk to provide said GiBs. I also believe that useless uses of cat are often acceptable for readability (many novices are not familiar with redirecting standard input, particularly not as the first thing on a command line).

ralph 13 years ago

Who said the GiBs need to be fetched from disk; they could already be in RAM. Even if not, it's still adding many system calls and context switches when the CPU could be doing other things; the machine isn't running just this one thing.

alexfoo 13 years ago

Who said NOPs are free? NOPs still take at least one clock cycle.

zvrba 13 years ago

It's not that simple anymore... In modern CPUs, NOPs are discarded by the decoding units so they never occupy the exeuction units. If decoding BW is not saturated (and most often, it's not), NOPs are indeed "free".
krzyk 13 years ago

Moreover if you use cat, it can be easily replaced with e.g. zcat to do whatever you want with gzipped files.

rimantas 13 years ago

One guideline to keep in mind:
> If the original title begins with a number or number + gratuitous > adjective, we'd appreciate it if you'd crop it. E.g. translate > "10 Ways To Do X" to "How To Do X," and "14 Amazing Ys" to "Ys." > Exception: when the number is meaningful, e.g. "The 5 Platonic Solids."

angusgr 13 years ago

One I learned for the first time the other day is 'paste'. Good for people like me who never fully grokked awk, it joins lines from separate files into a single file.
Say you have two files, one with lines of numbers:
1 2 3
... and one with letters:
A B C
$ paste numbers letters
1 A 2 B 3 C
Want CSV?
$ paste -d, numbers letters
1,A 2,B 3,C
Or, with '-s' you can join lines from inside the same file. For instance, you can sum numbers:
$ paste -sd+ numbers
1+2+3
$ paste -sd+ numbers |bc
6
(Thanks to a Stack Overflow post somewhere for suggesting that one!)
Useful example: the total resident memory size of all chromium processes:
$ ps --no-headers -o rss -C chromium | paste -sd+ | bc
793180

yesbabyyes 13 years ago

`paste` is also very useful together with stdin redirection and subshells. E.g.:
paste <(ping 8.8.8.8) <(while true; do iwconfig wlan0 | grep "Bit Rate"; sleep 1; done)

ralph 13 years ago

They could drift. Better to do a one-ping ping and run iwconfig once in a loop.

djcb 13 years ago

Hmmm, is this HN worthy? These commands are so basic that I don't expect any HN-reader using Unix systems not to know those.
What about some slightly less basic ones that I'd think would be useful for many people.
# less with syntax highlighting alias less="/usr/share/vim/vimcurrent/macros/less.sh" tailf # tail -f, but better & shorter mtr # check your connection (ie., traceroute with more info) htop # nice a colorful process listing (better than top) locate # search files by name -- that late-night disk thrashing is useful after all!

maw 13 years ago

I ask the programmers on my team, some of whom are quite junior and unexperienced, to send weekly status reports with an overview of what they did in the previous week, what they expect to do in the next, and other. Other is usually made up of areas of concern and interesting things that have made it onto one's radar. (You can put basically whatever you like in the other section. I'm still hoping somebody will send a joke.)
Since I wouldn't ever ask them to do something I wouldn't do myself, I send these reports too. This link will go in my miscellany section this week. I think it'll be useful, and I wouldn't have ever found it or even considered including something like it had it not shown up on HN.

thomaslangston 13 years ago

Yes, it is HN worthy. Reviewing the basics is important and not all readers use Unix systems (frequently enough to remember the basics).

masklinn 13 years ago

> tailf # tail -f, but better & shorter
That's better about tailf, and why would I use it instead of the (far superior to tail) less +F?

stretchwithme 13 years ago

I found it useful. I was unaware of less.

Evbn 13 years ago

If you are going to use vim, why bit he pretending it is less? Just use vim.

Nick_C 13 years ago

Because the vim macro changes all settings so it behaves just like less, including making the file effectively read-only, and without the overhead of loading all the plugins.

pstadler 13 years ago

You should really try ack (http://betterthangrep.com) as a replacement for grep.

davidw 13 years ago

I think it's obligatory that someone writes 'strace' in threads about articles like this, so here goes. strace is fantastic for debugging certain categories of problem.

cygwin98 13 years ago

Absolutely, especially for those who do network programming under Unix/Linux. Combined with lsof, it can help a lot in troubleshooting issues such as deadlocks.

pooriaazimi 13 years ago

I'm yet to find a decent introduction to strace (or dtrace). I'd appreciate if someone could point me to one...

jswanson 13 years ago

Here's a post outlining how strace can be used to solve problems:
https://blogs.oracle.com/ksplice/entry/strace_the_sysadmin_s...
More from a sysadmin POV than a dev, but you'd probably still find it useful.

yesbabyyes 13 years ago

I was surprised and a little disappointed that `join(1)` didn't join the list!
With the files in the example (order.out.log, order.in.log), to join every record on it's id, you would do something like:
$ join -j 2 order.out.log order.in.log 111, 8:22:19 1, Patterns of Enterprise Architecture, Kindle edition, 39.99 8:22:20 Order Complete 112, 8:23:45 1, Joy of Clojure, Hardcover, 29.99 8:23:50 Order sent to fulfillment 113, 8:24:19 -1, Patterns of Enterprise Architecture, Kindle edition, 39.99 8:24:20 Refund sent to processing

reirob 13 years ago

I thought to learn something useful for programing (debugging, program runtime analysis, etc.). But instead the article is just about the generic commands cat, sort, grep, cut, sed, uniq, find and less. It is not really development related.

jrajav 13 years ago

I disagree. They may not be obviously related to coding, though they'll probably end up being useful at some point anyway... But they're definitely useful for working with logs, working with datasets, working with config files, and a host of other development-related tasks. Just yesterday I was on a Windows machine and dearly felt the loss of sed and uniq. I have a task lined up for today to either find Windows alternatives or install msys.

antback 13 years ago

Totally agree! Wasted time. Everybody know these commands.

Domenic_S 13 years ago

You and I do, but surely you know a web dev or two that struggles to read log files meaningfully or doesn't know how to grep their html source for something?
Send the article along, it's not always about you...

klochner 13 years ago

A better post + thread:
"What are some time-saving tips that every Linux user should know?"
http://www.quora.com/Linux/What-are-some-time-saving-tips-th...
https://news.ycombinator.com/item?id=2361978

einhverfr 13 years ago

I came across something in fortune some time ago which you can find at:
http://motd.ambians.com/quotes.php/name/linux_songs_poems/to...
I have found it to be surprisingly useful. Nobody uses all the commands but remembering that something like zcat exists can be extremely useful. Also remembering to pipe through sort before sending through uniq is helpful as well.

gav 13 years ago

In most cases you can replace "sort | uniq" with "sort -u".

talkingquickly 13 years ago

tail -f filename should definitely make the list, essential for watching what's appended to log files in real time.

alexfoo 13 years ago

I prefer:-
tail -F filename
as most of the logfiles I need to watch tend to wrap at some point and 'tail -f' doesn't check for inode changes.
(-F is a non-standard option, it's there on GNU's [EDIT] tail binary and OS-X but not Solaris for example).

talkingquickly 13 years ago

Thanks, very useful - didn't know that!

vacri 13 years ago

less +F filename does the same thing, but loses the following when the logfile gets rotated. Does the same thing happen to tail -f? (edit: alexfoo answered that tail -F tracks the file properly)
one benefit to less +F is that you can cancel the following and read normally in less.

masklinn 13 years ago

> one benefit to less +F is that you can cancel the following and read normally in less.
Yeah, including doing backwards and forward searches, that is very handy

dllthomas 13 years ago

Those are, indeed, 8 commands every developer should know. Calling them "Linux Commands" is a little weird - I don't think there's anything there not specified in POSIX, and I think they all appear on OS X and other unix systems.

billsix 13 years ago

Agreed. Especially since he used 'find /Users -name "order*"', which means it was probably written on OS X.

tszming 13 years ago

A better title:
8 Linux Commands Every Developer Should Know [for log analysis]

lloeki 13 years ago

Unless you want to show before/after context (-A/-B flags), one should grep before sort, not after. Less work to do.

dlsym 13 years ago

sigh And I tought we were through with collections of trivial shell commands.

dazzawazza 13 years ago

might be better titled as "8 Unix commands every developer should know".
They are all generic unix commands.

ams6110 13 years ago

And to be really pedantic, they aren't "unix commands" they are utility programs.

aw3c2 13 years ago

Even worse, they are GNU. :)

protolif 13 years ago

To continue reading, subscribe? seriously?

asdfprou 13 years ago

Regardless of the actual content, I applaud the "storytelling" way of presenting content. With a flow of examples that tie into each other the reader is given background context and can say to him/herself "I've had that problem before!", as I found myself doing.

corford 13 years ago

lsof -i always gets left out of these lists. It's really handy for quickly seeing what daemons are running (under which user) and on what interfaces they're bound to.

gbog 13 years ago

It says sed has basic stream editing capabilities. Basic. This guy should check, sed has very complex and powerful editing capabilities.

yogione 13 years ago

how do I execute the out put of something like this:
grep -i 'pattern' file | awk '{print $5}' | sed 's/^/cmd/g'
I end up sending to a file, chmod, then run it at the shell.

michaelcampbell 13 years ago

In bash...
$(grep -i 'pattern' file | awk '{print $5}' | sed 's/^/cmd/g')
?
Or surround in backticks
`command which outputs text you want to run as a command`
I prefer $() as they nest better.
Or have I misunderstood your question?

yogione 13 years ago

That's it $() works. thanks.

badboy 13 years ago

Wrap it in $() to execute. And even if you pipe it into a file, no chmod is required. "sh file" works as it should

slug 13 years ago

or use eval
~$ help eval eval: eval [arg ...] Execute arguments as a shell command.
Combine ARGs into a single string, use the result as input to the shell, and execute the resulting commands. Exit Status: Returns exit status of command or success if command is null.

Domenic_S 13 years ago

xargs instead of that sed (I assume you're prepending your command to run there).
grep -i 'pattern' file | awk '{print $5}' | xargs cmd

zvrba 13 years ago

comm is another very useful command

Settings

Linux Commands for Developers

Keyboard Shortcuts