Linux Commands for Developers
blog.jayfields.comI was expecting to see the first comment in here complaining about his use of 'cat', as in all of the examples his second argument could've easily taken a filename argument..
sort order.*
is surely more elegant than cat order.* | sort
which is fair enough, however, as it happens, i generally do end up using 'cat' in the way he's used it.. for such small jobs nobody can be genuinely worried about the overhead, and it comes down to a matter of taste..personally, i find that using 'cat output | $command' helps to separate out the 'logic' of what i'm doing, if that makes sense..
also, again, purely as a matter of taste
i'd prefer
egrep 'Hardcover|Kindle'
over grep "\(Kindle\|Hardcover\)"
EDIT: (as alexfoo has pointed out, this isn't a proper AND as it worries about the order.. my bad, still useful though :D )and as a sidenote, something i only found recently, but which is quite useful, a logical AND with egrep looks like
egrep 'Hardcover.*Kindle'[ EDIT - Two replies as original post had been edited by the time I posted the first. ]
> and as a sidenote, something i only found recently, but which is quite useful, a logical AND with egrep looks like > > egrep 'Hardcover.Kindle'
That's not a true logical AND since it won't pick up an entry with the text "Kindle Hardcover". Only entries with the word "Hardcover" eventually followed by "Kindle". To cover both cases you'd need:-
(Of course, someone will now show how this can be done in even fewer characters).egrep 'Hardcover.*Kindle|Kindle.*Hardcover'How about:
grep Hardcover | grep KindleYup, but I had meant in one command though, i.e.
sed -n '/Kindle/{/Hardcover/p}'
awk '/Kindle/ && /Hardcover/'
awk doesn't have to be complicated ;)
If you really prefer to see things in order then
is an alternative to<foo sort | ...
though I wouldn't particularly recommend it. Instead the overhead of cat(1) should be omitted and it written in the normally accepted form ofcat foo | sort | ...
Providing a filename rather than re-directing stdin allows the program more choice over its method of access.sort foo | ...I find the "cat foo | .." in the beginning and "| cat > bar" at the end form more regular.
While iterating on a command line, it keeps things uniform, rather than switching between "sort foo" and "tai64nlocal < foo" and "ffmpeg -i foo", by which I mean: different programs take their input in different ways. You can normalize by making each take standard input, and feed the chain with a "cat".
I can understand liking the regularity but in production code or web examples it shouldn't be done because of the overhead. However, your example doesn't make sense.
If sort, tai64nlocal, and ffmpeg are all happy to read stdin so you can do
then they can all have their stdin redirected instead by the shell:cat foo | sort ... cat foo | tai64nlocal ... cat foo | ffmpeg ...
Similarly with stdout:<foo sort ...
becomes... | sort | cat >foo
In both cases regularity of having the filename at the start and end is preserved.... | sort >foo
Minor nitpick from the man page: "egrep is the same as grep -E. fgrep is the same as grep -F. Direct invocation as either egrep or fgrep is deprecated, but is provided to allow historical applications that rely on them to run unmodified."
Indeed, I don't see why people get so upset (or pedantic) about what are, effectively, NOPs in command-lines.
However, things change if you start adding certain options to sort:-
andsort -m order.*
are definitely not the same thing (for most input files at least).cat order.* | sort -mPerhaps because sending GiBs through read(2) and write(2) unnecessarily isn't a NOP?
You can do that sending while you're waiting for the disk to provide said GiBs. I also believe that useless uses of cat are often acceptable for readability (many novices are not familiar with redirecting standard input, particularly not as the first thing on a command line).
Who said the GiBs need to be fetched from disk; they could already be in RAM. Even if not, it's still adding many system calls and context switches when the CPU could be doing other things; the machine isn't running just this one thing.
Who said NOPs are free? NOPs still take at least one clock cycle.
It's not that simple anymore... In modern CPUs, NOPs are discarded by the decoding units so they never occupy the exeuction units. If decoding BW is not saturated (and most often, it's not), NOPs are indeed "free".
Moreover if you use cat, it can be easily replaced with e.g. zcat to do whatever you want with gzipped files.
One guideline to keep in mind:
> If the original title begins with a number or number + gratuitous
> adjective, we'd appreciate it if you'd crop it. E.g. translate
> "10 Ways To Do X" to "How To Do X," and "14 Amazing Ys" to "Ys."
> Exception: when the number is meaningful, e.g. "The 5 Platonic Solids."One I learned for the first time the other day is 'paste'. Good for people like me who never fully grokked awk, it joins lines from separate files into a single file.
Say you have two files, one with lines of numbers:
1
2
3
... and one with letters: A
B
C
$ paste numbers letters 1 A
2 B
3 C
Want CSV?$ paste -d, numbers letters
1,A
2,B
3,C
Or, with '-s' you can join lines from inside the same file. For instance, you can sum numbers:$ paste -sd+ numbers
1+2+3
$ paste -sd+ numbers |bc 6
(Thanks to a Stack Overflow post somewhere for suggesting that one!)Useful example: the total resident memory size of all chromium processes:
$ ps --no-headers -o rss -C chromium | paste -sd+ | bc
793180`paste` is also very useful together with stdin redirection and subshells. E.g.:
paste <(ping 8.8.8.8) <(while true; do iwconfig wlan0 | grep "Bit Rate"; sleep 1; done)They could drift. Better to do a one-ping ping and run iwconfig once in a loop.
Hmmm, is this HN worthy? These commands are so basic that I don't expect any HN-reader using Unix systems not to know those.
What about some slightly less basic ones that I'd think would be useful for many people.
# less with syntax highlighting
alias less="/usr/share/vim/vimcurrent/macros/less.sh"
tailf # tail -f, but better & shorter
mtr # check your connection (ie., traceroute with more info)
htop # nice a colorful process listing (better than top)
locate # search files by name -- that late-night disk thrashing is useful after all!I ask the programmers on my team, some of whom are quite junior and unexperienced, to send weekly status reports with an overview of what they did in the previous week, what they expect to do in the next, and other. Other is usually made up of areas of concern and interesting things that have made it onto one's radar. (You can put basically whatever you like in the other section. I'm still hoping somebody will send a joke.)
Since I wouldn't ever ask them to do something I wouldn't do myself, I send these reports too. This link will go in my miscellany section this week. I think it'll be useful, and I wouldn't have ever found it or even considered including something like it had it not shown up on HN.
Yes, it is HN worthy. Reviewing the basics is important and not all readers use Unix systems (frequently enough to remember the basics).
> tailf # tail -f, but better & shorter
That's better about tailf, and why would I use it instead of the (far superior to tail) less +F?
I found it useful. I was unaware of less.
If you are going to use vim, why bit he pretending it is less? Just use vim.
Because the vim macro changes all settings so it behaves just like less, including making the file effectively read-only, and without the overhead of loading all the plugins.
You should really try ack (http://betterthangrep.com) as a replacement for grep.
I think it's obligatory that someone writes 'strace' in threads about articles like this, so here goes. strace is fantastic for debugging certain categories of problem.
Absolutely, especially for those who do network programming under Unix/Linux. Combined with lsof, it can help a lot in troubleshooting issues such as deadlocks.
I'm yet to find a decent introduction to strace (or dtrace). I'd appreciate if someone could point me to one...
Here's a post outlining how strace can be used to solve problems:
https://blogs.oracle.com/ksplice/entry/strace_the_sysadmin_s...
More from a sysadmin POV than a dev, but you'd probably still find it useful.
I was surprised and a little disappointed that `join(1)` didn't join the list!
With the files in the example (order.out.log, order.in.log), to join every record on it's id, you would do something like:
$ join -j 2 order.out.log order.in.log
111, 8:22:19 1, Patterns of Enterprise Architecture, Kindle edition, 39.99 8:22:20 Order Complete
112, 8:23:45 1, Joy of Clojure, Hardcover, 29.99 8:23:50 Order sent to fulfillment
113, 8:24:19 -1, Patterns of Enterprise Architecture, Kindle edition, 39.99 8:24:20 Refund sent to processingI thought to learn something useful for programing (debugging, program runtime analysis, etc.). But instead the article is just about the generic commands cat, sort, grep, cut, sed, uniq, find and less. It is not really development related.
I disagree. They may not be obviously related to coding, though they'll probably end up being useful at some point anyway... But they're definitely useful for working with logs, working with datasets, working with config files, and a host of other development-related tasks. Just yesterday I was on a Windows machine and dearly felt the loss of sed and uniq. I have a task lined up for today to either find Windows alternatives or install msys.
Totally agree! Wasted time. Everybody know these commands.
You and I do, but surely you know a web dev or two that struggles to read log files meaningfully or doesn't know how to grep their html source for something?
Send the article along, it's not always about you...
A better post + thread:
"What are some time-saving tips that every Linux user should know?"
http://www.quora.com/Linux/What-are-some-time-saving-tips-th...
I came across something in fortune some time ago which you can find at:
http://motd.ambians.com/quotes.php/name/linux_songs_poems/to...
I have found it to be surprisingly useful. Nobody uses all the commands but remembering that something like zcat exists can be extremely useful. Also remembering to pipe through sort before sending through uniq is helpful as well.
In most cases you can replace "sort | uniq" with "sort -u".
tail -f filename should definitely make the list, essential for watching what's appended to log files in real time.
I prefer:-
tail -F filename
as most of the logfiles I need to watch tend to wrap at some point and 'tail -f' doesn't check for inode changes.
(-F is a non-standard option, it's there on GNU's [EDIT] tail binary and OS-X but not Solaris for example).
Thanks, very useful - didn't know that!
less +F filename does the same thing, but loses the following when the logfile gets rotated. Does the same thing happen to tail -f? (edit: alexfoo answered that tail -F tracks the file properly)
one benefit to less +F is that you can cancel the following and read normally in less.
> one benefit to less +F is that you can cancel the following and read normally in less.
Yeah, including doing backwards and forward searches, that is very handy
Those are, indeed, 8 commands every developer should know. Calling them "Linux Commands" is a little weird - I don't think there's anything there not specified in POSIX, and I think they all appear on OS X and other unix systems.
Agreed. Especially since he used 'find /Users -name "order*"', which means it was probably written on OS X.
A better title:
8 Linux Commands Every Developer Should Know [for log analysis]
Unless you want to show before/after context (-A/-B flags), one should grep before sort, not after. Less work to do.
sigh And I tought we were through with collections of trivial shell commands.
might be better titled as "8 Unix commands every developer should know".
They are all generic unix commands.
And to be really pedantic, they aren't "unix commands" they are utility programs.
Even worse, they are GNU. :)
To continue reading, subscribe? seriously?
Regardless of the actual content, I applaud the "storytelling" way of presenting content. With a flow of examples that tie into each other the reader is given background context and can say to him/herself "I've had that problem before!", as I found myself doing.
lsof -i always gets left out of these lists. It's really handy for quickly seeing what daemons are running (under which user) and on what interfaces they're bound to.
It says sed has basic stream editing capabilities. Basic. This guy should check, sed has very complex and powerful editing capabilities.
how do I execute the out put of something like this:
grep -i 'pattern' file | awk '{print $5}' | sed 's/^/cmd/g'
I end up sending to a file, chmod, then run it at the shell.
In bash...
?$(grep -i 'pattern' file | awk '{print $5}' | sed 's/^/cmd/g')Or surround in backticks
I prefer $() as they nest better.`command which outputs text you want to run as a command`Or have I misunderstood your question?
That's it $() works. thanks.
Wrap it in $() to execute. And even if you pipe it into a file, no chmod is required. "sh file" works as it should
or use eval
~$ help eval eval: eval [arg ...] Execute arguments as a shell command.
Combine ARGs into a single string, use the result as input to the shell, and execute the resulting commands. Exit Status: Returns exit status of command or success if command is null.
xargs instead of that sed (I assume you're prepending your command to run there).
grep -i 'pattern' file | awk '{print $5}' | xargs cmd
comm is another very useful command