Perl6: Unary Sort

perl6advent.wordpress.com

66 points by patrickas 12 years ago · 44 comments

Reader

I really need to take a proper looks at Perl 6 -- it has some nice characteristics. Maybe that's my new years resolution.

stesch 12 years ago

And a dozen ways to write a very simple function.
- Renaud 12 years ago
  
  Old debate, it's a core philosophy or the language. It's a bit like complaining that there are a 100 ways to cook chicken.
  Personally, I like the freedom that Perl offers. Perl 5.x is still widely used in some industries, finance and banking for instance, and it's still the glue that holds 'nix systems together (have a look at the scripts in the bin folders of any distro).
  Following Modern Perl[1] best practices, you can write powerful, meaningful and expressive Perl 5 without shooting yourself in the foot.
  Perl 6 is another beast, it removes many of the ambiguities present in Perl 5 and introduces more functional paradigms.
  At its core, Perl remains a multipurpose tool. The fact that there are multiple ways to do a thing is not bad, whether it suits you or not is a matter of personal preference really.
  So, if someone is curious about Perl, they should be encouraged to try it and find out for themselves if they like it or not.
  [1] http://modernperlbooks.com/books/modern_perl/
- kamaal 12 years ago
  
  That's a wrong way of looking at things. The right way is to see how a language adapts per your thinking for writing complex functions. When you look at things from such an angle, its highly restrictive if a language expects you to always bend towards its one true ideology.
  A nice language will offer you enough creative ammunition and will do as you wish while you are busy at more important things.
- rmc 12 years ago
  
  It's a feature, not a bug!

yxhuvud 12 years ago

Somehow I prefer the Ruby solution to add another method to do the unary variant. Compare

  ["A","b","C"].sort {|a, b| a.downcase <=> b.downcase }

  ["A","b","C"].sort_by {|k| k.downcase }

(or equivalently)

  ["A","b","C"].sort_by &:downcase

raiph 12 years ago
The unary sort is useful for two reasons. It simplifies the code AND it does key caching.
But sometimes you need BOTH a key extraction closure (to get key caching for performance) AND a comparison closure (to specify a custom sort). In P6 you just specify both closures (P6 figures out which is which because one has one arg and the other has two).
Here's the P6 equivalent of your last couple lines:
```
    ["A","b","C"].sort: { .lc } # A, b, C
    ["A","b","C"].sort:  *.lc   # same thing
```
Here's a custom sort:
```
    ["A","b","C"].sort: { $^b.lc leg $^a.lc } # C, b, A
```
Now combining them:
```
    ["A","b","C"].sort: *.lc, { $^b leg $^a } # C, b, A
```
The latter will run faster.
- colomon 12 years ago
  
  The latter doesn't appear to be implemented in any version of p6.
  - raiph 12 years ago
    
    colomon is right. Thanks for pointing this out colomon.
    Sorry HNers and P6ers.
    I feel so embarrassed. I'm usually almost pathologically careful. I have an explanation if anyone wants it but I suspect an excuse is uninteresting so will skip that for now.
    Instead I'll say I am now committed to trying to implement this (it's in the spec, just not yet implemented) by updating Rakudo's sort method:
    https://github.com/rakudo/rakudo/blob/874e358fa5b09f94243003...
    The good news is that it's (currently) about 10 lines of code. The bad news is they look very gutsy to me. Oh well. Open mouth, take responsibility, do your best and all that.

adrianmsmith 12 years ago

Great stuff, nevertheless the following is a surely a regression:

> Note that in Perl 6, cmp is smart enough to compare strings with string semantics and numbers with number semantics, so producing numbers in the transformation code generally does what you want.

Perl 5 had a clear distinction between "cmp" (string comparison) and "<=>" (numeric comparison). Trying to work out the data type, and thus the comparison approach, from the actual data itself, is surely going to create subtle bugs, that don't appear in testing, but do appear with live data.

colomon 12 years ago

Perl 6 has three comparisons: "<=>" for numeric, "leg" for string (stands for less equal greater), and "cmp" as described above.
- colomon 12 years ago
  
  I should have also said that a great deal of thought went into making the default behaviors for "cmp" robust. But if (as in the article's example) you're producing numbers in the transformation code, "cmp" is guaranteed to do the right thing for you. Likewise if you produce strings in the transformation.
  - adrianmsmith 12 years ago
    
    "Likewise if you produce strings in the transformation" What if you produce strings that look like numbers? Where, for example, extra leading zeros differ in significance between string and numeric comparison? Surely trying to use heuristics for this stuff is always a recipe for disaster, as with the == operator in PHP?
    
    colomon 12 years ago
    
    Perl 6 has proper types. If you produce a string, you are producing a Str object (well, technically it could be an object that does the Stringy role, but I don't think there are any other Stringy types implemented yet) and it will compare as a string, not a number, no matter what it looks like.
    Likewise, if you produce a number, you have an object which does the Numeric role, and it will compare as a number.
    
    perlgeek 12 years ago
    
    If you produce strings, they are compared as strings, even if they look like numbers.
orblivion 12 years ago

This caught my eye as well. As a Python programmer who never learned Perl, this validates some stereotypes. Or maybe it's the budding Haskell programmer in me. "smart enough"? Sounds pretty dangerous to me.
- laumars 12 years ago
  
  I don't know what stereotypes you are referring to, but Perl is actually one of the better loosely typed languages for not causing the aforementioned bugs. And also one of the best languages for not overloading operators too (which is one of the reasons it looks like executable line noise)
- ugexe 12 years ago
  
  So don't use the comparison operator that attempts to detects the correct type and use the operator made for the specific type you know you are comparing?
  - adrianmsmith 12 years ago
    
    If an operator has no use (i.e. you should always use a type-specific operator) then why is it in the language at all? If it's there, someone will use it, and if it shouldn't be used, then that's bad that someone uses it.
    
    ugexe 12 years ago
    
    Why would you use a type specific operator in a quick script where you don't need to really think about typing and all your perl 6 variables are typed as 'Any' as they aren't specifically typed?

rmc 12 years ago

Python has this with the "key" argument to "sort".

    list.sort(key=lambda x: x.lower())

maxerickson 12 years ago
You can also pass in the method from the str type:
```
    key=str.lower
```
It will be faster, but it will break on elements that aren't strings (which could be good or bad).
- koenigdavidmj 12 years ago
  Then you can use
  key=operator.methodcaller("lower")
  and handle any type.
_ZeD_ 12 years ago

witch is a little different thing.
python can sort according to a function (using the "key" parameter) or using a custom comparator function (using the "cmp" parameter)
as a side note, Python offers also a sorted[0] function, applyable to any sequence
[0] http://docs.python.org/2/library/functions.html#sorted
- orblivion 12 years ago
  
  I don't get the difference.
  - _ZeD_ 12 years ago
    
    They resolve differenti problems: say you have a list of complex object, and you wanna sort by an attribute, you can use
    mylist.sort(key=lambda element: element.attribute)
    Elsewere, say you want to sort a non-omogeneous list, or according to a custom order (attribute1 descending, attribute2 ascending) cmp make this easy
    
    philh 12 years ago
    
    I think ve was saying, ve doesn't see the difference between python's `sort(key=...)` and perl6's `sort { one-arg block }`. (At any rate, I don't see that difference.)
    
    rmc 12 years ago
    
    Elsewere, say you want to sort a non-omogeneous list, or according to a custom order (attribute1 descending, attribute2 ascending) cmp make this easy
    You can do this in python with the key function returning a list/tuple:
    mylist.sort(key=lambda element: (element.attribute1, -element.attribute2))
    
    _ZeD_ 12 years ago
    
    I'm not sure you can do -"string" :)
  - dded 12 years ago
    
    The Python .sort() method sorts in-place. The sorted() function returns a sorted copy, leaving the original untouched.
    
    raiph 12 years ago
    
    I realize this is drifting off topic, and I could well believe the current Rakudo implementation doesn't actually save the memory (I could believe it does, I just don't know) but P6, for any method, including sort:
    my $a = [2,1]; say $a.sort; say $a # 1 2 2 1 my $a = [2,1]; say $a.=sort; say $a # 1 2 1 2
    This riffs off the general op= syntax:
    my $a = 1; say $a + 1; say $a # 2 1 my $a = 1; say $a += 1; say $a # 2 2
    
    orblivion 12 years ago
    
    I meant I didn't understand the difference between Python's sort(key=..) and what's described here for Perl.
    
    _ZeD_ 12 years ago
    
    what is described for perl is python's sort(cmp=...)
    
    orblivion 12 years ago
    
    Hmm? I thought the whole point of the article was "Most people know about sort(cmp=), which takes two parameters. But check out sort(key=), it only takes one parameter, which means the key function only has to run once per item", but for Perl.
    
    raiph 12 years ago
    
    That's my understanding too.
    Lisp had a nice idiom that Randal Schwartz translated in to Perl 5 in 1994 which became known as the Schwartzian Transform: http://en.wikipedia.org/wiki/Schwartzian_Transform
    It seems from the comments here that Python's key and cmp args are Python's equivalent to the ST. I'm surprised by the comment that Python 3 removed cmp. Perhaps that's a temporary omission.
    This HN is about an advent article for Perl 6, which covers the same territory. Based on a look at the Rakudo code, it only implements a single custom sort closure (equivalent to Python's cmp) OR a single key closure (equivalent to Python's key).
    
    orblivion 12 years ago
    
    > I'm surprised by the comment that Python 3 removed cmp. Perhaps that's a temporary omission.
    Guido and company insist that you can (almost?) always find a way supplying a key function. For better or worse, they decided it was the best way, and going by the Python philosophy they removed anything that isn't the one obvious right way to do it. Perhaps it's available in a library somehow, that's what they did to other things that are unnecessary or useful 99% of the time but may still be 1% of the time.
    Similarly, Guido once almost removed lambda, but he faced an insurrection.
    
    jasomill 12 years ago
    
    From the Python 3.2 standard library functools.py:
    def cmp_to_key(mycmp): """Convert a cmp= function into a key= function""" class K(object): __slots__ = ['obj'] def __init__(self, obj, *args): self.obj = obj def __lt__(self, other): return mycmp(self.obj, other.obj) < 0 def __gt__(self, other): return mycmp(self.obj, other.obj) > 0 def __eq__(self, other): return mycmp(self.obj, other.obj) == 0 def __le__(self, other): return mycmp(self.obj, other.obj) <= 0 def __ge__(self, other): return mycmp(self.obj, other.obj) >= 0 def __ne__(self, other): return mycmp(self.obj, other.obj) != 0 __hash__ = None return K
- comex 12 years ago
  
  And incomprehensibly, Python 3 removed the cmp argument, leaving only unwary sort.

orblivion 12 years ago

One thing this article does not cover is that, even if a comparison function doesn't do anything slow, there is still the fact that it still has to do a Perl function call O(n log n) times. At least in the equivalent with Python, I think it is advised to use the key function rather than comparison funtion for this reason, for speed. Though I guess a key function needs to take more space if it is to memoize.

raiph 12 years ago

A one arg closure to the P6 sort builtin (eg { .lc }) is a key function, not a comparison function. Imo the P6 sort builtin is an elegant rethink of P5's Schwartzian Transform, which was invented in 1994 to address precisely the point you make.
- orblivion 12 years ago
  
  I agree, this wasn't a point about Python vs Perl.
  All I'm saying is that the author bills it as not wanting to run { .lc } twice per comparison. What I'm adding is that the comparison function itself, even if trivial, arguably has a bit of an overhead just from calling it. Thus, having O(n) calls to a key function { .lc } may be better than O(n log n) calls to a comparison function {.lc <=> .lc}.
  At least that's how people convinced me to use key= instead of cmp= in Python.

Grue3 12 years ago

So, it's a feature that has been available in every high-order language since forever, except with more line noise. Typical Perl!

http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec...

tarpden 12 years ago

While it appears that Perl 6 has many impressive features, I'm much more interested in practical matters such as: module versioning, installation, & removal; the state of MoarVM; and the state of the tutorial book.

raiph 12 years ago

After years of the spec being nice but the implementations being very basic, the module versioning and installation implementation in Rakudo is finally shaping up. I anticipate it getting pretty robust over the next 6 months. Imo FROGGS' day 11 advent article is poorly written but might be useful: http://perl6advent.wordpress.com/2013/12/11/day-11-installin...
Rakudo/MoarVM began running this month. As Larry Wall said a few days ago "failed 179 test files ... which ain't too shabby at this point" http://irclog.perlgeek.de/moarvm/2013-12-21#i_8029074
Afaik the Perl 6 book is not really being updated. Maybe this info is of interest to you: http://www.perlmonks.org/?node_id=1033899

Settings

Perl6: Unary Sort

Keyboard Shortcuts