Settings

Theme

I can’t believe that I can prove that it can sort

blog.adacore.com

270 points by Raphael_Amiard 4 years ago · 123 comments (121 loaded)

Reader

drdrek 4 years ago

He starts from the wrong axiom that its hard to prove and creates a lot of nonsense over that.

Its requires just two induction proofs: - One that for I=1, after N comparisons the largest number is at position 1 (Proven with induction) its the base case - The other, that for any I=n+1 if we assume that the first n slots are ordered we can treat n+1 as a new array of length N-n and solve using the base case proof.

Talking about computer science and not doing the bare minimum of computer science is peak software engineering :)

  • phkahler 4 years ago

    I thought his goal was to get the prover to prove it without understanding it himself.

    By realizing the low-indexed portion is always sorted, you've already proved the algorithm yourself and the prover is just checking for bugs in your logic.

    I'm not saying the proof isnt valuable, just that it's not magical and actually requires the user to understand the majority of the proof already.

    • leereeves 4 years ago

      This algorithm is surprising and interesting because it doesn't work at all like it at first seems to.

      The low-indexed portion is sorted, but isn't guaranteed to contain the lowest or the highest i elements of the list (except when i=1), and the list is ultimately sorted in decreasing, not increasing order. The final sort doesn't occur until the last iteration of the outer loop when the inequality is reversed (the interesting variable, j, is on the right).

      Because of that, the proof outline discussed here doesn't work.

      Consider what happens if the unique smallest element starts at position n. It is placed at the start of the list (the correct final position) in the final iteration of the outer loop (i=n), and not before.

      Proof (for simplicity the list is indexed 1 to n):

      Let A[n] < A[k] for all k != n in the initial list.

      Elements are only swapped when A[i] < A[j]. If i != n and j = n, then A[i] > A[n] = A[j], so A[n] is not swapped.

      Then, when i = n and j = 1, A[i] = A[n] < A[1] = A[j], so A[1] and A[n] are swapped, placing the smallest element in position 1 at last.

      • umanwizard 4 years ago

        It’s not true that it’s sorted in decreasing order. See my comment here: https://news.ycombinator.com/item?id=31978942

        • leereeves 4 years ago

          Quite right, that was a silly mistake on my part. I did say the smallest element was placed first, but wrote "the list is ultimately sorted in decreasing, not increasing order" backwards.

          I meant to contradict the top post: "the largest number is at position 1...we can treat n+1 as a new array of length N-n and solve using the base case proof", which would result in decreasing order (largest first).

          • phkahler 4 years ago

            The largest number is at position 1 after the first pass of the inner j loop. In fact it's at the ith position after each completion of the j loop, and it serves to separate the sorted and unsorted parts of the array.

            Beyond that it's just inserting the ith element into the list by finding its place and shifting everything over one position.

  • puffoflogic 4 years ago

    That is an elegant proof... for a totally different algorithm. J starts at 1, not at I. Every element of the loop is subject to be moved in every single outer iteration. Besides it gets the comparison backwards.

    • drdrek 4 years ago

      You are absolutely correct, I was gun ho and got punished with internet shame, bellow is the correct proof. but the point still stands.

      Solution is even simpler:

      Empty case after 1 iteration (I=1) the largest number is at position 1 Base case: after 2 iterations (I=2) the 2 first elements are ordered, and the largest number is at position 2

      Assume N case: after N iterations the first N numbers are ordered (within the sub list, not for the entire array) and the largest number is at position N

      N+1 case (I=N+1): For Every J<N+1 and A[J]>= A[I] nothing will happen From the first J where J<N+1 and A[J] < A[I] (denote x) each cell will switch as A[J] < A[J+1] At J=N+1 the largest number will move from A[N] to A[I] For every J>N+1 Nothing will happen as the largest number is at A[I]

      Not part of the proof but to make it clear we get: - for J<x A[J]<A[I] - for J=x A[J]=A[I] - for J<x A[J]>=A[I] and the list is ordered for the first J elements

    • umanwizard 4 years ago

      No, it works, there's just a bit of an unstated step in the proof.

      After j ranges from 1 to i, it will still be the case that the values from 1 to i are sorted. So you can assume that j starts at i without disturbing the induction condition.

      The reason this is true is... If A[i] is >= any element A[j] for j < i, then it is clearly true. Otherwise, there is some smallest j in 1 to i-1 inclusive such that A[i] < A[j]. Then A[i] will swap with that A[j], then it will swap again with the next A[j+1] (because the original A[j] was < A[j+1]) by the induction case, then with the next element, ... swapping with every element until j reaches i.

      After this is done we have maintained the condition that A[1] .. A[i] are sorted after j ranges from 1 to i.

      • leereeves 4 years ago

        That is the correct inductive step all by itself.

        That's all we need to know: during each iteration after the first, the algorithm inserts the ith element into the previously sorted list from A[1] to A[i-1], giving a new sorted list from A[1] to A[i], and doesn't touch the rest because A[i] contains the maximum element.

        Then when i=n the whole list is sorted.

  • tromp 4 years ago

    If the (i+1)'st outer loop starts with the first i elements in sorted order, then it ends with the first i+1 elements in sorted order.

    In fact if k of the first i elements are <= A[i+1], then the first k iterations of the inner loop will not swap, and the next i-k iterations will swap. The latter can be seen to maintain order.

    The last n-i iterations of the inner loop (where j>=i) do not affect this reasoning and can be omitted, as can the first iteration of the outer loop.

  • tonmoy 4 years ago

    Note that the equality sign is less than, but the algorithm sorts in a descending order.

    • umanwizard 4 years ago

      Not true.

      Clojure code:

          (defn swap [items i j]
            (assoc items
                   i (items j)
                   j (items i)))
          
          (defn weirdsort [items]
            (let [ct (count items)
                  indices (for [i (range ct)
                                j (range ct)]
                            [i j])]
              (reduce (fn [items [i j]]
                        (let [ith (nth items i)
                              jth (nth items j)]
                          (if (< ith jth)
                            (swap items i j)
                            items)))
                      items indices))
            )
      
      Then in the repl:

          my-namespace> (def items (shuffle (range 20)))
          #'my-namespace/items
          my-namespace> items
          [19 0 13 16 14 18 15 7 10 17 9 11 6 1 3 4 12 2 5 8]
          my-namespace> (weirdsort items)
          [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
      • tonmoy 4 years ago

        Sorry I meant to write ascending. It is weird that it swaps when i < j, but the list ends up ascending

smusamashah 4 years ago

Here is how it looks. https://xosh.org/VisualizingSorts/sorting.html#IYZwngdgxgBAZ...

If you compare it with both Insertion and Bubble sort. You can see it looks more like insertion sort than bubble sort.

  • iib 4 years ago

    I guess it can be thought of as an unoptimized insertion or bubble sort.

    I think it is very possible to write this algorithm by mistake in intro compsci classes when you try to code a bubble sort by heart. I would think TAs may have many such instances in their students' homework.

    • naniwaduni 4 years ago

      There's a surprisingly large class of "sorts people accidentally write while intending to write a bubble sort".

      This one is kind of special, though, since it's somehow more offensive to intuition than bubble sort itself.

      • Dylan16807 4 years ago

        Bubble sort is offensive to intuition? I would have said it was the most intuitive, because each step is very simple and you only have to remember one numeric variable in your core loop.

        • naniwaduni 4 years ago

          Bubble sort's inner loop is so hilariously pessimal that it's incredibly easy to accidentally write an insertion sort because you intuition tells you it can't possibly be intended to be that bad.

    • mmcgaha 4 years ago

      I am guilty. I wrote this sort for a gnu screen session menu years ago and even named my function bubsort.

  • bjterry 4 years ago

    The video from the article (https://www.youtube.com/watch?app=desktop&v=bydMm4cJDeU) is much better because it highlights the index of the outer loop, which is unclear from the cascading visualization there. By seeing the indexes it becomes clear that (1) in the area before the outer index, every value gets swapped in and out of the outer loop location to put them in order, and (2) at the end of the first outer loop iteration, the largest element will be at the location of the outer loop index, and so everything to the right is unchanged in each iteration.

    • smusamashah 4 years ago

      This is nice. I had a way to highlight indexes at one point in this tool. Got rid of it when I was trying to add a way to visualize any custom algo. I should add it back. It adds more value/info to the visualization.

  • calmingsolitude 4 years ago

    And here's a version I wrote that visualizes it on your terminal https://github.com/radiantly/simplest-sort

  • raldi 4 years ago

    Is the visualization supposed to animate? I can’t figure out how to make it start.

twawaaay 4 years ago

I actually used this algorithm a decade ago to implement a log-based, transactional database system for an embedded system with very low amount of memory and requirement that all memory be statically allocated.

To the frustration of the rest of the development team who first called me an idiot (I was new) then they could not make quicksort run as fast on inputs that were capped at something like 500 items. Apparently, algorithmic complexity isn't everything.

  • dahart 4 years ago

    You sure it was this algorithm and not Bubble Sort?

    > algorithmic complexity isn’t everything

    Yeah very true. Or at least hopefully everyone knows that complexity analysis only applies to large n, and small inputs can change everything. In console video games it was very common to avoid dynamic allocation and to use bubble sort on small arrays. Also extremely common to avoid a sort completely and just do a linear search on the (small) array while querying, that can end up being much faster than sorting and binary searching, especially when the total number of queries or the number of queries per frame is also low.

    • twawaaay 4 years ago

      Oh yeah, linear search I did a lot to the same consternation of other devs.

      The issue was the device was so low in memory (2MB of unified NV and flash for code, files and operating memory) that there simply did not exist enough space for a lot of things to be held to be any problem for the 20MHz ARM7 controller as long as you did not do anything stupid. 600kB of it was used by OpenSSL and further 300kB by operating system anyway (I am aware of how funny it sounds when OpenSSL takes twice as much space as OS).

      • eranation 4 years ago

        That is pretty awesome if you came up with this exact algorithm a decade ago, ( it was published in 2021, but apparently was in the wild in 2015[1] and I assume discovered earlier, but no one thought it was interesting enough to publish). Why didn’t you use bubble sort / insertion sort? (The algorithm in the paper looks like bubble sort at first look but is basically a clever insertion sort) what was the benefit of using it this way?

        [1] https://cs.stackexchange.com/questions/45348/why-does-this-s...

        • twawaaay 4 years ago

          I am not sure how awesome it is. It looks like something you can stumble on your own by accident on a whiteboard on an interview and use successfully even if you don't know why it works exactly.

          Honestly, I always thought it is a common knowledge and it is just too simple and for this reason gets omitted from books.

          People in CS/IT tend to not spend a lot of time on algorithms with bad complexity and so I am used to people discounting algorithms with useful properties just because there is another that is a bit more efficient when n goes to infinity.

          • Izkata 4 years ago

            Yeah, this sorting algorithm is something you can come up with by accident if you do bubble sort wrong. I wouldn't be surprised if a lot of beginners came up with this one in intro courses. I think I saw it around 2006-2008 when I was a TA and one of my students couldn't figure out why his array got sorted in the wrong order (don't remember for sure though).

            For example, from 2016, here's someone not getting the difference between the two: https://stackoverflow.com/questions/40786409/whats-the-diffe...

          • yannickmoy 4 years ago

            For a related eye-opening result, with both practical application and interesting research backing it, see this talk on "Quicksorts of the 21st century" at Newton Institute on Verified Software last year: https://www.newton.ac.uk/seminar/30504/

            TLDR; practical complexity computation needs to take into account things like memory access latency on different architectures.

      • astrange 4 years ago

        That’s not surprising, OpenSSL has terrible code quality and besides that was written in the era of thinking you should unroll all your loops so they go faster.

  • delecti 4 years ago

    I fee like anyone who was surprised that algorithmic complexity isn't everything, probably didn't totally understand it. The assumptions (like ignoring constants) are straight out of calculus limits. That (+10000) on the end doesn't mean anything if you're sorting an infinite list, but it means a lot if you're sorting 15 (or in your case 500) entries.

    • twawaaay 4 years ago

      Well, it actually is kinda worse (or better, depends how you look at it).

      It is not necessarily +10000, it can also be something like x5000.

      Because CPUs really, really, really like working short, simple, predictable loops that traverse data in a simple pattern and hate when it is interrupted with something like dereferencing a pointer or looking up a missing page.

      So your super complex and super intelligent algorithm might actually be only good on paper but doing more harm to your TLB cache, prefetcher, branch predictor, instruction pipeline, memory density, etc.

      So there is this fun question:

      "You are generating k random 64 bit integers. Each time you generate the integer, you have to insert it in a sorted collection. You implement two versions of the algorithm, one with a singly-linked list and one with a flat array. In both cases you are not allowed to cheat by using any external storage (indexes, caches, fingers, etc.)

      The question: in your estimation, both algorithms being implemented optimally, what magnitude k needs to be for the linked list to start being more efficient than array list."

      The fun part of this question is that the answer is: NEVER.

      • simiones 4 years ago

        > The fun part of this question is that the answer is: NEVER.

        Are you claiming that it is faster in practice to insert an element at the beginning of an array with 1G items, than it is to insert it at the beginning of a linked list with 1G items?

        Or that, on average, the time spent traversing the linked list (assuming an average number of cache misses) the time taken to move all elements in the array one to the right?

        I am certain the first item will not be true, and the second may or may not be true depending on many many other details (e.g., in a language with a compacting GC, the linked list elements will often be stored approximately like in the array case, and you may not pay too high a cost for cache misses).

        • twawaaay 4 years ago

          > Are you claiming that it is faster in practice to insert an element at the beginning of an array with 1G items, than it is to insert it at the beginning of a linked list with 1G items?

          You missed that insertion point is selected at random and must be found as part of the operation. But in practice if you know data will be inserted preferentially at the wrong end -- you can just easily reverse the data structure to make insertions prefer the better end.

          > I am certain the first item will not be true

          I am used to it. Everybody gets it wrong until they spend some time thinking about it, get it explained and/or run a real life test.

          Hint: think about the cost of iterating over half of the linked list versus performing binary search over the array AND moving half of it.

          The cost of binary search goes to negligible pretty fast and cost of moving 8 bytes of memory is always going to be lower than the cost of iterating over one entry (16 bytes) of linked list. And you have statistically equal number of both assuming you are selecting values at random with equal chance for each 64 integer to be next choice.

          CPU can be moving a lot of bytes at the same time but it has to dereference the pointer before it can get to perform comparison before it can dereference the next pointer...

          Actual algorithmic complexity of both algorithms is exactly the same. But an array is much more efficient (instructions per item).

          • kmonsen 4 years ago

            You are right and this is will know, but not only for the reason you state. Cache locality is so much better on an array/vector compared to linked list and that is really important.

            • twawaaay 4 years ago

              Good point. Cache locality is another important reason that I should have mentioned. I have not run tests, but it may possibly be the most important reason of them all.

              In many linked data structures you can mitigate a lot of this problem. For example, in the past I have used contiguous arrays where I would allocate linked elements, then I would try to exploit how the elements are produced to get them placed one after another, if possible.

              When data is produced randomly you can try to create "buckets", where each bucket is backed by an array and attempts to keep consecutive elements relatively close to each other. You could then try to rewrite those buckets in a way that gets amortised. Or you can just straight write it as if it was an array of sorted linked list elements where on each delete or insert you have to move some of the elements, but you only need to reorganise one bucket and not others.

              All a bit cumbersome and at the end of it you are still wasting at least half of the cache on pointers.

              • kmonsen 4 years ago

                Here is a video by Bjarne himself explaining this: https://youtu.be/YQs6IC-vgmo

                Note there is a use case for linked list, inserting/deleting one or few items once in a while. But in general you can just use vector for anything that is not a leetcode problem.

              • jasonwatkinspdx 4 years ago

                A fun related thing I first learned about years ago due to a Dr Dobbs article: some strided access patterns are FAR worse than random access patterns. This is because certain strides hit the worst case behavior of limited associativity caches, dramatically reducing the effective cache size.

          • simiones 4 years ago

            > The cost of binary search goes to negligible pretty fast and cost of moving 8 bytes of memory is always going to be lower than the cost of iterating over one entry (16 bytes) of linked list. And you have statistically equal number of both assuming you are selecting values at random with equal chance for each 64 integer to be next choice.

            That makes a lot of sense, thank you for the more detailed explanation.

          • jwilk 4 years ago

            I think you misunderstood the question.

            If you insert an element at the beginning, there's no list iteration cost.

            • twawaaay 4 years ago

              Please, read the question.

              It stipulates that elements are selected at random.

              You select k integers at random.

              Some of them might be inserted faster into linked list, but when you average it over k integers you will still be slower because inserting into array will be faster, on average.

              Just think what happens if the integer is inserted at the END of the list (same probability...) You need to slowly iterate over entire linked list. While you quickly land at the result with a binary search on an array and then pay almost no cost inserting it there.

              If you think about it, ON AVERAGE, you have to search through half of the list (if you are linked list) or move half of the array (if you are an array).

              • jwilk 4 years ago

                The parts of simiones's message you quoted are not about the average case. They are about the corner case when an item would be added the beginning, meaning the linked list has the biggest advantage.

                • twawaaay 4 years ago

                  You don't get to change the rules of the game while the game is being played.

                  The original problem, as stated by me, was:

                  "You are generating k random 64 bit integers. Each time you generate the integer, you have to insert it in a sorted collection.

                  (...)

                  The question: in your estimation, both algorithms being implemented optimally, what magnitude k needs to be for the linked list to start being more efficient than array list."

                  The cost is not evaluated for any individual insertion, only for the total cost of inserting k integers.

                  For some of the integers the cost of inserting to linked list will be higher, and for some the cost of inserting to array will be higher. It does not matter. We do not care. We just care about what is the cost of inserting k integers thus randomly generated a) into sorted linked list and b) into sorted array list.

      • layer8 4 years ago

        Well, there is one case where a linked list may be faster: when due to memory fragmentation you are unable to allocate contiguous k elements (hence efficiency would drop to zero) but you would still able to allocate k nodes (or k/n nodes with n elements each). It can therefore make sense to use a linked list of arrays with some sensible limit on the array size (e.g. in the gigabyte range). As long as a smaller number of elements is used, the linked list will only have one node and degenerate to the array solution.

        • twawaaay 4 years ago

          If that was on a very old operating system (think MS-DOS?)

          On new operating systems process address space is pieced together independently from physical space. So you can have contiguous, multi-GB array even if you don't have contiguous, multi-GB physical region.

          As you read/write your virtual address space the CPU detects that there is no TLB mapping and interrupts into operating system which behind the scenes finds a page, returns the mapping and continues as if nothing ever happened.

          This is why a process nowadays can allocate any amount of memory -- more than there is physical memory on the machine. It only starts grinding to a halt when you actually try to use it all. (This actually is one of my interview questions -- Can a process allocate more memory than is physically available on the machine? Why does it work?)

  • texaslonghorn5 4 years ago

    If I understand correctly, you would just run the n+1 th outer loop to insert the new item each time?

roenxi 4 years ago

If it looks "obviously wrong" the quick proof that it works ... the j loop will move the largest unsorted element in the array into a sorted position. Since the i loop executes the j loop n times, the array must be sorted (since the n largest elements will be in the correct order).

EDIT

^W^W^W

Nope, I'm wrong. I did a worked example on paper. I think the devious thing here is the algorithm looks simple at a lazy glance.

To sort: 4-3-6-9-1, next round pivot for the i loop in [].

    [4]-3-6-9-1
    9-[3]-4-6-1
    3-9-[4]-6-1
    3-4-9-[6]-1
    3-4-6-9-[1]

    1-2-3-6-9 & sorted
I can see that it sorts everything to the left of a pivot, then because it does that n times it comes to a sorted list. A reasonable proof will be more complicated than I thought.
  • blueflow 4 years ago

    This somewhat looks like bubble-sort minus the early exit.

    • zajio1am 4 years ago

      It is just quirky insert-sort. In each outer cycle, it inserts value originally from the A[i] position to already sorted sequence A[1]..A[i-1], while using A[i] as a temporary variable to shift higher part of the sorted sequence one position up.

    • MrYellowP 4 years ago

      It is BubbleSort. At least that's how I've learned it in the 80s.

  • phkahler 4 years ago

    The list on the left (index less than i) is always sorted. The ith element is inserted by the j loop and the rest of the list is shifted right by one element by repeated swapping with the ith position. Nothing to the right changes because the ith element is the max of the entire list, which seems to be a red herring for analysis.

ufo 4 years ago

The comments so far have tended to focus on the proof itself, but for me the coolest part of the blog post were the formal methods.

Does anyone here also use SPARK for this sort of thing? Are there other formal methods tools you'd use if you had to prove something like this?

  • im3w1l 4 years ago

    Yeah, I see this as a very interesting reply to the TLA+ post. Might spend my evening diving into Ada and Spark.

    Edit: Some things I noticed. The package gnat-12 does not have gnatprove. Ada mode for emacs requires a compilation step that failes with gnat community edition. With alire there is no system gnat so it cannot compile it (quite possible I'm missing something). In the end I gave up on using emacs. Gnatstudio wouldn't run for me until I realized it needed ncurses. It also had some unhandled exceptions for me (raised PROGRAM_ERROR : adjust/finalize raised GNATCOLL.VFS.VFS_INVALID_FILE_ERROR: gnatcoll-vfs.adb:340), but in the end I managed to get it up and running.

    Edit2: After playing around with it, I'm extremely impressed with what spark can do. I made a function to add 7 to numbers. Tried putting a post condition that the return value is bigger than input. "Nuh uh, it could overflow". Ok so I add a pre condition that numbers must be less than 100. "Uhm, buddy you are passing in 200 right there". This is useful stuff for real and easy to write too.

    • yannickmoy 4 years ago

      you can ask Alire to install the latest GNAT it built, or to use another version installed on your machine, see https://alire.ada.dev/transition_from_gnat_community.html

      It also explains how to install GNATprove: ``alr with gnatprove``

      • im3w1l 4 years ago

        Sorry if it was confusing I kind of jumped between the issues I had with various approaches. I did manage to get gnatprove through alire through just that command, it was the apt gnat that didnt have gnatprove. What I wasn't sure how to correctly do with the alire install was

          cd ~/.emacs.d/elpa/ada-mode-i.j.k
          ./build.sh
          ./install.sh
        
        Actually I didn't get that working with any of the options I tried I guess.
  • layer8 4 years ago

    I predict that one day it will be common for this to be a builtin language feature.

Stampo00 4 years ago

I've been watching those mesmerizing YouTube videos visualizing sorting algorithms lately. The header of this article uses a screen cap from one of them.

Them: So what shows do you watch?

Me: ... It's complicated.

There are a lot of different sorting algorithms. Like, a lot, a lot.

As I watch them, I try to figure out what they were optimizing for. Some only scan in one direction. Some only use the swap operation. Some seem to do the minimum number of writes. Some are incremental performance improvements over others.

When I see an algorithm like this, I don't assume the person who wrote it was an idiot. I assume they were optimizing for something that's not obvious to me. Its only modifying operation is swap, so maybe that operation is faster than an arbitrary insert for whatever system or data structure they're using. There are no temporary variables besides loop counters, so maybe they're on a memory-constrained environment. There's barely any code here, so maybe this is for a microcontroller with precious little ROM. Or maybe they're applying this as a binary patch and they have a strict ceiling to the number of ops they can fit in.

Or maybe it's just the first sorting algorithm they could think of in an environment that doesn't ship with one and the performance is adequate for their needs. In that case, it's optimized for developer time and productivity. And honestly, it's a far more elegant algorithm than my "naive" version would be.

These are all valid reasons to use a "naive" sorting algorithm.

eatonphil 4 years ago

This is a fantastic post! It's great to have a concrete example of proving a not-trivial algorithm. And somehow this Ada code feels more approachable to me than HOL4 and other functional proof assistants I've looked at.

While it's been on my list for a while, I'm more curious to try out Ada now.

  • Jtsummers 4 years ago

    It'll take you less than a week to get proficient enough in it to make something useful, or at least interesting. It's a very straightforward language, overall.

    https://learn.adacore.com/ - Pretty much the best free source until you're actually ready to commit to the language. The "Introduction to Ada" course took me maybe a week of 1-2 hours a day reading and practicing to go through. There's also a SPARK course that takes a bit longer, but is also interactive.

    The language reference manuals for 2012 and 202x (which should become Ada 2022):

    http://www.ada-auth.org/standards/ada12.html

    http://www.ada-auth.org/standards/ada2x.html

    • touisteur 4 years ago

      Also, don't hesitate to use Rosetta Code to find useful little snippets. Not all perfect but a good jump-start.

      • Jtsummers 4 years ago

        Yep, I wrote a bunch of those RC examples too. It was a useful exercise when I decided to learn Ada (and with other languages, too).

  • touisteur 4 years ago

    Two of the most interesting things that trickled down during the design of Spark2014:

    1- contracts are written in the same language as the code to be checked/proved. This made important to add expressive features to the language (for all / for some: quantifiers! expression functions, if- and case-statements and more recently the new delta-aggregates notation): these additions make the language far more expressive without too much loss of readability.

    2- executable contracts: most contracts can be checked at runtime or proved. And the escape hatch (code only present for proof) is 'ghost' code which is also a nice addition.

    Lots of little nifty additions to the language, from just 'thinking in contracts' or 'designing for probability'.

jwilk 4 years ago

The paper discussed on HN:

https://news.ycombinator.com/item?id=28758106 (318 comments)

OJFord 4 years ago

Why is this algorithm at all surprising?

(I'm really not trying to brag - I assume I must be missing something, that I'm naïve to think it obviously works, and it actually works for a different reason.)

In plain English - 'for every element, compare to every element, and swap position if necessary' - .. of course that works? It's as brute force as you can get?

  • Asooka 4 years ago

    The comparison is not the way you'd expect. Index j goes past index i, but the comparison doesn't change. What you propose is a condition more like

        if ((a[i] < a[j]) != (i < j)) swap(a, i, j) ;
    
    Which obviously works.

    The surprising algorithm sorts even though it swaps elements that are already ordered.

    • OJFord 4 years ago

      Ok, I think I see why it's a bit weird '1,2,3 when i=1 and j=3 it swaps them anyway' sort of thing?

      But i-loop comes through 'afterwards', so when i=3 (value now 1) and j=1 (3) it sets them straight.

      It still seems quite intuitive, but I think I cheated by skimming over it initially and thinking it was clear, whereas actually I've thought about it more now.

      (Not to compare myself to anything like him, I'm no mathematician at all, but I'm reminded of an amusing Erdós anecdote in which he apparently paused mid-sentence in some lecture, left the room, and came back some time later continuing 'is clearly [...]' or similar!)

  • YetAnotherNick 4 years ago

    If the second loop is from j = i to n, it is easy to see that it will sort in decreasing order. But notice j = 1 to n, then suddenly it will sort in increasing order

  • shp0ngle 4 years ago

    look again at the comparison direction.

    It is opposite!

    It works but not as you think it does.

donatj 4 years ago

Is this not a standard sorting algorithm? If I'm not mistaken this is my coworkers go to sort when writing custom sorts.

  • Jtsummers 4 years ago

    It manages to be more inefficient than most, and will even disorder and then reorder an already sorted input. Bubble sort (a very low bar to beat) doesn't even do that. If you've only got a small number of items, it doesn't matter unless you're sorting a small number of items a large number of times.

prionassembly 4 years ago

I taught myself this exact algorithm at 8 or 10 by sorting playing cards (we established an order for the kind, diamond < leaf < spades < heart) in a very very very long plane trip where I had neglected to bring comic books or something.

asrp 4 years ago

After each outer loop iteration, A[:i] (the array up to i) is in ascending order with the max of A at A[i].

This is true the first iteration since max(A) is eventually swapped to A[1]. This is true in subsequent iterations since during the ith iteration, it inserts the next element, initially at A[i], into A[:i-1] and shift everything up with swaps with A[i] so A[:i] is sorted, with the max of A moved to A[i]. After that no more swaps happen in the ith iteration since A[i] contains the max of A.

fweimer 4 years ago

Isn't the proof incomplete because it does not ensure that the result is a permutation of the original array contents? Just overwriting the entire array with its first element should still satisfy the post-condition as specified, but is obviously not a valid sorting implementation.

  • touisteur 4 years ago

    It was touched upon in the post: 'Tip: Don’t prove what you don’t need to prove' (and surrounding text). With a link on another proof for another sorting algorithm.

    • yannickmoy 4 years ago

      and there are actually multiple ways to prove that the result is a permutation of the entry, either by computing a model multiset (a.k.a. bag) of the value on entry and exit and showing they are equal, or by exhibiting the permutation to apply to the entry value to get the exit value (typically by building this permutation as the algorithm swaps value, in ghost code), etc.

      but none of this makes a good read for people not already familiar with program proof tools.

nephanth 4 years ago

Article says it is better to actually read the paper instead of trying to figure it out.

I disagree on that. It is a good exercise to try to prove it by yourself and it was actually quite fun

Main mistakes author makes though

- trying to prove it works without first understanding why / how it works. Always simulate test-runs on paper / in your head before

- trying to prove on machine first. Always make sure you can do the proof on paper first. Then and only then should you battle with your verifier / solver to get it to understand your proof

MattPalmer1086 4 years ago

Interesting. It is a bit counter intuitive at first but not too hard to see how it works.

After the first main loop, the first item will be the biggest item in the list.

The inner loop, as it always starts from 1 again will gradually replace the bigger items with smaller items, and so on.

  • JadeNB 4 years ago

    > After the first main loop, the first item will be the biggest item in the list.

    > The inner loop, as it always starts from 1 again will gradually replace the bigger items with smaller items, and so on.

    I think the "and so on" is the point. To put it more precisely, I think that anyone (with a suitable familiarity with the tools) can prove formally that "after the first main loop, the first item will be the biggest item in the list"; but I think that it might take a bit more work to prove the "and so on" in a way that satisfies a formal prover.

    (As to the intuition—I can buy that the "and so on" part is reasonably intuitive, but only at a level of intuition that can also believe wrong things. For example, just always replacing bigger items with smaller items doesn't guarantee you'll get a sorted list!)

    • MattPalmer1086 4 years ago

      Sure, formal proof would be a lot harder. I just didn't find it quite as surprising that it works as the article implied.

fdupress 4 years ago

Every time somebody proves that a sorting algorithm sorts, they forget to prove that the algorithm doesn't just drop values or fill the entire array with 0s (with fixed-length arrays, as here).

On paper: "it's just swaps" Formally: "how do I even specify this?"

(For every value e of the element type original array and the final array have the same number of occurrences of e. Show that that's transitive, and show it's preserved by swap.)

  • touisteur 4 years ago

    Hi, co-author here. We actually touched upon this subject in the article: 'Tip: Don’t prove what you don’t need to prove' and surrounding text. Also previous comment by Yannick: https://news.ycombinator.com/item?id=31980126

    • fdupress 4 years ago

      Looked for it and missed it.

      And thanks for writing this up, by the way. Having people less familiar with formal methods try and write up their experiences is, I think, much more effective at letting people in than having experts try and write tutorials.

jeffhuys 4 years ago

Small sort implementation in ES6:

   const args = process.argv.slice(2).map(Number);

   for (let i = 0; i < args.length; i++)
     for (let j = 0; j < args.length; j++)
       if (args[i] < args[j])
         [args[i], args[j]] = [args[j], args[i]];

   console.log(args.join(' '));
Didn't expect it to be actually that simple, but now that I wrote it, it makes complete sense.

To PROVE it, is another thing. Good work.

zoomablemind 4 years ago

TFA seems to be a somewhat wasteful version of an algo which in my mind I call 'sinkersort' or 'minsort':

  for (i=0; i<n; ++i)
    for (j=i+1; j<n; ++j)
      if (a[j] < a[i])
        swap(a[j], a[i]);
Here the inner loop basically finds a min value of the remaining subset. This progressively fills the array in the sorted order.

The number of comparisons is bound by n^2/2 vs n^2 of TFA.

  • palunon 4 years ago

    Yours looks like selection sort but with more swapping.

    The one in TFA isn't, for one crucial reason: the comparison is the other way around... And yet, it sorts.

    • zoomablemind 4 years ago

      By the way, the TFA algo is equivalent to:

        for (i=0; i<n; ++i)
          for (j=0; j<(i ? i : n); ++j)
            if (a[i] < a[j])
              swap(a[i], a[j]);
      
      Once the max has been swaped in, further inner iterations past i (the position of max) are redundant.

      However, we can go even further:

        for (i=0; i<n; ++i)
          for (j=0; j<i; ++j)
            if (a[i] < a[j])
              swap(a[i], a[j]);
      
      This dispenses with the explicit max swap and simply does progressive insertion.

      Curiously, in such transformed form the algo echoes the 'selection sort' from my previous post, performs equivalently too (wrt ncmpr, nswp).

    • zoomablemind 4 years ago

      > Yours looks like selection sort but with more swapping...

      Well, shooting for the simplicity (as posited in the Fung's arxiv paper) -- this one is on par with TFA and somewhat simpler in expression than the selection sort.

      This is also direct, less 'magical' and indeed less wasteful than TFA. So for those roll-your-own-sort moments I'd rather remember this one instead.

ki_ 4 years ago

a "new sorting algorithm"? Pretty sure it existed 50 years ago.

carlsborg 4 years ago

Vaguely remember seeing this in the first few chapters somewhere of Spirit of C by Mullish and Cooper.

  • petercooper 4 years ago

    Can confirm. It's on page 243, in the "Arrays" chapter, and is referred to as the "exchange sort." The actual code is:

        /* Sort the array
        for (i = 0; i < array_size - 1; i++)
            for (j = i + 1; j < array_size; j++)
                if (array[i] > array[j]])
                {
                    int temp = array[i];
                    array[i] = array[j]:
                    array[j] = temp;
                }
    
    So that's from the late 80s.
    • Jtsummers 3 years ago

      That's not the same sort as in the article. Two key differences:

      1. j in the article runs from 0 to array_size-1 (if done in C like this, in the article it's a 1-based array so 1 to array_size). This sort has j run from i+1 to array_size-1.

      2. The swap condition is reversed. In the article's sort the swap happens when array[i] < array[j].

hobo_mark 4 years ago

Wow, I wish we had these built-in provers in VHDL (which is basically Ada in sheep's clothing).

MrYellowP 4 years ago

> published at the end of 2021

I'm so confused. That "new" algorithm is just BubbleSort?

  • Jtsummers 4 years ago

    As addressed in other comments in this thread and past threads on this sort (when the paper came out), it is not bubble sort. Bubble sort only compares adjacent items and swaps them if they are in the wrong order. This has the effect of "bubbling" the largest value to the top (the way it's often written, could also reverse it to push the smallest down first).

    This sort is like a worse performance version of insertion sort. But insertion sort creates a partition (and in doing this performs a sort) of the first i+1 items, with i increasing until the entire sequence is sorted. This one will scan the entire sequence every time on the inner loop instead of just a subset of it and usually perform more swaps.

FrozenVoid 4 years ago

Combsort is far more elegant and faster algorithm. I've wrote a type-generic combsort a while ago here: https://github.com/FrozenVoid/combsort.h

(Combsort as well as mentioned algorithm also consists of two loops and a swap)

  • JadeNB 4 years ago

    I don't think that the goal here is to show a fast and elegant sort, but rather to show that a sorting algorithm that seems like it can't possibly work actually does. That is, probably no-one will learn from this article how to sort better, but hopefully people will learn from this article how to formally prove things (e.g., about sorting) better.

    • touisteur 4 years ago

      Yes (co-author here) that was exactly the point. Thanks for putting it clearly.

ermir 4 years ago

Isn't the sorting algorithm in question the famous BubbleSort? I understand the value of formally proving it works, but why is the name mentioned nowhere?

dgan 4 years ago

So, Lionel knew how to implement a 3D Renderer, but had no clue about O(n) complexity neither other sorting algorithms? I am buffled

  • Agingcoder 4 years ago

    The exact same thing happened to me, and this is how I discovered algorithmic complexity. Sorting my triangles took forever (I saw it in the profile, taking a whopping 90% of the time ) and I eventually figured there might be a proper sorting algorithms out there.

    I was at the same time happily churning out assembly code, talking to the vga card, writing a dos extender, etc.

    You can actually do quite a few things without formal education!

  • touisteur 4 years ago

    Well, Hi there, it me. Back then I was 17 and wading through software rendering so (light) mesh tessellation, triangle projection (twice, for a stereoscopic anaglyph renderer), triangle filling, texture mapping, with at most 300 objects, all in crappy crashy C (first with VESA, then SDL eased my life...) with lots of misunderstandings about pointers.

    I was welllll over my head with complex stuff, and sorting didn't appear in the profiles, so... I guess you can call that profile-guided learning? I had the formal training, later on, and even then it didn't stick until I faced the actual complexity problem head-on.

    I'll never forget that whole weekend with my AMD Athlon 900 at 100% CPU sorting a 200 000 words dictionary... It was still not finished on Monday. Implemented (a very primitive) insertion sort and it was done in less than 2 minutes...

    That was my 10000-a-day day https://xkcd.com/1053

  • im3w1l 4 years ago

    That's self-taught for you. Will know some advanced stuff, and miss some basics depending on what they found interesting or useful.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection