Settings

Theme

K-way merge - Python way

melitamihaljevic.blogspot.com

25 points by mmihaljevic 13 years ago · 5 comments

Reader

cypherpunks01 13 years ago

Re-sorting with sorted() in #2 defeats the purpose of the k-way merge, you may as well just materialize all the lists and sort them.

I didn't know about heapq.merge before, that seems like an efficient way to go. Usually when I do k-way merges it's over a data stream, so I almost always want an iterator.

  • thezilch 13 years ago

    heapq.merge assumes the passed iterators are already sorted and will sort the iterators as is, so you can expect to get different results from some inputs to the examples in #1 and #2 [0]. Of course, if you simply call sorted on the the heapq results (#1), you will get the same result as #2 [1]. As well, without a final sort of either merge-example, you will get different results [2].

      [0]
      > list(heapq.merge([3, 2, 1], [3], [2, 3], [4, 5], [0, 1, 2]))
      [0, 1, 2, 2, 3, 2, 1, 3, 3, 4, 5]
      > sorted(itertools.chain([3, 2, 1], [3], [2, 3], [4, 5], [0, 1, 2]))
      [0, 1, 1, 2, 2, 2, 3, 3, 3, 4, 5]
    
      [1]
      > sorted(heapq.merge([3, 2, 1], [3], [2, 3], [4, 5], [0, 1, 2]))
      [0, 1, 1, 2, 2, 2, 3, 3, 3, 4, 5]
    
      [2]
      > list(itertools.chain([3, 2, 1], [3], [2, 3], [4, 5], [0, 1, 2]))
      [3, 2, 1, 3, 2, 3, 4, 5, 0, 1, 2]
      > list(heapq.merge([3, 2, 1], [3], [2, 3], [4, 5], [0, 1, 2]))
      [0, 1, 2, 2, 3, 2, 1, 3, 3, 4, 5]
  • mmihaljevicOP 13 years ago

    you are right about sorted() but I was thinking it could be useful if you don't have sorted arrays but need similar functionality and arrays are not too big so you don't need an iterator but I was really amazed with heapq.merge()

    I appreciate your comment

    • tantalor 13 years ago

      > if you don't have sorted arrays

      In this case certainly you should sort the multiple smaller arrays (in parallel), not the merged one.

andrewcooke 13 years ago

    >>> from heapq import merge
    >>> def kmerge(*lists):
    ...   return merge(*map(sorted, lists))
    ... 
    >>> list(kmerge([1,2,3],[6,5,4],[2,2]))
    [1, 2, 2, 2, 3, 4, 5, 6]

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection