Settings

Theme

Pareto Front Sort for Python

github.com

41 points by kummappp 6 years ago · 13 comments

Reader

kummapppOP 6 years ago

I did not find pareto front sorting library so I made one. It took a weekend. Save your weekend and use my module instead.

phijFTW 6 years ago

Generally, I would also recommend using pymoo [1] if you have multi-objective optimization problems. They also have an implementation of Non-dominated Sorting [2], which should serve the same purpose.

[1]: https://pymoo.org/

[2]: https://github.com/msu-coinlab/pymoo/blob/master/pymoo/util/...

  • kummapppOP 6 years ago

    Thanks. The 'Non-dominated sorting' was the magic phrase for google i was missing.

anamexis 6 years ago

Can you explain, for the uninitiated, what Pareto front searching is?

  • hansvm 6 years ago

    The general idea is that if you have multiple things you care about (objectives) it's generally not possible to optimize them all at once (e.g. in computer science you often can't minimize runtime and memory usage in the same algorithm). However, there are some objectively bad answers. If you have an algorithm running in O(n) time and O(n) space (assuming no other objectives of interest), one running in O(n) time and O(n^2) space is worse.

    The Pareto front is a formalization of that idea. It's a collection of all "non-dominated" points. A point is dominated if there is some other point that's at least as good in all dimensions and better in at least one. A point is "non-dominated" if there are no points that dominate it.

    Going back to the [O(n), O(n)] example, this would not be dominated by, e.g., an [O(1), O(n^2)] algorithm since even though the time complexity is better, the space complexity is worse. Out of the three examples I've given, those two would be in the pareto front.

    I haven't read through the whole library yet, but it looks like the way this works is it finds the pareto front (all non-empty, finite collections have a non-empty sub-collection of non-dominated points), finds the pareto front of the remainder, and so on till it's exhausted the input. Your output is a collection of "waves" of the original data.

    Edit: I used algorithmic complexity just because it would be serve as a familiar example to Hacker News. In general you just need a comparison function in each dimension (and often you'll make a choice for each dimension as to whether you care about maximums or minimums -- they're directly interchangeable, so that's just for end user convenience). E.g., you might want to maximize crop yield and minimize input costs, or in a dimensionality reduction algorithm you might want to maximize separation while minimizing how different your reduction is from the original data by some metric. It's an extremely general purpose paradigm that sidesteps some of the issues that arise when you have multiple things you care about that can't be cleanly converted into a single cost function (all the hyperparameters in a machine learning model come to mind as well). You still need a mechanism for choosing things from the Pareto front, and there are whole fields of study in how people interact with a Pareto solver, but being able to restrict your inputs to _just_ the Pareto front is still a big win.

cochleari_major 6 years ago

For low dimensions, it might be useful to look at the convex properties of the Pareto front. If a point P is on Pareto front, it is not in the interior of the convex hull of undominated points. In two dimensions, one can compute the convex hull of N points in O(N log N) time. This typically allows for faster Pareto front computation, but not in the worst case.

  • contravariant 6 years ago

    Yeah I'm not sure this works. As far as I can tell the Pareto front doesn't have to be convex and can therefore contain points that are in the interior of the convex hull.

    Of course if you accept convex combinations of options then the pareto front is part of the convex hull.

  • Jargerhaben 6 years ago

    > If a point P is on Pareto front, it is not in the interior of the convex hull of undominated points.

    Can you elaborate on what you mean by this? Is this under the assumption of a convex Pareto front? I may be misunderstanding.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection