The Politics of Images in Machine Learning Training Sets

2 points by sakagami0 5 years ago · 2 comments

Reader

Recently, I came across some threads asking why data labeling is difficult. (I have my biases as engineer) In my opinion, it's because labeling it essentially determining truth. But, truth requires context, interpretation, and domain knowledge. Sometimes it's easy (with caveats like dataset bias, labeler bias, taxonomy bias). But, for more complex labels, truth is not easily abstractable nor tractable.

version_five 5 years ago

To me it boils down to misunderstanding what the technology can do. If you are trying to have model that labels people as "unsuccessful" based on a picture (example from the article), of course you are setting yourself up for failure. If you're looking for a model that tells whether a manufactured part has a defect, you have a good chance at succeeding. The real question I have is why people ever think ML should be used for judgement calls.

Settings

The Politics of Images in Machine Learning Training Sets

Keyboard Shortcuts