We now have good tooling for handling sequential inputs (e.g. RNNs). But what about variable sized inputs that have no explicit ordering? We can use a sequential model, but in what order do we feed our input? The authors show empirical evidence that order matters, and describe a model which can be viewed as a simplified Neural Turing Machine for naturally handling sets as inputs.
Re-using the NTM tooling seems like a clever way to handle unordered inputs in a natural way. It would have been great to see actual comparisons against existing models for all of the example problems: sure, a bi-directional LSTM isn’t the right way to handle an unordered input set, but how much am I losing by re-using my out of the box solution? The authors demonstrate how some existing models change behavior if the input order is permuted, but I would have liked to see a side-by-side comparison against their proposal.
This paper continues the theme/direction of networks which get to take repeated looks at different parts of their input: i.e. they control what they look at, instead of being fed data and being forced to make do. While in the NTM paper this felt like an interesting hack, it’s starting to seem more and more like a very reasonable way of solving real problems.