As I write, across social media numerous academics and pollsters are criticising certain individuals for selecting sub-samples of commercial polls and citing them as fully representative polls of their respective sub-sections of the polled population. For example, one such individual was claiming that the Scottish sub-sample of respondents from a national poll was reflective of the overall voting intention of the Scottish population.
This is of course entirely wrong to do. Such sub-samples are not representative of the overall population: simply because they are not large enough to have captured an accurate (within margins of error) distribution of voting intention within the population. Sampling theory and distribution theory combine in this regard to tell us this. It's for this same reason that we set a benchmark of at least 1,000 respondents for commercial polls in order for them to even begin to be considered a representative sample of an electorate.
Put simply, polling 100 people in Scotland would not in any way, shape or form give you an accurate representation of the voting intention of the population of Scotland. Thus, if a sub-sample within a commercial poll only contains 100 Scottish respondents, we cannot take those Scottish respondents as a standalone sample and suggest that the reported voting intention as recorded by said poll is an accurate reflection of current Scottish voting intention. It simply doesn't work like that.
However, what is of great interest is how we are (rightly) so quick to condemn treating sub-samples within commercial polls as representative of their respective populations, but yet very at ease with using often equally (and sometimes smaller) sub-samples in weighting models.
Weighting is applied to representative samples in order to improve their representativeness, and to try and better reflect things like likelihood to turn out to vote. Every commercial polling company uses them, and all survey databases come with weights ready to apply.
While the benefits of (and indeed need for) weighting is not confusing, what is puzzling is that seemingly the same assumptions which we unanimously agree are false about sub-samples of commercial polls in the instance of reading Scottish voting intention are contrastingly considered true when we are applying weighting models to such sub-samples.
For instance, in order to confidently apply a weight which might up-weight Scottish respondents (because they were perhaps underrepresented in the original sample), are we not actually assuming the exact same thing as those who are reading results from the sub-samples: that the individuals that we do have within those samples reflect an accurate representation of the distribution of voting intention within said group?
This same logic would apply to weights on any sub-sample, from age to newspaper readership. Why are we so quick to disregard any inferring of results from sub-samples, but so quick to weight on them?
At the theoretical level, distribution theory (where we reinterpret functions as linear functionals acting on a space of test functions) would suggest that both reading results from sub-samples and applying weighting models based on sub-samples are in effect assuming the same thing: that such sub-samples have captured a representative distribution of voting intention of the population which that sub-sample represents.
Why are we so confident in one regard (that we can robustly apply weighting models and functions to small sub-samples) but so definitely unconfident in the other (that we cannot infer sub-population results from sub-samples)? Why is representativity a worry for inferring results from sub-samples, but not for applying weights to them? Does the small 'n' problem simply not apply to weighting?