Posted by Tom.Capper
Sampling is a process used in statistics when it’s unfeasible or impractical to analyse all the data that exists. Instead, a small, randomly selected subset is used to keep things manageable. Many analytics platforms use some sort of sampling to keep report loading times in check, and there seem to be three schools of thought when it comes to sampling in analytics. There are those who are terrified of it, insisting in unsampled versions of any report. Then there are those who are relaxed about it, trusting the statistical logic. And then, lastly, there are those who are oblivious.
All three are misguided.
Sampling isn’t something to fear, but, in Google Analytics in particular, it can’t always be trusted. Because of that, it’s definitely worth your time to understand when it occurs, how it affects your work, and how it can be avoided.
You can always tell when sampling is being used, because of this line at the top of every report:
If the percentage is less than 100%, then sampling is in progress. You’ll notice above that I’ve produced a report based on more than half a billion sessions without any sampling — sampling isn’t just about the …read more