First page Back Continue Last page Overview Graphics
Apportioning Data
Can’t just split a data set into three to form training, testing and validation sets
Statistical parameters of each set should be similar
Distribution of target classes / values should be similar
- can’t have all of class A in the training set and all of class B in the testing set