I have recently watched a video explaining that for Deep Learning, if you add more data, you don't need as much regularization, which sort of makes sense.
This being said, does this statement hold for "normal" Machine Learning algorithms like Random Forest for example ? And if so, when searching for the best hyper-parameters for the algorithm, in theory you should have as input dataset ( of course that gets further divided into cross validation sets etc ) as much data as you have, and not just a sample of it. This of course means a muuch longer training time, as for every combination of hyper-params you have X cross-validation sets which need to be trained and so on.
So basically, is it fair to assume that the params found for a decently size sample of your dataset are the "best" ones to use for the entire dataset or isn't it ?