Hello
I am just getting started with deep learning and one of the things that confuses me is how to pick the right batch size for training. I have seen tutorials where people use very small batch sizes like 16 or 32; while others recommend going as large as possible to speed things up on GPUs.
When I tried larger batch sizes; the training was faster but the accuracy on validation data sometimes got worse.
This makes me wonder if there is a general rule of thumb for beginners on how to select a batch size that balances speed & accuracy.
I understand that it depends on the dataset, model architecture & available GPU memory but it would be helpful to know how experienced practitioners approach this choice.
Checked CS231n Deep Learning for Computer Vision guide for reference. As a beginner; I sometimes feel the same confusion with batch size in deep learning as I did when first trying to understand what is pl sql in databases both require clear guidance to get started.
If anyone could share practical tips, like whether to start small and scale up / maybe adjust learning rates alongside batch size; that would really help beginners like me avoid common mistakes. Clear examples would be great to understand how to apply this in real projects.
Thank you !!