we have three kinds of losses
- loss
- batch_loss
- train_loss
as I understand loss is a tensor, batch loss is the value of the tensor , train_loss is the accumulative value of the batch_loss this is ok for me.
my question is why AllenNLP considered the batch_loss in for batch and did not calculate the cumulative loss for batch_group?
Also I did not understand the need for batch_group inside epoch, and batch inside batch_group
this is my understanding we have epoch inside it we have batch_group inside batch_group we have batch the batch_loss is calculated for batch not for batch_group why?