The disadvantage of the attention/transformer architecture is that it adds more weight parameters to the model.
It should be noted that architecture simply means the science of designing and constructing buildings.
In this case, the disadvantage of the attention/transformer architecture is that it adds more weight parameters to the model.
Learn more about architecture on:
https://brainly.com/question/9760486