toy-BERT

A toy implementation of BERT i built from scratch

Caveats:

It is trained on a very small piece of data
It has not NSP added in the pretraining according to the paper, but it is really easy to do so.
I have not adhered to best policies,(below avg in a couple of places, just due to this being a toy project and I have time constraints)
It has been trained with less iterations, hopefully will run on a larger corpus for greater iterations.

Enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
BERT_from_scratch.ipynb		BERT_from_scratch.ipynb
README.md		README.md

Provide feedback