Skip to content

Latest commit

 

History

History
10 lines (8 loc) · 468 Bytes

README.md

File metadata and controls

10 lines (8 loc) · 468 Bytes

toy-BERT

A toy implementation of BERT i built from scratch

Caveats:

  • It is trained on a very small piece of data
  • It has not NSP added in the pretraining according to the paper, but it is really easy to do so.
  • I have not adhered to best policies,(below avg in a couple of places, just due to this being a toy project and I have time constraints)
  • It has been trained with less iterations, hopefully will run on a larger corpus for greater iterations.

Enjoy!