Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

early_stop not being triggered? #98

Open
jackievaleri opened this issue Apr 6, 2023 · 0 comments
Open

early_stop not being triggered? #98

jackievaleri opened this issue Apr 6, 2023 · 0 comments

Comments

@jackievaleri
Copy link

jackievaleri commented Apr 6, 2023

Hello, I have tried to train DNABERT models with early_stop = 1 and early_stop = 5 but both times my training log looks like below. It seems the early_stop condition is not being triggered. While I am only recording loss here, and I understand that the early_stop is triggered by auc, I have tested the model performance at different checkpoints. As expected, auc is low when loss ~0.7 and auc is high when loss ~0.3. Maybe I am not understanding something about the early stop condition: is early_stop triggered after each epoch or training step?

Thanks so much for your help!

Training log

============================================================
{"learning_rate": 4.6403712296983755e-06, "loss": 0.7007974421977997, "step": 100}
{"learning_rate": 9.280742459396751e-06, "loss": 0.6481768530607224, "step": 200}
{"learning_rate": 1.3921113689095128e-05, "loss": 0.5411258974671364, "step": 300}
{"learning_rate": 1.8561484918793502e-05, "loss": 0.4675881567597389, "step": 400}
{"learning_rate": 2.320185614849188e-05, "loss": 0.4492384225130081, "step": 500}
{"learning_rate": 2.7842227378190256e-05, "loss": 0.41344937846064567, "step": 600}
{"learning_rate": 3.248259860788863e-05, "loss": 0.3779252037405968, "step": 700}
{"learning_rate": 3.7122969837587004e-05, "loss": 0.37262490659952163, "step": 800}
{"learning_rate": 4.176334106728538e-05, "loss": 0.3927639803290367, "step": 900}
{"learning_rate": 4.640371229698376e-05, "loss": 0.3633966606855392, "step": 1000}
{"learning_rate": 5.104408352668214e-05, "loss": 0.3557696384191513, "step": 1100}
{"learning_rate": 5.568445475638051e-05, "loss": 0.3788648857176304, "step": 1200}
{"learning_rate": 6.032482598607889e-05, "loss": 0.3504885651916265, "step": 1300}
{"learning_rate": 6.496519721577726e-05, "loss": 0.36347281366586687, "step": 1400}
{"learning_rate": 6.960556844547565e-05, "loss": 0.3671479770541191, "step": 1500}
{"learning_rate": 7.424593967517401e-05, "loss": 0.37253240182995795, "step": 1600}
{"learning_rate": 7.88863109048724e-05, "loss": 0.3568559378385544, "step": 1700}
{"learning_rate": 8.352668213457077e-05, "loss": 0.34786286219954493, "step": 1800}
{"learning_rate": 8.816705336426915e-05, "loss": 0.3583295064419508, "step": 1900}
{"learning_rate": 9.280742459396752e-05, "loss": 0.34529574267566204, "step": 2000}
{"learning_rate": 9.74477958236659e-05, "loss": 0.3723073774576187, "step": 2100}
{"learning_rate": 0.00010208816705336428, "loss": 0.33124514155089857, "step": 2200}
{"learning_rate": 0.00010672853828306264, "loss": 0.35580602899193764, "step": 2300}
{"learning_rate": 0.00011136890951276103, "loss": 0.3535840827226639, "step": 2400}
{"learning_rate": 0.0001160092807424594, "loss": 0.40146430298686026, "step": 2500}
{"learning_rate": 0.00012064965197215778, "loss": 0.35824646830558776, "step": 2600}
{"learning_rate": 0.00012529002320185614, "loss": 0.3476183260977268, "step": 2700}
{"learning_rate": 0.00012993039443155453, "loss": 0.35561001248657703, "step": 2800}
{"learning_rate": 0.0001345707656612529, "loss": 0.3357570244371891, "step": 2900}
{"learning_rate": 0.0001392111368909513, "loss": 0.35210531190037725, "step": 3000}
{"learning_rate": 0.00014385150812064966, "loss": 0.33717056721448896, "step": 3100}
{"learning_rate": 0.00014849187935034802, "loss": 0.35643586337566374, "step": 3200}
{"learning_rate": 0.0001531322505800464, "loss": 0.3409475743770599, "step": 3300}
{"learning_rate": 0.0001577726218097448, "loss": 0.3437032409012318, "step": 3400}
{"learning_rate": 0.00016241299303944317, "loss": 0.35203878819942475, "step": 3500}
{"learning_rate": 0.00016705336426914153, "loss": 0.3511780767887831, "step": 3600}
{"learning_rate": 0.00017169373549883992, "loss": 0.34659315675497054, "step": 3700}
{"learning_rate": 0.0001763341067285383, "loss": 0.4635441067814827, "step": 3800}
{"learning_rate": 0.0001809744779582367, "loss": 0.35839121639728544, "step": 3900}
{"learning_rate": 0.00018561484918793505, "loss": 0.35847420528531077, "step": 4000}
{"learning_rate": 0.0001902552204176334, "loss": 0.3532230830192566, "step": 4100}
{"learning_rate": 0.0001948955916473318, "loss": 0.3591628995537758, "step": 4200}
{"learning_rate": 0.00019953596287703018, "loss": 0.36945720002055166, "step": 4300}
{"learning_rate": 0.00019953602268333548, "loss": 0.3720928683876991, "step": 4400}
{"learning_rate": 0.00019902049233148602, "loss": 0.36925311520695686, "step": 4500}
{"learning_rate": 0.00019850496197963656, "loss": 0.3749953772127628, "step": 4600}
{"learning_rate": 0.0001979894316277871, "loss": 0.37533086746931077, "step": 4700}
{"learning_rate": 0.00019747390127593763, "loss": 0.407482353746891, "step": 4800}
{"learning_rate": 0.00019695837092408816, "loss": 0.36162124142050744, "step": 4900}
{"learning_rate": 0.0001964428405722387, "loss": 0.36303346186876295, "step": 5000}
{"learning_rate": 0.00019592731022038924, "loss": 0.39387683257460593, "step": 5100}
{"learning_rate": 0.00019541177986853977, "loss": 0.376108710616827, "step": 5200}
{"learning_rate": 0.0001948962495166903, "loss": 0.3784422878921032, "step": 5300}
{"learning_rate": 0.00019438071916484084, "loss": 0.36946004882454875, "step": 5400}
{"learning_rate": 0.00019386518881299138, "loss": 0.396262523829937, "step": 5500}
{"learning_rate": 0.00019334965846114192, "loss": 0.37226478368043897, "step": 5600}
{"learning_rate": 0.00019283412810929245, "loss": 0.4128914840519428, "step": 5700}
{"learning_rate": 0.000192318597757443, "loss": 0.40437444478273393, "step": 5800}
{"learning_rate": 0.0001918030674055935, "loss": 0.3735684740543366, "step": 5900}
{"learning_rate": 0.00019128753705374406, "loss": 0.3898909795284271, "step": 6000}
{"learning_rate": 0.0001907720067018946, "loss": 0.38805960774421694, "step": 6100}
{"learning_rate": 0.0001902564763500451, "loss": 0.3711014576256275, "step": 6200}
{"learning_rate": 0.00018974094599819564, "loss": 0.4074031673371792, "step": 6300}
{"learning_rate": 0.0001892254156463462, "loss": 0.37439350083470346, "step": 6400}
{"learning_rate": 0.0001887098852944967, "loss": 0.40759576119482516, "step": 6500}
{"learning_rate": 0.00018819435494264725, "loss": 0.3819374969601631, "step": 6600}
{"learning_rate": 0.00018767882459079778, "loss": 0.38475257441401484, "step": 6700}
{"learning_rate": 0.00018716329423894832, "loss": 0.45396690398454664, "step": 6800}
{"learning_rate": 0.00018664776388709886, "loss": 0.38281256571412087, "step": 6900}
{"learning_rate": 0.0001861322335352494, "loss": 0.43358139574527743, "step": 7000}
{"learning_rate": 0.00018561670318339996, "loss": 0.4598173153400421, "step": 7100}
{"learning_rate": 0.00018510117283155046, "loss": 0.46087321028113365, "step": 7200}
{"learning_rate": 0.000184585642479701, "loss": 0.4918929560482502, "step": 7300}
{"learning_rate": 0.00018407011212785154, "loss": 0.4780182507634163, "step": 7400}
{"learning_rate": 0.00018355458177600207, "loss": 0.4108209757506847, "step": 7500}
{"learning_rate": 0.0001830390514241526, "loss": 0.3931944060325623, "step": 7600}
{"learning_rate": 0.00018252352107230314, "loss": 0.5266498747467995, "step": 7700}
{"learning_rate": 0.00018200799072045368, "loss": 0.5410025058686734, "step": 7800}
{"learning_rate": 0.00018149246036860422, "loss": 0.5135592448711396, "step": 7900}
{"learning_rate": 0.00018097693001675475, "loss": 0.42130292400717734, "step": 8000}
{"learning_rate": 0.00018046139966490526, "loss": 0.4594034646451473, "step": 8100}
{"learning_rate": 0.00017994586931305582, "loss": 0.4378485783934593, "step": 8200}
{"learning_rate": 0.00017943033896120636, "loss": 0.44566343814134596, "step": 8300}
{"learning_rate": 0.00017891480860935687, "loss": 0.4684973441064358, "step": 8400}
{"learning_rate": 0.0001783992782575074, "loss": 0.42857180193066596, "step": 8500}
{"learning_rate": 0.00017788374790565797, "loss": 0.4171652066707611, "step": 8600}
{"learning_rate": 0.00017736821755380848, "loss": 0.45506142362952234, "step": 8700}
{"learning_rate": 0.00017685268720195901, "loss": 0.47059225648641584, "step": 8800}
{"learning_rate": 0.00017633715685010958, "loss": 0.4542036910355091, "step": 8900}
{"learning_rate": 0.00017582162649826009, "loss": 0.4596729950606823, "step": 9000}
{"learning_rate": 0.00017530609614641062, "loss": 0.46318377524614335, "step": 9100}
{"learning_rate": 0.00017479056579456116, "loss": 0.45318587839603425, "step": 9200}
{"learning_rate": 0.0001742750354427117, "loss": 0.4797310657799244, "step": 9300}
{"learning_rate": 0.00017375950509086223, "loss": 0.45472550854086874, "step": 9400}
{"learning_rate": 0.00017324397473901277, "loss": 0.45764863803982736, "step": 9500}
{"learning_rate": 0.0001727284443871633, "loss": 0.6760511407256127, "step": 9600}
{"learning_rate": 0.00017221291403531384, "loss": 0.6996373969316483, "step": 9700}
{"learning_rate": 0.00017169738368346437, "loss": 0.6951607382297516, "step": 9800}
{"learning_rate": 0.0001711818533316149, "loss": 0.7007030230760575, "step": 9900}
{"learning_rate": 0.00017066632297976545, "loss": 0.6961118984222412, "step": 10000}
{"learning_rate": 0.00017015079262791598, "loss": 0.6988678574562073, "step": 10100}
{"learning_rate": 0.00016963526227606652, "loss": 0.6939001500606536, "step": 10200}
{"learning_rate": 0.00016911973192421703, "loss": 0.6991030830144882, "step": 10300}
{"learning_rate": 0.0001686042015723676, "loss": 0.7225308656692505, "step": 10400}
{"learning_rate": 0.00016808867122051813, "loss": 0.6984154450893402, "step": 10500}
{"learning_rate": 0.00016757314086866863, "loss": 0.6992167514562607, "step": 10600}
{"learning_rate": 0.0001670576105168192, "loss": 0.6993762147426605, "step": 10700}
{"learning_rate": 0.00016654208016496973, "loss": 0.6954449665546417, "step": 10800}
{"learning_rate": 0.00016602654981312024, "loss": 0.6956585949659347, "step": 10900}
{"learning_rate": 0.00016551101946127078, "loss": 0.6980380529165268, "step": 11000}
{"learning_rate": 0.00016499548910942134, "loss": 0.6994923764467239, "step": 11100}
{"learning_rate": 0.00016447995875757185, "loss": 0.7009448939561844, "step": 11200}
{"learning_rate": 0.0001639644284057224, "loss": 0.6961477559804916, "step": 11300}
{"learning_rate": 0.00016344889805387292, "loss": 0.6950068110227585, "step": 11400}
{"learning_rate": 0.00016293336770202346, "loss": 0.6984470278024674, "step": 11500}
{"learning_rate": 0.000162417837350174, "loss": 0.6987768566608429, "step": 11600}
{"learning_rate": 0.00016190230699832453, "loss": 0.6955089658498764, "step": 11700}
{"learning_rate": 0.00016138677664647507, "loss": 0.6974623143672943, "step": 11800}
{"learning_rate": 0.0001608712462946256, "loss": 0.6963051301240921, "step": 11900}
{"learning_rate": 0.00016035571594277614, "loss": 0.6966987651586533, "step": 12000}
{"learning_rate": 0.00015984018559092667, "loss": 0.6975135022401809, "step": 12100}
{"learning_rate": 0.0001593246552390772, "loss": 0.698139386177063, "step": 12200}
{"learning_rate": 0.00015880912488722775, "loss": 0.6945223665237427, "step": 12300}
{"learning_rate": 0.00015829359453537828, "loss": 0.697940474152565, "step": 12400}
{"learning_rate": 0.00015777806418352882, "loss": 0.6953733849525452, "step": 12500}
{"learning_rate": 0.00015726253383167935, "loss": 0.698943390250206, "step": 12600}
{"learning_rate": 0.0001567470034798299, "loss": 0.6958224195241928, "step": 12700}
{"learning_rate": 0.0001562314731279804, "loss": 0.6969279026985169, "step": 12800}
{"learning_rate": 0.00015571594277613096, "loss": 0.6943042081594467, "step": 12900}
{"learning_rate": 0.0001552004124242815, "loss": 0.6955613481998444, "step": 13000}
{"learning_rate": 0.000154684882072432, "loss": 0.6964903801679612, "step": 13100}
{"learning_rate": 0.00015416935172058254, "loss": 0.6958559346199036, "step": 13200}
{"learning_rate": 0.0001536538213687331, "loss": 0.6994060349464416, "step": 13300}
{"learning_rate": 0.00015313829101688362, "loss": 0.6962485539913178, "step": 13400}
{"learning_rate": 0.00015262276066503415, "loss": 0.6973061317205429, "step": 13500}
{"learning_rate": 0.0001521072303131847, "loss": 0.6984491866827011, "step": 13600}
{"learning_rate": 0.00015159169996133522, "loss": 0.6958070963621139, "step": 13700}
{"learning_rate": 0.00015107616960948576, "loss": 0.6981200927495956, "step": 13800}
{"learning_rate": 0.0001505606392576363, "loss": 0.6957536000013351, "step": 13900}
{"learning_rate": 0.00015004510890578686, "loss": 0.6943967545032501, "step": 14000}
{"learning_rate": 0.00014952957855393737, "loss": 0.6963396489620208, "step": 14100}
{"learning_rate": 0.0001490140482020879, "loss": 0.696145783662796, "step": 14200}
{"learning_rate": 0.00014849851785023844, "loss": 0.6964495003223419, "step": 14300}
{"learning_rate": 0.00014798298749838898, "loss": 0.6950684404373169, "step": 14400}
{"learning_rate": 0.0001474674571465395, "loss": 0.6964384931325912, "step": 14500}
{"learning_rate": 0.00014695192679469005, "loss": 0.6967101609706878, "step": 14600}
{"learning_rate": 0.00014643639644284058, "loss": 0.695085843205452, "step": 14700}
{"learning_rate": 0.00014592086609099112, "loss": 0.6954471105337143, "step": 14800}
{"learning_rate": 0.00014540533573914166, "loss": 0.6961427891254425, "step": 14900}
{"learning_rate": 0.00014488980538729216, "loss": 0.6960506910085678, "step": 15000}
{"learning_rate": 0.00014437427503544273, "loss": 0.694625655412674, "step": 15100}
{"learning_rate": 0.00014385874468359326, "loss": 0.6954445457458496, "step": 15200}
{"learning_rate": 0.00014334321433174377, "loss": 0.6965684586763382, "step": 15300}
{"learning_rate": 0.00014282768397989434, "loss": 0.695345344543457, "step": 15400}
{"learning_rate": 0.00014231215362804487, "loss": 0.6949293637275695, "step": 15500}
{"learning_rate": 0.00014179662327619538, "loss": 0.6968457072973251, "step": 15600}
{"learning_rate": 0.00014128109292434592, "loss": 0.6959818017482757, "step": 15700}
{"learning_rate": 0.00014076556257249648, "loss": 0.6949035269021988, "step": 15800}
{"learning_rate": 0.000140250032220647, "loss": 0.6961296737194062, "step": 15900}
{"learning_rate": 0.00013973450186879752, "loss": 0.6951547855138779, "step": 16000}
{"learning_rate": 0.00013921897151694806, "loss": 0.6960852247476578, "step": 16100}
{"learning_rate": 0.0001387034411650986, "loss": 0.6949686509370804, "step": 16200}
{"learning_rate": 0.00013818791081324913, "loss": 0.696487774848938, "step": 16300}
{"learning_rate": 0.00013767238046139967, "loss": 0.6956022328138352, "step": 16400}
{"learning_rate": 0.0001371568501095502, "loss": 0.6934760868549347, "step": 16500}
{"learning_rate": 0.00013664131975770074, "loss": 0.6948903107643127, "step": 16600}
{"learning_rate": 0.00013612578940585128, "loss": 0.6949892055988312, "step": 16700}
{"learning_rate": 0.0001356102590540018, "loss": 0.6958494943380356, "step": 16800}
{"learning_rate": 0.00013509472870215235, "loss": 0.6957391756772995, "step": 16900}
{"learning_rate": 0.00013457919835030288, "loss": 0.6953414016962052, "step": 17000}
{"learning_rate": 0.00013406366799845342, "loss": 0.6942579627037049, "step": 17100}
{"learning_rate": 0.00013354813764660396, "loss": 0.6927767395973206, "step": 17200}
{"learning_rate": 0.0001330326072947545, "loss": 0.694207838177681, "step": 17300}
{"learning_rate": 0.00013251707694290503, "loss": 0.6958832108974456, "step": 17400}
{"learning_rate": 0.00013200154659105554, "loss": 0.6944126284122467, "step": 17500}
{"learning_rate": 0.0001314860162392061, "loss": 0.695941561460495, "step": 17600}
{"learning_rate": 0.00013097048588735664, "loss": 0.6938049125671387, "step": 17700}
{"learning_rate": 0.00013045495553550715, "loss": 0.6943626815080642, "step": 17800}
{"learning_rate": 0.00012993942518365768, "loss": 0.6939664018154145, "step": 17900}
{"learning_rate": 0.00012942389483180824, "loss": 0.6929499858617783, "step": 18000}
{"learning_rate": 0.00012890836447995875, "loss": 0.7006488847732544, "step": 18100}
{"learning_rate": 0.0001283928341281093, "loss": 0.6944272422790527, "step": 18200}
{"learning_rate": 0.00012787730377625983, "loss": 0.695071604847908, "step": 18300}
{"learning_rate": 0.00012736177342441036, "loss": 0.6951347279548645, "step": 18400}
{"learning_rate": 0.0001268462430725609, "loss": 0.6943277144432067, "step": 18500}
{"learning_rate": 0.00012633071272071143, "loss": 0.6938108086585999, "step": 18600}
{"learning_rate": 0.000125815182368862, "loss": 0.6953209698200226, "step": 18700}
{"learning_rate": 0.0001252996520170125, "loss": 0.6955448627471924, "step": 18800}
{"learning_rate": 0.00012478412166516304, "loss": 0.6940515089035034, "step": 18900}
{"learning_rate": 0.00012426859131331358, "loss": 0.6950866854190827, "step": 19000}
{"learning_rate": 0.0001237530609614641, "loss": 0.695438135266304, "step": 19100}
{"learning_rate": 0.00012323753060961465, "loss": 0.6947926729917526, "step": 19200}
{"learning_rate": 0.00012272200025776519, "loss": 0.6947897309064865, "step": 19300}
{"learning_rate": 0.00012220646990591572, "loss": 0.6943871974945068, "step": 19400}
{"learning_rate": 0.00012169093955406626, "loss": 0.6953035587072373, "step": 19500}
{"learning_rate": 0.0001211754092022168, "loss": 0.694454043507576, "step": 19600}
{"learning_rate": 0.00012065987885036732, "loss": 0.6951151895523071, "step": 19700}
{"learning_rate": 0.00012014434849851785, "loss": 0.6936695420742035, "step": 19800}
{"learning_rate": 0.0001196288181466684, "loss": 0.6949385398626328, "step": 19900}
{"learning_rate": 0.00011911328779481892, "loss": 0.6942293000221252, "step": 20000}
{"learning_rate": 0.00011859775744296946, "loss": 0.694293304681778, "step": 20100}
{"learning_rate": 0.00011808222709112, "loss": 0.6939092284440994, "step": 20200}
{"learning_rate": 0.00011756669673927052, "loss": 0.6942078387737274, "step": 20300}
{"learning_rate": 0.00011705116638742107, "loss": 0.6950744992494583, "step": 20400}
{"learning_rate": 0.0001165356360355716, "loss": 0.6944970977306366, "step": 20500}
{"learning_rate": 0.00011602010568372213, "loss": 0.6949332165718078, "step": 20600}
{"learning_rate": 0.00011550457533187266, "loss": 0.6947655683755875, "step": 20700}
{"learning_rate": 0.00011498904498002321, "loss": 0.6931582671403885, "step": 20800}
{"learning_rate": 0.00011447351462817375, "loss": 0.6942216354608536, "step": 20900}
{"learning_rate": 0.00011395798427632427, "loss": 0.6936932241916657, "step": 21000}
{"learning_rate": 0.0001134424539244748, "loss": 0.6959827470779419, "step": 21100}
{"learning_rate": 0.00011292692357262536, "loss": 0.6948721438646317, "step": 21200}
{"learning_rate": 0.00011241139322077588, "loss": 0.6947108006477356, "step": 21300}
{"learning_rate": 0.00011189586286892641, "loss": 0.6948240506649017, "step": 21400}
{"learning_rate": 0.00011138033251707696, "loss": 0.693189348578453, "step": 21500}
{"learning_rate": 0.00011086480216522747, "loss": 0.6939473843574524, "step": 21600}
{"learning_rate": 0.00011034927181337802, "loss": 0.6945353192090988, "step": 21700}
{"learning_rate": 0.00010983374146152856, "loss": 0.6948973572254181, "step": 21800}
{"learning_rate": 0.00010931821110967908, "loss": 0.6949133116006851, "step": 21900}
{"learning_rate": 0.00010880268075782962, "loss": 0.6938155382871628, "step": 22000}
{"learning_rate": 0.00010828715040598017, "loss": 0.6946255797147751, "step": 22100}
{"learning_rate": 0.00010777162005413069, "loss": 0.694715086221695, "step": 22200}
{"learning_rate": 0.00010725608970228122, "loss": 0.6943990218639374, "step": 22300}
{"learning_rate": 0.00010674055935043177, "loss": 0.6945424020290375, "step": 22400}
{"learning_rate": 0.00010622502899858228, "loss": 0.6941477972269058, "step": 22500}
{"learning_rate": 0.00010570949864673283, "loss": 0.6941612839698792, "step": 22600}
{"learning_rate": 0.00010519396829488337, "loss": 0.6933134979009629, "step": 22700}
{"learning_rate": 0.00010467843794303389, "loss": 0.6942899340391159, "step": 22800}
{"learning_rate": 0.00010416290759118443, "loss": 0.6938205045461655, "step": 22900}
{"learning_rate": 0.00010364737723933498, "loss": 0.6929915601015091, "step": 23000}
{"learning_rate": 0.0001031318468874855, "loss": 0.6935181415081024, "step": 23100}
{"learning_rate": 0.00010261631653563604, "loss": 0.6933096051216125, "step": 23200}
{"learning_rate": 0.00010210078618378658, "loss": 0.6937584692239761, "step": 23300}
{"learning_rate": 0.00010158525583193712, "loss": 0.6939013832807541, "step": 23400}
{"learning_rate": 0.00010106972548008764, "loss": 0.6933817720413208, "step": 23500}
{"learning_rate": 0.00010055419512823818, "loss": 0.6942682832479476, "step": 23600}
{"learning_rate": 0.00010003866477638873, "loss": 0.6942080450057984, "step": 23700}
{"learning_rate": 9.952313442453925e-05, "loss": 0.69455681681633, "step": 23800}
{"learning_rate": 9.900760407268979e-05, "loss": 0.6944382125139237, "step": 23900}
{"learning_rate": 9.849207372084032e-05, "loss": 0.6950533038377762, "step": 24000}
{"learning_rate": 9.797654336899086e-05, "loss": 0.6938429665565491, "step": 24100}
{"learning_rate": 9.74610130171414e-05, "loss": 0.6940653598308564, "step": 24200}
{"learning_rate": 9.694548266529192e-05, "loss": 0.6934658801555633, "step": 24300}
{"learning_rate": 9.642995231344247e-05, "loss": 0.6938647544384002, "step": 24400}
{"learning_rate": 9.591442196159299e-05, "loss": 0.6943705379962921, "step": 24500}
{"learning_rate": 9.539889160974353e-05, "loss": 0.6939336389303208, "step": 24600}
{"learning_rate": 9.488336125789406e-05, "loss": 0.6934653830528259, "step": 24700}
{"learning_rate": 9.43678309060446e-05, "loss": 0.6944891929626464, "step": 24800}
{"learning_rate": 9.385230055419513e-05, "loss": 0.6937082558870316, "step": 24900}
{"learning_rate": 9.333677020234567e-05, "loss": 0.6944995999336243, "step": 25000}
{"learning_rate": 9.28212398504962e-05, "loss": 0.6935051149129867, "step": 25100}
{"learning_rate": 9.230570949864674e-05, "loss": 0.695109441280365, "step": 25200}
{"learning_rate": 9.179017914679728e-05, "loss": 0.6935811567306519, "step": 25300}
{"learning_rate": 9.12746487949478e-05, "loss": 0.6938967287540436, "step": 25400}
{"learning_rate": 9.075911844309835e-05, "loss": 0.6920149976015091, "step": 25500}
{"learning_rate": 9.024358809124887e-05, "loss": 0.6942734509706497, "step": 25600}
{"learning_rate": 8.972805773939941e-05, "loss": 0.6941861057281494, "step": 25700}
{"learning_rate": 8.921252738754994e-05, "loss": 0.6946189391613007, "step": 25800}
{"learning_rate": 8.869699703570048e-05, "loss": 0.6934875655174255, "step": 25900}
{"learning_rate": 8.818146668385102e-05, "loss": 0.6925832593441009, "step": 26000}
{"learning_rate": 8.766593633200155e-05, "loss": 0.6939669120311737, "step": 26100}
{"learning_rate": 8.715040598015209e-05, "loss": 0.6940711295604706, "step": 26200}
{"learning_rate": 8.663487562830262e-05, "loss": 0.6944798254966735, "step": 26300}
{"learning_rate": 8.611934527645316e-05, "loss": 0.6943936222791671, "step": 26400}
{"learning_rate": 8.560381492460368e-05, "loss": 0.6952890658378601, "step": 26500}
{"learning_rate": 8.508828457275423e-05, "loss": 0.6919172084331513, "step": 26600}
{"learning_rate": 8.457275422090475e-05, "loss": 0.6943334257602691, "step": 26700}
{"learning_rate": 8.405722386905529e-05, "loss": 0.6939889740943909, "step": 26800}
{"learning_rate": 8.354169351720583e-05, "loss": 0.6935403031110764, "step": 26900}
{"learning_rate": 8.302616316535636e-05, "loss": 0.694214860200882, "step": 27000}
{"learning_rate": 8.25106328135069e-05, "loss": 0.6940362697839737, "step": 27100}
{"learning_rate": 8.199510246165743e-05, "loss": 0.6946299654245377, "step": 27200}
{"learning_rate": 8.147957210980797e-05, "loss": 0.6942669183015824, "step": 27300}
{"learning_rate": 8.09640417579585e-05, "loss": 0.6935280251502991, "step": 27400}
{"learning_rate": 8.044851140610904e-05, "loss": 0.6940082788467408, "step": 27500}
{"learning_rate": 7.993298105425957e-05, "loss": 0.6944767034053803, "step": 27600}
{"learning_rate": 7.941745070241011e-05, "loss": 0.6936148363351822, "step": 27700}
{"learning_rate": 7.890192035056064e-05, "loss": 0.6940289860963822, "step": 27800}
{"learning_rate": 7.838638999871117e-05, "loss": 0.693521146774292, "step": 27900}
{"learning_rate": 7.787085964686171e-05, "loss": 0.6940224194526672, "step": 28000}
{"learning_rate": 7.735532929501225e-05, "loss": 0.694213599562645, "step": 28100}
{"learning_rate": 7.683979894316278e-05, "loss": 0.693992674946785, "step": 28200}
{"learning_rate": 7.632426859131332e-05, "loss": 0.6937928247451782, "step": 28300}
{"learning_rate": 7.580873823946385e-05, "loss": 0.6941070693731308, "step": 28400}
{"learning_rate": 7.529320788761439e-05, "loss": 0.6942984628677368, "step": 28500}
{"learning_rate": 7.477767753576493e-05, "loss": 0.6934936100244522, "step": 28600}
{"learning_rate": 7.426214718391545e-05, "loss": 0.6939462494850158, "step": 28700}
{"learning_rate": 7.3746616832066e-05, "loss": 0.6931078684329987, "step": 28800}
{"learning_rate": 7.323108648021653e-05, "loss": 0.693705386519432, "step": 28900}
{"learning_rate": 7.271555612836706e-05, "loss": 0.6933625149726867, "step": 29000}
{"learning_rate": 7.22000257765176e-05, "loss": 0.6938331001996993, "step": 29100}
{"learning_rate": 7.168449542466813e-05, "loss": 0.6939905261993409, "step": 29200}
{"learning_rate": 7.116896507281866e-05, "loss": 0.6928278946876526, "step": 29300}
{"learning_rate": 7.06534347209692e-05, "loss": 0.6933250719308853, "step": 29400}
{"learning_rate": 7.013790436911974e-05, "loss": 0.6942643219232559, "step": 29500}
{"learning_rate": 6.962237401727027e-05, "loss": 0.6942366039752961, "step": 29600}
{"learning_rate": 6.910684366542081e-05, "loss": 0.6942061775922775, "step": 29700}
{"learning_rate": 6.859131331357134e-05, "loss": 0.6938119745254516, "step": 29800}
{"learning_rate": 6.807578296172188e-05, "loss": 0.6939305770397186, "step": 29900}
{"learning_rate": 6.756025260987242e-05, "loss": 0.6941372758150101, "step": 30000}
{"learning_rate": 6.704472225802294e-05, "loss": 0.6937731575965881, "step": 30100}
{"learning_rate": 6.652919190617349e-05, "loss": 0.6935141772031784, "step": 30200}
{"learning_rate": 6.601366155432401e-05, "loss": 0.6935466122627258, "step": 30300}
{"learning_rate": 6.549813120247455e-05, "loss": 0.6940394848585129, "step": 30400}
{"learning_rate": 6.498260085062508e-05, "loss": 0.6936585700511932, "step": 30500}
{"learning_rate": 6.446707049877562e-05, "loss": 0.6938036584854126, "step": 30600}
{"learning_rate": 6.395154014692615e-05, "loss": 0.6929751324653626, "step": 30700}
{"learning_rate": 6.343600979507669e-05, "loss": 0.6937020355463028, "step": 30800}
{"learning_rate": 6.292047944322723e-05, "loss": 0.6939474183320999, "step": 30900}
{"learning_rate": 6.240494909137776e-05, "loss": 0.6937033146619797, "step": 31000}
{"learning_rate": 6.18894187395283e-05, "loss": 0.6927569437026978, "step": 31100}
{"learning_rate": 6.137388838767882e-05, "loss": 0.6935778671503067, "step": 31200}
{"learning_rate": 6.085835803582936e-05, "loss": 0.6924829059839248, "step": 31300}
{"learning_rate": 6.03428276839799e-05, "loss": 0.6932672679424285, "step": 31400}
{"learning_rate": 5.982729733213043e-05, "loss": 0.6932972466945648, "step": 31500}
{"learning_rate": 5.931176698028097e-05, "loss": 0.6940431755781173, "step": 31600}
{"learning_rate": 5.87962366284315e-05, "loss": 0.6937311619520188, "step": 31700}
{"learning_rate": 5.828070627658203e-05, "loss": 0.6931491333246231, "step": 31800}
{"learning_rate": 5.776517592473257e-05, "loss": 0.6931317842006683, "step": 31900}
{"learning_rate": 5.72496455728831e-05, "loss": 0.694061838388443, "step": 32000}
{"learning_rate": 5.6734115221033645e-05, "loss": 0.693862590789795, "step": 32100}
{"learning_rate": 5.6218584869184174e-05, "loss": 0.6939488059282303, "step": 32200}
{"learning_rate": 5.570305451733471e-05, "loss": 0.6930150347948074, "step": 32300}
{"learning_rate": 5.518752416548525e-05, "loss": 0.69360311627388, "step": 32400}
{"learning_rate": 5.467199381363578e-05, "loss": 0.6929512268304825, "step": 32500}
{"learning_rate": 5.415646346178631e-05, "loss": 0.6938000392913818, "step": 32600}
{"learning_rate": 5.3640933109936854e-05, "loss": 0.693729214668274, "step": 32700}
{"learning_rate": 5.312540275808738e-05, "loss": 0.6935598164796829, "step": 32800}
{"learning_rate": 5.260987240623791e-05, "loss": 0.6932873320579529, "step": 32900}
{"learning_rate": 5.2094342054388455e-05, "loss": 0.6936816495656967, "step": 33000}
{"learning_rate": 5.1578811702538984e-05, "loss": 0.6935984808206558, "step": 33100}
{"learning_rate": 5.106328135068953e-05, "loss": 0.6930211931467056, "step": 33200}
{"learning_rate": 5.054775099884006e-05, "loss": 0.6935324704647065, "step": 33300}
{"learning_rate": 5.003222064699059e-05, "loss": 0.6934433990716934, "step": 33400}
{"learning_rate": 4.951669029514113e-05, "loss": 0.6938003820180892, "step": 33500}
{"learning_rate": 4.9001159943291664e-05, "loss": 0.6991971051692962, "step": 33600}
{"learning_rate": 4.84856295914422e-05, "loss": 0.6955929601192474, "step": 33700}
{"learning_rate": 4.797009923959273e-05, "loss": 0.6960605102777481, "step": 33800}
{"learning_rate": 4.7454568887743265e-05, "loss": 0.6946552872657776, "step": 33900}
{"learning_rate": 4.69390385358938e-05, "loss": 0.6951100254058837, "step": 34000}
{"learning_rate": 4.642350818404434e-05, "loss": 0.6946106946468353, "step": 34100}
{"learning_rate": 4.5907977832194873e-05, "loss": 0.6933953738212586, "step": 34200}
{"learning_rate": 4.539244748034541e-05, "loss": 0.6938926738500595, "step": 34300}
{"learning_rate": 4.4876917128495945e-05, "loss": 0.6937662249803543, "step": 34400}
{"learning_rate": 4.436138677664648e-05, "loss": 0.6954003483057022, "step": 34500}
{"learning_rate": 4.384585642479701e-05, "loss": 0.6942340284585953, "step": 34600}
{"learning_rate": 4.333032607294755e-05, "loss": 0.6935677117109299, "step": 34700}
{"learning_rate": 4.281479572109808e-05, "loss": 0.6936280846595764, "step": 34800}
{"learning_rate": 4.229926536924861e-05, "loss": 0.6942207562923431, "step": 34900}
{"learning_rate": 4.178373501739915e-05, "loss": 0.6937257647514343, "step": 35000}
{"learning_rate": 4.1268204665549684e-05, "loss": 0.6941741323471069, "step": 35100}
{"learning_rate": 4.075267431370022e-05, "loss": 0.6938879925012589, "step": 35200}
{"learning_rate": 4.0237143961850756e-05, "loss": 0.6927222561836243, "step": 35300}
{"learning_rate": 3.972161361000129e-05, "loss": 0.6938156253099441, "step": 35400}
{"learning_rate": 3.920608325815183e-05, "loss": 0.6936214190721511, "step": 35500}
{"learning_rate": 3.8690552906302364e-05, "loss": 0.694928520321846, "step": 35600}
{"learning_rate": 3.817502255445289e-05, "loss": 0.6939738488197327, "step": 35700}
{"learning_rate": 3.765949220260343e-05, "loss": 0.6946884405612945, "step": 35800}
{"learning_rate": 3.7143961850753965e-05, "loss": 0.6938414561748505, "step": 35900}
{"learning_rate": 3.6628431498904494e-05, "loss": 0.6945059078931809, "step": 36000}
{"learning_rate": 3.611290114705504e-05, "loss": 0.6935639744997024, "step": 36100}
{"learning_rate": 3.559737079520557e-05, "loss": 0.693930697441101, "step": 36200}
{"learning_rate": 3.508184044335611e-05, "loss": 0.6934611493349075, "step": 36300}
{"learning_rate": 3.456631009150664e-05, "loss": 0.6929791033267975, "step": 36400}
{"learning_rate": 3.4050779739657174e-05, "loss": 0.6938034987449646, "step": 36500}
{"learning_rate": 3.353524938780771e-05, "loss": 0.6936759740114212, "step": 36600}
{"learning_rate": 3.301971903595824e-05, "loss": 0.6940606141090393, "step": 36700}
{"learning_rate": 3.2504188684108776e-05, "loss": 0.6928446072340012, "step": 36800}
{"learning_rate": 3.198865833225931e-05, "loss": 0.6942046236991882, "step": 36900}
{"learning_rate": 3.147312798040985e-05, "loss": 0.6929558277130127, "step": 37000}
{"learning_rate": 3.0957597628560384e-05, "loss": 0.6937725967168809, "step": 37100}
{"learning_rate": 3.0442067276710916e-05, "loss": 0.6934615093469619, "step": 37200}
{"learning_rate": 2.9926536924861452e-05, "loss": 0.6940130072832108, "step": 37300}
{"learning_rate": 2.941100657301199e-05, "loss": 0.6925313198566436, "step": 37400}
{"learning_rate": 2.889547622116252e-05, "loss": 0.6936526983976364, "step": 37500}
{"learning_rate": 2.8379945869313057e-05, "loss": 0.6939217108488083, "step": 37600}
{"learning_rate": 2.7864415517463593e-05, "loss": 0.6938716804981232, "step": 37700}
{"learning_rate": 2.7348885165614125e-05, "loss": 0.69281702876091, "step": 37800}
{"learning_rate": 2.683335481376466e-05, "loss": 0.6941372233629227, "step": 37900}
{"learning_rate": 2.6317824461915197e-05, "loss": 0.6936556702852249, "step": 38000}
{"learning_rate": 2.5802294110065733e-05, "loss": 0.6929066628217697, "step": 38100}
{"learning_rate": 2.5286763758216263e-05, "loss": 0.6944833081960679, "step": 38200}
{"learning_rate": 2.4771233406366802e-05, "loss": 0.6939995032548905, "step": 38300}
{"learning_rate": 2.4255703054517338e-05, "loss": 0.692908741235733, "step": 38400}
{"learning_rate": 2.374017270266787e-05, "loss": 0.6946476805210113, "step": 38500}
{"learning_rate": 2.3224642350818407e-05, "loss": 0.693791915178299, "step": 38600}
{"learning_rate": 2.270911199896894e-05, "loss": 0.6936742180585861, "step": 38700}
{"learning_rate": 2.2193581647119475e-05, "loss": 0.6933111488819123, "step": 38800}
{"learning_rate": 2.167805129527001e-05, "loss": 0.6935337150096893, "step": 38900}
{"learning_rate": 2.1162520943420544e-05, "loss": 0.6939919608831405, "step": 39000}
{"learning_rate": 2.064699059157108e-05, "loss": 0.6937378084659577, "step": 39100}
{"learning_rate": 2.0131460239721613e-05, "loss": 0.6941754966974258, "step": 39200}
{"learning_rate": 1.9615929887872152e-05, "loss": 0.6937722772359848, "step": 39300}
{"learning_rate": 1.9100399536022685e-05, "loss": 0.6935488641262054, "step": 39400}
{"learning_rate": 1.858486918417322e-05, "loss": 0.6938992667198182, "step": 39500}
{"learning_rate": 1.8069338832323753e-05, "loss": 0.693024982213974, "step": 39600}
{"learning_rate": 1.755380848047429e-05, "loss": 0.6933804148435593, "step": 39700}
{"learning_rate": 1.7038278128624825e-05, "loss": 0.6937606805562972, "step": 39800}
{"learning_rate": 1.6522747776775358e-05, "loss": 0.6930754858255387, "step": 39900}
{"learning_rate": 1.6007217424925894e-05, "loss": 0.6933560097217559, "step": 40000}
{"learning_rate": 1.5491687073076426e-05, "loss": 0.6936141043901444, "step": 40100}
{"learning_rate": 1.4976156721226964e-05, "loss": 0.69259372651577, "step": 40200}
{"learning_rate": 1.4460626369377497e-05, "loss": 0.6934698694944381, "step": 40300}
{"learning_rate": 1.3945096017528034e-05, "loss": 0.692716588973999, "step": 40400}
{"learning_rate": 1.3429565665678567e-05, "loss": 0.693301666378975, "step": 40500}
{"learning_rate": 1.2914035313829101e-05, "loss": 0.693270617723465, "step": 40600}
{"learning_rate": 1.2398504961979637e-05, "loss": 0.6940693652629852, "step": 40700}
{"learning_rate": 1.1882974610130172e-05, "loss": 0.6927421498298645, "step": 40800}
{"learning_rate": 1.1367444258280708e-05, "loss": 0.6931837040185929, "step": 40900}
{"learning_rate": 1.0851913906431242e-05, "loss": 0.693362221121788, "step": 41000}
{"learning_rate": 1.0336383554581776e-05, "loss": 0.6932171028852463, "step": 41100}
{"learning_rate": 9.82085320273231e-06, "loss": 0.6935684406757354, "step": 41200}
{"learning_rate": 9.305322850882847e-06, "loss": 0.6938198703527451, "step": 41300}
{"learning_rate": 8.78979249903338e-06, "loss": 0.6934957838058472, "step": 41400}
{"learning_rate": 8.274262147183915e-06, "loss": 0.6932735008001327, "step": 41500}
{"learning_rate": 7.758731795334451e-06, "loss": 0.6932463425397873, "step": 41600}
{"learning_rate": 7.2432014434849854e-06, "loss": 0.6936384433507919, "step": 41700}
{"learning_rate": 6.727671091635521e-06, "loss": 0.6933966648578643, "step": 41800}
{"learning_rate": 6.212140739786055e-06, "loss": 0.693065015077591, "step": 41900}
{"learning_rate": 5.69661038793659e-06, "loss": 0.6930826485157013, "step": 42000}
{"learning_rate": 5.181080036087125e-06, "loss": 0.6935008931159973, "step": 42100}
{"learning_rate": 4.6655496842376595e-06, "loss": 0.6931707191467286, "step": 42200}
{"learning_rate": 4.150019332388195e-06, "loss": 0.6932642602920532, "step": 42300}
{"learning_rate": 3.634488980538729e-06, "loss": 0.6932427972555161, "step": 42400}
{"learning_rate": 3.118958628689264e-06, "loss": 0.6932768583297729, "step": 42500}
{"learning_rate": 2.6034282768397993e-06, "loss": 0.6935637044906616, "step": 42600}
{"learning_rate": 2.0878979249903336e-06, "loss": 0.6936329805850983, "step": 42700}
{"learning_rate": 1.5723675731408688e-06, "loss": 0.6937804961204529, "step": 42800}
{"learning_rate": 1.0568372212914035e-06, "loss": 0.693507427573204, "step": 42900}
{"learning_rate": 5.413068694419384e-07, "loss": 0.6930651545524598, "step": 43000}
{"learning_rate": 2.577651759247326e-08, "loss": 0.6932080692052841, "step": 43100}
============================================================

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant