-
Notifications
You must be signed in to change notification settings - Fork 5
/
TODO
356 lines (349 loc) · 17.2 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
# DONE
```
✓ clean up tmp files
✓ fix the build / update_symlink ergonomics
✓ upload to github
✓ support ? for keyboard help
✓ support c to compile
✓ add a license
✓ single keystroke to ingest new code from clipboard "R"
✓ links to discord/twitch/yt in repo
✓ triple-quoted strings
✓ implement "/" for search -- YES! we have search
✓ block ends at the next one
✓ the v thing because things going together
✓ fit the block on the screen
✓ parametrize the project / binary / code names, etc
✓ parse .cmpr/conf
✓ support --conf <alternate_conf_file>
✓ fix build
✓ fix or 'r' get_code
✓ Python or C? later: arbitrary block defn
✓ eventually the config will just be a block (?) but for now we'll use .cmpr in the current directory
✓ finish all the settings
✓ push v2-b1 to github
✓ parametrize block_comment_part same as finding the blocks
✓ clean up the code a bit
✓ use it for chess_bpa
✓ test it with a Python project
✓ push v2 to github
✓ add "--init"
✓ touch .cmpr/conf
✓ create .cmpr/{revs,tmp}
✗ set those conf vars and save
✓ add 'g'
✓ add 'G' (w/o numeric args)
✓ genericize the arena allocation of array types
✓ move this back into library code
✓ Python crashing when only """ in block
✓ "library" support (= more than one file per project)
✓ list of files in the conf, bring in blocks from each one in order
✓ block type is per file, not per project
✓ language build tool will manage dependencies, so we don't have any feature work here
✓ directory tree support: support projects in directories by reading everything into one buffer
✓ don't symlink but just copy the new version (e.g. we shouldn't be committing a symlink for the main project file in git)
✓ support an alternative conf file location
✓ handle syntax highlighting / filetype issues: put an extension (e.g. .c or .py) on the tmp filename
✓ push v3 to github
✓ decide how to handle structs / other things that aren't functions
✓ add "?" (keyboard help)
✓ pagination
✓ implement space/b (down/back by page)
✓ add a build timestamp to the version string
✓ add a Makefile
✓ publish TODO in the repo
✓ push v4
✓ make sure there is at least one file (in check_conf_vars)
✓ when the conf file doesn't exist report the error better
✓ push v4 again
✓ add JS support
✓ test via 2048 demo
✓ add
✓ --print-block N
✓ --count-blocks
✓ --print-code N
✓ --print-comment N
✓ --find-block <literal>
✓ add :bootstrap
✓ try mawk manually
✓ local bootstrap script for cmpr itself
✓ system() -> send to model (currently send_to_clipboard)
✓ handle the multiple search result issue -- just make 'n' and 'N' work
✗ the search should show the current matching line, in the center of the display
✗ n goes to the next, which may or may not be in the same block
✗ enter sets the current index but can also set a pagination mode
✗ e might also do a +N or equivalent
✓ when switching blocks we should reset pagination (per-block(?))
✓ R should ensure that there is a newline at the end
✓ fix ruler position bug
✓ first version of bootstrap can be the first extended (ex commands) :bootstrap
✓ push v5
✓ fix --init
✓ 'v' by accident is annoying
✓ push v6
✓ intro video -- youtube (language switch)
✓ fix json_o_extend interface
✓ support API usage
✓ support LLM rewrite via API
✓ pick a config approach
✓ #send_to_llm
✓ clean up JSON lib
✓ store API req/res pairs
✓ create dirs on startup
✓ test JSON parsing and handle the response
✓ handle errors
✓ strip the markdown
✗ make r not repeat (rate limit)
✓ add GPT4
✓ bootstrap
✓ probably remove the delay after llm
✓ push code
✓ replace libcurl with curl
✓ update README
✓ fix 'h' (it's '?' not 'h'...) (this is now displayed on the ruler)
✓ strip config directories
✓ support arrow keys in :model
✓ the respond with OK thing
✓ remove tmpdir and revdir; use cmprdir
✓ push v7
✓ fix build issues:
✓ clang
✓ asan
✓ dist directory
✓ ? still lists undo, v, S
✓ DATALOSS.md and systemprompt should be in repo
✓ push
✓ unbreak pagination after I edited the code
✓ push
✓ support ollama API
✓ support ollamas conf var
✓ support llama.cpp server
✓ update :model
✓ block references
✓ add :expandrefs to see it
✓ implement expand_refs()
✓ update comment_to_prompt to expand refs
✓ test using langtable
✓ #find_blocks_language
✓ #tmp_filename
✓ #block_comment_part
✓ add depth to expand_refs_rec
✓ make chase_refs support :code and :all
✓ add block_by_id
✓ add block_transforms(b,f)
✓ add :expand, :toprefs, :inrefs
✓ add "#" feature to jump to a block by id (as a menu)
✓ turn off bootstrap on startup
✓ make .n start at 0 (in generic arrays)
✓ checksum for block content in memory and on disk
✓ pick a placeholder or simple approach: SipHash?
✓ get something to compile
✓ checksum blocks
✓ checksum files, blocks, lines in all revs
✓ fix the first-line thing on @id:code
✓ replace spans arena with generic implementation
✓ dataloss pass: think through everything; addresses control issue
✓ visibility into what's happening with process control; be chatty around tmp files
✓ user visibility into what's in the files and when it got there
✓ "diffs and dates" (so far this is the "U" feature, but should be more)
✓ update ? short help
✓ merge 2 PRs
✓ bugfix :inrefs, :toprefs crashes (removed these)
✓ chmod issue on Windows (exec bit)
✓ bump version and push v8
✓ basic diff features
✓ add "U" with a listing, most-recent first
✓ checksum and index blocks
✓ design how it will work
✓ implement checksums for blocks, lines, files (in #checksum_code)
✓ implement checksums for each rev
✓ implement "find previous version of block"
✓ support j/k, q, Enter
✓ render timestamps
✓ deal with non-adjacent duplication
✓ show the progress
✓ numbers shouldn't stop short
✓ optimization (caching?)
✓ put the whole thing behind a flag or disable it
✓ re-enable with performance cost paid only when using the feature
✓ integrate sqlite
✓ remove sqlite
✓ improve revblock caching and re-enable undo feature
✓ look into memory allocation and editor fork issues, build fixes (use vfork)
✓ explore "prompt palette" idea
✓ implement one or two by hand
✓ agreement()
✓ duplicate rewrite_current_block_with_llm, comment_to_prompt, and make call_llm take a callback function
✓ implement minimal expand_prompt_template and parse_prompt_template
✓ add filename_template
✓ add filename_variables and a constant now value per main loop iteration
✓ write the output in outputs/
✓ agreement_to_nl_diff
✓ look up the output of agreement()
✓ #output_template_var
✓ #lookup_output
✓ #current_block_checksum
✓ #read_output_headers
✓ #read_output_body
✓ #proposed_diff_SAV
✓ extend #get_prompt_template
✓ read default templates in from files instead of putting them into the binary; makes it easy for extension authors to add them
✓ agreement_to_pl_diff
✓ bug: seems reproducable by doing an undo on a block, then immediately editing it; Error: Inconsistent state after updating file contents.
✓ add new inp_sanity_checks
✓ nl2algo
✓ get prompt templates from files under prompts/ instead
✓ add dir_listing
✗ give the output an id or a "dash-continuation" or something
✗ actually add the output to the file, replacing any already there
✓ SEGV on example block, U, k, which happens every second time; something to do with revblock ids
✓ implement j/k order (i.e. unified file/block order, empty states for files and projects)
✓ delete existing current_index from state, and then fix everywhere that it was used (~50 places)
✓ make 'r' implementation match the other palette ops
✓ implement nl2pl_rewrite
✓ make 'r' simply call that
✓ implement summarize_block
✓ testing palette features
✓ pl2nl_rewrite
✓ support Anthropic's Claude
✓ dedup transitive block references
```
# CURRENTLY
→ llm.c (in git) to a set of blocks in the shortest time
# FUTURE
```
add block dedup to relnotes
perhaps: add a marker in block top line, such that any references following it are implementation-only (i.e. not transitive context)
perhaps: the same thing in the body, indicating where the block "ends" for purposes of callers
implement global context pragma
revise README
if bootstrap output is empty and bootstrap command isn't then run the bootstrap before any llm API calls
start to introduce block concept in our prompts
follow block ids back in time (in "U") (actually no, the decision was intentional, but maybe document it?)
add machinery for maintaining block one-line summaries (block_id, checksum_prefix, summary...)
add palette op for suggesting context based on one-line summaries
perhaps add summaries to the block jump list?
bump version and push v9
manually blockized the train_gpt2.c file; this could almost trivially be automated; maybe a script that we ship (as a block(?))
add cmpr --usage to give a nice output breaking down tokens used per model and by day or week
our markdown blockizer is actually broken; e.g. the file "#test\n```c\n#include <stdio.h>\n```\n" will include two blocks but should be just one.
add {bcs} and {bce} block comment delimiters (block comment start, end) per language as prompt template variables
add {NL} and {PL} as template expansion variables
check for C-specific stuff in our palette ops
perhaps we allow "@- " comment syntax to be indented (using trim first on each line)
proposed diff handling
look up proposed diff to current block, if any
implement the patch application and testing for clean application
write the block to a file
extract the patch to a file
call patch with `patch orig_file patch_file` and check the exit status
show some indication in the UI
give some UI affordance to apply the diff
add "diff" palette feature (the first non-LLM palette operation)
basically we want a "@llm_op" context that will create a nice English DSL for writing ops (we can derive this from existing impls)
get rid of .bak when we write a rev (false sense of security anyway, once you save twice...) (let's wait until we detect files changing on disk)
when ' goes to the palette we should select the previous op; when ' is hit again we should run it; this makes '' similar to "@@" or "." in vim
add a "cmpr and git" section to the README: include conf, probably exclude the rest
data loss when two cmprs open (related to rev currency check) (or when cmpr backgrounded and separately vimming, which bit me)
build cmpr.ai using cmpr (and ideally deploy it the same way)
commenting stuff out in a block that's going into a bootstrap doesn't work! (shift over to gcb?)
get rid of all the prompts on initial startup
scope the templating work (first pass: template our own stuff)
change the "----" thing now that chatgpt does markdown everywhere
make opening the file instead of the block an option ('E'?)
Arena overflow for spans
I think flush should probably never apply to cmp, only out
"Error: Block does not belong to any file." happened
context: added a space before a "#" in the last markdown block in the last file, then :wq
Also there was an error when creating a new file in a Python project (because the file was empty, I think)
make "@" be a menu of references from this block
"@@" or sth is menu of blocks that refer to us
automatically suggest regen of stale downstream blocks
prt_exit() is probably actually a good idea
keep ollama model loaded
make LLM calls asynchronous
clipboard model set automatically
add :allfiles to autopopulate conf, and an empty state (?)
~/.cmpr for top-level conf
send everything to LLM and create a bootstrap? ctags?
ship with some kind of bootstrap bootstrap?
jumping between files not just blocks
idea: maybe go from spec to list of callable functions, approve this, then implementation; allows injecting library call documentation
idea: (maybe optionally) put the cmp highwater in the ruler and then start unleaking span allocs
QA on --init and setting language
turn a template into a shell script with cmpr calls embedded
idea: cmpr --expand takes a template on stdin and returns text on stdout (or --eval) (it makes cmpr a database)
test / fix check_conf_vars (e.g. cmprdir missing)
support other AIs via an external script
idea: when a TODO item is done, put the relevant blocks in a comment
IDE mode (what is this?)
think about LLM refactoring
first steps on the road total rev awareness 4 4 16
basic summarize feature (at least vjjjjj should give you something maximizing cols*5 chars worth)
bring compiler errors into the workflow (line no -> block, etc) 3 3 9
add translations into ~every language 4 2 8
add 'd', 'p', 'P' so that moving around blocks is easy 4 2 8
add o / O for block insertions 3 2 6
support markdown or HTML "views"
handle deleted (or not-yet-created) files sensibly
unified diff stuff
try getting GPT4 to suggest changes to the comment part
bug: (mkdir pytest; cd pytest; cmpr --init; cmpr # use Python language with .cmpr/conf)
librification / cleanup
"total rev awareness"
start with bpe(?)
https://en.wikipedia.org/wiki/Johnson%E2%80%93Lindenstrauss_lemma
start by determining which revs go to which files
is a set of features starting from an index or rolling hash
indexed at newlines(?)
subsumes diff features, data loss prevention on save, block -> rev indexing
conflicts with catalog file idea (?)
not mint a rev if block unchanged / basic diff features "d" for diff (by def. w/ prev rev) of block
add u for "R"
✓ sort the revdir and print everything
add a catalog file in revs/catalog
every rev gets written to it, and we parse it on startup
might as well start with a (tabular-programmed) set of them:
:bootstrap :addfile :addlib :help :config
actually use the arena allocation to e.g. reclaim cmp space (clipboard handling, etc)
"manual fixup" as a feature: put a "you got this wrong" in the prompt but not the block comment
can piggyback on hashtags features
fix bug: an empty file (e.g. index.html) block is reported belonging to prev, non-empty file
add translations into ~every language
publish training data?
as separate repo
settings mode works by opening the conf file in a buffer and syntax checking it after (visudo style)
to "file:" we add "collection:" (library, folder, ...) which is exactly the same but adds a "dir"
(By "dir" we mean a level of hierarchy in the UI that's closed by default.)
(We can use "lib:" so for example "library: spanio.c" would give a neat "spanio> " in the UI.)
add a ":addfile" or something where we repeat the question about language
add "settings mode" but for files
deal with the function declaration / header issue -> general-purpose code hygiene transforms
support having "principles" that are checked/enforced by the LLM (aka "invariants")
only on block changes for efficiency / over all code by special command
think about using the LLM to write the comment and not just the code (i.e. 'r' and 'R')
check if current file == latest rev on load, if not store a new rev
parametrize comment_to_prompt
make the prompts all config -- or all files in .cmpr
add o / O for block insertions
works the same as editing a block except the file starts empty and instead of replacing we insert
once we show the file in the ruler, always adds to the current file
add support for numbers (e.g. for G, d, etc)
add 'd', 'p', 'P' so that moving around blocks is easy
visual selection mode works with "r"
visual selection mode works with "e"
sometimes you want to unconditionally append, not replace the code part (p? v?)
something like "*" in vim, also "gd"
consider metadata on blocks: prompt "wrapper" functions, custom block end, etc
tagging blocks
bring compiler errors into the workflow (line no -> block, etc)
stderr into a file into a "temporary" block
heuristics identify errors and line numbers in a compiler-agnostic way
FIX code formatting issues once and for all
***** have cmpr handle it's own TODO items *****
find the right way to handle blocks (language agnostic, ...) (maybe we have?)
TOC presentation of blocks (LLM summarization???) (kind of handled by "#" feature)
intro video -- mp4 or gif in repo
basic stats on blocks e.g. comment-code ratio
experiment with GPT4 finetunes
```