Replies: 6 comments 8 replies
-
This is what I have been looking for for a long time without taking the time to articulate it, so thanks! Signature files address the performance side but not readability, since they separate documentation (type annotations and xml docs) and implementations into different files, making it hard to read code since you have to read in two places simultaneously. To me the approach of fully annotating objects in code would give the perfect combination of compiler performance and code readability. |
Beta Was this translation helpful? Give feedback.
-
We've talked about this a bit more inside G-Research, and there does seem to be an interest in exploring this. It would be great if it can be applied incrementally on a file-per-file level. Turning it on for an entire project might be overkill. The same rules more or less apply when considering a signature file, so not absolutely every file would be a good candidate to be fully top-level typed.
In G-Research/fsharp-analyzers#47, I started playing around with finding untyped top-level functions. To get an initial idea of what the changes to a file would be to be able to extract a signature file from it. Of course, knowing what is still missing doesn't put the file in the desired end state. Some automated process to convert the file would be ideal. As this involves multiple modifications, I don't think this is easy to pull off in a command-line tool. This might be more interesting to have as IDE actions. I might experiment with something in FSAC. Afterwards, if your files were processed and are now fully typed, you still want to have some sort of awareness in your IDE. Warnings should be raised when a non-private function is missing a type annotation and the IDE should make it very visible whether a function is exposed or not. I'll chronicle some findings in this discussion. Feel free to reach out to me if there is any interest in collaborating on this experiment. |
Beta Was this translation helpful? Give feedback.
-
One edge case I can already see is that constraints would need to be listed explicitly. let areEqual a b = a.Equals(b) Add type annotations: let areEqual (a:'a) (b:'b):bool = a.Equals(b) would lead to the in-memory signature of val areEqual: 'a -> 'b -> bool But this doesn't work because
It needs to produce:
so the type annotation needs to be: let areEqual<'a, 'b when 'a : equality> (a:'a) (b:'b):bool = a.Equals(b) I think all the information is present to deduce this in the transformation step, however, it is a non-trivial situation to implement. |
Beta Was this translation helpful? Give feedback.
-
Alright, I was able to annotate everything in I first detected every missing type information via an analyzer. The nice thing here, is that I was able to re-use the detecting algorithm in both the analyzer as the code fix. After some missing tweaks in the compiler experiment, I was able to compile the project: Before (sequential):
After (extracted signatures for each file + graph based type-checking):
Typecheck went from |
Beta Was this translation helpful? Give feedback.
-
I'm regularly compiling around 24,000+ lines of F#, across 7 projects, in about 15s (for Grace). It takes that long if I make a change to the "Shared" project the others all rely on; it's much faster if it's one of the "leaf node" projects on the compilation tree. I always love it when things go faster, like compilation in my favorite language. With that said, if going to the trouble of creating external signature files that I have to keep synced up with source code is only going to save me 2s per compilation (or whatever), I'll never do it. It would take an awful lot of 2s to add up to the hours I'd spend on keeping those files synced, and the hours I'd spend debugging something when it'll come down to some weird issue involving the files being out of sync. I'm thinking of the pain it can be to generate and sync up OpenAPI specifications and imagining that this is analogous. I love that this is an area for exploration, and I'm not trying to discourage you, it might lead somewhere awesome. I imagine a future where we have LLM's involved in our compilation steps, and I wouldn't object to using an LLM to generate the signature files automatically - only when it's detected that regenerating would be required - that then improved compilation time. But I'll never create those files by hand, and if there's a way to infer this data and cache it (as suggested above), yay! If not, meh, I won't miss it. |
Beta Was this translation helpful? Give feedback.
-
I'm still exploring this experiment and the results are a bit mixed. I've started by changing the compiler in my fork so that only marked files are being processed to extract signatures. When selecting which files to start typing, I'm looking at the longest path in the graph. (Detected via this script). I found this to be an efficient way to speed up the graph while keeping things pragmatic. It is possible to win Addressing type-checking still makes the most sense as it is timewise still the largest factor. |
Beta Was this translation helpful? Give feedback.
-
Continuing the conversation of #16436 (comment).
Short recap: in this experiment, I've extracted in-memory signature files from the implementation files.
Having those allows could potentially be very beneficial in graph-based type-checking.
One small example I did:
sequential:
graph-based (with in-memory signatures):
Consider this cautiously, as the data is insufficient for definitive conclusions. However, it hints at the potential benefits when every file is signed. Interestingly, this is achieved without any actual signature files, thus avoiding redundancy.
</recap>
Replying to Vlad's fair critique:
Indeed, there's a mix of emotions regarding F#'s magic. It seems suitable for extensive enterprise-level code. Typing everything at the top level somewhat resembles using signature files, which are employed by some, like in the compiler.
I'm not advocating for an immediate shift towards this approach. Even if it became standard overnight, there would be a significant effort required to update existing codebases with complete typings. Without a dedicated migration tool, I doubt it would gain widespread acceptance.
Conversely, this method would allow for parallel type-checking of every implementation file, subtly addressing the initial challenge of relocating certain checks to a post-inference stage, depending on how you look at it."
Beta Was this translation helpful? Give feedback.
All reactions