Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concurrency to mdformatter #63

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

saswatamcode
Copy link
Collaborator

@saswatamcode saswatamcode commented Jul 23, 2021

This PR parallelizes the file processing loop in mdformatter for faster formatting and link checking using WaitGroup by spawning a goroutine for each file. (This also changes CLI spinner behavior as filenames cannot be shown due to multiple files being processed at once)

Some caveats,

  • Panic during cancellation sometimes due to v.c.Wait()(flaky, maybe due to internal colly WaitGroup)
  • Need some way to limit number of file processing goroutines

@saswatamcode saswatamcode requested a review from bwplotka July 23, 2021 13:48
@saswatamcode saswatamcode self-assigned this Jul 23, 2021
Signed-off-by: Saswata Mukherjee <[email protected]>
Copy link
Owner

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far! Thanks, some comments.

@@ -240,57 +241,73 @@ func IsFormatted(ctx context.Context, logger log.Logger, files []string, opts ..

func format(ctx context.Context, logger log.Logger, files []string, diffs *Diffs, spin *yacspin.Spinner, opts ...Option) error {
f := New(ctx, opts...)
b := bytes.Buffer{}
// TODO(bwplotka): Add concurrency (collector will need to redone).
errorChannel := make(chan error)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
errorChannel := make(chan error)
errorCh := make(chan error)

spin.Message(fn + "...")
}
errs.Add(func() error {
go func(fn string) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use https://pkg.go.dev/golang.org/x/sync/errgroup maybe, so semantics are easier.

return
}
v.destFutures[k] = &futureResult{cases: 1, resultFn: func() error { return nil }}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a race during replacement operation.

@@ -34,7 +34,11 @@ func (v GitHubValidator) IsValid(k futureKey, r *validator) (bool, error) {
// RoundTripValidator.IsValid returns true if url is checked by colly.
func (v RoundTripValidator) IsValid(k futureKey, r *validator) (bool, error) {
// Result will be in future.
r.destFutures[k].resultFn = func() error { return r.remoteLinks[k.dest] }
prevResult, _ := r.destFutures.LoadAndDelete(k)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lot's of races when we delete, unfortunately.. we need to make this operation locked potentially with Mutex.

But it has to be the full operation Locked:

  1. Checking if there is entry in map
  2. If no, create new future, and schedule colly
  3. If yes, grab reference to future

We might need to check IsValid interface, if this actually allows us to do this 🤔

@saswatamcode saswatamcode requested a review from bwplotka July 30, 2021 15:24
@bwplotka
Copy link
Owner

Still comments not addressed, I believe?

@bwplotka
Copy link
Owner

bwplotka commented Jun 7, 2023

Any progress?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants