-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doc/processor #141
base: main
Are you sure you want to change the base?
Doc/processor #141
Conversation
docs/processing.md
Outdated
|
||
## Using FileProcessor | ||
|
||
Useful if you want to store the file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the file in the filesystem"
$chunk = yield; | ||
// do something with it | ||
} while (!$chunk->isLast()); | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also explain that the returned value of the processor will be returned by the process
method
What would be the solution when I want to use it via cli & use e.g. flysystem to save the pdf to another source like a s3 bucket? If I understood the docs correctly, this is right now not possible unless I implement a own |
A StringProcessor would eventually cancel all the memory efficiency that this bundle tries to provide. But yes a custom Processor to leverage https://github.com/async-aws/aws/blob/master/src/Integration/Aws/SimpleS3/src/SimpleS3Client.php#L69 for example to upload chunk by chunk. We are planning for a flysystem native integration with this bundle. |
You're right, you will have to create your own Processor to stream to your bucket. You can use AsyncAws to make it easier. There is no such class in the GotenbergBundle yet but it's planned as a Bridge. |
Here is a naive approach for async aws : https://gist.github.com/Neirda24/05e6a65a862ab34854be3b5f9e562420 As @Jean-Beru said we will try to integrate natively with Flysystem and so on. |
and here is a working example for flysystem. https://gist.github.com/Neirda24/f950f1aaa5343c67bcbaeb12ed0cb887 I'll have to investigate more to see if its possible without a tmpfile but I don't see how given the abstraction. |
@@ -33,12 +33,16 @@ | |||
"symfony/mime": "^6.4 || ^7.0" | |||
}, | |||
"require-dev": { | |||
"ext-mbstring": "*", | |||
"async-aws/s3": "^2.6", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worth to lower the requirement
"league/flysystem": "^3.29", | ||
"league/flysystem-bundle": "^3.3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worth to lower the requirement
"phpstan/extension-installer": "^1.3", | ||
"phpstan/phpstan": "^1.10", | ||
"phpstan/phpstan-symfony": "^1.3", | ||
"phpunit/phpunit": "^10.4", | ||
"shipmonk/composer-dependency-analyser": "^1.7", | ||
"shipmonk/composer-dependency-analyser": "^1.8", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needed to ignore ext-mbstring
issues
What about using kbond filesystem that does all the abstractions for you? |
Not sure hw that would help in this case. Flysystem does not expose a way to write by chunk. the version from kbonc suffers the same issues. |
public function __invoke(string|null $fileName): \Generator | ||
{ | ||
if (null === $fileName) { | ||
$fileName = uniqid('gotenberg_', true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use Filesystem here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be droped soon I think. I don't see a reason filename should be null. I went along with it in this PR but it will change. Otherwise yes filesystem could be a lead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah because of the getFileName()
who's returned $this->response->getFileName()
. But I think we can't receive a null or it willcrash with an exception before this part.
$this->logger?->debug('{processor}: no filename given. Content will be dumped to "{file}".', ['processor' => self::class, 'file' => $fileName]); | ||
} | ||
|
||
$this->logger?->debug('{processor}: starting multi part upload of "{file}".', ['processor' => self::class, 'file' => $fileName]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private readonly LoggerInterface $logger = new NullLogger()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ? This works well already. What kind of gain would we have ?
$this->filesystemOperator->writeStream($fileName, $tmpfile); | ||
|
||
$this->logger?->debug('{processor}: content dumped to "{file}".', ['processor' => self::class, 'file' => $fileName]); | ||
|
||
return function () use ($fileName) { | ||
return $this->filesystemOperator->read($fileName); // use readStream instead ? | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not feel optimised
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does nothing unless you can that callable. I was thinking of returning an anonymous class that implements Stringable and returns this call instead to make it easier to use. What do you suggest ?
* | ||
* @implements ProcessorInterface<CompleteMultipartUploadOutput> | ||
*/ | ||
final class AsyncAwsProcessor implements ProcessorInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, specific processors like AWS or Flysystem must be in a separated bridge to avoid adding dev-dependencies. It could block the update of GotenbergBundle to Symfony x.y if, for example, FlysystemBundle does not support it yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then they would have to require it just for one class potentially ? Seems a bit too much don't you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO such classes must be fixed on package version instead of framework version. So maybe rename it to include more clearly package name and version of said package ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK for a Bridge
folder (like the Notifier
for example) for now.
To avoid developer to install both Flysystem and AsyncAWS on their local project, we can install these dev dependencies only in the CI (if tests need them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are dev dependencies already so they are not installed on developper projects. Could be move to suggest only if needed and manually required in test env for CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not. We should also not rely on the bundle but just the FlySystem lib
```php | ||
do { | ||
$chunk = yield; | ||
// do something with it | ||
} while (!$chunk->isLast()); | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
```php | |
do { | |
$chunk = yield; | |
// do something with it | |
} while (!$chunk->isLast()); | |
``` | |
```php | |
public function __invoke(string|null $fileName): \Generator | |
{ | |
do { | |
$chunk = yield; | |
// do something with it | |
} while (!$chunk->isLast()); | |
// rest of your code | |
} |
|
||
## Other processors | ||
|
||
* `Sensiolabs\GotenbergBundle\Processor\AsyncAwsProcessor` : Upload using the `async-aws/s3` package. Uploads using the (multipart upload)[https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html] feature of S3. Returns a `AsyncAws\S3\Result\CompleteMultipartUploadOutput` object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add an example of it
public function __invoke(string|null $fileName): \Generator | ||
{ | ||
if (null === $fileName) { | ||
$fileName = uniqid('gotenberg_', true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah because of the getFileName()
who's returned $this->response->getFileName()
. But I think we can't receive a null or it willcrash with an exception before this part.
Closes #140