-
-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Set content id of attachments #597
base: master
Are you sure you want to change the base?
Conversation
David Vilar writes on November 14, 2018 10:45:
The Content-ID is set to the file name.
In this way, when composing HTML message inline images can be included. E.g. when editing markup
```
![This is an inline image](cid:image.jpg)
```
will include `image.jpg` in the text. AFAIK there was no easy way to inline attached images before.
As this happens at send time, the preview will not display the image, though.
Is there any reason we cannot make this work in the preview? This would
be a great feature, but I am a bit reluctant to make the final message
different from the previewed one.
Also, what is the spec for content_id? The file name might not fit into
it always. Then we need some way to communicate the sanitized file name
to the user.
|
I was actually surprised that it should be so easy... I checked the rfc and actually the cid should be "globally unique". This could be accomplished e.g. by appending the message id to the file name, with an @-symbol as separator. But then the simple inlining will not work anymore without modifying the reference with the new cid. That being said, apparently this rule is not strictly followed. I actually took the "cid is filename" convetion from some mails that I have received. You can also look at this thread in stackoverflow. But I think we should follow the standard. One thing that could ease the implementation of the name substitution, without having to parse the full markdown or html, is to define our own convention for indicating an inline image and do a simple substitution each time we encounter the prefix, e.g. specify As for displaying the image in the preview, I haven't looked at the code yet. I'll try to get it working. |
BTW. What is your preference for PRs? I marked this one as [WIP] as it is clearly not ready for merging. Is this OK or do you prefer that I close it and create a new one when it is more mature? |
On Fri, Nov 16, 2018 at 9:50 AM David Vilar ***@***.***> wrote:
I was actually surprised that it should be so easy...
I checked the rfc <https://tools.ietf.org/html/rfc2392> and actually the
cid should be "globally unique". This could be accomplished e.g. by
appending the message id to the file name, with an @-symbol as separator.
But then the simple inlining will not work anymore without modifying the
reference with the new cid. That being said, apparently this rule is not
strictly followed. I actually took the "cid is filename" convetion from
some mails that I have received. You can also look at this thread
<https://stackoverflow.com/questions/39577386/the-precise-format-of-content-id-header>
in stackoverflow. But I think we should follow the standard.
Nice, we probably have to do some escaping though, or perhaps this is done
already both by GMime at set_content_id and at load when converted to HTML.
In which case we might actually be good to go.
One thing that could ease the implementation of the name substitution,
without having to parse the full markdown or html, is to define our own
convention for indicating an inline image and do a simple substitution each
time we encounter the prefix, e.g. specify ***@***.***:image.jpg),
but of course this could produce undesired substitutions in some edge
cases. Is it possible with webkit to get a list of all the references in a
document?
What do you mean? Have a look in tvextension.cc for how I susbstitute the
img src for cid's at the moment.
|
On Fri, Nov 16, 2018 at 9:51 AM David Vilar ***@***.***> wrote:
BTW. What is your preference for PRs? I marked this one as [WIP] as it is clearly not ready for merging. Is this OK or do you prefer that I close it and create a new one when it is more mature?
That's great, good to post it early so that the direction of the
implementation can be discussed. There's also an work-in-progress
label.
|
On Fri, Nov 16, 2018 at 9:54 AM Gaute Hope ***@***.***> wrote:
> One thing that could ease the implementation of the name substitution,
> without having to parse the full markdown or html, is to define our own
> convention for indicating an inline image and do a simple substitution
each
> time we encounter the prefix, e.g. specify
***@***.***:image.jpg),
> but of course this could produce undesired substitutions in some edge
> cases. Is it possible with webkit to get a list of all the references in
a
> document?
>
What do you mean? Have a look in tvextension.cc for how I susbstitute the
img src for cid's at the moment.
That is actually what I was looking for! The idea would then be to go
through the document as you do there, detect all cid: and substitute with
the new names. I'll try to have a go at it.
|
On Fri, Nov 16, 2018 at 10:04 AM David Vilar <[email protected]>
wrote:
On Fri, Nov 16, 2018 at 9:54 AM Gaute Hope ***@***.***>
wrote:
> > One thing that could ease the implementation of the name substitution,
> > without having to parse the full markdown or html, is to define our own
> > convention for indicating an inline image and do a simple substitution
> each
> > time we encounter the prefix, e.g. specify
> ***@***.***:image.jpg),
> > but of course this could produce undesired substitutions in some edge
> > cases. Is it possible with webkit to get a list of all the references
in
> a
> > document?
> >
> What do you mean? Have a look in tvextension.cc for how I susbstitute the
> img src for cid's at the moment.
>
That is actually what I was looking for! The idea would then be to go
through the document as you do there, detect all cid: and substitute with
the new names. I'll try to have a go at it.
Nice! but if GMime handles escaping and un-escpaing properly then it might
not be necessary?
|
On Fri, Nov 16, 2018 at 10:21 AM Gaute Hope ***@***.***> wrote:
Nice! but if GMime handles escaping and un-escpaing properly then it might
not be necessary?
The problem is not the escaping (which I hope GMime takes care of it, but I
will check it). The problem is the global cid:
Suppose we want to attach and inline image.jpg. It will get a cid, say
image.jpg@1234567.astroid. Now, the user specified in the markdown a link
to cid:image.jpg which gets transformed into html as '<img
src="cid:image.jpg"/>'. We will then need to change it to '<img
src="cid:[email protected]"/>'. And that's where the "parsing html"
problem comes into play.
|
On Fri, Nov 16, 2018 at 11:01 AM David Vilar ***@***.***> wrote:
On Fri, Nov 16, 2018 at 10:21 AM Gaute Hope ***@***.***>
wrote:
> Nice! but if GMime handles escaping and un-escpaing properly then it might
> not be necessary?
>
The problem is not the escaping (which I hope GMime takes care of it, but I
will check it). The problem is the global cid:
Suppose we want to attach and inline image.jpg. It will get a cid, say
***@***.*** Now, the user specified in the markdown a link
to cid:image.jpg which gets transformed into html as '<img
src="cid:image.jpg"/>'. We will then need to change it to '<img
***@***.***"/>'. And that's where the "parsing html"
problem comes into play.
OK, perhaps you can do it before the markdown processor step in
ComposeMessage. Why do you need to add an `@xxxx.astroid` part?
|
That's the pain point. The rfc states that "Both message-id and content-id
are required to be globally unique. That is [...] no different body parts
will ever have the same Content-ID addr-spec.". That's why I was thinking
of adding the message-id to make them unique.
To be honest, I don't really see the point, as I don't think anyone would
reference some content-id independently of the message. But I assume we
should stick to the rfc (although not all email clients do).
On Fri, Nov 16, 2018 at 11:41 AM Gaute Hope <[email protected]>
wrote:
… On Fri, Nov 16, 2018 at 11:01 AM David Vilar ***@***.***>
wrote:
>
> On Fri, Nov 16, 2018 at 10:21 AM Gaute Hope ***@***.***>
> wrote:
>
> > Nice! but if GMime handles escaping and un-escpaing properly then it
might
> > not be necessary?
> >
>
> The problem is not the escaping (which I hope GMime takes care of it,
but I
> will check it). The problem is the global cid:
>
> Suppose we want to attach and inline image.jpg. It will get a cid, say
> ***@***.*** Now, the user specified in the markdown a
link
> to cid:image.jpg which gets transformed into html as '<img
> src="cid:image.jpg"/>'. We will then need to change it to '<img
> ***@***.***"/>'. And that's where the "parsing
html"
> problem comes into play.
OK, perhaps you can do it before the markdown processor step in
ComposeMessage. Why do you need to add an ***@***.***` part?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#597 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AApuRamPl9u8ha_8HQhLx5ZG_ZM7ijuAks5uvpZcgaJpZM4YdXyZ>
.
|
On Fri, Nov 16, 2018 at 1:14 PM David Vilar ***@***.***> wrote:
That's the pain point. The rfc states that "Both message-id and content-id
are required to be globally unique. That is [...] no different body parts
will ever have the same Content-ID addr-spec.". That's why I was thinking
of adding the message-id to make them unique.
To be honest, I don't really see the point, as I don't think anyone would
reference some content-id independently of the message. But I assume we
should stick to the rfc (although not all email clients do).
Oh, right, I see. Well, we certainly do not rely on it in astroid. Whenever
a message is forwarded or replied to the CIDs would have to be re-generated
then (not that it matters much for us atm since we do not use the HTML
content then).
|
On Fri, Nov 16, 2018 at 1:48 PM Gaute Hope ***@***.***> wrote:
On Fri, Nov 16, 2018 at 1:14 PM David Vilar ***@***.***>
wrote:
Oh, right, I see. Well, we certainly do not rely on it in astroid. Whenever
a message is forwarded or replied to the CIDs would have to be re-generated
then (not that it matters much for us atm since we do not use the HTML
content then).
True, I didn't think of that. But let's go one step at a time :-)
|
Is this one ready for review? Or are you still working on it? |
No, it's not ready. I wanted to work on it but didn't find the time yet. In
practice "it works", but it does not conform with the specification.
…On Wed, 2 Jan 2019, 15:49 Gaute Hope ***@***.*** wrote:
Is this one ready for review? Or are you still working on it?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#597 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AApuRc5pmr-Ey_KSgWoIVDZHZjbQzDY9ks5u_MbkgaJpZM4YdXyZ>
.
|
Just an idea: Why not generate a new message id for the cid everytime a file gets attached? The postfix of the cid will differ from the actual mid only in the timestamp, but sending a mail should result in unique mids at any point in time, so the cid should also be unique with this approach. |
I think that's ok, but it should be easy to refer to those cid's in a markdown email. It might be difficult to guess those when auto-generated? |
The Content-ID is set to the file name.
In this way, when composing HTML message inline images can be included. E.g. when editing markup
will include
image.jpg
in the text. AFAIK there was no easy way to inline attached images before.As this happens at send time, the preview will not display the image, though.