[WIP] Set content id of attachments #597

davvil · 2018-11-14T09:45:38Z

The Content-ID is set to the file name.

In this way, when composing HTML message inline images can be included. E.g. when editing markup

![This is an inline image](cid:image.jpg)

will include image.jpg in the text. AFAIK there was no easy way to inline attached images before.

As this happens at send time, the preview will not display the image, though.

gauteh · 2018-11-15T18:22:04Z

David Vilar writes on November 14, 2018 10:45:

The Content-ID is set to the file name. In this way, when composing HTML message inline images can be included. E.g. when editing markup ``` ![This is an inline image](cid:image.jpg) ``` will include `image.jpg` in the text. AFAIK there was no easy way to inline attached images before. As this happens at send time, the preview will not display the image, though.

Is there any reason we cannot make this work in the preview? This would be a great feature, but I am a bit reluctant to make the final message different from the previewed one. Also, what is the spec for content_id? The file name might not fit into it always. Then we need some way to communicate the sanitized file name to the user.

davvil · 2018-11-16T08:50:16Z

I was actually surprised that it should be so easy...

I checked the rfc and actually the cid should be "globally unique". This could be accomplished e.g. by appending the message id to the file name, with an @-symbol as separator. But then the simple inlining will not work anymore without modifying the reference with the new cid. That being said, apparently this rule is not strictly followed. I actually took the "cid is filename" convetion from some mails that I have received. You can also look at this thread in stackoverflow. But I think we should follow the standard.

One thing that could ease the implementation of the name substitution, without having to parse the full markdown or html, is to define our own convention for indicating an inline image and do a simple substitution each time we encounter the prefix, e.g. specify ![caption](#@inline:image.jpg), but of course this could produce undesired substitutions in some edge cases. Is it possible with webkit to get a list of all the references in a document?

As for displaying the image in the preview, I haven't looked at the code yet. I'll try to get it working.

davvil · 2018-11-16T08:51:36Z

BTW. What is your preference for PRs? I marked this one as [WIP] as it is clearly not ready for merging. Is this OK or do you prefer that I close it and create a new one when it is more mature?

gauteh · 2018-11-16T08:54:09Z

On Fri, Nov 16, 2018 at 9:50 AM David Vilar ***@***.***> wrote: I was actually surprised that it should be so easy... I checked the rfc <https://tools.ietf.org/html/rfc2392> and actually the cid should be "globally unique". This could be accomplished e.g. by appending the message id to the file name, with an @-symbol as separator. But then the simple inlining will not work anymore without modifying the reference with the new cid. That being said, apparently this rule is not strictly followed. I actually took the "cid is filename" convetion from some mails that I have received. You can also look at this thread <https://stackoverflow.com/questions/39577386/the-precise-format-of-content-id-header> in stackoverflow. But I think we should follow the standard.

Nice, we probably have to do some escaping though, or perhaps this is done already both by GMime at set_content_id and at load when converted to HTML. In which case we might actually be good to go.

One thing that could ease the implementation of the name substitution, without having to parse the full markdown or html, is to define our own convention for indicating an inline image and do a simple substitution each time we encounter the prefix, e.g. specify ***@***.***:image.jpg), but of course this could produce undesired substitutions in some edge cases. Is it possible with webkit to get a list of all the references in a document?

What do you mean? Have a look in tvextension.cc for how I susbstitute the img src for cid's at the moment.

gauteh · 2018-11-16T08:55:02Z

On Fri, Nov 16, 2018 at 9:51 AM David Vilar ***@***.***> wrote: BTW. What is your preference for PRs? I marked this one as [WIP] as it is clearly not ready for merging. Is this OK or do you prefer that I close it and create a new one when it is more mature?

That's great, good to post it early so that the direction of the implementation can be discussed. There's also an work-in-progress label.

davvil · 2018-11-16T09:04:38Z

On Fri, Nov 16, 2018 at 9:54 AM Gaute Hope ***@***.***> wrote: > One thing that could ease the implementation of the name substitution, > without having to parse the full markdown or html, is to define our own > convention for indicating an inline image and do a simple substitution each > time we encounter the prefix, e.g. specify ***@***.***:image.jpg), > but of course this could produce undesired substitutions in some edge > cases. Is it possible with webkit to get a list of all the references in a > document? > What do you mean? Have a look in tvextension.cc for how I susbstitute the img src for cid's at the moment.

That is actually what I was looking for! The idea would then be to go through the document as you do there, detect all cid: and substitute with the new names. I'll try to have a go at it.

gauteh · 2018-11-16T09:21:19Z

On Fri, Nov 16, 2018 at 10:04 AM David Vilar <[email protected]> wrote:

On Fri, Nov 16, 2018 at 9:54 AM Gaute Hope ***@***.***> wrote: > > One thing that could ease the implementation of the name substitution, > > without having to parse the full markdown or html, is to define our own > > convention for indicating an inline image and do a simple substitution > each > > time we encounter the prefix, e.g. specify > ***@***.***:image.jpg), > > but of course this could produce undesired substitutions in some edge > > cases. Is it possible with webkit to get a list of all the references in > a > > document? > > > What do you mean? Have a look in tvextension.cc for how I susbstitute the > img src for cid's at the moment. > That is actually what I was looking for! The idea would then be to go through the document as you do there, detect all cid: and substitute with the new names. I'll try to have a go at it.

Nice! but if GMime handles escaping and un-escpaing properly then it might not be necessary?

davvil · 2018-11-16T10:01:47Z

On Fri, Nov 16, 2018 at 10:21 AM Gaute Hope ***@***.***> wrote: Nice! but if GMime handles escaping and un-escpaing properly then it might not be necessary?

The problem is not the escaping (which I hope GMime takes care of it, but I will check it). The problem is the global cid: Suppose we want to attach and inline image.jpg. It will get a cid, say image.jpg@1234567.astroid. Now, the user specified in the markdown a link to cid:image.jpg which gets transformed into html as '<img src="cid:image.jpg"/>'. We will then need to change it to '<img src="cid:[email protected]"/>'. And that's where the "parsing html" problem comes into play.

gauteh · 2018-11-16T10:41:31Z

On Fri, Nov 16, 2018 at 11:01 AM David Vilar ***@***.***> wrote: On Fri, Nov 16, 2018 at 10:21 AM Gaute Hope ***@***.***> wrote: > Nice! but if GMime handles escaping and un-escpaing properly then it might > not be necessary? > The problem is not the escaping (which I hope GMime takes care of it, but I will check it). The problem is the global cid: Suppose we want to attach and inline image.jpg. It will get a cid, say ***@***.*** Now, the user specified in the markdown a link to cid:image.jpg which gets transformed into html as '<img src="cid:image.jpg"/>'. We will then need to change it to '<img ***@***.***"/>'. And that's where the "parsing html" problem comes into play.

OK, perhaps you can do it before the markdown processor step in ComposeMessage. Why do you need to add an `@xxxx.astroid` part?

davvil · 2018-11-16T12:14:29Z

That's the pain point. The rfc states that "Both message-id and content-id are required to be globally unique. That is [...] no different body parts will ever have the same Content-ID addr-spec.". That's why I was thinking of adding the message-id to make them unique. To be honest, I don't really see the point, as I don't think anyone would reference some content-id independently of the message. But I assume we should stick to the rfc (although not all email clients do). On Fri, Nov 16, 2018 at 11:41 AM Gaute Hope <[email protected]> wrote:

…

On Fri, Nov 16, 2018 at 11:01 AM David Vilar ***@***.***> wrote: > > On Fri, Nov 16, 2018 at 10:21 AM Gaute Hope ***@***.***> > wrote: > > > Nice! but if GMime handles escaping and un-escpaing properly then it might > > not be necessary? > > > > The problem is not the escaping (which I hope GMime takes care of it, but I > will check it). The problem is the global cid: > > Suppose we want to attach and inline image.jpg. It will get a cid, say > ***@***.*** Now, the user specified in the markdown a link > to cid:image.jpg which gets transformed into html as '<img > src="cid:image.jpg"/>'. We will then need to change it to '<img > ***@***.***"/>'. And that's where the "parsing html" > problem comes into play. OK, perhaps you can do it before the markdown processor step in ComposeMessage. Why do you need to add an ***@***.***` part? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#597 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AApuRamPl9u8ha_8HQhLx5ZG_ZM7ijuAks5uvpZcgaJpZM4YdXyZ> .

gauteh · 2018-11-16T12:47:57Z

On Fri, Nov 16, 2018 at 1:14 PM David Vilar ***@***.***> wrote: That's the pain point. The rfc states that "Both message-id and content-id are required to be globally unique. That is [...] no different body parts will ever have the same Content-ID addr-spec.". That's why I was thinking of adding the message-id to make them unique. To be honest, I don't really see the point, as I don't think anyone would reference some content-id independently of the message. But I assume we should stick to the rfc (although not all email clients do).

Oh, right, I see. Well, we certainly do not rely on it in astroid. Whenever a message is forwarded or replied to the CIDs would have to be re-generated then (not that it matters much for us atm since we do not use the HTML content then).

davvil · 2018-11-16T15:15:58Z

On Fri, Nov 16, 2018 at 1:48 PM Gaute Hope ***@***.***> wrote: On Fri, Nov 16, 2018 at 1:14 PM David Vilar ***@***.***> wrote: Oh, right, I see. Well, we certainly do not rely on it in astroid. Whenever a message is forwarded or replied to the CIDs would have to be re-generated then (not that it matters much for us atm since we do not use the HTML content then).

True, I didn't think of that. But let's go one step at a time :-)

gauteh · 2019-01-02T14:49:07Z

Is this one ready for review? Or are you still working on it?

davvil · 2019-01-04T09:24:33Z

No, it's not ready. I wanted to work on it but didn't find the time yet. In practice "it works", but it does not conform with the specification.

…

On Wed, 2 Jan 2019, 15:49 Gaute Hope ***@***.*** wrote: Is this one ready for review? Or are you still working on it? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#597 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AApuRc5pmr-Ey_KSgWoIVDZHZjbQzDY9ks5u_MbkgaJpZM4YdXyZ> .

ff2000 · 2020-02-20T04:41:46Z

Just an idea: Why not generate a new message id for the cid everytime a file gets attached? The postfix of the cid will differ from the actual mid only in the timestamp, but sending a mail should result in unique mids at any point in time, so the cid should also be unique with this approach.
Replying to a message with attachments could be done the same: Generate a message id when the user hits "reply" and just use that.

gauteh · 2020-05-11T07:49:51Z

I think that's ok, but it should be easy to refer to those cid's in a markdown email. It might be difficult to guess those when auto-generated?

Set content id of attachments

c21f3ff

davvil changed the title ~~Set content id of attachments~~ [WIP] Set content id of attachments Nov 16, 2018

gauteh mentioned this pull request Apr 4, 2019

Quote html parts #626

Merged

gauteh mentioned this pull request Mar 27, 2020

logo in email signature #678

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Set content id of attachments #597

[WIP] Set content id of attachments #597

davvil commented Nov 14, 2018

gauteh commented Nov 15, 2018 via email

davvil commented Nov 16, 2018

davvil commented Nov 16, 2018

gauteh commented Nov 16, 2018 via email

gauteh commented Nov 16, 2018 via email

davvil commented Nov 16, 2018 via email

gauteh commented Nov 16, 2018 via email

davvil commented Nov 16, 2018 via email

gauteh commented Nov 16, 2018 via email

davvil commented Nov 16, 2018 via email

gauteh commented Nov 16, 2018 via email

davvil commented Nov 16, 2018 via email

gauteh commented Jan 2, 2019

davvil commented Jan 4, 2019 via email

ff2000 commented Feb 20, 2020

gauteh commented May 11, 2020

[WIP] Set content id of attachments #597

Are you sure you want to change the base?

[WIP] Set content id of attachments #597

Conversation

davvil commented Nov 14, 2018

gauteh commented Nov 15, 2018 via email

davvil commented Nov 16, 2018

davvil commented Nov 16, 2018

gauteh commented Nov 16, 2018 via email

gauteh commented Nov 16, 2018 via email

davvil commented Nov 16, 2018 via email

gauteh commented Nov 16, 2018 via email

davvil commented Nov 16, 2018 via email

gauteh commented Nov 16, 2018 via email

davvil commented Nov 16, 2018 via email

gauteh commented Nov 16, 2018 via email

davvil commented Nov 16, 2018 via email

gauteh commented Jan 2, 2019

davvil commented Jan 4, 2019 via email

ff2000 commented Feb 20, 2020

gauteh commented May 11, 2020