Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid image file when using Active Storage and Action Mailbox Inbound Emails #555

Open
estebanutz opened this issue Aug 16, 2024 · 16 comments · Fixed by #556 or #561
Open

Invalid image file when using Active Storage and Action Mailbox Inbound Emails #555

estebanutz opened this issue Aug 16, 2024 · 16 comments · Fixed by #556 or #561

Comments

@estebanutz
Copy link

Describe the bug in a sentence or two.

In my app I've setup Action Mailbox, which uses Active Storage to store the content of the inbound emails. I've integrated Cloudinary gem just fine with Active Storage with the rest of the app, but now I'm getting a "ActiveStorage::IntegrityError in Rails::Conductor::ActionMailbox::InboundEmailsController#create Invalid image file" "Exception Causes
CloudinaryException: Invalid image file" when testing inbound emails.

I can see in the active_storage_blobs table that the emails content_type as "message/rfc822". I read somewhere that Active Storage is sending the email info to Cloudinary as an image for some reason, but I could be wrong.

What am I missing? Any help would be greatly appreciated! Also, should this issue be posted in the Rails repo? Thanks!

Issue Type (Can be multiple)

Integration with Active Storage and Action Mailbox

Operating System

  • [X ] macOS

Environment and Libraries (fill in the version numbers)

  • Cloudinary Gem - 2.1.1
  • Ruby Version - 3.2.0
  • Rails Version - 7.1.3.4
  • Active Storage - 7.1.3.4
  • Action Mailbox - 7.1.3.4
@const-cloudinary
Copy link
Contributor

@estebanutz , thank you for reporting the issue, it should be fixed in the latest version: 2.1.2

@estebanutz
Copy link
Author

@const-cloudinary, Thanks! The issue has been fixed. I have another quick question for you, and I don't know if this is related to this closed issue. When testing Action Mailbox inbound emails, when using Cloudinary when trying to process or read the email, for example getting the email "from" and "to" info I get nothing. When I switch to local as a service, same inbound email, I get everything from, to, body, etc. Could it be an ActiveStorage issue? I checked the blob information for both emails and files; they contain the same information. Any ideas?

@const-cloudinary
Copy link
Contributor

@estebanutz, it would be great if you could provide some sample code that I would be able to check

@estebanutz
Copy link
Author

@const-cloudinary sure, there is little code since I'm using the conductor controller to run those tests. Same rails app as above:
Cloudinary Gem - 2.1.1
Ruby Version - 3.2.0
Rails Version - 7.1.3.4
Active Storage - 7.1.3.4
Action Mailbox - 7.1.3.4

config/environments/development.rb
config.active_storage.service = :cloudinary

app/mailboxes/application_mailbox.rb

class ApplicationMailbox < ActionMailbox::Base
     routing :all => :articles
end

app/mailboxes/articles_mailbox.rb

class ArticlesMailbox < ApplicationMailbox
  def process
	  Rails.logger.info("##################### FROM: #{mail.from}")
	  Rails.logger.info("##################### TO: #{mail.to}")
	  Rails.logger.info("##################### BODY: #{mail.decoded}")
  end
end

So, in the logger output, all that is empty, and also, if for example, you view the "Full email source" in one of your emails: /rails/conductor/action_mailbox/inbound_emails/ID it's empty as well.

When you change the active_storage.service to :local you get all the correct values:
config.active_storage.service = :local

I hope that helps, thanks!

@estebanutz
Copy link
Author

Hi @const-cloudinary any updates on this issue? Any other way I can help? Thanks!

@const-cloudinary
Copy link
Contributor

Hi @estebanutz,

Thank you for bringing this to our attention.

We have investigated the issue and were able to reproduce it in our development environment, where incomplete class reloading resulted in broken links.

This issue has been addressed in version 2.2.0.

Please try updating to this version and let us know if it resolves your issue.

@estebanutz
Copy link
Author

estebanutz commented Sep 10, 2024

@const-cloudinary I tried it, and now it sort of works. It's very inconsistent, and I've been troubleshooting it for a while and I can't say for sure what it is, but in my code above:

Rails.logger.info("##################### FROM: #{mail.from}")
Rails.logger.info("##################### TO: #{mail.to}")
Rails.logger.info("##################### BODY: #{mail.decoded}")

If I start testing multiple inbound emails, for example, let's say I send 10 emails. The first 4 emails will return the correct information and then the rest won't have anything, so mail.from, or mail.to or mail.decoded return all empty. I thought maybe it had something to do with the email service (Postmark) so I tried Mailgun, same inconsistent behavior.

Any ideas what could be happening?

@const-cloudinary
Copy link
Contributor

const-cloudinary commented Sep 11, 2024

@estebanutz
I would start by checking rails logs, specifically:

  Cloudinary Storage (0.1ms) Generated URL for file at key: vbrnv5yt5qrpi94rtvnc5cv8il7y (https://res.cloudinary.com/cloud_name/raw/upload/vbrnv5yt5qrpi94rtvnc5cv8il7y.eml?_a=BACE6GBn)
  Cloudinary Storage (399.2ms) Downloaded file from key: vbrnv5yt5qrpi94rtvnc5cv8il7y

Specifically would pay attention on:
raw/upload part of the URL.

If all files are OK, would continue to check Rails, it does perform all kind of lazy loading things, maybe something is not in-sync, or need to trigger some initialization explicitly.

@estebanutz
Copy link
Author

@cloudinary-bot Ok, let me check on that and report back, thanks!

@estebanutz
Copy link
Author

@const-cloudinary following your suggestion I checked the rails logs and ran a couple of test emails, and I'm getting the same results you showed above, for example:

Cloudinary Storage (0.3ms) Generated URL for file at key: a9jxkml4m8o8yih5ojymmfdlez6y (https://res.cloudinary.com/djtnxzyud/raw/upload/v1/eeHEALTH/a9jxkml4m8o8yih5ojymmfdlez6y.eml?_a=BACADKDL)
Cloudinary Storage (104.8ms) Downloaded file from key: a9jxkml4m8o8yih5ojymmfdlez6y

One interesting thing I found is that if I look at the files uploaded under the media library in Cloudinary, the files that contain no information when trying to read "mail.from" or "mail.decoded", etc. have an N/A format. The emails that I can actually pull information from show EML as the format. Another way to test this is if you go directly in a browser to the generated URL, the N/A ones will say something like: "No webpage was found for the web address". The ones with EML format will prompt you to download the file.

@const-cloudinary
Copy link
Contributor

@estebanutz , unfortunately I cannot reproduce it on my side,

I suspect it loses file extension (.eml) during upload for some reason, to investigate it, can you please install the gem from this branch?

Add the following line to your Gemfile:
gem 'cloudinary_gem', git: 'https://github.com/cloudinary/cloudinary_gem.git', branch: 'debug-active-storage-upload'

And rerun it.

In the log you should see something like:

Cloudinary Storage (3712.7ms) Uploaded file to key: 11qrxrbzyfuzcdtx1yky8bnqyase (checksum: tmyStc779M1wSnmcO1BM7g==)
Uploaded file to key: 11qrxrbzyfuzcdtx1yky8bnqyase (filename: message.eml) (secure_url: https://res.cloudinary.com/constantine-sdk/raw/upload/v1726499700/11qrxrbzyfuzcdtx1yky8bnqyase.eml)

Then you can compare it with the url that gets generated:

  Cloudinary Storage (0.6ms) Generated URL for file at key: 11qrxrbzyfuzcdtx1yky8bnqyase (https://res.cloudinary.com/constantine-sdk/raw/upload/11qrxrbzyfuzcdtx1yky8bnqyase.eml?_a=BACADKBn)
  Cloudinary Storage (541.9ms) Downloaded file from key: 11qrxrbzyfuzcdtx1yky8bnqyase

(you can ignore version part /v1726499700/)

@estebanutz
Copy link
Author

@const-cloudinary

I get the error message below when trying to update the gem. Should I just use 'cloudinary' only (remove _gem)?

Could not find gem 'cloudinary_gem' in https://github.com/cloudinary/cloudinary_gem.git (at debug-active-storage-upload@3e61439).

If so I tried it and I'm not getting the (filename: message.eml) part, here's what I'm getting:

TRANSACTION (4.2ms)  COMMIT
13:10:30 web.1  |   Cloudinary Storage (404.6ms) Uploaded file to key: a7am7oc61debgh3fy5bre0gw60ax (checksum: Ao5MqXJ9bMOPIaGz9J6t/Q==)
13:10:30 web.1  |   TRANSACTION (0.3ms)  BEGIN
13:10:30 web.1  |   ActionMailbox::InboundEmail Create (4.6ms)  INSERT INTO `action_mailbox_inbound_emails` (`status`, `message_id`, `message_checksum`, `created_at`, `updated_at`) VALUES (0, '[email protected]', '8482f738e12f8af163bf3319892085ebb3d9d85e', '2024-09-17 18:10:30.477074', '2024-09-17 18:10:30.477074')
13:10:30 web.1  |   ActiveStorage::Blob Load (12.2ms)  SELECT `active_storage_blobs`.* FROM `active_storage_blobs` INNER JOIN `active_storage_attachments` ON `active_storage_blobs`.`id` = `active_storage_attachments`.`blob_id` WHERE `active_storage_attachments`.`record_id` = 82 AND `active_storage_attachments`.`record_type` = 'ActionMailbox::InboundEmail' AND `active_storage_attachments`.`name` = 'raw_email' LIMIT 1
13:10:30 web.1  |   ActiveStorage::Attachment Load (0.8ms)  SELECT `active_storage_attachments`.* FROM `active_storage_attachments` WHERE `active_storage_attachments`.`record_id` = 82 AND `active_storage_attachments`.`record_type` = 'ActionMailbox::InboundEmail' AND `active_storage_attachments`.`name` = 'raw_email' LIMIT 1
13:10:30 web.1  |   ActiveStorage::Attachment Create (1.0ms)  INSERT INTO `active_storage_attachments` (`name`, `record_type`, `record_id`, `blob_id`, `created_at`) VALUES ('raw_email', 'ActionMailbox::InboundEmail', 82, 184, '2024-09-17 18:10:30.502472')
13:10:30 web.1  |   ActionMailbox::InboundEmail Update (1.2ms)  UPDATE `action_mailbox_inbound_emails` SET `action_mailbox_inbound_emails`.`updated_at` = '2024-09-17 18:10:30.504506' WHERE `action_mailbox_inbound_emails`.`id` = 82
13:10:30 web.1  |   TRANSACTION (1.2ms)  COMMIT
13:10:30 web.1  | [ActiveJob] Enqueued ActionMailbox::RoutingJob (Job ID: 460fd4c1-30f4-435c-ad53-21775c44fca2) to Sidekiq(default) with arguments: #<GlobalID:0x00000001156c0e38 @uri=#<URI::GID gid://eehealth/ActionMailbox::InboundEmail/82>>
13:10:30 web.1  |   TRANSACTION (0.2ms)  BEGIN
13:10:30 web.1  |   ActiveStorage::Blob Update (0.6ms)  UPDATE `active_storage_blobs` SET `active_storage_blobs`.`metadata` = '{\"identified\":true,\"analyzed\":true}' WHERE `active_storage_blobs`.`id` = 184
13:10:30 web.1  |   ActiveStorage::Attachment Load (0.4ms)  SELECT `active_storage_attachments`.* FROM `active_storage_attachments` WHERE `active_storage_attachments`.`blob_id` = 184
13:10:30 web.1  |   ActionMailbox::InboundEmail Load (0.7ms)  SELECT `action_mailbox_inbound_emails`.* FROM `action_mailbox_inbound_emails` WHERE `action_mailbox_inbound_emails`.`id` = 82
13:10:30 web.1  |   ActionMailbox::InboundEmail Update (9.6ms)  UPDATE `action_mailbox_inbound_emails` SET `action_mailbox_inbound_emails`.`updated_at` = '2024-09-17 18:10:30.521781' WHERE `action_mailbox_inbound_emails`.`id` = 82
13:10:30 web.1  |   TRANSACTION (0.6ms)  COMMIT

@estebanutz
Copy link
Author

@const-cloudinary another piece of information that may be helpful for you is that when sidekiq is processing the background jobs the logs are as follow:

'Headers' is not empty

2024-09-17T18:14:07.953Z pid=82917 tid=1lvl class=ActionMailbox::RoutingJob jid=7d7a40e39047f26655e3b6f5 INFO:   Cloudinary Storage (1086.5ms) Downloaded file from key: rngg3fjxqck53d0aal48qnbd5lw8
2024-09-17T18:14:07.964Z pid=82917 tid=1lvl class=ActionMailbox::RoutingJob jid=7d7a40e39047f26655e3b6f5 INFO: Mail object: #<Mail::Message:27140, Multipart: false, Headers: <Received: by p-pm-inboundg03c-aws-useast1c.inbound.postmarkapp.com (Postfix, from userid 996)..........>

vs empty:

2024-09-17T18:14:25.417Z pid=82917 tid=1lvl class=ActionMailbox::RoutingJob jid=a929d8ca4eada80b52db9066 INFO:   Cloudinary Storage (370.0ms) Downloaded file from key: fwb6weg3p2mxg7rk3boh2x3527za
2024-09-17T18:14:25.419Z pid=82917 tid=1lvl class=ActionMailbox::RoutingJob jid=a929d8ca4eada80b52db9066 INFO: Mail object: #<Mail::Message:27220, Multipart: false, Headers: >

@estebanutz
Copy link
Author

@const-cloudinary I keep troubleshooting this issue. I pushed the site to a test server for more testing and got the same inconsistent results. Could it be that, in some cases, there is a delay between when ActiveStorage fully downloads the file and when ActionMailbox processes it, resulting in errors?

@const-cloudinary
Copy link
Contributor

@estebanutz , there should be no significant delays, unless you attach really large files to your emails.

Another thing I would try, is to set resource_type to raw in your storage.yml

something like:

cloudinary:
  service: Cloudinary
  ...your config goes here
  resource_type: raw

This will force all uploads to be treated as raw files.

@estebanutz
Copy link
Author

@const-cloudinary I have not done any tests with attachments, all my tests are just plain text. I mentioned delayed because it looked to me that ActionMailbox was running before the email was downloaded, so it had no information about the email's content. I'll try and test with resource_type to raw

My workaround to make this work, which I think is overkill, is to run a background job that waits 5 seconds and sends the inbound email ID. Then, I can process the inbound email just fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants