Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hints about file content validation. #1452

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

righettod
Copy link
Member

Hello,

💬This PR enrich the hints on validation against a file content regarding a malicious file added to the end of the source file. In addition, I removed the section about my repo because it is not relevant anymore.

📖My sources are based on my tests documented here.

✅My contribution respect the rules specified in the PR template:

  • In case of a new Cheat Sheet, you have used the Cheat Sheet template.
  • All the markdown files do not raise any validation policy violation, see the policy.
  • All the markdown files follow these format rules.
  • All your assets are stored in the assets folder.
  • All the images used are in the PNG format.
  • Any references to websites have been formatted as [TEXT](URL)
  • You verified/tested the effectiveness of your contribution (e.g., the defensive code proposed is really an effective remediation? Please verify it works!).
  • The CI build of your PR pass, see the build status here.

😉Thanks in advance.

malicious-document.pdf: PDF document, version 1.4
```

Therefore, it is recommended like mentioned above for images, to apply document rewriting techniques to destroys any kind of malicious content embedded.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the output of 'file' in this case shows what it does makes perfect sense, but this is really only a problem if your OS (or perhaps your PDF viewer) is going to attempt to execute the 'malicious-file.exe' part, which seems a stretch even for Windows OS. I have seen malicious content injected into PDFs via JavaScript links and leveraging bugs in common PDF viewers such as Adobe Acrobat Reader, but it was never anything as simple as this.

So, so you have a documented case where something was exploited using this simple approach? If so, I'd like a reference to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is critical that untrusted uploaded files staged in public directories are assigned file permissions of read only at the OS level. That is my only execution concern overall.

@jmanico
Copy link
Member

jmanico commented Jul 21, 2024

@kwwall
Copy link
Collaborator

kwwall commented Jul 21, 2024

Fair enough, but 1) that's one reason why no one ever allows applets anymore, and 2) based on the description of

The attack is simply a mashup of a GIF picture and a JAR (Java applet). The malicious JAR is compiled and then combined with information from a GIF. The GIF part fools the browser into opening it as a picture and trusting the content.

(Emphasis mine.) There comes a point where we can try to be defensive and to do due diligence to provide a safe harbor (like ESAPI does against XSS attacks because of buggy browsers) or we say "screw it; performance is more important" (e.g., the approach that the OWASP Java Encoder takes). Neither approach is really wrong, but being cautious does have performance hits, In that sense, I would argue that if you haven't done a threat model to justify that this is reasonable for your particular application, you probably shouldn't be trying to defend against it.

Lastly, that said, if we are going to put in verbiage that people need to be cautious and rewrite whatever file type they get, then we definitely ought to add something t address the ever looming danger of Windows NTFS Alternate Data Streams. I remember experimenting with that in the early 2000s where I created a simple .txt file that would open calc.exe as the alternate data stream. That seems to be a broader danger than what is mentioned here because it's by design and AFAIK, still present in NTFS. (I'm not sure; I have used Windows since the Windows 7 days.) But we don't even mention NTFS Alternate Data Steams at all in the current File Upload CS.

@jmanico
Copy link
Member

jmanico commented Jul 23, 2024

I still think this is a very good addition. These types of attacks have high impact when they work.

@kwwall
Copy link
Collaborator

kwwall commented Jul 24, 2024

@jmanico - The general idea is good, but I think the example is unrealistic / too simplistic. If the example was converted to using Windows NTFS alternative data streams (which, IIRC, was almost that simple to exploit), I'd be in favor of keeping it, but I'm just convinced that it will work otherwise, unless there's a bug in the PDF view that's being used.

@jmanico
Copy link
Member

jmanico commented Jul 24, 2024

Can we come up with a better explanation then? This is an important control. Here is some light AI reading on the topic.

Why Validate File Contents?

Malicious File Uploads:
Attackers can upload files with malicious code or payloads. For example, a seemingly harmless image file could contain embedded scripts or malware. If these files are processed or executed on the server, they can lead to server-side vulnerabilities like Remote Code Execution (RCE).

File Type Mismatch:
Attackers may try to upload files with misleading extensions (e.g., a .jpg file that is actually an executable or script). Relying solely on file extensions or MIME types provided by the client can be dangerous, as these can be easily spoofed.

Resource Exhaustion:
Uploading excessively large files can lead to Denial of Service (DoS) attacks by exhausting server resources like memory, disk space, or processing power.

Data Exfiltration and Information Disclosure:
If file uploads are not properly handled, attackers could upload files that, when accessed, disclose sensitive information or facilitate unauthorized access to data.

Cross-Site Scripting (XSS):
Malicious scripts can be embedded in files like images, PDFs, or documents. If these files are later rendered or processed by the application without proper sanitization, they can lead to XSS attacks.

Exploitation Techniques
Remote Code Execution (RCE):
An attacker uploads a file with server-side executable content (e.g., PHP, Python, shell script). If the application processes this file improperly, it could execute the malicious code, leading to full server compromise.

Directory Traversal:
If an application stores uploaded files in a predictable directory structure, attackers might exploit directory traversal vulnerabilities to overwrite or access sensitive files.

Content-Type Mismatch and Bypass:
By tampering with the file's MIME type or extension, attackers can bypass client-side validation checks and upload prohibited file types, leading to potential security breaches.

Phishing:
Attackers can upload files that appear to be legitimate (e.g., HTML files mimicking a login page) and then share the link to these files, tricking users into providing sensitive information.

@kwwall
Copy link
Collaborator

kwwall commented Jul 26, 2024

@jmanico - I agree that we need something. My primary objection was to the specific example(s) that were proposed. It's just not that simple (well, except when it involves NTFS alternate data streams). I just re-checked our File Upload CS and there is zero mentioned about the fact that SVG images really an XML document and JavaScript can be embedded in it, allowing things like avatars as SVG images could be a source of persistent XSS. So, yes, we definitely have a few gaps here.

OTOH, one of the things that makes this so difficult is that you almost have to do it on a per file type basis. The techniques that are recommended for (say) image types like PNG or JPG won't necessarily work for images. And given that and how extensive your AI generated list of concerns is, we'd probably double or triple the length of the current OWASP File Uploads Cheat Sheet (which is at about 6 printed pages as of the time of this comment).

Thus, I think we need to:

  • Strategize how we want to approach it. Perhaps it needs to be re-organized into multiple cheat sheets.
  • Divide and conquer. Addressing all the concerns that you brought up is likely to be too much of an undertaking for a single volunteer, but if we could get a few different people to take a particular section, we might have a chance completing it before the end of the year.
  • We probably also should think about which of the general areas are the most important to the development community and focus on those things first.

@mackowski
Copy link
Collaborator

@jmanico and @righettod comments from @kwwall are valid in my opinion but it would be also a wast if we do not commit anything. Do you want to try to add more details to make it rock solid? We can also add a comment when this code will be fine and when reader need to add more protections.

@jmanico
Copy link
Member

jmanico commented Aug 29, 2024

I agree, and suggest we just come up with a more modern example. Fair enough Kevin and others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants