Add hints about file content validation. #1452

righettod · 2024-07-21T15:27:25Z

Hello,

💬This PR enrich the hints on validation against a file content regarding a malicious file added to the end of the source file. In addition, I removed the section about my repo because it is not relevant anymore.

📖My sources are based on my tests documented here.

✅My contribution respect the rules specified in the PR template:

In case of a new Cheat Sheet, you have used the Cheat Sheet template.
All the markdown files do not raise any validation policy violation, see the policy.
All the markdown files follow these format rules.
All your assets are stored in the assets folder.
All the images used are in the PNG format.
Any references to websites have been formatted as [TEXT](URL)
You verified/tested the effectiveness of your contribution (e.g., the defensive code proposed is really an effective remediation? Please verify it works!).
The CI build of your PR pass, see the build status here.

😉Thanks in advance.

kwwall · 2024-07-21T19:27:41Z

cheatsheets/File_Upload_Cheat_Sheet.md

+malicious-document.pdf: PDF document, version 1.4
+```
+
+Therefore, it is recommended like mentioned above for images, to apply document rewriting techniques to destroys any kind of malicious content embedded.


Why the output of 'file' in this case shows what it does makes perfect sense, but this is really only a problem if your OS (or perhaps your PDF viewer) is going to attempt to execute the 'malicious-file.exe' part, which seems a stretch even for Windows OS. I have seen malicious content injected into PDFs via JavaScript links and leveraging bugs in common PDF viewers such as Adobe Acrobat Reader, but it was never anything as simple as this.

So, so you have a documented case where something was exploited using this simple approach? If so, I'd like a reference to it.

It is critical that untrusted uploaded files staged in public directories are assigned file permissions of read only at the OS level. That is my only execution concern overall.

jmanico · 2024-07-21T21:05:22Z

Here is an example of a GIF that runs as a jar. Old school: https://hackaday.com/2008/08/04/the-gifar-image-vulnerability/#:~:text=The%20attack%20is%20simply%20a,picture%20and%20trusting%20the%20content.

kwwall · 2024-07-21T22:36:05Z

Fair enough, but 1) that's one reason why no one ever allows applets anymore, and 2) based on the description of

The attack is simply a mashup of a GIF picture and a JAR (Java applet). The malicious JAR is compiled and then combined with information from a GIF. The GIF part fools the browser into opening it as a picture and trusting the content.

(Emphasis mine.) There comes a point where we can try to be defensive and to do due diligence to provide a safe harbor (like ESAPI does against XSS attacks because of buggy browsers) or we say "screw it; performance is more important" (e.g., the approach that the OWASP Java Encoder takes). Neither approach is really wrong, but being cautious does have performance hits, In that sense, I would argue that if you haven't done a threat model to justify that this is reasonable for your particular application, you probably shouldn't be trying to defend against it.

Lastly, that said, if we are going to put in verbiage that people need to be cautious and rewrite whatever file type they get, then we definitely ought to add something t address the ever looming danger of Windows NTFS Alternate Data Streams. I remember experimenting with that in the early 2000s where I created a simple .txt file that would open calc.exe as the alternate data stream. That seems to be a broader danger than what is mentioned here because it's by design and AFAIK, still present in NTFS. (I'm not sure; I have used Windows since the Windows 7 days.) But we don't even mention NTFS Alternate Data Steams at all in the current File Upload CS.

jmanico · 2024-07-23T15:54:54Z

I still think this is a very good addition. These types of attacks have high impact when they work.

kwwall · 2024-07-24T04:54:50Z

@jmanico - The general idea is good, but I think the example is unrealistic / too simplistic. If the example was converted to using Windows NTFS alternative data streams (which, IIRC, was almost that simple to exploit), I'd be in favor of keeping it, but I'm just convinced that it will work otherwise, unless there's a bug in the PDF view that's being used.

jmanico · 2024-07-24T15:14:24Z

Can we come up with a better explanation then? This is an important control. Here is some light AI reading on the topic.

Why Validate File Contents?

Malicious File Uploads:
Attackers can upload files with malicious code or payloads. For example, a seemingly harmless image file could contain embedded scripts or malware. If these files are processed or executed on the server, they can lead to server-side vulnerabilities like Remote Code Execution (RCE).

File Type Mismatch:
Attackers may try to upload files with misleading extensions (e.g., a .jpg file that is actually an executable or script). Relying solely on file extensions or MIME types provided by the client can be dangerous, as these can be easily spoofed.

Resource Exhaustion:
Uploading excessively large files can lead to Denial of Service (DoS) attacks by exhausting server resources like memory, disk space, or processing power.

Data Exfiltration and Information Disclosure:
If file uploads are not properly handled, attackers could upload files that, when accessed, disclose sensitive information or facilitate unauthorized access to data.

Cross-Site Scripting (XSS):
Malicious scripts can be embedded in files like images, PDFs, or documents. If these files are later rendered or processed by the application without proper sanitization, they can lead to XSS attacks.

Exploitation Techniques
Remote Code Execution (RCE):
An attacker uploads a file with server-side executable content (e.g., PHP, Python, shell script). If the application processes this file improperly, it could execute the malicious code, leading to full server compromise.

Directory Traversal:
If an application stores uploaded files in a predictable directory structure, attackers might exploit directory traversal vulnerabilities to overwrite or access sensitive files.

Content-Type Mismatch and Bypass:
By tampering with the file's MIME type or extension, attackers can bypass client-side validation checks and upload prohibited file types, leading to potential security breaches.

Phishing:
Attackers can upload files that appear to be legitimate (e.g., HTML files mimicking a login page) and then share the link to these files, tricking users into providing sensitive information.

kwwall · 2024-07-26T02:13:08Z

@jmanico - I agree that we need something. My primary objection was to the specific example(s) that were proposed. It's just not that simple (well, except when it involves NTFS alternate data streams). I just re-checked our File Upload CS and there is zero mentioned about the fact that SVG images really an XML document and JavaScript can be embedded in it, allowing things like avatars as SVG images could be a source of persistent XSS. So, yes, we definitely have a few gaps here.

OTOH, one of the things that makes this so difficult is that you almost have to do it on a per file type basis. The techniques that are recommended for (say) image types like PNG or JPG won't necessarily work for images. And given that and how extensive your AI generated list of concerns is, we'd probably double or triple the length of the current OWASP File Uploads Cheat Sheet (which is at about 6 printed pages as of the time of this comment).

Thus, I think we need to:

Strategize how we want to approach it. Perhaps it needs to be re-organized into multiple cheat sheets.
Divide and conquer. Addressing all the concerns that you brought up is likely to be too much of an undertaking for a single volunteer, but if we could get a few different people to take a particular section, we might have a chance completing it before the end of the year.
We probably also should think about which of the general areas are the most important to the development community and focus on those things first.

mackowski · 2024-08-29T07:06:50Z

@jmanico and @righettod comments from @kwwall are valid in my opinion but it would be also a wast if we do not commit anything. Do you want to try to add more details to make it rock solid? We can also add a comment when this code will be fine and when reader need to add more protections.

jmanico · 2024-08-29T07:10:15Z

I agree, and suggest we just come up with a more modern example. Fair enough Kevin and others.

Add hints about content validation

ee755ab

righettod requested review from mackowski, jmanico and szh as code owners July 21, 2024 15:27

righettod added 2 commits July 21, 2024 15:29

Fix linter error

1b1c600

Fix linter error

9ec9f79

kwwall reviewed Jul 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hints about file content validation. #1452

Add hints about file content validation. #1452

righettod commented Jul 21, 2024

kwwall Jul 21, 2024

jmanico Sep 9, 2024

jmanico commented Jul 21, 2024

kwwall commented Jul 21, 2024 •

edited

Loading

jmanico commented Jul 23, 2024

kwwall commented Jul 24, 2024

jmanico commented Jul 24, 2024

kwwall commented Jul 26, 2024

mackowski commented Aug 29, 2024

jmanico commented Aug 29, 2024

Add hints about file content validation. #1452

Are you sure you want to change the base?

Add hints about file content validation. #1452

Conversation

righettod commented Jul 21, 2024

kwwall Jul 21, 2024

Choose a reason for hiding this comment

jmanico Sep 9, 2024

Choose a reason for hiding this comment

jmanico commented Jul 21, 2024

kwwall commented Jul 21, 2024 • edited Loading

jmanico commented Jul 23, 2024

kwwall commented Jul 24, 2024

jmanico commented Jul 24, 2024

kwwall commented Jul 26, 2024

mackowski commented Aug 29, 2024

jmanico commented Aug 29, 2024

kwwall commented Jul 21, 2024 •

edited

Loading