Skip to content

A Post-Unicode Normalization Vulnerability

Low
tlipkis published GHSA-mgqj-hphf-9588 Aug 23, 2023

Package

maven lockss-daemon (Maven)

Affected versions

1.76.x and earlier

Patched versions

1.77.3

Description

Summary

The next code snippet is vulnerable to post-Unicode normalization. It's a CWE-176.
Such a vulnerability happens when some security checks are performed before a Unicode normalization.

    /**
     * Sanitises a string so that it can be used as a div id
     *
     * @param name
     * @return Returns sanitized string
     */
    public static String cleanName(String name) {
        return Normalizer.normalize(HtmlUtil.encode(name.replace(" ", "_").replace("&", "").replace("(", "")
                .replace(")", "").replace(",", "").replace("+", "_"), HtmlUtil.ENCODE_TEXT), Normalizer.Form.NFC);
    }

As can be seen the function cleanName() sanitizes the name against spaces, ampersand and (),+ characters.
However, the late Unicode normalization using the NFC form algorithm may re-introduce back those characters.

Impact

This is a low-severity vulnerability. A mitigation would be to Unicode normalize first and then omit (replace) the unwanted characters.

As an example of a re-introduced characters check when the normalization operation is applied to U+1FEF (`), the resulting character will be U+0060 (`) under the NFC form. Same could happen to other cases.

References

Severity

Low

CVE ID

No known CVE

Weaknesses

Credits