Fix for Unicode Transformation Issue/Vulnerability

2019-07-06 20:47发布

We upgraded our security scanner recently, and it's reporting a new issue.

What's the recommended fix? (We happen to be on ACF9.)

(Also, if you have an example exploit geared to CF, I'd appreciate it.)


Unicode transformation issues

Severity

High

Type

Configuration

Reported by module

Scripting (XSS.script)

Description

This page is vulnerable to various Unicode transformation issues such as Best-Fit Mappings, Overlong byte sequences, Ill-formed sequences.

Best-Fit Mappings occurs when a character X gets transformed to an entirely different character Y. In general, best-fit mappings occur when characters are transcoded between Unicode and another encoding.

Overlong byte sequences (non-shortest form) - UTF-8 allows for different representations of characters that also have a shorter form. For security reasons, a UTF-8 decoder must not accept UTF-8 sequences that are longer than necessary to encode a character. For example, the character U+000A (line feed) must be accepted from a UTF-8 stream only in the form 0x0A, but not in any of the following five possible overlong forms:

  • 0xC0 0x8A

  • 0xE0 0x80 0x8A

  • 0xF0 0x80 0x80 0x8A

  • 0xF8 0x80 0x80 0x80 0x8A

  • 0xFC 0x80 0x80 0x80 0x80 0x8A

Ill-Formed Subsequences As REQUIRED by UNICODE 3.0, and noted in the Unicode Technical Report #36, if a leading byte is followed by an invalid successor byte, then it should NOT consume it.

Impact

Software vulnerabilities arise when Best-Fit mappings occur. For example, characters can be manipulated to bypass string handling filters, such as cross-site scripting (XSS) or SQL Injection filters, WAF's, and IDS devices. Overlong UTF-8 sequence could be abused to bypass UTF-8 substring tests that look only for the shortest possible encoding.

Recommendation

Identiy the source of these Unicode transformation issues and fix them. Consult the web references bellow for more information.

References

Unicode Security

UTF-8 and Unicode FAQ for Unix/Linux

A couple of unicode issues on PHP and Firefox

Unicode Security Considerations

Affecteditems

/mysite-portal/

Details

URL encoded POST input linkServID was set to acu5955%EF%BC%9Cs1%EF%B9%A5s2%CA%BAs3%CA%B9uca5955

List of issues:

  • Unicode character U+02B9 MODIFIER LETTER PRIME (encoded as %CA%B9) was transformed into U+0027 APOSTROPHE (')

  • Unicode character U+02B9 MODIFIER LETTER PRIME (encoded as %CA%B9) was transf ... (line truncated)

Request headers

GET

/mysite-portal/?display=login&status=failed&rememberMe=0&contentid=&LinkServID=acu5955%1 Cs1es2%BAs3%B9uca5955&returnURL=https://stage-cms.mysite.com/mysite-portal/ HTTP/1.1 Referer: https://stage-cms.mysite.com:443/

Connection: Keep-alive

Accept-Encoding: gzip,deflate

User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)

Accept: */*

Host: stage-cms.mysite.com

3条回答
叛逆
2楼-- · 2019-07-06 21:16

Answer is: Canonicalization.

https://www.owasp.org/index.php/Canonicalization,_locale_and_Unicode#How_to_protect_yourself

How to protect yourself

A suitable canonical form should be chosen and all user input canonicalized into that form before any authorization decisions are performed. Security checks should be carried out after UTF-8 decoding is completed. Moreover, it is recommended to check that the UTF-8 encoding is a valid canonical encoding for the symbol it represents.

http://www.mattgifford.co.uk/canonicalize-method-in-coldfusion-8-and-coldfusion-9

查看更多
冷血范
3楼-- · 2019-07-06 21:19

There are various solutions for this

For CF 8 and 9 users:

A set of functions to work around this can be found at:

https://github.com/coldfumonkeh/cfml-security

For CF 10 users:

canonicalize(inputString, restrictMultiple, restrictMixed) 

Covers this concern. See http://help.adobe.com/en_US/ColdFusion/10.0/CFMLRef/WS932f2e4c7c04df8f-1a0d37871353e31b968-8000.html

For Railo users:

This was addressed in 4.0.0.011

https://issues.jboss.org/browse/RAILO-1873?_sscc=t

查看更多
狗以群分
4楼-- · 2019-07-06 21:23

Canonicalization wouldn't help you if your user inputs are ill-formed sequence.

For more information on how to handle ill-formed subsequences, see "Constraints on Conversion Processes" in Section 3.9, Unicode Encoding Forms in Unicode 5.2

For those cases, replace the invalid sequences with the "replacement char" U+FFFD built exactly for this purpose. That's the magic pill that will work in 99.9% cases but that 0.1% left is enough to wipeout your databases.

To be really secure, you need to fully analyze your input parsers to see if they're vulnerable against U+FFFD replacements.

The best solution that works all the time is to stop parsing, cleanup your junk, and then return an error message.

查看更多
登录 后发表回答