CAPEC-80: Using UTF-8 Encoding to Bypass Validation Logic
Description
Extended Description
A URL may contain special character that need special syntax handling in order to be interpreted. Special characters are represented using a percentage character followed by two digits representing the octet code of the original character (%HEX-CODE).
For instance US-ASCII space character would be represented with %20. This is often referred as escaped ending or percent-encoding. Since the server decodes the URL from the requests, it may restrict the access to some URL paths by validating and filtering out the URL requests it received. An adversary will try to craft an URL with a sequence of special characters which once interpreted by the server will be equivalent to a forbidden URL.
It can be difficult to protect against this attack since the URL can contain other format of encoding such as UTF-8 encoding, Unicode-encoding, etc. The adversary could also subvert the meaning of the URL string request by encoding the data being sent to the server through a GET request. For instance an adversary may subvert the meaning of parameters used in a SQL request and sent through the URL string (See Example section).
Severity :
High
Possibility :
High
Type :
Detailed
Relationships with other CAPECs
This table shows the other attack patterns and high level categories that are related to this attack pattern.
Prerequisites
This table shows the other attack patterns and high level categories that are related to this attack pattern.
- The application's UTF-8 decoder accepts and interprets illegal UTF-8 characters or non-shortest format of UTF-8 encoding.
- Input filtering and validating is not done properly leaving the door open to harmful characters for the target host.
Skills required
This table shows the other attack patterns and high level categories that are related to this attack pattern.
- Low An attacker can inject different representation of a filtered character in UTF-8 format.
- Medium An attacker may craft subtle encoding of input data by using the knowledge that they have gathered about the target host.
Taxonomy mappings
Mappings to ATT&CK, OWASP and other frameworks.
Related CWE
A Related Weakness relationship associates a weakness with this attack pattern. Each association implies a weakness that must exist for a given attack to be successful.
CWE-20: Improper Input Validation
CWE-73: External Control of File Name or Path
CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')
CWE-172: Encoding Error
CWE-173: Improper Handling of Alternate Encoding
CWE-180: Incorrect Behavior Order: Validate Before Canonicalize
CWE-181: Incorrect Behavior Order: Validate Before Filter
CWE-692: Incomplete Denylist to Cross-Site Scripting
CWE-697: Incorrect Comparison
Visit http://capec.mitre.org/ for more details.