java 处理html代码 中<p> 标签 保证闭合

需求:

处理一段html代码中<p>标签,使其不嵌套,闭合,无内容暴露在标签之外

例如:

11111<p></p>22222<p>33333</p><p>44444</p>555555<p></p>66666


思维导图:

代码:

public static String checkpp(String content){
		String Str;
		int time =0;
		int pos =0;
		int path =0;
		boolean flag =true;
		int index1 =0;
		int index2 =0;
		
		//保证首尾位置分别为<p>和</p>
		int firstindex = content.indexOf("<p>");
		if(firstindex !=0){
			content =  "<p>"+content;
		}
		int lastindex = content.lastIndexOf("</p>");
		if(lastindex != content.length()-4 || lastindex == -1){
			content = content +"</p>";
		}
		
		StringBuffer  sb = new StringBuffer(content);
		
		do{
			time++;
		System.out.println("-------"+time+"----- path:"+path);
		switch(path){
			case 0:
				index1 =sb.indexOf("<p>",pos+1);
				index2 =sb.indexOf("</p>",pos+1);
				System.out.println(sb.toString());
				System.out.println("pos:"+pos);
				System.out.println("index1:"+index1);
				System.out.println("index2:"+index2);
				if(index1 != -1){
					if(index2<index1){ // <p></p>
						path = 1;	
						pos = index2;
					}else{     // <p>(...)*<p> → <p>(xxx)*</p><p>
						sb = sb.insert(index1, "</p>");
						path = 0;
						pos = index1+4;
					}
				}else{  // no more <p> but it's not end yet:<p>(...</p>)*</p>
					path = 1;	
					pos = index2;
				}
				break;
			case 1:
				index1 =sb.indexOf("<p>",pos+1);
				index2 =sb.indexOf("</p>",pos+1);
				System.out.println(sb.toString());
				System.out.println("pos:"+pos);
				System.out.println("index1:"+index1);
				System.out.println("index2:"+index2);
				//no more </p> exit
				if(index2 == -1){
					flag = false;
					break;
				}
				//no more <p> but not end
				if(index1 == -1){  
					sb = sb.insert(pos+4, "<p>");
					path =0;
					pos = pos + 7;
					break;
				}else{
					/* </p>(....)*<p>
					 * ↑
					 * pos
					 */ 
					if(index1<index2){
						if(index1 - pos > 4){ // </p>(xxx)<p> → </p><p>(xxx)</p><p>
							sb = sb.insert(pos+4, "<p>");
							sb = sb.insert(index1+3, "</p>");
							path = 0;
							pos = index1 +7;
						}else if(index1 - pos == 4){  // </p><p> 
							path = 0;
							pos = index1;
						}
					}else{ //</p>(....)*</p> → </p><p>(...)*</p>
						sb = sb.insert(pos+4, "<p>");
						path = 1;
						pos = index2 +3;
					}					
					break;
				}
			default:
				break;
		}
		System.out.println("pos:"+pos);
		System.out.println(sb.toString());
		//exit
		if(pos == sb.length()-4)
			break;

		}while(flag);
		

		Str =sb.toString();
		Str= Str.replaceAll("<p>(\\s)*( )*( )*</p>", "");
		return Str;
	}

执行例子的log:
11111<p></p>22222<p>33333</p><p>44444</p>555555<p></p>66666
-------1----- path:0
<p>11111<p></p>22222<p>33333</p><p>44444</p>555555<p></p>66666</p>
pos:0
index1:8
index2:11
pos:12
<p>11111</p><p></p>22222<p>33333</p><p>44444</p>555555<p></p>66666</p>
-------2----- path:0
<p>11111</p><p></p>22222<p>33333</p><p>44444</p>555555<p></p>66666</p>
pos:12
index1:24
index2:15
pos:15
<p>11111</p><p></p>22222<p>33333</p><p>44444</p>555555<p></p>66666</p>
-------3----- path:1
<p>11111</p><p></p>22222<p>33333</p><p>44444</p>555555<p></p>66666</p>
pos:15
index1:24
index2:32
pos:31
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p>555555<p></p>66666</p>
-------4----- path:0
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p>555555<p></p>66666</p>
pos:31
index1:43
index2:39
pos:39
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p>555555<p></p>66666</p>
-------5----- path:1
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p>555555<p></p>66666</p>
pos:39
index1:43
index2:51
pos:43
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p>555555<p></p>66666</p>
-------6----- path:0
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p>555555<p></p>66666</p>
pos:43
index1:61
index2:51
pos:51
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p>555555<p></p>66666</p>
-------7----- path:1
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p>555555<p></p>66666</p>
pos:51
index1:61
index2:64
pos:68
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p><p>555555</p><p></p>66666</p>
-------8----- path:0
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p><p>555555</p><p></p>66666</p>
pos:68
index1:-1
index2:71
pos:71
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p><p>555555</p><p></p>66666</p>
-------9----- path:1
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p><p>555555</p><p></p>66666</p>
pos:71
index1:-1
index2:80
pos:78
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p><p>555555</p><p></p><p>66666</p>
-------10----- path:0
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p><p>555555</p><p></p><p>66666</p>
pos:78
index1:-1
index2:83
pos:83
<p>11111</p><p></p><p>22222</p><p>33333</p><p>44444</p><p>555555</p><p></p><p>66666</p>
<p>11111</p><p>22222</p><p>33333</p><p>44444</p><p>555555</p><p>66666</p>

分析:

1.由于只涉及<p></p>,就可以将<p>和</p>作为两个状态来处理,思维导图见上,从左至右处理字符串。

2.将String转化成StringBuffer,便于执行插入操作。

3.使用外层do while内层switch方式遍历。

4.使用pos记录当前目标位置。

5.使用flag,当在swtich中触发终止条件时,将其设为false,再break,这样便可终止。

6.触发终止信号:a.当前pos位于最后一个</p>处,这个条件对于两个状态都适用; b.当前pos节点 向后遍历 没有</p> 出现,证明这是最后一个</p>,即结尾。

注:因为<p>状态非结尾状态,因此不能直接从switch的case 0中实现退出,只能跳转到case 1。

<h2>总体评价</h2> <h3>重点关注</h3> <ul> <li><strong>优良项</strong>:资本结构稳健,资产负债率处于行业合理范围;长期负债率较低,财务杠杆风险可控。</li> <li><strong>风险项</strong>:流动资产占比下降,短期偿债能力减弱;长期股权投资金额较大,投资风险需关注。</li> <li><strong>异常项</strong>:预计负债出现非零金额,需关注潜在法律或合同纠纷。</li> </ul> <h3>总体评价结论</h3> <ul> <li><strong>异常变动情况</strong>:长期股权投资、货币资金、预计负债等项目变动较大,需进一步分析。</li> <li><strong>结构合理性评价</strong>:资产结构中流动资产占比下降,非流动资产占比上升,资本结构中长期负债占比合理。</li> <li><strong>运营与战略验证</strong>:流动性风险有所增加,资本结构稳健性尚可,资产配置效率需进一步提升。</li> <li><strong>趋势洞察结论</strong>:资产和负债均呈现扩张趋势,需关注资产扩张的可持续性。</li> <li><strong>行业及宏观环境分析</strong>:家具制造业受房地产市场影响较大,需关注行业政策和市场需求变化。</li> </ul> <h2>异常变动分析</h2> <h3>大额变动项目分析</h3> <table> <thead> <tr><th>报表项目名称</th><th>金额(2024Q4)</th><th>变动幅度</th><th>平均历史波动阈值</th></tr> </thead> <tbody> <tr><td>长期股权投资</td><td>3,021,466,806.33</td><td>+30%</td><td>±10%</td></tr> <tr><td>货币资金</td><td>1,333,371,737.85</td><td>-13%</td><td>±5%</td></tr> <tr><td>预计负债</td><td>3,417,297.11</td><td>+100%</td><td>±5%</td></tr> </tbody> </table> <p><strong>分析结论</strong>:</p> <ul> <li><strong>长期股权投资</strong>:金额大幅增加,可能表明公司进行了较大规模的投资活动,需关注投资回报率。</li> <li><strong>货币资金</strong>:金额减少,可能影响公司短期偿债能力,需关注现金流情况。</li> <li><strong>预计负债</strong>:首次出现非零金额,可能表明公司面临潜在的法律或合同纠纷,需进一步调查。</li> </ul> <h3>异常比例项目分析</h3> <table> <thead> <tr><th>父项目</th><th>子项目</th><th>金额(2024Q4)</th><th>本期占比</th><th>行业平均占比</th></tr> </thead> <tbody> <tr><td>非流动资产</td><td>长期股权投资</td><td>3,021,466,806.33</td><td>15%</td><td>10%</td></tr> <tr><td>流动资产</td><td>货币资金</td><td>1,333,371,737.85</td><td>30%</td><td>35%</td></tr> </tbody> </table> <p><strong>分析结论</strong>:</p> <ul> <li><strong>长期股权投资</strong>:占比高于行业平均水平,可能表明公司投资策略较为激进。</li> <li><strong>货币资金</strong>:占比低于行业平均水平,可能影响公司短期偿债能力。</li> </ul>怎么将这段html转换成pdf的java实现
04-02
/** * NOTE: This file has been copied and slightly modified from {com.alibaba.csp.sentinel.datasource.redis}. * <p> * <h2>Redis Datasource Task Elements Operation</h2> * * This class is a concrete implementation of a remote configuration data source, * built on top of Redis's high-performance read/write capabilities and Pub/Sub real-time * messaging mechanism. It enables dynamic loading, listening, and publishing of rule or * configuration data with low latency and strong consistency. * * <p>The primary design objectives are:</p> * * <ul> * <li><b>Real-time Updates</b>: Subscribes to a designated Redis Pub/Sub channel to detect * configuration changes immediately and trigger hot-reload events.</li> * <li><b>Deployment Flexibility</b>: Supports multiple Redis deployment modes including * standalone, Sentinel (for high availability), and Redis Cluster (for horizontal scaling).</li> * <li><b>Security Support</b>: Provides full SSL/TLS encryption support with compatibility for * both JKS and PEM certificate formats, meeting production security requirements.</li> * <li><b>Extensibility</b>: Extends {@link RemoteDatasourceTaskElementsOperation}, adhering to a unified * datasource abstraction contract. Can be seamlessly integrated into systems requiring * runtime reconfiguration, such as rule engines, policy managers, or distributed gateways.</li> * </ul> * * <h3>Working Principle</h3> * <p>During initialization, the class automatically selects the appropriate client based on * the provided {@link RedisConnectionConfig}: * <ul> * <li>If cluster nodes are configured, it creates a {@link RedisClusterClient}.</li> * <li>Otherwise, it uses a {@link RedisClient} for standalone or Sentinel mode connections.</li> * </ul> * * Additionally, it sets up a "lazy-loaded" subscription via {@link #setLazyListener(java.util.function.Supplier)} — * meaning the actual Pub/Sub connection and subscription occur only when an external system * explicitly requests listening (e.g., during startup of a configuration watcher). * This lazy initialization strategy reduces resource consumption and improves application startup time. * * <h3>Key Method Overview</h3> * <table border="1" cell-padding="8"> * <tr><th>Method</th><th>Purpose</th></tr> * <tr> * <td>{@link #getRemoteConfigInfo()}</td> * <td>Retrieves the current configuration value associated with {@code ruleKey} from Redis (via GET)</td> * </tr> * <tr> * <td>{@link #publishConfig(String)}</td> * <td>Publishes new configuration data to Redis (via SET), typically used to broadcast updates</td> * </tr> * <tr> * <td>{@link #subscribeFromChannel(String)}</td> * <td>Establishes a Pub/Sub connection and subscribes to a specific channel; incoming messages * are dispatched through {@link DelegatingRedisPubSubListener}</td> * </tr> * <tr> * <td>{@link #close()}</td> * <td>Gracefully shuts down the Redis client and releases all underlying resources</td> * </tr> * </table> * * <h3>Typical Use Cases</h3> * <ul> * <li>Dynamic rule loading in rule engines (e.g., Drools, Easy Rules)</li> * <li>Hot-reloading of distributed configurations in microservices</li> * <li>Real-time policy or permission updates in access control systems</li> * <li>Centralized rate-limiting or circuit-breaker configuration management</li> * </ul> * <h3>Example Usage</h3> * <pre>{@code * RedisConnectionConfig config = new RedisConnectionConfig(); * config.setHost("localhost"); * config.setPort(6379); * config.setTimeout(5000); * * RedisDatasourceTaskElementsOperation redisSource = * new RedisDatasourceTaskElementsOperation(config, "rule:flow", "channel:rule-update", ConfigFormat.JSON); * * // Retrieve current configuration * String currentConfig = redisSource.getRemoteConfigInfo(); * * // Start listening (triggers lazy subscription) * redisSource.startListen(); * * // Publish new config (other instances will receive notification) * redisSource.publishConfig("{\"rateLimit\": 1000}"); * * // Clean up resources * redisSource.close(); * }</pre> * * @author <a href="mailto:929160069@qq.com">zhangpengfei</a> * @since 3.0.2 */ 编译报错 [ERROR] /Users/zhangpengfei/workspace/assembly/assembly-simplified-cron/cron-redis-datasource-driven-scheduled/src/main/java/top/osjf/cron/datasource/driven/scheduled/redis/RedisDatasourceTaskElementsOperation.java:97: 错误: 此处不允许使用标记: <tr> [ERROR] * <tr>
最新发布
11-20
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值