Java: CSVUtils

本文介绍了一种自定义的CSV解析方法,能够有效处理包含特殊字符的数据字段。通过使用Java实现的CSV解析工具,可以准确地解析复杂的CSV文件,包括字段内含有逗号和引号等特殊字符的情况。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

解析CSV文件,最简单,最直接,最突然的想法就是

 

"a,b,c,d".split(",")

 

但是如果复杂一点儿,加上一些特殊字符,比如引号“"”,逗号“,”,那你就会发现split(",")不灵光了。于是今天就闲着没事儿写了一个,没有完全测试,但是应该没问题,哈哈

package com.javaeye.terrencexu.csv

import java.util.LinkedList;
import java.util.List;

public final class CSVUtils {

	private CSVUtils() {}
	
	private static char fieldSep = ',';

	private static List<String> fields = null;

	/**
	 * This function will split the input on commas. It needs to decide whether
	 * to parse normally or consider special scenarios like "AA,BB". This
	 * function returns the number of fields present in the input.
	 * 
	 * @param line
	 * @return
	 */
	public static List<String> split(String line) {
		fields = new LinkedList<String>();
		
		if (line.length() == 0) {
			return null;
		}
		
		int curPos = 0;
		
		while(curPos <= line.length()) {
			if (curPos < line.length() && line.charAt(curPos) == '"') {
				curPos = parseQuoted(line, ++curPos);
			} else {
				curPos = parsePlain(line, curPos);
			}

			curPos ++;
		}

		return fields;
	}
	
	public static List<String> split(String line, char separator) {
		fieldSep = separator;
		
		return split(line);
	}

	/**
	 * This function will parse all fields that are not in quotes.
	 * 
	 * @param line
	 * @param curPos
	 * @return
	 */
	private static int parsePlain(String line, int curPos) {
		int nextSepPos;

		nextSepPos = line.indexOf(fieldSep, curPos);

		if (nextSepPos == -1) {
			fields.add(line.substring(curPos));
			return line.length();
		} else {
			fields.add(line.substring(curPos, nextSepPos));
		}
		
		return nextSepPos;
	}

	/**
	 * This function will parse all fields that are in quotes.
	 * 
	 * @param line
	 * @param curPos
	 * @return
	 */
	private static int parseQuoted(String line, int curPos) {
		int tmpPos;
		String fld = "";
		
		for (tmpPos = curPos; tmpPos < line.length(); tmpPos++) { 
			if (line.charAt(tmpPos) == '"' && tmpPos + 1 < line.length()) { 
				if (line.charAt(tmpPos + 1) == '"') { 
					tmpPos++; 
				} else if (line.charAt(tmpPos + 1) == fieldSep) { 
					tmpPos++;
					break;
				}
				
			} else if (line.charAt(tmpPos) == '"' && tmpPos + 1 == line.length()) {
				break;
			}
			
			fld = fld + line.charAt(tmpPos); 
		}
		
		fields.add(fld);
		
		return tmpPos;
	}

}

 

测试一把:

public static void main(String[] args) {
	String line = "col_1,Test,\"{\"\"key\"\":\"\"date\"\",\"\"order\"\":\"\"desc\"\"}\",,,,application/xml";
	String line2 = "a|b|\"|\"|d";
	
	System.out.println(CSVUtils.split(line));
	System.out.println(CSVUtils.split(line2, '|'));
}

 运行结果:

[col_1, Test, {"key":"date","order":"desc"}, , , , application/xml]

[a, b, |, d]

 

 -- Done --

 

org.junit.platform.commons.PreconditionViolationException: Classpath resource [/nextDay.csv] does not exist at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:215) at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1709) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:570) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:636) at java.base/java.util.stream.ReferencePipeline$7$1FlatMap.accept(ReferencePipeline.java:294) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:215) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:215) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:215) at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1709) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:570) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:636) at java.base/java.util.stream.ReferencePipeline$7$1FlatMap.accept(ReferencePipeline.java:294) at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1709) a
最新发布
03-24
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值