百度服务简介:文字识别是百度自然场景OCR服务,依托百度业界领先的OCR算法,提供了整图文字检测、识别、整图文字识别、整图文字行定位和单字图像识别等功能。
不多说啦,直接看demo吧!
- package com.oa.test;
-
- import java.io.BufferedReader;
- import java.io.File;
- import java.io.InputStream;
- import java.io.InputStreamReader;
- import java.net.HttpURLConnection;
- import java.net.URL;
-
- import com.oa.commons.util.BASE64;
-
- public class OCRTest {
-
- public static String request(String httpUrl, String httpArg) {
- BufferedReader reader = null;
- String result = null;
- StringBuffer sbf = new StringBuffer();
-
- try {
- URL url = new URL(httpUrl);
- HttpURLConnection connection = (HttpURLConnection) url
- .openConnection();
- connection.setRequestMethod("POST");
- connection.setRequestProperty("Content-Type",
- "application/x-www-form-urlencoded");
-
- connection.setRequestProperty("apikey", "您自己的apikey");
- connection.setDoOutput(true);
- connection.getOutputStream().write(httpArg.getBytes("UTF-8"));
- connection.connect();
- InputStream is = connection.getInputStream();
- reader = new BufferedReader(new InputStreamReader(is, "UTF-8"));
- String strRead = null;
- while ((strRead = reader.readLine()) != null) {
- sbf.append(strRead);
- sbf.append("\r\n");
- }
- reader.close();
- result = sbf.toString();
- } catch (Exception e) {
- e.printStackTrace();
- }
- return result;
- }
-
- <pre name="code" class="java">
-
-
- public static void main(String[] args) {
- File file = new File("d:\\che4.jpg");
- String imageBase = BASE64.encodeImgageToBase64(file);
- imageBase = imageBase.replaceAll("\r\n","");
- imageBase = imageBase.replaceAll("\\+","%2B");
- String httpUrl = "http://apis.baidu.com/apistore/idlocr/ocr";
- String httpArg = "fromdevice=pc&clientip=10.10.10.0&detecttype=LocateRecognize&languagetype=CHN_ENG&imagetype=1&image="+imageBase;
- String jsonResult = request(httpUrl, httpArg);
- System.out.println("返回的结果--------->"+jsonResult);
-
- }
-
-
-
-
-
-
-
- public static String encodeImgageToBase64(File imageFile) {
-
- byte[] data = null;
-
- try {
- InputStream in = new FileInputStream(imageFile);
- data = new byte[in.available()];
- in.read(data);
- in.close();
- } catch (IOException e) {
- e.printStackTrace();
- }
-
-
- BASE64Encoder encoder = new BASE64Encoder();
- return encoder.encode(data);
- }
附件:

(che4.jpg)
运行后结果:
{"errNum":"0","errMsg":"success","querySign":"2289891521,4081625058","retData":[{"rect":{"left":"32","top":"15","width":"418","height":"118"},"word":"\u8c6bC88888"},{"rect":{"left":"45","top":"137","width":"373","height":"18"},"word":"\u4e1c\u98ce\u672c\u7530\u6d1b\u9633\u952e\u901a\u5e97\u7535\u8bdd\uff1a03796358222"}]}
注意:将此结果放到 在线JSON校验格式化工具中(http://www.bejson.com/)会得到你想要的结果:
- {
- "errNum": "0",
- "errMsg": "success",
- "querySign": "2289891521,4081625058",
- "retData": [
- {
- "rect": {
- "left": "32",
- "top": "15",
- "width": "418",
- "height": "118"
- },
- "word": "豫C88888"
- },
- {
- "rect": {
- "left": "45",
- "top": "137",
- "width": "373",
- "height": "18"
- },
- "word": "东风本田洛阳键通店电话:03796358222"
- }
- ]
- }
怎么样,感觉很神奇吧,感兴趣的试一下吧!
最后,解释一下几个参数的含义:
apikey:API密钥 也就是您自己的apikey
fromdevice:来源,例如:Android、iPhone 默认是PC
clientip:客户端出口IP
detecttype:OCR接口类型
languagetype:要检测的文字类型
imagetype:图片资源类型
image:图片资源,目前仅支持jpg格式