atoi java

最新推荐文章于 2021-03-16 03:07:57 发布

cncnlg

最新推荐文章于 2021-03-16 03:07:57 发布

阅读量812

点赞数

字符串转换成整数：输入一个表示整数的字符串，把该字符串转换成整数并输出，例如输入字符串”345”，则输出整数345。

在笔试面试中，atoi 即「字符串转换成整数」是一个经典问题了，此题无关算法，考察的更多是编码能力和细节考虑能力。因此自己就动手写了下，写完之后，打开 JDK 的源码想看看大牛是怎么写的，所谓「站在巨人的肩膀上」，果然还是有很多有意思的东西的。

首先，实现的思路是扫描整个字符串，扫描到当前字符时，将之前的结果乘以10加上当前字符代表的数字。

思路是很简单，但是有很多细节需要考虑，也是本题考查的重点。

开头可能会有 ‘+’ 和 ‘-‘，表示整数的正负。
字符串为 null 或是空（””）呢？
字符串中包含非数字的字符
只有一个 “+” 或 “-“ 字符呢？
最后就是溢出问题。

先放出我的代码吧！

     
      public 
      static 
      int 
      atoi(String s) 
      throws Exception {
     
         
      if (s == 
      null || s.length() == 
      0) {
     
             
      throw 
      new Exception(
      "illegal number input");
     
     
          }
     
         
      final 
      int MAX_DIV = Integer.MAX_VALUE / 
      10;
     
         
      final 
      int MIN_DIV = -(Integer.MIN_VALUE / 
      10);
     
         
      final 
      int MAX_M = Integer.MAX_VALUE % 
      10;
     
         
      final 
      int MIN_M = - (Integer.MIN_VALUE % 
      10);
     
         
      int result = 
      0;
     
         
      int i = 
      0, len = s.length();
     
         
      int sign = 
      1;
     
         
      int digit = s.charAt(
      0);            
      //当前字符
     
     
         
      if (digit == 
      '-' || digit == 
      '+') {
     
             
      if (digit == 
      '-') {
     
     
                  sign = -
      1;
     
     
              }
     
             
      if (len == 
      1) {
     
                 
      throw 
      new Exception(
      "illegal number input");
     
     
              }
     
     
              i++;
     
     
          }
     
     
         
      while (i < len) {
     
     
              digit = s.charAt(i++) - 
      '0';
     
             
      if (digit >= 
      0 && digit <= 
      9) {
     
                 
      if (sign > 
      0 && (result > MAX_DIV || (result == MAX_DIV && digit > MAX_M))) { 
      // 正数溢出
     
                     
      throw 
      new Exception(
      "number overflow");
     
     
                  } 
      else 
      if (sign < 
      0 && (result > MIN_DIV || (result == MIN_DIV && digit > MIN_M))) { 
      //负数溢出
     
                     
      throw 
      new Exception(
      "number overflow");
     
     
                  }
     
     
                  result = result * 
      10 + digit;
     
     
              } 
      else {
     
                 
      throw 
      new Exception(
      "illegal number input");
     
     
              }
     
     
          }
     
         
      return sign > 
      0 ? result : -result;
     
     
      }

上述代码中对于一些非法输入都做了异常处理，主要需要看的地方是溢出的判断。

一般判断溢出会这样判断 digit > INT_MAX - result*10 （先不考虑正负问题），但是这段代码是有问题的。

我们知道INT_MAX = 2147483647，如果输入的字符串是2147483657，那么执行到最后一位时，会有result*10 = 2147483650 > INT_MAX，此时已经溢出，所以答案必然出错。

在July的博客中提到比较好的解决方法是用除法和取模。

先用 result 与 INT_MAX /10 进行比较：若 result > INT_MAX/10（当然同时还要考虑 result=INT_MAX/10 的情况），说明最终得到的整数一定会溢出，故此时可以当即进行溢出处理，从而也就免去了计算 result*10 这一步骤。
当 result=INT_MAX/10 时，若 digit > INT_MAX%10，说明 result*10+digit 最终还是会溢出，也直接当溢出处理。

比如，对于正数来说，溢出有两种可能：

一种是诸如 2147483650，即 result > MAX/10 的；
一种是诸如 2147483649，即 result == MAX/10 && digit > MAX%10 的。

由于INT_MAX = 2147483647 而 INT_MIN = -2147483648，两者绝对值不相等。因此，正负数溢出的条件不一样，代码中用了两个条件来判断溢出情况。

再来学习下 JDK 中 parseInt 的实现吧！

     
      /*
     
     
       * Integer 类中的 parseInt 函数
     
     
       * Copyright (c) 1994, 2010, Oracle and/or its affiliates. All rights reserved.
     
     
       */
     
     
      public 
      static 
      int 
      parseInt (String s) 
      throws NumberFormatException {
     
         
      return parseInt(s,
      10);
     
     
      }
     
     
     
      public 
      static 
      int 
      parseInt(String s, 
      int radix)
     
                 
      throws NumberFormatException
     
     
      {
     
         
      /*
     
     
           * WARNING: This method may be invoked early during VM initialization
     
     
           * before IntegerCache is initialized. Care must be taken to not use
     
     
           * the valueOf method.
     
     
           */
     
     
         
      if (s == 
      null) {
     
             
      throw 
      new NumberFormatException( 
      "null");        
      //判断null的情况
     
     
          }
     
     
         
      if (radix < Character. MIN_RADIX) {                  
      //转换的进制不能小于2
     
             
      throw 
      new NumberFormatException( 
      "radix " + radix +
     
                                             
      " less than Character.MIN_RADIX");
     
     
          }
     
     
         
      if (radix > Character. MAX_RADIX) {               
      //转换的进制不能大于36
     
             
      throw 
      new NumberFormatException( 
      "radix " + radix +
     
                                             
      " greater than Character.MAX_RADIX");
     
     
          }
     
     
         
      int result = 
      0;                    
      //对应整数结果
     
         
      boolean negative = 
      false;          
      //保存整数的正负状态
     
         
      int i = 
      0, len = s.length();
     
         
      int limit = -Integer. MAX_VALUE;     
      //合法数字的上限（下限）
     
         
      int multmin;                         
      //在做乘法计算时能走到的合法上限（下限）
     
         
      int digit;     
      //当前字符对应的数字
     
     
         
      if (len > 
      0) {
     
             
      char firstChar = s.charAt(
      0);
     
             
      if (firstChar < 
      '0') { 
      // Possible leading "+" or "-"
     
                 
      if (firstChar == 
      '-') {
     
     
                      negative = 
      true;
     
     
                      limit = Integer. MIN_VALUE;
     
     
                  } 
      else 
      if (firstChar != 
      '+')
     
                     
      throw NumberFormatException. forInputString(s);
     
     
                 
      if (len == 
      1) 
      //不能只有一个"+"或者"-"
     
                     
      throw NumberFormatException. forInputString(s);
     
     
                  i++;
     
     
              }
     
     
              multmin = limit / radix;
     
             
      while (i < len) {
     
                 
      // Accumulating negatively avoids surprises near MAX_VALUE
     
     
                  digit = Character. digit(s.charAt(i++),radix);     
      //以指定的进制转换整数字符
     
                 
      if (digit < 
      0) {                                   
      //不能是非数字
     
                     
      throw NumberFormatException. forInputString(s);
     
     
                  }
     
                 
      if (result < multmin) {                         
      //判断溢出
     
                     
      throw NumberFormatException. forInputString(s);
     
     
                  }
     
     
                  result *= radix;
     
                 
      if (result < limit + digit) {
     
                     
      throw NumberFormatException. forInputString(s);
     
     
                  }
     
     
                  result -= digit;
     
     
              }
     
     
          } 
      else {     
      //字符串不能为""，即长度要大于0
     
             
      throw NumberFormatException. forInputString(s);
     
     
          }
     
         
      return negative ? result : -result;
     
     
      }

可以看到parseInt(String s)函数调用了parseInt(String s, int radix)，源码中参数的检查，异常输入的判断都非常全面。可以说严谨和巧妙是这段程序最大的特点。比较难懂的可能是溢出判断那一段。

     
      if (
      result < multmin) {                         //判断溢出
     
     
          throw 
      NumberFormatException. forInputString(s);
     
     
      }
     
     
      result *= radix;
     
     
      if (
      result < limit + digit) {
     
     
          throw 
      NumberFormatException. forInputString(s);
     
     
      }
     
     
      result -= digit;

其实与上文的溢出判断思想是差不多的，只不过可以发现 JDK 将所有的整数先当做负数来处理了，这有什么妙处呢？

我们知道有符号 int 的上下限是-2147483648 ~ 2147483647，可见负数表达的范围比正数多一个，这样就好理解为什么在开头要把 limit 全部表达为负数（下限）。这样的操作减少了后续的判断，可以一步到位，相当于二者选择取其大一样，大的包含了小的，不用像我的代码一样正负数情况都判断一次。同理，那么 multmin 也都是负数了，而且可以认为是只和进制参数 radix 有关系。在这里 multmin 就是-INT_MAX/10，当负数时就是INT_MIN/10。所以与上文类似，第一个条件就是若-result < -INT_MAX/10则是溢出。不满足第一个条件的情况下，result*10肯定不会溢出了。所以第二个条件判断若- result*10 < -INT_MAX + digit，则是溢出。

比如对于最大的负数来说，当扫描到最后一位时，result = -214748364，multmin=-214748364
第一个条件result < multmin不满足，执行result *= radix之后，result = -2147483640
第二个条件result < limit + digit，即 -2147483640<-2147483648+8 也不满足条件。
所以正常输出。