字符串数值转换,性能对比,批量转换可以考虑使用sscanf
sscanf
sscanf使用demo:

split+atof 和sscanf性能对比

其他函数性能对比
The following are sequential results measured on a PC (Core i7 920 @2.67Ghz), where u32toa() is compiled by Visual C++ 2013 and run on Windows 64-bit. The speedup is based on sprintf().
| Function | Time (ns) | Speedup |
|---|---|---|
| sprintf | 194.225 | 1.00x |
| vc | 61.522 | 3.16x |
| naive | 26.743 | 7.26x |
| count | 20.552 | 9.45x |
| lut | 17.810 | 10.91x |
| countlut | 9.926 | 19.57x |
| branchlut | 8.430 | 23.04x |
| sse2 | 7.614 | 25.51x |
| null | 2.230 | 87.09x |


Implementations
| Function | Description |
|---|---|
| ostringstream | std::ostringstream in C++ standard library. |
| ostrstream | std::ostrstream in C++ standard library. |
| to_string | std::to_string() in C++11 standard library. |
| sprintf | sprintf() in C standard library |
| vc | Visual C++'s _itoa(), _i64toa(), _ui64toa() |
| naive | Compute division/modulo of 10 for each digit, store digits in temp array and copy to buffer in reverse order. |
| unnamed | Compute division/modulo of 10 for each digit, store directly in buffer |
| count | Count number of decimal digits first, using technique from [1]. |
| lut | Uses lookup table (LUT) of digit pairs for division/modulo of 100. Mentioned in [2] |
| countlut | Combines count and lut. |
| branchlut | Use branching to divide-and-conquer the range of value, make computation more parallel. |
| sse2 | Based on branchlut scheme, use SSE2 SIMD instructions to convert 8 digits in parallel. The algorithm is designed by Wojciech Muła [3]. (Experiment shows it is useful for values equal to or more than 9 digits) |
| null | Do nothing. |
Results
The following are sequential results measured on a PC (Core i7 920 @2.67Ghz), where u32toa() is compiled by Visual C++ 2013 and run on Windows 64-bit. The speedup is based on sprintf().
| Function | Time (ns) | Speedup |
|---|---|---|
| ostringstream | 2,778.748 | 0.45x |
| ostrstream | 2,628.365 | 0.48x |
| gay | 1,646.310 | 0.76x |
| sprintf | 1,256.376 | 1.00x |
| fpconv | 273.822 | 4.59x |
| grisu2 | 220.251 | 5.70x |
| doubleconv | 201.645 | 6.23x |
| milo | 138.021 | 9.10x |
| null | 2.146 | 585.58x |


Implementations
| Function | Description |
|---|---|
| ostringstream | std::ostringstream in C++ standard library with setprecision(17). |
| ostrstream | std::ostrstream in C++ standard library with setprecision(17). |
| sprintf | sprintf() in C standard library with "%.17g" format. |
| stb_sprintf | fast sprintf replacement with "%.17g" format. |
| gay | David M. Gay's dtoa() C implementation. |
| grisu2 | Florian Loitsch's Grisu2 C implementation [1]. |
| doubleconv | C++ implementation extracted from Google's V8 JavaScript Engine with EcmaScriptConverter().ToShortest() (based on Grisu3, fall back to slower bignum algorithm when Grisu3 failed to produce shortest implementation). |
| fpconv | night-shift's Grisu2 C implementation. |
| milo | miloyip's Grisu2 C++ header-only implementation. |
| null | Do nothing. |
Notes:
-
tostring()is not tested as it does not fulfill the roundtrip requirement. -
Grisu2 is chosen because it can generate better human-readable number and >99.9% of results are in shortest. Grisu3 needs another
dtoa()implementation for not meeting the shortest requirement.
参考资料:
https://wiki.so.qihoo.net/pages/viewpage.action?pageId=16145758
https://github.com/compmeist/fast-atof
https://github.com/yuanyuanxiang/_atof
https://github.com/j-jorge/atoi-benchmark
这篇博客探讨了不同数值转换方法的性能,包括sscanf、std::ostringstream和一些优化实现如Grisu2。通过在Core i7 920 @ 2.67Ghz的PC上进行基准测试,结果显示某些优化后的函数如branchlut和sse2能提供显著的性能提升,速度比sprintf快20倍以上。同时,文章还对比了C++标准库中的转换方法与特定实现,如doubleconv和fpconv,显示了特定优化技术对于提高转换效率的重要性。
876

被折叠的 条评论
为什么被折叠?



