大规模机器学习是大数据分析与挖掘的重要支撑工具,经过查找,网上QURA有个整理比较好的文献,采用下来,以供大家使用。
转载地址:https://www.quora.com/What-are-some-software-libraries-for-large-scale-learning
(1) 开源工具
- Netlib/Scalapack: http://netlib.org/liblist
.html - ScalaNLP and breeze
- Theano
- dmlc/xgboost
- NAG: http://www.nag.co.uk/nume
ric/fd/... - Nvidia Introduces CuDNN, a CUDA-based Library for Deep Neural Networks
- PredictionIO
- Probabilistic Network Library PNL by Intel and PNL | ITLab Lobachevsky State University of Nizhni Novgorod
- refr - Reranker Framework (ReFr) - Google Project Hosting, about: A New Open-Source Framework for Building Reranking Models
- parredHMMlib | Andreas Sand
- SAMOA by yahoo
- The SMALL project: http://www.small-project.
eu/ - Decomposer: http://code.google.com/p/
decompo... - SuiteSparse: http://www.cise.ufl.edu/r
esearch... - SMMP: http://www.mgnet.org/~dou
glas/Pr... - SPARSEKIT: http://www-users.cs.umn.e
du/~saa... (also http://www-users.cs.umn.e du/~saa... ) - HEIGEN: http://www.cs.cmu.edu/~uk
ang/HEI... - MLbase
- 0xdata/h2o
- milakov/nnForge
- http://deeplearning4j.org
/
- Leon Bottou's toolkit: http://leon.bottou.org/pr
ojects/sgd - MADlib: http://madlib.net/
- BayesDB
- Infer.NET
- Matrix factorizations: https://sites.google.com/
site/ig... - VW: http://hunch.net/~vw/
- Apache Mahout: http://mahout.apache.org/ ( notes:www.cs.stanford.edu/~ang/
papers/nips06-mapreducemu lticore.pdf ) - Pegasus: http://www.cs.cmu.edu/~pe
gasus/ - BID Data Project by John Canny
- VFML: http://www.cs.washington.
edu/dm/... - Shogun: http://www.shogun-toolbox
.org/ - GraphLab: http://www.graphlab.ml.cm
u.edu/p... (notes:http://metaoptimize.com/q a/quest... , via Danny Bickson and Joseph Turian ) - iSAX: http://www.cs.ucr.edu/~ea
monn/iS... - MOA: http://moa.cs.waikato.ac.
nz/ (notes http://sourceforge.net/pr ojects/... ) - S4: http://s4.io/
- LinAlg (Ruby): http://rubyforge.org/proj
ects/li... - dlib: http://dlib.net/ (notes: http://jmlr.csail.mit.edu
/papers... ) - Shark: http://shark-project.sour
ceforge... - Borealis: http://www.cs.brown.edu/r
esearch... - Hama: http://incubator.apache.o
rg/hama/ - PSVM: http://code.google.com/p/
psvm/ (via Joseph Misiti) - LibCVM: http://www.cse.ust.hk/~iv
or/cvm.... (via Joel Hoff) - The TLD project: http://info.ee.surrey.ac.
uk/Pers... - LIBLINEAR: http://www.csie.ntu.edu.t
w/~cjli... - SVMTorch: http://bengio.abracadoudo
u.com/S... - SVDPACK: http://www.netlib.org/svd
pack/ - StreamIt: http://groups.csail.mit.e
du/cag/... - SENNA: http://ml.nec-labs.com/se
nna/ - Spark Machine Learning Library (MLlib) (via Apache Spark: The Next Big Data Thing? )
- Spark-LIBLINEAR: Libraries for Large-scale Linear Classification on Distributed Environments
- Elephant: http://elefant.developer.
nicta.c... (notes:http://elefant.developer. nicta.c... ) - Elephant/Stream: http://users.cecs.anu.edu
.au/~jp... - SparseM: http://cran.r-project.org
/web/pa... (via Josh Wills) - R/HPC: http://cran.r-project.org
/web/vi... (via Dan Knoepfle) - R/ML: http://cran.r-project.org
/web/vi... - SciPy: http://www.scipy.org/
- SciKits: http://scikit-learn.sourc
eforge.... - Sage and IPython: http://www.sagemath.org/ ,http://ipython.scipy.org/
moin/ (via Jameson Quinn) - NumPy, etc: http://wiki.python.org/mo
in/Nume... ,http://pypi.python.org/py pi?%3Aa... - ARPACK: http://www.caam.rice.edu/
softwar... - ScaLAPACK: http://www.netlib.org/sca
lapack/ - BLAS implementations: http://en.wikipedia.org/w
iki/Bas... - uBLAS/Boost: http://www.boost.org/doc/
libs/1_... - Eigen: http://eigen.tuxfamily.or
g/index... - SPIRIT: http://www.cs.cmu.edu/afs
/cs/pro... - Scilab: http://www.scilab.org/pro
jects/c... (via Aditya Sengupta) - Java Nonlinear Optimization: ww1.fpl.fs.fed.us/optimiz
ation.html - Java Parallel Optimization: http://www5.informatik.un
i-erlan... - JAMA: http://math.nist.gov/java
numeric... - Octave Multicore: http://octave.sourceforge
.net/mu... (via Jordi Arnabat) - Tesseract OCR: http://code.google.com/p/
tessera... - IBM Parallel ML toolbox: http://www.alphaworks.ibm
.com/te... - Microsoft Sigma: http://research.microsoft
.com/en... - http://mloss.org/software
/ - HPCC by LexisNexis : http://hpccsystems.com/
- Graphical Models toolbox: http://people.rit.edu/jcd
icsa/JGMT/ - Lazy Learning toolbox: http://iridia.ulb.ac.be/~
lazy/ - Deep Learning toolbox: http://deeplearning.net/s
oftware... - Mortar: http://cseweb.ucsd.edu/~k
yocum/s... - Debellor: http://www.debellor.org/
- STXXL: http://stxxl.sourceforge.
net/
- Matlab Toolbox For Dimensionality Reduction: http://homepage.tudelft.n
l/19j49... (via Laurens van der Maaten) - CUDA: http://www.nvidia.com/obj
ect/cud... ,http://www.nvidia.com/obj ect/tes... - Perl Data Language: http://pdl.perl.org/
- PGPLOT: http://www.astro.caltech.
edu/~tj... - Processing: http://processing.org/
- C5.0: http://rulequest.com/down
load.html - Spark: http://www.spark-project.
org/ - Bagel: https://github.com/mesos/
spark/p... - StatStream: http://cs.nyu.edu/shasha/
papers/... - OpenCV: http://opencv.willowgarag
e.com/w... - vecLib: http://developer.apple.co
m/hardw... - Netlab: http://www1.aston.ac.uk/e
as/rese... - Java Numerics: http://math.nist.gov/java
numeric... - Matrix Toolkits for Java: http://code.google.com/p/
matrix-... - List of numerical analysis software: http://en.wikipedia.org/w
iki/Lis... - PLASMA: http://icl.cs.utk.edu/pla
sma/ind... - Colt: http://acs.lbl.gov/softwa
re/colt/ - Parallel Colt: https://sites.google.com/
site/pi... - Incanter: https://github.com/liebke
/incanter/ - Lush: http://lush.sourceforge.n
et/ - Hal: http://www.umiacs.umd.edu
/~hal/s... - Cython for numerical computations: http://conference.scipy.o
rg/proc...
- Distributed Matlab: http://www.mathworks.com/
product... - Microsoft/ISC Star-P: http://www.microsoft.com/
pathway... (notes:
http://beowulf.csail.mit.edu/18.... ) - IBM InfoSphere: http://www-01.ibm.com/sof
tware/d... - Google Prediction API: http://code.google.com/ap
is/pred... (notes: http://mark.reid.name/iem /predic... ) - RapidMiner: http://rapid-i.com/conten
t/view/... - Pentaho: http://www.pentaho.com/
- Tableu: http://www.tableausoftwar
e.com/ - SAS Forecast Server: http://www.sas.com/techno
logies/... - Esper: http://www.espertech.com/
- Streambase: http://www.streambase.com
/ - Oracle BI: http://www.oracle.com/us/
solutio... - Tibco Spotfire: http://spotfire.tibco.com
/ - Oracle Data Mining: http://www.oracle.com/tec
hnetwor... ,http://en.wikipedia.org/w iki/Ora... - Intel Math Kernel: http://software.intel.com
/en-us/... - EigenDog: https://www.eigendog.com (thanks to Julien Verlaguet )
- Boyd: http://www.stanford.edu/~
boyd/so... - Franklin: http://www.cs.berkeley.ed
u/~fran... - Faloutsos: http://www.cs.cmu.edu/~ch
ristos/ - Bontempi: http://mlg.ulb.ac.be/
- Shalev-Shwartz: http://www.cs.huji.ac.il/
~shais/... - NEC: http://www.nec-labs.com/r
esearch... - Select: http://www.select.cs.cmu.
edu/peo... - Shasha: http://cs.nyu.edu/shasha/
papers/... - CBCB: http://www.cbcb.umd.edu/s
oftware/ - Wendykier: https://sites.google.com/
site/pi... - http://www.gpucomputing.n
et
- Pregel: http://googleresearch.blo
gspot.c... - Has anyone started an Apache project based on Google's recently published Pregel paper?
- Graph Databases: http://www.graph-database
.org/ov... - NoSQL Databases: http://nosql-database.org
/ - Kdb+: http://kx.com/kdb+.php
- GNU Parallel: http://www.gnu.org/softwa
re/para... - Bloom: http://www.bloom-lang.net
/ - LinAlg: http://www.linalg.org/
- Lush: http://lush.sourceforge.n
et/ - Brainlab: http://www.interstice.com
/~drewe... - The SRI Language Modeling Toolkit: http://www.speech.sri.com
/projec... (via Jeff Dalton) - Yahoo LDA: https://github.com/shrava
nmn/Yah... (Via Alex Smolahttp://blog.smola.org/pos t/63597... ) - AppScale: http://code.google.com/p/
appscal... - Scientific libraries for Mac: http://www.atmos.washingt
on.edu/... - dmoz AI directory: http://www.dmoz.org/Compu
ters/Ar... - Metaoptimize thread on ML libraries: http://metaoptimize.com/q
a/quest... - Google Code Search: http://www.google.com/cod
esearch... - http://www.jstatsoft.org/
- http://hunch.net/?p=230
- http://hpc.sourceforge.ne
t/ - http://stackoverflow.com/
questio... - Java-matrix-benchmark: http://code.google.com/p/
java-ma... - Gorila: Google Reinforcement Learning Architecture
- Netlib/Scalapack: http://netlib.org/liblist
.html - ScalaNLP and breeze
- Theano
- dmlc/xgboost
- NAG: http://www.nag.co.uk/nume
ric/fd/... - Nvidia Introduces CuDNN, a CUDA-based Library for Deep Neural Networks
- PredictionIO
- Probabilistic Network Library PNL by Intel and PNL | ITLab Lobachevsky State University of Nizhni Novgorod
- refr - Reranker Framework (ReFr) - Google Project Hosting, about: A New Open-Source Framework for Building Reranking Models
- parredHMMlib | Andreas Sand
- SAMOA by yahoo
- The SMALL project: http://www.small-project.
eu/ - Decomposer: http://code.google.com/p/
decompo... - SuiteSparse: http://www.cise.ufl.edu/r
esearch... - SMMP: http://www.mgnet.org/~dou
glas/Pr... - SPARSEKIT: http://www-users.cs.umn.e
du/~saa... (also http://www-users.cs.umn.e du/~saa... ) - HEIGEN: http://www.cs.cmu.edu/~uk
ang/HEI... - MLbase
- 0xdata/h2o
- milakov/nnForge
- http://deeplearning4j.org
/
- Leon Bottou's toolkit: http://leon.bottou.org/pr
ojects/sgd - MADlib: http://madlib.net/
- BayesDB
- Infer.NET
- Matrix factorizations: https://sites.google.com/
site/ig... - VW: http://hunch.net/~vw/
- Apache Mahout: http://mahout.apache.org/ ( notes:www.cs.stanford.edu/~ang/
papers/nips06-mapreducemu lticore.pdf ) - Pegasus: http://www.cs.cmu.edu/~pe
gasus/ - BID Data Project by John Canny
- VFML: http://www.cs.washington.
edu/dm/... - Shogun: http://www.shogun-toolbox
.org/ - GraphLab: http://www.graphlab.ml.cm
u.edu/p... (notes:http://metaoptimize.com/q a/quest... , via Danny Bickson and Joseph Turian ) - iSAX: http://www.cs.ucr.edu/~ea
monn/iS... - MOA: http://moa.cs.waikato.ac.
nz/ (notes http://sourceforge.net/pr ojects/... ) - S4: http://s4.io/
- LinAlg (Ruby): http://rubyforge.org/proj
ects/li... - dlib: http://dlib.net/ (notes: http://jmlr.csail.mit.edu
/papers... ) - Shark: http://shark-project.sour
ceforge... - Borealis: http://www.cs.brown.edu/r
esearch... - Hama: http://incubator.apache.o
rg/hama/ - PSVM: http://code.google.com/p/
psvm/ (via Joseph Misiti) - LibCVM: http://www.cse.ust.hk/~iv
or/cvm.... (via Joel Hoff) - The TLD project: http://info.ee.surrey.ac.
uk/Pers... - LIBLINEAR: http://www.csie.ntu.edu.t
w/~cjli... - SVMTorch: http://bengio.abracadoudo
u.com/S... - SVDPACK: http://www.netlib.org/svd
pack/ - StreamIt: http://groups.csail.mit.e
du/cag/... - SENNA: http://ml.nec-labs.com/se
nna/ - Spark Machine Learning Library (MLlib) (via Apache Spark: The Next Big Data Thing? )
- Spark-LIBLINEAR: Libraries for Large-scale Linear Classification on Distributed Environments
- Elephant: http://elefant.developer.
nicta.c... (notes:http://elefant.developer. nicta.c... ) - Elephant/Stream: http://users.cecs.anu.edu
.au/~jp... - SparseM: http://cran.r-project.org
/web/pa... (via Josh Wills) - R/HPC: http://cran.r-project.org
/web/vi... (via Dan Knoepfle) - R/ML: http://cran.r-project.org
/web/vi... - SciPy: http://www.scipy.org/
- SciKits: http://scikit-learn.sourc
eforge.... - Sage and IPython: http://www.sagemath.org/ ,http://ipython.scipy.org/
moin/ (via Jameson Quinn) - NumPy, etc: http://wiki.python.org/mo
in/Nume... ,http://pypi.python.org/py pi?%3Aa... - ARPACK: http://www.caam.rice.edu/
softwar... - ScaLAPACK: http://www.netlib.org/sca
lapack/ - BLAS implementations: http://en.wikipedia.org/w
iki/Bas... - uBLAS/Boost: http://www.boost.org/doc/
libs/1_... - Eigen: http://eigen.tuxfamily.or
g/index... - SPIRIT: http://www.cs.cmu.edu/afs
/cs/pro... - Scilab: http://www.scilab.org/pro
jects/c... (via Aditya Sengupta) - Java Nonlinear Optimization: ww1.fpl.fs.fed.us/optimiz
ation.html - Java Parallel Optimization: http://www5.informatik.un
i-erlan... - JAMA: http://math.nist.gov/java
numeric... - Octave Multicore: http://octave.sourceforge
.net/mu... (via Jordi Arnabat) - Tesseract OCR: http://code.google.com/p/
tessera... - IBM Parallel ML toolbox: http://www.alphaworks.ibm
.com/te... - Microsoft Sigma: http://research.microsoft
.com/en... - http://mloss.org/software
/ - HPCC by LexisNexis : http://hpccsystems.com/
- Graphical Models toolbox: http://people.rit.edu/jcd
icsa/JGMT/ - Lazy Learning toolbox: http://iridia.ulb.ac.be/~
lazy/ - Deep Learning toolbox: http://deeplearning.net/s
oftware... - Mortar: http://cseweb.ucsd.edu/~k
yocum/s... - Debellor: http://www.debellor.org/
- STXXL: http://stxxl.sourceforge.
net/
- Matlab Toolbox For Dimensionality Reduction: http://homepage.tudelft.n
l/19j49... (via Laurens van der Maaten) - CUDA: http://www.nvidia.com/obj
ect/cud... ,http://www.nvidia.com/obj ect/tes... - Perl Data Language: http://pdl.perl.org/
- PGPLOT: http://www.astro.caltech.
edu/~tj... - Processing: http://processing.org/
- C5.0: http://rulequest.com/down
load.html - Spark: http://www.spark-project.
org/ - Bagel: https://github.com/mesos/
spark/p... - StatStream: http://cs.nyu.edu/shasha/
papers/... - OpenCV: http://opencv.willowgarag
e.com/w... - vecLib: http://developer.apple.co
m/hardw... - Netlab: http://www1.aston.ac.uk/e
as/rese... - Java Numerics: http://math.nist.gov/java
numeric... - Matrix Toolkits for Java: http://code.google.com/p/
matrix-... - List of numerical analysis software: http://en.wikipedia.org/w
iki/Lis... - PLASMA: http://icl.cs.utk.edu/pla
sma/ind... - Colt: http://acs.lbl.gov/softwa
re/colt/ - Parallel Colt: https://sites.google.com/
site/pi... - Incanter: https://github.com/liebke
/incanter/ - Lush: http://lush.sourceforge.n
et/ - Hal: http://www.umiacs.umd.edu
/~hal/s... - Cython for numerical computations: http://conference.scipy.o
rg/proc...
- Distributed Matlab: http://www.mathworks.com/
product... - Microsoft/ISC Star-P: http://www.microsoft.com/
pathway... (notes:
http://beowulf.csail.mit.edu/18.... ) - IBM InfoSphere: http://www-01.ibm.com/sof
tware/d... - Google Prediction API: http://code.google.com/ap
is/pred... (notes: http://mark.reid.name/iem /predic... ) - RapidMiner: http://rapid-i.com/conten
t/view/... - Pentaho: http://www.pentaho.com/
- Tableu: http://www.tableausoftwar
e.com/ - SAS Forecast Server: http://www.sas.com/techno
logies/... - Esper: http://www.espertech.com/
- Streambase: http://www.streambase.com
/ - Oracle BI: http://www.oracle.com/us/
solutio... - Tibco Spotfire: http://spotfire.tibco.com
/ - Oracle Data Mining: http://www.oracle.com/tec
hnetwor... ,http://en.wikipedia.org/w iki/Ora... - Intel Math Kernel: http://software.intel.com
/en-us/... - EigenDog: https://www.eigendog.com (thanks to Julien Verlaguet )
- Boyd: http://www.stanford.edu/~
boyd/so... - Franklin: http://www.cs.berkeley.ed
u/~fran... - Faloutsos: http://www.cs.cmu.edu/~ch
ristos/ - Bontempi: http://mlg.ulb.ac.be/
- Shalev-Shwartz: http://www.cs.huji.ac.il/
~shais/... - NEC: http://www.nec-labs.com/r
esearch... - Select: http://www.select.cs.cmu.
edu/peo... - Shasha: http://cs.nyu.edu/shasha/
papers/... - CBCB: http://www.cbcb.umd.edu/s
oftware/ - Wendykier: https://sites.google.com/
site/pi... - http://www.gpucomputing.n
et
- Pregel: http://googleresearch.blo
gspot.c... - Has anyone started an Apache project based on Google's recently published Pregel paper?
- Graph Databases: http://www.graph-database
.org/ov... - NoSQL Databases: http://nosql-database.org
/ - Kdb+: http://kx.com/kdb+.php
- GNU Parallel: http://www.gnu.org/softwa
re/para... - Bloom: http://www.bloom-lang.net
/ - LinAlg: http://www.linalg.org/
- Lush: http://lush.sourceforge.n
et/ - Brainlab: http://www.interstice.com
/~drewe... - The SRI Language Modeling Toolkit: http://www.speech.sri.com
/projec... (via Jeff Dalton) - Yahoo LDA: https://github.com/shrava
nmn/Yah... (Via Alex Smolahttp://blog.smola.org/pos t/63597... ) - AppScale: http://code.google.com/p/
appscal... - Scientific libraries for Mac: http://www.atmos.washingt
on.edu/... - dmoz AI directory: http://www.dmoz.org/Compu
ters/Ar... - Metaoptimize thread on ML libraries: http://metaoptimize.com/q
a/quest... - Google Code Search: http://www.google.com/cod
esearch... - http://www.jstatsoft.org/
- http://hunch.net/?p=230
- http://hpc.sourceforge.ne
t/ - http://stackoverflow.com/
questio... - Java-matrix-benchmark: http://code.google.com/p/
java-ma... - Gorila: Google Reinforcement Learning Architecture