Gnumpy + cudamat can be used together for high-speed GPU matrix computation.
A good but short tutorial is here
http://blog.sina.com.cn/s/blog_4c38701d01018tfz.html
Links of both gnumpy and cudamat are as follows,
https://pypi.python.org/pypi
http://code.google.com/p/cudamat/