When choosing between a high and low-level language, you have to make a trade-off between being able to work and quickly, and having programs that run quickly and efficiently. Luckily, there are two Python libraries that were created to give us the best of both-worlds: NumPy and pandas. Together, pandas and NumPy provide a powerful toolset for working with data in Python. They allow us to write code quickly without sacrificing performance. But how do they do this? What is it that makes these libraries faster than raw Python? The answer is vectorization.
1. why?
Vectorization takes advantage of a processor feature called Single Instruction Multiple Data (SIMD) to process data faster. Most modern computer processors support SIMD. SIMD allows a processor to perform the same operation, on multiple data points, in a single processor cycle.
2. NumPy
- To start working with NumPy, we’ll first need to start by importing the NumPy library into our Python environment. For this, we use a simple import statement:
We used the as syntax in our import statement. This allows us to access the NumPy library using another name. When working with NumPy, the convention is to import the library as np for brevity.
- To convert our list of lists into a NumPy n-dimensional array, or ndarray, you can think of it as NumPy’s version of a list of lists format. To convert from the list type to ndarray, we use the numpy.array() constructor. ndarray.shape可以知道该数组各个dimensional信息
- With a list of lists, we use two separate pairs of square brackets back-to-back. With a NumPy ndarray, we use a single pair of brackets with comma separated row and column locations.
- operator:
Or you can use the arithmetic functions,like devide()
- the differences between Function and Method:
Functions act as stand alone segments of code that usually take an input, perform some processing, and return some output. When we’re working with Python lists, we can use the len() function to calculate the length of a list, but if we’re working with Python strings, we can also use len(). In this case, it calculates the numbers of characters (or length) of the string.In contrast, methods are special functions that belong to a specific type of object. Python lists have a list.append() method that we can use to add an item to the end of a list. If we try to use that method on a string, we will get an error - add rows and columns to an ndarray: numpy.concatenate() function. This function accepts:
· A list of ndarrays as the first, unnamed parameter.
· An integer for the axis parameter, where 0 will add rows and 1 will add columns.
numpy.expand_dims() function. : to provide the same type
7. numpy.argsort() function: It returns the indices which would sort an array. 返回索引
8.oolean Indexing with Numpy : numpy.genfromtxt()—— reads a text file into a NumPy ndarray. While it has over 20 parameters, for most cases you need only two:
skip_header:跳过第n行,此处跳过第一行
9 . We can use the ndarray.dtype attribute to see the internal datatype that has been used.
10.