[TOC]
The fundamental package for scientific computing with Python.
Introduction
-
Best tutorial: Quickstart tutorial
-
Since pythoner usually do
import numpy as np
, in most scenes ‘np’ measn ‘numpy’ -
Note: an 1d array in numpy acts like a row vector in linear algebra, but most lienar algebra textbook are written in column form!
In text book: matrix x column vector = column vector
In numpy: row vector x matrix.T = row vector -
string type numpy array automatically encode the string to bytes!
1new_a = a.astype('U') # get string instead of bytes -
帮助:
np.info
例如np.info(np.random)
broadcasting
Broadcasting is one of most error-prone concept in numpy.
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when:
- they are equal, or
- one of them is 1
-
reshape can turn off the broadcast
-
an telling example
A (4d array): 8 x 1 x 6 x 1
B (3d array): 7 x 1 x 5
Result (4d array): 8 x 7 x 6 x 5So for a single dimension, the broadcast means numpy will expand (6 x 1) to (6 x 5), and then do element-wise operation.
ndarray学习
-
ndarray.mean(axis=None, dtype=None, out=None):返回指定轴的数组元素均值
-
ndarray.var(axis=None, dtype=None, out=None, ddof=0):返回数组的方差,沿指定的轴。
-
ndarray.std(axis=None, dtype=None, out=None, ddof=0):沿给定的轴返回数则的标准差
注意std默认是除以N而不是N-1的,如果要除以N-1,将ddof设为1;另外,pandas中std是除以N-1的。 -
ndarray.trace(offset=0, axis1=0, axis2=1, dtype=None, out=None):返回沿对角线的数组元素之和
-
ndarray.diagonal(offset=0, axis1=0, axis2=1):返回对角线的所有元素。
-
最大/小值
- ndarray.argmin(axis=None, out=None):返回指定轴最小元素的索引。
- darray.min(axis=None, out=None):返回指定轴的最小值
-
flat/flatten
- ndarray.flat 和 ndarray.T 一样不是函数调用
12345>>> x = X.flat>>> x<numpy.flatiter object at 0x9e82278># 不直接返回一维数组# 但可直接索引- flatten()是函数调用,可以指定平坦化的参数。
ndarray.flatten(order='C')
可选参数,order:
(1)’C’:C-style,行序优先
(2)’F’:Fortran-style,列序优先
(3)’A’:保持
(4)默认为’C’
-
ndarray.transpose(*axes) :返回矩阵的转置矩阵
-
ndarray.take(indices, axis=None, out=None, mode=’raise’):获得数组的指定索引的数据,如:
1234>>> a.take([1,3],axis=1) #提取1,3列的数据array([[ 1, 3],[ 5, 7],[ 9, 11]]) -
numpy.argmax(a, axis=None, out=None)
非常有用
Returns the indices of the maximum values along an axis. doc
构造矩阵
- arange()/linspace()
- numpy.zeros,numpy.ones,numpy.eye, numpy.empty((2,3)), numpy.full((2,2),7) 123456789101112>>> print np.zeros((3,4))[[ 0. 0. 0. 0.][ 0. 0. 0. 0.][ 0. 0. 0. 0.]]>>> print np.ones((3,4))[[ 1. 1. 1. 1.][ 1. 1. 1. 1.][ 1. 1. 1. 1.]]>>> print np.eye(3)[[ 1. 0. 0.][ 0. 1. 0.][ 0. 0. 1.]]
矩阵indexing
-
automatic reshaping
1234>>> a = np.arange(30)>>> a.shape = 2,-1,3 # -1 means "whatever is needed">>> a.shape(2, 5, 3) -
Indexing with Arrays of Indices
- Suppose a and idx is a np.array, then
a[i].shape == idx.shape
- Supose a, idx_i and idx_j is a np.array, if
idx_i.shape == idx_j.shape
, thena[i,j].shape == idx_i.shape
, idx_i is the first axis index of array a, idx_j is the second axis index of array a.12list_ij=[i,j]a[list_ij] == a[i,j] # this statement is true
- Indexing with Boolean Arrays12345678910111213>>> a = np.arange(12).reshape(3,4)>>> b = a > 4>>> b # b is a boolean with a's shapearray([[False, False, False, False],[False, True, True, True],[ True, True, True, True]], dtype=bool)>>> a[b] # 1d array with the selected elementsarray([ 5, 6, 7, 8, 9, 10, 11])>>> a[b]=0>>> aarray([[0, 1, 2, 3],[4, 0, 0, 0],[0, 0, 0, 0]])
Note that a[b] is a 1d array! But a[b]=0 is a 2d array! This is because if you don’t assign a value to the ‘False’ element, there is no value for that element.
数据添加与拷贝
-
c=a.copy
深拷贝 -
vstack和hstack函数:
vstack、hstack是深拷贝1234567>>> a = np.ones((2,2))>>> b = np.eye(2)>>> print np.vstack((a,b))[[ 1. 1.][ 1. 1.][ 1. 0.][ 0. 1.]] -
row_stack(matrix,a_row)
向二维矩阵尾部添加一行
numpy.linalg
|
|
comparison
- a == b #逐个元素比较
- a < 2
np.array_equal(a,b)
arithmetic operation
- +, -, *, / #element-wise
- np.dot(a,b) # matrix multiply
I/O
- np.save(‘myarray’,a)
- np.savez(‘myarray.npz’,a,b)
- np.save(‘myarray.npy’,a)
- np.loadtxt/savetxt/genfromtxt
Other
- np.newaxis
|
|