NumPy

Implemented in C, stores data in contiguous memory blocks
- internal storage includes
  - pointer to data
  - data type (dtype), describes fixed-size value cells
    - numpy dtype hierarchy
    - np.issubtype
  - tuple for array shape
  - tuple for strides: number of bytes to step

ndarray(n-dimension array)

Data Types and Type Code
- int8(unit) to int62(i1, u1 to i8, u8), float 16 to float128(f2 to f16 (f4=f, f8=d, f16=g)
- complex64(c8 or c16) to complex256
- bool (?)
- object(0) - Python object type
- string_ (S) fixed length ASCII string type (eg. use length 10 "S10")
- unicode_ U ("U10")
Attributes
- .ndim(rank), shape(dimension),size, dtype, itemsize(size of item(in byte)
Methods
- [] - indexing with m:n or logical indexing
  - ```
  data[name == 'Bob', 2:] 
  #Bod could be another list with length n rows
```
- fancy indexing
  - take(<ind>, axis=) and put(<ind>)
    - put uses C order
- shape, ndim, dtype, size
  - dimension (5,) is different from (5,1), the former is a rank-1 array
- reshape([order='C' or 'F'] )
  - reshape( n, -1) : flatten all other dimensions (to (n, (n_2*n_3..*n_n))
  - C order (row major) vs Fortran order (column major)
- .ravel(['C' or 'F']) and flatten()
  - flatten return a copy
- .T - transpose
- copy() - deep copy
- Aggregation
  - sum(),mean()
  - var(),max(),min()
  - argmin(),argmax()-return index
  - cumsum(),cumprod()
  - argument: keepdims = True : to avoid rank-0 array
- any(), all() - works on logical nd array
- sort([axis])
  - argsort([kind=]) and lexsort()
    - return sort indexers (sort and lexical sort)
    - kind = 'quiksort' (default), 'mergesort', 'heapsort'
  - np.partition(arr, pos), np.argpartition()
  - searchsorted(val)
    - returns index by binary search
- unique()

Numpy (np.)

Creation

array(<sequence>0
asarray() - covert input to array, not copy if already an np array
arrange([start,] stop, [step,], dtype = None) - like range
linespace(start, stop, num, endpoint...)
logspace()
ones(<shape>), ones_like(<another array of same shape>), zeros, zeros_like
empty, empty_like, full, full_like
eye(), identity()
fromfile(), fromfunction()

Operations

shape
a[a,b] - same usage for : (to or all) as list
*+-/ - Element wise operation
- Numpy will do "broadcasting" by copying elements (eg. (5,3) array + (5,1) array is (5,3))
np.squeeze - select a subset
concatenate and split
- dstack() - stack in 'depth' wise
- vstack() and row_stack
- htack() and column_stack
  - column_sstack converts one d array to 2d columns first
- r_()
  - row_stack()
- c_()
- split(arr, [...]), vsplit
isnan()
np.where() -selection

Functions (np.) - (ufunc() - implemented in C usually)

Looping

wrap function using np.vectorize
apply_along_axis(<lambda>)
- all, exp, floor, ceil, clip, conj, corrcoef, cov, bincoun)

Unary

abs, fabs
sqrt, square
exp, log, log10, log2, log1p
sign, ceil, floor
rint
modf - fractional and integral part
isnan
isfinite, isinf
cos, sin, cosh, sinh, tan, tanh, arcsin, arccost, ...
logical_not

Binary

add, substrct, multiply, divide, floor_divide, dot, power
- @ is matrix multiply (dot)
maximum, fmax(ignores NaN), minimum, fmin
copysign
greater, greater_equal, less_euqal
logical_and, logical_or
outer() - Cartesian product

Multiple(aggregators)

reduce

operations can be chained

arr = np.arange(10)

np.add.reduce(arr)
# 45

accumulate - (cumulative reduce)
reduceat(arr, [reduce cuts])

Logical Indexing

np.

unique(x)
intersect1d(x,y)
union1d(x,y)
setdiff1d(x,y)
setxor1d(x,y)

Others

np.meshgrid() -1D to 2D
np.where( <cond>, <if array>, <else array>)
- <if array>.where(<cond>)

Linear Algebra

dot - matirx mutliply
.outer() - outer product
.linalg
- diag,
- trace
- .det, inv, solve, eig,
- pinv - Moore-Penrose pseudo-inverse of a matrix
- norm( , keepdims = True)
np.asmatrix (Matrix)

Random

np. random

seed - set seed
RandomState() - create one generator independent of others
permutation
shuffle
rand, randint
randn (std normal), binomial, beta, normal, chisquare, gamma, uniform
choice(<list>, size, [prob]) - choose among

Advanced

Broadcasting

broadcasting
automatic casting of smaller one to meet the shape of bigger one in linear algebra and general calculations

Numpy File I/O

deal with either text of binary format. Arrays are saved by default in umcompressed raw binary format with file extension .npy. Compressed form is .npz

np.

load(<npy>/<npz>)
- npz returns dict-like struct (lazily)
save()
savez(<file>, <var dict>)
- a=..., b=///
savez_compressed(<file>, a = arr, b = arr)

User-defined, C-like ufuncs, dtypes and Numba

np.

frompyfunc(func, nin, nout)
- returns python objects
vectorize
- can specify output type

Structured array

structured array can compress complex nested objects to single block of memory

Numba

njit([nopython])
- indicates any python API calls
float64

Uses LLVM project to translate python code to compiled machine code

Advanced Array Input and Output

memmap object
- nd-array like object, enable large file to be read and written without loading to memory
np.memap

memmap

flush()

Numpy Functions for Deep Learning

np.pad()

Numpy Performance Tips

vectorize
- loops and condition logic to array operations and boolean operations
- use contiguous memory (C-contiguous)
  - arr.flags
broadcasting if possible
use views to avoid copy
Utilize ufunc

Examples

Examples List

Numpy