381x Filetype PDF File size 1.50 MB Source: www.utc.fr
Python For Data Science Cheat Sheet Lists Also see NumPy Arrays Libraries
Python Basics >>> a = 'is' Import libraries
>>> b = 'nice' >>> import numpy Data analysis Machine learning
Learn More Python for Data Science Interactively at www.datacamp.com >>> my_list = ['my', 'list', a, b] >>> import numpy as np
>>> my_list2 = [[4,5,6,7], [3,4,5,6]] Selective import
Index starts at 0 >>> from math import pi Scientific computing 2D plotting
Variables and Data Types Selecting List Elements
Variable Assignment Subset Install Python
>>> x=5 >>> my_list[1] Select item at index 1
>>> x >>> my_list[-3] Select 3rd last item
5 Slice
>>> my_list[1:3] Select items at index 1 and 2
Calculations With Variables >>> my_list[1:] Select items after index 0
>>> x+2 Sum of two variables >>> my_list[:3] Select items before index 3 Leading open data science platform Free IDE that is included Create and share
>>> my_list[:] Copy my_list powered by Python with Anaconda documents with live code,
7 Subset Lists of Lists visualizations, text, ...
>>> x-2 Subtraction of two variables >>> my_list2[1][0] my_list[list][itemOfList]
3 >>> my_list2[1][:2] Numpy Arrays Also see Lists
>>> x*2 Multiplication of two variables >>> my_list = [1, 2, 3, 4]
10 List Operations >>> my_array = np.array(my_list)
>>> x**2 Exponentiation of a variable >>> my_list + my_list >>> my_2darray = np.array([[1,2,3],[4,5,6]])
25
>>> x%2 Remainder of a variable ['my', 'list', 'is', 'nice', 'my', 'list', 'is', 'nice'] Selecting Numpy Array Elements Index starts at 0
1 >>> my_list * 2
>>> x/float(2) Division of a variable ['my', 'list', 'is', 'nice', 'my', 'list', 'is', 'nice'] Subset
2.5 >>> my_list2 > 4 >>> my_array[1] Select item at index 1
Types and Type Conversion True 2
List Methods Slice
str() '5', '3.45', 'True' >>> my_array[0:2] Select items at index 0 and 1
Variables to strings Get the index of an item array([1, 2])
>>> my_list.index(a) Subset 2D Numpy arrays
int() 5, 3, 1 Variables to integers >>> my_list.count(a) Count an item
>>> my_list.append('!') Append an item at a time >>> my_2darray[:,0] my_2darray[rows, columns]
float() 5.0, 1.0 Variables to floats >>> my_list.remove('!') Remove an item array([1, 4])
>>> del(my_list[0:1]) Remove an item Numpy Array Operations
bool() True, True, True Variables to booleans >>> my_list.reverse() Reverse the list >>> my_array > 3
>>> my_list.extend('!') Append an item array([False, False, False, True], dtype=bool)
Asking For Help >>> my_list.pop(-1) Remove an item >>> my_array * 2
>>> my_list.insert(0,'!') Insert an item array([2, 4, 6, 8])
>>> help(str) >>> my_list.sort() Sort the list >>> my_array + np.array([5, 6, 7, 8])
Strings array([6, 8, 10, 12])
>>> my_string = 'thisStringIsAwesome' Numpy Array Functions
>>> my_string String Operations Index starts at 0 >>> my_array.shape Get the dimensions of the array
'thisStringIsAwesome' >>> my_string[3] >>> np.append(other_array) Append items to an array
String Operations >>> my_string[4:9] >>> np.insert(my_array, 1, 5) Insert items in an array
String Methods >>> np.delete(my_array,[1]) Delete items in an array
>>> my_string * 2 >>> np.mean(my_array) Mean of the array
'thisStringIsAwesomethisStringIsAwesome' >>> my_string.upper() String to uppercase >>> np.median(my_array) Median of the array
>>> my_string + 'Innit' >>> my_string.lower() String to lowercase >>> my_array.corrcoef() Correlation coefficient
'thisStringIsAwesomeInnit' >>> my_string.count('w') Count String elements >>> np.std(my_array) Standard deviation
>>> 'm' in my_string >>> my_string.replace('e', 'i') Replace String elements
True >>> my_string.strip() Strip whitespaces DataCamp
Learn Python for Data Science Interactively
Python For Data Science Cheat Sheet Working with Different Programming Languages Widgets
Kernels provide computation and communication with front-end interfaces Notebook widgets provide the ability to visualize and control changes
Jupyter Notebook like the notebooks. There are three main kernels: in your data, often as a control like a slider, textbox, etc.
Learn More Python for Data Science Interactively at www.DataCamp.com You can use them to build interactive GUIs for your notebooks or to
IRkernel IJulia synchronize stateful and stateless information between Python and
Installing Jupyter Notebook will automatically install the IPython kernel. JavaScript.
Saving/Loading Notebooks Restart kernel Interrupt kernel
Create new notebook Restart kernel & run Interrupt kernel & Download serialized Save notebook
Open an existing all cells clear all output state of all widget with interactive
Make a copy of the notebook Restart kernel & run Connect back to a models in use widgets
current notebook all cells remote notebook
Rename notebook Run other installed Embed current
kernels widgets
Save current notebook Revert notebook to a
and record checkpoint previous checkpoint Command Mode:
Download notebook as
Preview of the printed - IPython notebook 15
notebook - Python
- HTML
Close notebook & stop - Markdown 13 14
running any scripts - reST
- LaTeX 1 2 3 4 5 6 7 8 9 10 11 12
- PDF
Writing Code And Text
Code and text are encapsulated by 3 basic cell types: markdown cells, code
cells, and raw NBConvert cells.
Edit Cells Edit Mode: 1. Save and checkpoint 9. Interrupt kernel
2. Insert cell below 10. Restart kernel
Cut currently selected cells Copy cells from 3. Cut cell 11. Display characteristics
to clipboard clipboard to current 4. Copy cell(s) 12. Open command palette
5. Paste cell(s) below 13. Current kernel
Paste cells from cursor position Executing Cells 6. Move cell up 14. Kernel status
clipboard above 7. Move cell down 15. Log out from notebook server
current cell Paste cells from Run selected cell(s) Run current cells down 8. Run current cell
clipboard below and create a new one
Paste cells from current cell below Asking For Help
clipboard on top Delete current cells Run current cells down
of current cel and create a new one Walk through a UI tour
Revert “Delete Cells” Split up a cell from above Run all cells
invocation current cursor Run all cells above the Run all cells below List of built-in keyboard
position current cell the current cell Edit the built-in shortcuts
Merge current cell Merge current cell Change the cell type of toggle, toggle keyboard shortcuts Notebook help topics
with the one above with the one below current cell scrolling and clear Description of
Move current cell up Move current cell toggle, toggle current outputs markdown available Information on
Adjust metadata down scrolling and clear in notebook unofficial Jupyter
underlying the Find and replace all output Python help topics Notebook extensions
current notebook in selected cells View Cells IPython help topics
Remove cell Copy attachments of NumPy help topics
attachments current cell Toggle display of Jupyter Toggle display of toolbar Matplotlib help topics SciPy help topics
Paste attachments of Insert image in logo and filename SymPy help topics
current cell selected cells Toggle display of cell Pandas help topics
action icons: About Jupyter Notebook
Insert Cells - None
- Edit metadata
Add new cell below the Toggle line numbers - Raw cell format
Add new cell above the in cells - Slideshow
current one current one - Attachments DataCamp
- Tags
Learn Python for Data Science Interactively
Python For Data Science Cheat Sheet Inspecting Your Array Subsetting, Slicing, Indexing Also see Lists
>>> a.shape Array dimensions Subsetting
NumPy Basics >>> len(a) Length of array 1 2 3
>>> b.ndim Number of array dimensions >>> a[2] Select the element at the 2nd index
Learn Python for Data Science Interactively at www.DataCamp.com >>> e.size Number of array elements 3
1.5 2 3
>>> b.dtype Data type of array elements >>> b[1,2] Select the element at row 0 column 2
4 5 6
>>> b.dtype.name Name of data type 6.0 (equivalent to b[1][2])
>>> b.astype(int) Convert an array to a different type Slicing
>>> a[0:2] Select items at index 0 and 1
NumPy array([1, 2]) 1 2 3
2 Asking For Help
1.5 2 3
The NumPy library is the core library for scientific computing in >>> np.info(np.ndarray.dtype) >>> b[0:2,1] Select items at rows 0 and 1 in column 1
Python. It provides a high-performance multidimensional array array([ 2., 5.]) 4 5 6
1.5 2 3
object, and tools for working with these arrays. Array Mathematics >>> b[:1] Select all items at row 0
4 5 6
array([[1.5, 2., 3.]]) (equivalent to b[0:1, :])
Use the following import convention: Arithmetic Operations >>> c[1,...] Same as [1,:,:]
array([[[ 3., 2., 1.],
>>> import numpy as np >>> g = a - b Subtraction [ 4., 5., 6.]]])
array([[-0.5, 0. , 0. ], >>> a[ : :-1] Reversed array a
NumPy Arrays [-3. , -3. , -3. ]]) array([3, 2, 1])
1D array 2D array 3D array >>> np.subtract(a,b) Subtraction Boolean Indexing
>>> a[a<2] Select elements from a less than 2
>>> b + a Addition array([1]) 1 2 3
1 2 3 axis 1 axis 2 array([[ 2.5, 4. , 6. ],
axis 1 [ 5. , 7. , 9. ]]) Fancy Indexing
axis 0 1.5 2 3 >>> np.add(b,a) Addition >>> b[[1, 0, 1, 0],[0, 1, 2, 0]] Select elements (1,0),(0,1),(1,2) and (0,0)
4 5 6 axis 0 >>> a / b Division array([ 4. , 2. , 6. , 1.5])
array([[ 0.66666667, 1. , 1. ], >>> b[[1, 0, 1, 0]][:,[0,1,2,0]] Select a subset of the matrix’s rows
[ 0.25 , 0.4 , 0.5 ]]) array([[ 4. ,5. , 6. , 4. ], and columns
>>> np.divide(a,b) Division [ 1.5, 2. , 3. , 1.5],
Creating Arrays >>> a * b Multiplication [ 4. , 5. , 6. , 4. ],
array([[ 1.5, 4. , 9. ], [ 1.5, 2. , 3. , 1.5]])
>>> a = np.array([1,2,3]) [ 4. , 10. , 18. ]]) Array Manipulation
>>> b = np.array([(1.5,2,3), (4,5,6)], dtype = float) >>> np.multiply(a,b) Multiplication
>>> c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]], >>> np.exp(b) Exponentiation Transposing Array
dtype = float) >>> np.sqrt(b) Square root >>> i = np.transpose(b) Permute array dimensions
Initial Placeholders >>> np.sin(a) Print sines of an array >>> i.T Permute array dimensions
>>> np.cos(b) Element-wise cosine
>>> np.zeros((3,4)) Create an array of zeros >>> np.log(a) Element-wise natural logarithm Changing Array Shape
>>> e.dot(f) Dot product >>> b.ravel() Flatten the array
>>> np.ones((2,3,4),dtype=np.int16) Create an array of ones array([[ 7., 7.],
>>> d = np.arange(10,25,5) Create an array of evenly [ 7., 7.]]) >>> g.reshape(3,-2) Reshape, but don’t change data
spaced values (step value) Adding/Removing Elements
>>> np.linspace(0,2,9) Create an array of evenly Comparison >>> h.resize((2,6)) Return a new array with shape (2,6)
spaced values (number of samples) >>> a == b Element-wise comparison >>> np.append(h,g) Append items to an array
>>> e = np.full((2,2),7) Create a constant array array([[False, True, True], >>> np.insert(a, 1, 5) Insert items in an array
>>> f = np.eye(2) Create a 2X2 identity matrix [False, False, False]], dtype=bool) >>> np.delete(a,[1]) Delete items from an array
>>> np.random.random((2,2)) Create an array with random values >>> a < 2 Element-wise comparison
>>> np.empty((3,2)) Create an empty array array([True, False, False], dtype=bool) Combining Arrays
>>> np.array_equal(a, b) Array-wise comparison >>> np.concatenate((a,d),axis=0) Concatenate arrays
I/O array([ 1, 2, 3, 10, 15, 20])
Aggregate Functions >>> np.vstack((a,b)) Stack arrays vertically (row-wise)
Saving & Loading On Disk array([[ 1. , 2. , 3. ],
>>> a.sum() Array-wise sum [ 1.5, 2. , 3. ],
>>> np.save('my_array', a) >>> a.min() Array-wise minimum value [ 4. , 5. , 6. ]])
>>> np.savez('array.npz', a, b) >>> b.max(axis=0) Maximum value of an array row >>> np.r_[e,f] Stack arrays vertically (row-wise)
>>> np.load('my_array.npy') >>> np.hstack((e,f)) Stack arrays horizontally (column-wise)
>>> b.cumsum(axis=1) Cumulative sum of the elements array([[ 7., 7., 1., 0.],
Saving & Loading Text Files >>> a.mean() Mean [ 7., 7., 0., 1.]])
>>> b.median() Median >>> np.column_stack((a,d)) Create stacked column-wise arrays
>>> np.loadtxt("myfile.txt") >>> a.corrcoef() Correlation coefficient array([[ 1, 10],
>>> np.genfromtxt("my_file.csv", delimiter=',') >>> np.std(b) Standard deviation [ 2, 15],
[ 3, 20]])
>>> np.savetxt("myarray.txt", a, delimiter=" ") Copying Arrays >>> np.c_[a,d] Create stacked column-wise arrays
Splitting Arrays
Data Types >>> h = a.view() Create a view of the array with the same data >>> np.hsplit(a,3) Split the array horizontally at the 3rd
>>> np.int64 Signed 64-bit integer types >>> np.copy(a) Create a copy of the array [array([1]),array([2]),array([3])] index
>>> np.float32 Standard double-precision floating point >>> h = a.copy() Create a deep copy of the array >>> np.vsplit(c,2) Split the array vertically at the 2nd index
[array([[[ 1.5, 2. , 1. ],
>>> np.complex Complex numbers represented by 128 floats [ 4. , 5. , 6. ]]]),
>>> np.bool Boolean type storing TRUE and FALSE values array([[[ 3., 2., 3.],
>>> np.object Python object type Sorting Arrays [ 4., 5., 6.]]])]
>>> np.string_ Fixed-length string type >>> a.sort() Sort an array DataCamp
>>> np.unicode_ Fixed-length unicode type >>> c.sort(axis=0) Sort the elements of an array's axis Learn Python for Data Science Interactively
Python For Data Science Cheat Sheet Linear Algebra Also see NumPy
SciPy - Linear Algebra You’ll use the linalg and sparse modules. Note that scipy.linalg contains and expands on numpy.linalg.
Learn More Python for Data Science Interactively at www.datacamp.com >>> from scipy import linalg, sparse Matrix Functions
Creating Matrices Addition
>>> A = np.matrix(np.random.random((2,2))) >>> np.add(A,D) Addition
SciPy >>> B = np.asmatrix(b) Subtraction
The SciPy library is one of the core packages for >>> C = np.mat(np.random.random((10,5))) >>> np.subtract(A,D) Subtraction
>>> D = np.mat([[3,4], [5,6]]) Division
scientific computing that provides mathematical Basic Matrix Routines >>> np.divide(A,D) Division
algorithms and convenience functions built on the Multiplication
NumPy extension of Python. Inverse >>> np.multiply(D,A) Multiplication
>>> A.I Inverse >>> np.dot(A,D) Dot product
Also see NumPy >>> linalg.inv(A) Inverse >>> np.vdot(A,D) Vector dot product
Interacting With NumPy >>> A.T Tranpose matrix >>> np.inner(A,D) Inner product
>>> import numpy as np >>> A.H Conjugate transposition >>> np.outer(A,D) Outer product
>>> a = np.array([1,2,3]) >>> np.trace(A) Trace >>> np.tensordot(A,D) Tensor dot product
>>> b = np.array([(1+5j,2j,3j), (4j,5j,6j)]) Norm >>> np.kron(A,D) Kronecker product
>>> c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]]) Exponential Functions
>>> linalg.norm(A) Frobenius norm >>> linalg.expm(A) Matrix exponential
Index Tricks >>> linalg.norm(A,1) L1 norm (max column sum) >>> linalg.expm2(A) Matrix exponential (Taylor Series)
>>> linalg.norm(A,np.inf) L inf norm (max row sum) >>> linalg.expm3(D) Matrix exponential (eigenvalue
>>> np.mgrid[0:5,0:5] Create a dense meshgrid Rank decomposition)
>>> np.ogrid[0:2,0:2] Create an open meshgrid >>> np.linalg.matrix_rank(C) Matrix rank Logarithm Function
>>> np.r_[[3,[0]*5,-1:1:10j] Stack arrays vertically (row-wise) >>> linalg.logm(A) Matrix logarithm
>>> np.c_[b,c] Create stacked column-wise arrays Determinant Trigonometric Tunctions
Shape Manipulation >>> linalg.det(A) Determinant >>> linalg.sinm(D) Matrix sine
Solving linear problems >>> linalg.cosm(D) Matrix cosine
>>> np.transpose(b) Permute array dimensions >>> linalg.solve(A,b) Solver for dense matrices >>> linalg.tanm(A) Matrix tangent
>>> b.flatten() Flatten the array >>> E = np.mat(a).T Solver for dense matrices Hyperbolic Trigonometric Functions
>>> np.hstack((b,c)) Stack arrays horizontally (column-wise) >>> linalg.lstsq(D,E) Least-squares solution to linear matrix >>> linalg.sinhm(D) Hypberbolic matrix sine
>>> np.vstack((a,b)) Stack arrays vertically (row-wise) equation >>> linalg.coshm(D) Hyperbolic matrix cosine
>>> np.hsplit(c,2) Split the array horizontally at the 2nd index Generalized inverse >>> linalg.tanhm(A) Hyperbolic matrix tangent
>>> np.vpslit(d,2) Split the array vertically at the 2nd index >>> linalg.pinv(C) Compute the pseudo-inverse of a matrix Matrix Sign Function
Polynomials (least-squares solver) >>> np.sigm(A) Matrix sign function
>>> linalg.pinv2(C) Compute the pseudo-inverse of a matrix Matrix Square Root
>>> from numpy import poly1d (SVD) >>> linalg.sqrtm(A) Matrix square root
>>> p = poly1d([3,4,5]) Create a polynomial object Creating Sparse Matrices Arbitrary Functions
Vectorizing Functions >>> linalg.funm(A, lambda x: x*x) Evaluate matrix function
>>> def myfunc(a): >>> F = np.eye(3, k=1) Create a 2X2 identity matrix
if a < 0: >>> G = np.mat(np.identity(2)) Create a 2x2 identity matrix Decompositions
return a*2 >>> C[C > 0.5] = 0
else: >>> H = sparse.csr_matrix(C) Compressed Sparse Row matrix Eigenvalues and Eigenvectors
return a/2 >>> I = sparse.csc_matrix(D) Compressed Sparse Column matrix >>> la, v = linalg.eig(A) Solve ordinary or generalized
>>> np.vectorize(myfunc) Vectorize functions >>> J = sparse.dok_matrix(A) Dictionary Of Keys matrix eigenvalue problem for square matrix
>>> E.todense() Sparse matrix to full matrix >>> l1, l2 = la Unpack eigenvalues
Type Handling >>> sparse.isspmatrix_csc(A) Identify sparse matrix >>> v[:,0] First eigenvector
>>> np.real(c) Return the real part of the array elements >>> v[:,1] Second eigenvector
>>> np.imag(c) Return the imaginary part of the array elements Sparse Matrix Routines >>> linalg.eigvals(A) Unpack eigenvalues
>>> np.real_if_close(c,tol=1000) Return a real array if complex parts close to 0 Inverse Singular Value Decomposition
>>> np.cast['f'](np.pi) Cast object to a data type Inverse >>> U,s,Vh = linalg.svd(B) Singular Value Decomposition (SVD)
>>> sparse.linalg.inv(I) >>> M,N = B.shape
Other Useful Functions Norm >>> Sig = linalg.diagsvd(s,M,N) Construct sigma matrix in SVD
>>> np.angle(b,deg=True) Return the angle of the complex argument >>> sparse.linalg.norm(I) Norm LU Decomposition
>>> g = np.linspace(0,np.pi,num=5) Create an array of evenly spaced values Solving linear problems >>> P,L,U = linalg.lu(C) LU Decomposition
(number of samples) >>> sparse.linalg.spsolve(H,I) Solver for sparse matrices
>>> g [3:] += np.pi Sparse Matrix Decompositions
>>> np.unwrap(g) Unwrap Sparse Matrix Functions
>>> np.logspace(0,10,3) Create an array of evenly spaced values (log scale) >>> la, v = sparse.linalg.eigs(F,1) Eigenvalues and eigenvectors
>>> np.select([c<4],[c*2]) Return values from a list of arrays depending on >>> sparse.linalg.expm(I) Sparse matrix exponential >>> sparse.linalg.svds(H, 2) SVD
conditions
>>> misc.factorial(a) Factorial
>>> misc.comb(10,3,exact=True) Combine N things taken at k time Asking For Help
>>> misc.central_diff_weights(3) Weights for Np-point central derivative >>> help(scipy.linalg.diagsvd) DataCamp
>>> misc.derivative(myfunc,1.0) Find the n-th derivative of a function at a point >>> np.info(np.matrix) Learn Python for Data Science Interactively
no reviews yet
Please Login to review.