Introduction to NumPy
These notes are based on the video ML Zoomcamp 1.7 - Introduction to NumPy
NumPy is a fundamental library for numerical computing in Python. This guide covers the essential NumPy operations you’ll need throughout the machine learning course.
Importing NumPy
The standard convention is to import NumPy with an alias:
import numpy as np
This makes code more concise by allowing you to write np
instead of numpy
.
Creating Arrays
Basic Array Creation
NumPy provides several functions to create arrays:
np.zeros(10)
: Creates an array of 10 zerosnp.ones(10)
: Creates an array of 10 onesnp.full(10, 2.5)
: Creates an array of 10 elements, all with value 2.5
From Python Lists
Convert a Python list to a NumPy array:
a = np.array([1, 2, 3, 5, 7, 12])
Accessing and Modifying Elements
Access elements using zero-based indexing:
a[2] # Returns the third element (3)
a[2] = 10 # Changes the third element to 10
Range-Based Arrays
Create arrays with sequential values:
np.arange(10)
: Creates array with values 0 through 9np.arange(3, 10)
: Creates array with values 3 through 9np.linspace(0, 1, 11)
: Creates 11 evenly spaced values from 0 to 1 inclusive
Multi-Dimensional Arrays
Creating 2D Arrays
# Create a 5×2 array of zeros
np.zeros((5, 2))
# Create from nested lists
n = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
Accessing Elements in 2D Arrays
n[0, 1] # Access element at row 0, column 1 (value: 2)
n[0, 1] = 20 # Change this element to 20
Accessing Rows and Columns
n[0] # Access the first row
n[2] = [1, 1, 1] # Replace the last row with ones
n[:, 1] # Access the second column (all rows, column 1)
n[:, 2] = [0, 1, 2] # Replace the last column
Randomly Generated Arrays
Uniform Distribution
# 5×2 array of random values between 0 and 1
np.random.random((5, 2))
Setting Random Seed
Make random generation reproducible:
np.random.seed(2)
np.random.random((5, 2)) # Will produce the same "random" values every time
Normal Distribution
# Random values from standard normal distribution
np.random.randn(5, 2)
Random Integers
# Random integers between 0 and 99
np.random.randint(0, 100, (5, 2))
Element-Wise Operations
NumPy arrays support arithmetic operations that apply to each element:
a = np.arange(5) # [0, 1, 2, 3, 4]
a + 1 # [1, 2, 3, 4, 5]
a * 2 # [0, 2, 4, 6, 8]
a / 2 # [0, 0.5, 1, 1.5, 2]
Operations can be chained:
a * 2 + 10 # [10, 12, 14, 16, 18]
(a * 2 + 10) ** 2 # [100, 144, 196, 256, 324]
Operations Between Arrays
Element-wise operations work between arrays too:
a = np.arange(5) # [0, 1, 2, 3, 4]
b = np.array([10, 10, 10, 10, 10])
a + b # [10, 11, 12, 13, 14]
a * b # [0, 10, 20, 30, 40]
Comparison Operations
Compare arrays element-wise:
a >= 2 # [False, False, True, True, True]
a > b # Element-wise comparison between arrays
Filter arrays using boolean masks:
a[a > b] # Returns elements of a where a > b is True
Summarizing Operations
Reduce arrays to single values:
np.min(a) # Minimum value (0)
np.max(a) # Maximum value (4)
np.sum(a) # Sum of all elements (10)
np.mean(a) # Mean value (2.0)
np.std(a) # Standard deviation
These operations work on both 1D and 2D arrays.
NumPy offers many more functions for operations like finding minimums per row, sorting arrays, and matrix operations which will be covered in the linear algebra section.