Cẩm nang máy học với Python: Chương 1: Làm việc với vector, ma trận và mảng trong Numpy

NumPy (Numerical Python) là một thư viện của Python phù hợp để xử lý số và dữ liệu khoa học. Nó có các đối tượng mảng Nchiều và nhiều phương pháp để xứ lý chúng. Tạo vector bằng NumPy Load library import numpy as np Create a vector as a row vector_row = np.array(1, 2, 3) Create a vector as a column vector_column = np.array(1,2,3)

Trang 1

NumPy (Numerical Python) là một thư viện của Python phùhợp để xử lý số và dữ liệu khoa học Nó có các đối tượng mảngN-chiều và nhiều phương pháp để xứ lý chúng.

Tạo vector bằng NumPy

Trang 2

Chúng ta có thể tạo matrix bằng cách tạo Numpy array haichiều Tuy nhiên ít ai sử dụng matrix bởi vì cấu trúc dữ liệuchuẩn của NumPy chính là mảng và hầu hết các thao tác trongNumpy đều trả về mảng chứ không phải matrix.

Tạo ma trận thư thớt (Sparse Matrix)

Ma trận thưa thớt là ma trận mà hầu hết các thành phầncủa nó = 0 Ma trận thưa thớt chỉ lưu các giá trị khác không vàcho các giá trị còn lại = 0, nhờ đó tiết kiệm được quá trình tính

Trang 3

toán Trong ví dụ trên chúng ta thấy được ma trận thưa thớt của

và cột 1 Ma trận thưa thớt như vậy gọi là ma trận thưa thớt nén

(compressed sparse row (CSR) matrices).

# Create larger matrix

Trang 4

Có rất nhiều loại ma trận thưa thớt như compressed sparsecolumn, list of lists, và dictionary of keys Mỗi loại ma trận cócách sử dụng khác nhau do đó cần xem xét cẩn thận trước khichọn loại ma trận.

Đối tượng mảng NumPy

Trong Numpy thì các đối tượng mảng đa chiều là đối tượng

cơ bản nhất Mảng Numpy chỉ chứa cùng loại giá trị, khônggiống như Python list Trong Numpy thì chiều của mảng được đềcập đến là trục Ví dụ mảng 1 chiều ([1,2,3]) thì được xem như

là có 1 trục với 3 thành phần hay nói cách khác nó có trục có độdài là 3 Mảng hai chiều sẽ có hai trục: trục thứ nhất (axis = 0)

có chiều dài là 2 và trục thứ hai (axis=1) có chiều dài là 3

In NumPy, we can create the N-D array using the array()

create a NumPy array by passing any regular Python list or

function The following is an example of the same:

Trang 5

In this example, we have used a regular Python list to create

dimensional array (array1_1D), and second, we have used a

tuple and made a 1-D NumPy array (array2_1D)

We can also explicitly pass the datatypes of the array using the

the array () function Let’s see the following example:

So, if we want to datatype int or float, and so on, that can be

of complex datatype Later in the chapter, we will see the

if we need a 3-D Array, then sequences of sequences, and so

following example of 2-D array creation using the array()

function:

Trang 6

We passed the two tuples list in this coding snippet, and the

transformed it into a 2-D array Here, the length of the first axis

In NumPy, we can create special arrays, such as an array of

The following are the examples for some special arrays:

Tạo mảng Numpy với giá trị cho trước

Numpy có 1 số hàm cho phép tạo ra các mảng với các giátrị cho trước

Trang 7

Với hàm zero() chúng ta có thể tạo mảng Numpy tất cảbằng 0 Đối số shape quy định số lượng thành phần trong mảng

là 5 Kết quả là ta có mảng 5 giá trị 0

Hàm zero cũng cho phép tạo matrix gồm các giá trị 0 Đối

số shape(2,3) nghĩa là tạo axis =0 là có 2 hàng và axis =1 là có

Trang 8

# View the vector

Tạo mảng Numpy bằng list và tuple

Chúng ta có thể tạo ra mảng Numpy bằng cách sử dụnglist trong Python với hàm asarray()

Trang 9

Trong ví dụ trên chúng ta đã truyền list t1 vào hàm

asarray() với định dạng dtype=’int16’ và nó trả về tương

đương mảng 1 chiều và matrix hai chiều

Chúng ta cũng có thể tạo mảng Numpy bằng cách sử dụngtuple của Python cũng với hàm asarray()

Tạo mảng Numpy bằng dãy số

Chúng ta có thể tạo ra một mảng Numpy bằng dãy số vớihàm arrange(star,stop,step)

Trang 10

Indexing and slicing trong mảng NumPy

Giống như list trong Python chúng ta cũng có thể indexing

và slicing Numpy array cũng có index 0 nghĩa là thành phầnđầu tiên của mảng chính là chỉ mục 0 Nó cũng hỗ trợ chỉ mục

âm nghĩa là -1, tượng trưng cho thành phần cuối cùng

Indexing và slicing cho mảng 1 chiều

Trang 11

# Đảo ngược vector

vector[::-1]

Indexing và slicing cho 1 matrix

Để slicing cho 1 matrix chúng ta cần phải xác định slicescho cả hàng và cột Cú pháp như sau:

[dim1_slice,cdim2_slice]

Hoặc

[row_slice,column_slice]

Hoặc

[dim_1_slice : dim_1_slice: dim1_slice… dim_n_slie] trong

đó giá trị index ban đầu được thêm vào còn giá trị cuối thì bịloại ra

Trang 12

Data types in NumPy

NumPy supports a bigger number of numeric types than Python

10.1 is the list of basic data types in NumPy

In NumPy array, all the elements have the same data type,

there is any n-D array, ‘a’ is a type of int8 (8-bit integer), then

this array will be able to store an 8-bit integer value

Getting the datatype and memory storage information of NumPy

Trang 13

itemsize attribute of this class, we can get the info about one

The following is the coding snippet demonstrating the same:

In this example, first, we have created an array a, and then

consumed by this array a Last, we used the direct option

ndarray.nbytes to get the total memory size consumed by the

Creating the NumPy array with defined datatype

If we want to create an array with a defined datatype, we must

datatype argument with a valid NumPy type value in the np.

Trang 14

function while creating the NumPy array.The following coding snippet are examples to create the NumPyarray

with int type

Let’s understand the examples mentioned in this codingsnippet

Example#1: We created an array with an integer type by

dtype=int; this will create the array with an integer type with

default

storage; in this case, it is a 32-bit integer

Example#2: If we want some defined storage size of array

Trang 15

value (bytes) means dtype=i2, which is the same as int16, so

In a similar way, we can create arrays as per need The

examples to create an array with float datatype

The following is the coding snippet to demonstrate the arraycreation

with boolean type:

Trang 16

The following is the coding snippet to demonstrate the arraycreation

with String and Unicode string type

In this coding snippet, we can observe that we have just passed

data which have more characters that defined

The structure data type is like Struct in C, or we can think of it

Trang 17

several ways to define the structured data type One of them ispassing

the list of tuples with (field_name, data_type) Syntax: struct_type = numpy.dtype([(filed1,filed1_type), (filed2,filed2_type)…(filedn,filedn_type)])

Here, the field type will be the valid Numpy data types likeint8,int16,

The following is an example where we have created a

Trang 18

name andtype.

We often have to deal with resizing or reshaping the shape of

The following is a list of essential functions we need for daily

work:

Function/

method Description

reshape() a returned new array with a specified

shape without modifying data

flat() flattens the array then returns the element

same as reshape(), but resize modifies the

has been applied, or modifies the referredarray

Table 10.2: Essential functions for data analysis

See the following coding examples to understand thesefunctions better:

Trang 19

We have shown all the examples for the previously discussed

following is the output after executing these examples

Trang 20

This snippet has the output for all the examples which we haveexecuted.

Inserting and deleting array element(s)

This is a common need when adding an element(s) into the

Trang 21

In this example, we have an existing array, a 2-D array Now, atline#5,

we have created an array a1 by appending the values [7,8,9],

axis=1 with append the function respectively For axis=0, we

array ‘a’ has append values at row level or axis =0 (we can see

Trang 23

Joining and splitting NumPy arrays

To join two arrays or split the array, we have various functions

hstack(), vstack(), vslpit(), and many more; we will

Trang 24

case of a 2-D array But we can see that we haven’t passed anyaxis

argument in example#1, in that case, it concatenated the inputarrays

array_1 and array_2 along with axis =0, which is default

value

for parameter axis Please note that we want to concatenateinput

arrays in simple shape; otherwise, it will make an error

Syntax: numpy.concatenate((array1, array1, … arrayn), axis)

Trang 27

In this example, we used the function hsplit(): it splits the

Trang 28

Here in this example, vsplit() function split array_input in two

arrays

(array_1 and array_2)

To understand the data and its nature, we often take the help of

information methods like mean, median, and so on So, here in

have such functions to get the statistical information of the

are some important functions and their uses

These functions NumPy.amin() and NumPy.amaz() give the

The following are the examples of amin() and amax()

functions:

Trang 30

along with the input array But if we do not give the weights as

arguments, this will be like a mean() function.

In this example, we have an array input_arr as input array and

Trang 32

The following coding snippet demonstrates the function

numpy.percentile():

Trang 33

Numeric operations in NumPy

NumPy also has various numeric functions to process thementioned

operations like addition, subtraction, division, and so on The

some essential numeric options in NumPy:

We have functions in NumPy to add, subtract, multiply, and

Trang 34

For these coding examples, the following is the output snippet:

Trang 35

In this example, we can see each element of the input array rise

Trang 36

array Inthe case of an integer type array, it will return 0 if the array

greater than 1; the following coding snippet demonstrates

understand this function:

In Example#2, we can see the type of input array is int, so it

for the array elements greater than 1

Mô tả ma trận trong Numpy

Chúng ta có thể mô tả dạng (shape), kích thước (size) vàchiều của ma trận bằng các hàm shape, size và ndim

Trang 38

for trên các phần tử và không tăng hiệu suất Hơn nữa, mảngNumPy cho phép chúng ta thực hiện hoạt động giữa các mảngngay cả khi kích thước của chúng không giống nhau (một quátrình được gọi là broadcasting) Ví dụ: chúng ta có thể tạo mộtphiên bản đơn giản hơn nhiều cho vấn đề ở trên bằng cách sau:

# Add 100 to all elements

matrix + 100

Tìm giá trị tối đa và tối thiểu trong mảng

Chúng ta cũng có thể tìm giá trị tối đa và tối thiểu củamảng bằng hàm min và max

Trang 39

# Find maximum element in each row

np.max(matrix, axis=1)

array([3, 6, 9])

Tính trung bình, phương sai và độ lệch chuẩn của mảng

Với hàm average(),var, và std() chúng ta có thể tính trungbình, phương sai và độ lệch chuẩn của mảng Numpy

Trang 40

Thay đổi dạng của mảng

Chúng ta có thể thay đổi shape của mảng bằng hàmreshape

Chúng ta có reshape(1,-1) nghĩa là chúng ta tạo matrix có

matrix.reshape(12)

array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

Trang 41

Hoán vị một Vector hoặc Ma trận

Chúng ta có thể sử dụng phương pháp T để hoán vị vectorhoặc matrix Numpy

Do đó, người ta thường hoán vị một vectơ là để chuyển đổimột vectơ hàng thành một vectơ cột (chú ý cặp dấu ngoặc thứhai) hoặc ngược lại:

# Transpose row vector

Trang 42

Làm phẳng là một phương pháp đơn giản để chuyển đổi

ma trận thành mảng một chiều Ngoài ra, chúng ta có thể sửdụng reshape để tạo một vectơ hàng:

matrix.reshape(1, -1)

array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])

Một cách phổ biến khác để làm phẳng mảng là phươngpháp ravel Không giống như hàm flatten, trả về một bản saocủa mảng ban đầu, ravel hoạt động trên chính đối tượng banđầu và do đó nhanh hơn một chút Nó cũng cho phép chúng ta

Trang 43

làm phẳng một loạt các mảng, điều mà chúng ta không thể thựchiện bằng hàm flatten làm phẳng Thao tác này rất hữu ích đểlàm phẳng các mảng rất lớn và tăng tốc mã:

# Create one matrix

# Create a list of matrices

matrix_list = [matrix_a, matrix_b]

# Flatten the entire list of matrices

np.ravel(matrix_list)

array([1, 2, 3, 4, 5, 6, 7, 8])

Tìm hạng của ma trận

Bạn cần biết thứ hạng của một ma trận Bạn có thể sửdụng phương pháp đại số tuyến tính của NumPy Matrix_rank:

Trang 44

của ma trận thật dễ dàng trong NumPy nhờ vào hàmmatrix_rank.

Lấy đường chéo của ma trận

Bạn cần lấy các phần tử đường chéo của ma trận Chúng

ta sử dụng hàm diagonol của NumPy:

NumPy làm cho việc lấy các phần tử đường chéo của matrận trở nên dễ dàng bằng hàm diagonol Cũng có thể lấy đườngchéo bên của ma trận bằng hàm offset:

# Return diagonal one above the main diagonal

Trang 45

# Return diagonal and sum elements

Trang 46

Trong đó ai là thành phần của vector a và bi là thành phầnthứ i của vector b Chúng ta có thể sử dụng hàm dot hoặc trongPython phiên bản 3.5 trở lên chúng ta có thể sử dụng toán tử @.

# Calculate dot product

Trang 48

array([[2, 5],

[3, 7]])

Ngoài ra chúng ta cũng có thể sử dụng toán tử @ kể từphiên bản Python 3.5 trở lên

# Multiply two matrices

Bạn muốn tính nghịch đảo của ma trận vuông Chúng ta

sử dụng phương pháp inv đại số tuyến tính của NumPy:

Trang 49

Nghịch đảo của một ma trận vuông, A là ma trận thứ hai A

-1 trong đó:

I chính là ma trận đẳng thức Trong Numpy chúng ta có thể

sử dụng linalg.inv để tính A-1 nếu nó tồn tại Để xem hành độngnày chúng ta nhân ma trận với nghịch đảo của nó, và kết quảchính là ma trận đẳng thức:

# Multiply matrix and its inverse

matrix @ np.linalg.inv(matrix)

array([[ 1., 0.],

[ 0., 1.]])

Tạo giá trị ngẫu nhiên

Bạn muốn tạo các giá trị giả ngẫu nhiên Giải pháp là sửdụng hàm random.seed của NumPy:

# Generate three random integers between 0 and 10

np.random.randint(0, 11, 3)

array([3, 7, 9])

Trang 50

Ngoài ra, chúng ta có thể tạo ra các số bằng cách rútchúng từ một phân phối (lưu ý rằng đây không phải là ngẫunhiên về mặt kỹ thuật):

# Draw three numbers from a normal distribution with mean0.0

# and standard deviation of 1.0

để mã bạn nhìn thấy trong sách và mã bạn chạy trên máy tínhđều tạo ra kết quả giống nhau

Tiêu đề	Làm việc với vector, ma trận và mảng trong Numpy
Trường học	University of Science
Chuyên ngành	Machine Learning
Thể loại	Tài liệu
Thành phố	Ho Chi Minh

Định dạng
Số trang	50
Dung lượng	4,09 MB
File đính kèm	Chapter 1_Working with Vector.rar (4 MB)