I have read a number of articles that explain why the Numpy array rank (dimension) is reduced by 1 after slicing (subsetting). I don’t quite understand (or accept) the explanations.
After some explorations, I managed to find the answer. To explain it, we must understand the theory and design of Numpy, specifically on indexing and slicing.
The theory states:
An integer, i, returns the same values as i:i+1 except the dimensionality of the returned object is reduced by 1. In particular, a selection tuple with the p-th element an integer (and all other entries : ) returns the corresponding sub-array with dimension N - 1.
The above theory is too technical to understand 😅 Let me explain with an example below.
a = np.array([ [1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12] ])
[[ 1 2 3 4] [ 5 6 7 8] [ 9 10 11 12]] |
Example: Python Numpy array -Why does its rank reduce by 1 after slicing / subsetting? |
The above Python program creates a 2D array. (i.e. rank or dimension is 2)
How to slice an Numpy array?
- Syntax: a[ row range , column range ]
- Example: a[ 2:3, 0:1 ]
In Numpy, 2:3 and 0:1 are known as indexes. I will use the term index when I explain here.
Example 1
If you do slicing as follows:
a[2, :]Output is: [ 9 10 11 12]
The rank (dimension) of the array is reduced by 1. (i.e. 2D reduces to 1D array). Why? because the theory above states that if an index is an integer (i.e. 2), the rank is reduced by 1.
Example 2
If you do slicing as follows:
a[ 2:3 , : ]Output is: [ [ 9 10 11 12] ]
Indexes of the code above are 2:3 and : . There are no integers in the indexes. Hence, the rank of the array stays at 2. (i.e. 2D array)
Example 3
If you do slicing as follows:
a[ [2] , : ]Output is: [ [ 9 10 11 12] ]
Indexes of the code above are [2] and : . There are no integers in the indexes. Hence, the rank of the array stays at 2. (i.e. 2D array)
Note that [2] is not an integer. It is a indexing method known as "integer array indexing"
Example 4
If you do indexing as follows:
a[ 0 , 0 ]Output is: 1 (It is a scalar value. Neither 1D or 2D array)
Based on the theory, since the indexes are integers which are 0 , 0 → the array rank reduces by 2. That means from a 2D array, it has reduced to a scalar value, as illustrated below.
2D array → 1D array → scalar
Supplementary note:
Ah Ha!😁 a[0, 0] is something we are very familiar with in conventional programming! With two integers as indexes it accesses an element (scalar) in the Numpy array. Now, it makes sense why an integer will reduce the rank (or dimension) of an array. If you do it the reverse direction, upgrading each integer to a range, it will increase the rank, as follows:
scalar → 1D array → 2D array. The following is the detailed explanation.
- Two integers indexes a[ 0 , 0 ] output: 1 scalar
- Upgrade an integer to a range a[2, : ] output: [ 9 10 11 12] 1D array
- Upgrade another integer to a range a[ 2:3 , : ] output: [ [ 9 10 11 12] ] 2D array
0 Comments