How to Normalize a Numpy Array to Within a Certain Range

Scale Numpy array to certain range

After asking on CodeReview, I was informed there is a built-in np.interp that accomplishes this:

np.interp(a, (a.min(), a.max()), (-1, +1))

I've left my old answer below for the sake of posterity.

I made my own function based off of the D3.js code in this answer:

import numpy as np

def d3_scale(dat, out_range=(-1, 1)):
    domain = [np.min(dat, axis=0), np.max(dat, axis=0)]

    def interp(x):
        return out_range[0] * (1.0 - x) + out_range[1] * x

    def uninterp(x):
        b = 0
        if (domain[1] - domain[0]) != 0:
            b = domain[1] - domain[0]
        else:
            b =  1.0 / domain[1]
        return (x - domain[0]) / b

    return interp(uninterp(dat))

print(d3_scale(np.array([-2, 0, 2], dtype=np.float)))
print(d3_scale(np.array([-3, -2, -1], dtype=np.float)))

Min-max normalisation of a NumPy array

Referring to this Cross Validated Link, How to normalize data to 0-1 range?, it looks like you can perform min-max normalisation on the last column of foo.

v = foo[:, 1]   # foo[:, -1] for the last column
foo[:, 1] = (v - v.min()) / (v.max() - v.min())

foo

array([[ 0.        ,  0.        ],
       [ 0.13216   ,  0.06609523],
       [ 0.25379   ,  1.        ],
       [ 0.30874   ,  0.09727968]])

Another option for performing normalisation (as suggested by OP) is using sklearn.preprocessing.normalize, which yields slightly different results -

from sklearn.preprocessing import normalize
foo[:, [-1]] = normalize(foo[:, -1, None], norm='max', axis=0)

foo

array([[ 0.        ,  0.2378106 ],
       [ 0.13216   ,  0.28818769],
       [ 0.25379   ,  1.        ],
       [ 0.30874   ,  0.31195614]])

how numpy.ndarray can be normalized?

Vectorized is much faster than iterative

If you want to scale the pixel values of all your images using numpy arrays only, you may want to keep the vectorized nature of the operation (by avoiding loops).

Here is a way to scale your images :

# Getting min and max per image
maxis = images.max(axis=(1,2,3))
minis = images.min(axis=(1,2,3))
# Scaling without any loop
scaled_images = ((images.T - minis) / (maxis - minis) * 255).T
# timeit > 178 µs ± 1.24 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

The transposes .T were necessary here to broadcast correctly the subtraction.

We can check if this is correct:

print((scaled_images.min(axis=(1,2,3)) == 0).all())
# > True
print((scaled_images.max(axis=(1,2,3)) == 255).all())
# > True

Scaling into the [0, 1] range

If you want pixel values between 0and 1, we simply remove the x255 multiplication:

scaled_images = ((images.T - minis) / (maxis - minis)).T

Only with numpy arrays and such

You must also make sure you are handling a numpy array in the first place, not a list :

import numpy as np
images = np.array(images)

OpenCV
On-the-go scaling

Since you are using opencv to read your images one by one, you can normalize your images on the go with it:

inputPath='E:/Notebooks/data'

max_scale = 1   # or 255 if needed
# Load in the images 
images = [cv2.normalize(
    cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)),
    None, 0, max_scale, cv2.NORM_MINMAX)
    for filepath in os.listdir(inputPath)]

Make sure you have images in the folder

inputPath='E:/Notebooks/data'
images = []

max_scale = 1   # or 255 if needed

# Load in the images 
for filepath in os.listdir(inputPath):
    image = cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH))
    # Scale and append the list if it is an image
    if image is not None:
        images.append(cv2.normalize(image, None, 0, max_scale, cv2.NORM_MINMAX))

Bug on versions of open-cv prior to 3.4

As reported here, there is a bug with opencv's normalize method producing values below the alpha parameter. It was corrected on version 3.4.

Here is a way to scale images on-the-go with older versions of open-cv:

def custom_scale(img, max_scale=1):
    mini = img.min()
    return (img - mini) / (img.max() - mini) * max_scale

max_scale = 1   # or 255 if needed

images = [custom_scale(
    cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)), max_scale)
    for filepath in os.listdir(inputPath)]

How to scale a numpy array from 0 to 1 with overshoot?

First, transform the DataFrame to a numpy array

import numpy as np
T = np.array(df['Temp'])

Then scale it to a [0, 1] interval:

def scale(A):
    return (A-np.min(A))/(np.max(A) - np.min(A))

T_scaled = scale(T)

Then transform it to anywhere you want, e.g. to [55..100]

T2 = 55 + 45*T_scaled

I'm sure that this can be done within Pandas too (but I'm not familiar with it). Perhaps you might study Pandas df.apply()

How To Normalize Array Between 1 and 10?

Your range is actually 9 long: from 1 to 10. If you multiply the normalized array by 9 you get values from 0 to 9, which you need to shift back by 1:

start = 1
end = 10
width = end - start
res = (arr - arr.min())/(arr.max() - arr.min()) * width + start

Note that the denominator here has a numpy built-in named arr.ptp():

res = (arr - arr.min())/arr.ptp() * width + start