Scale Numpy array to certain range
After asking on CodeReview, I was informed there is a built-in np.interp
that accomplishes this:
np.interp(a, (a.min(), a.max()), (-1, +1))
I've left my old answer below for the sake of posterity.
I made my own function based off of the D3.js
code in this answer:
import numpy as np
def d3_scale(dat, out_range=(-1, 1)):
domain = [np.min(dat, axis=0), np.max(dat, axis=0)]
def interp(x):
return out_range[0] * (1.0 - x) + out_range[1] * x
def uninterp(x):
b = 0
if (domain[1] - domain[0]) != 0:
b = domain[1] - domain[0]
else:
b = 1.0 / domain[1]
return (x - domain[0]) / b
return interp(uninterp(dat))
print(d3_scale(np.array([-2, 0, 2], dtype=np.float)))
print(d3_scale(np.array([-3, -2, -1], dtype=np.float)))
Min-max normalisation of a NumPy array
Referring to this Cross Validated Link, How to normalize data to 0-1 range?, it looks like you can perform min-max normalisation on the last column of foo
.
v = foo[:, 1] # foo[:, -1] for the last column
foo[:, 1] = (v - v.min()) / (v.max() - v.min())
foo
array([[ 0. , 0. ],
[ 0.13216 , 0.06609523],
[ 0.25379 , 1. ],
[ 0.30874 , 0.09727968]])
Another option for performing normalisation (as suggested by OP) is using sklearn.preprocessing.normalize
, which yields slightly different results -
from sklearn.preprocessing import normalize
foo[:, [-1]] = normalize(foo[:, -1, None], norm='max', axis=0)
foo
array([[ 0. , 0.2378106 ],
[ 0.13216 , 0.28818769],
[ 0.25379 , 1. ],
[ 0.30874 , 0.31195614]])
how numpy.ndarray can be normalized?
Vectorized is much faster than iterativeIf you want to scale the pixel values of all your images using numpy
arrays only, you may want to keep the vectorized nature of the operation (by avoiding loops).
Here is a way to scale your images :
# Getting min and max per image
maxis = images.max(axis=(1,2,3))
minis = images.min(axis=(1,2,3))
# Scaling without any loop
scaled_images = ((images.T - minis) / (maxis - minis) * 255).T
# timeit > 178 µs ± 1.24 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
The transposes .T
were necessary here to broadcast correctly the subtraction.
We can check if this is correct:
print((scaled_images.min(axis=(1,2,3)) == 0).all())
# > True
print((scaled_images.max(axis=(1,2,3)) == 255).all())
# > True
Scaling into the [0, 1] rangeIf you want pixel values between 0
and 1
, we simply remove the x255 multiplication:
scaled_images = ((images.T - minis) / (maxis - minis)).T
Only with numpy arrays and suchYou must also make sure you are handling a numpy array
in the first place, not a list
:
import numpy as np
images = np.array(images)
OpenCVOn-the-go scaling
Since you are using opencv
to read your images one by one, you can normalize your images on the go with it:
inputPath='E:/Notebooks/data'
max_scale = 1 # or 255 if needed
# Load in the images
images = [cv2.normalize(
cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)),
None, 0, max_scale, cv2.NORM_MINMAX)
for filepath in os.listdir(inputPath)]
Make sure you have images in the folderinputPath='E:/Notebooks/data'
images = []
max_scale = 1 # or 255 if needed
# Load in the images
for filepath in os.listdir(inputPath):
image = cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH))
# Scale and append the list if it is an image
if image is not None:
images.append(cv2.normalize(image, None, 0, max_scale, cv2.NORM_MINMAX))
Bug on versions of open-cv prior to 3.4As reported here, there is a bug with opencv's normalize
method producing values below the alpha parameter
. It was corrected on version 3.4.
Here is a way to scale images on-the-go with older versions of open-cv:
def custom_scale(img, max_scale=1):
mini = img.min()
return (img - mini) / (img.max() - mini) * max_scale
max_scale = 1 # or 255 if needed
images = [custom_scale(
cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)), max_scale)
for filepath in os.listdir(inputPath)]
How to scale a numpy array from 0 to 1 with overshoot?
First, transform the DataFrame to a numpy array
import numpy as np
T = np.array(df['Temp'])
Then scale it to a [0, 1] interval:
def scale(A):
return (A-np.min(A))/(np.max(A) - np.min(A))
T_scaled = scale(T)
Then transform it to anywhere you want, e.g. to [55..100]
T2 = 55 + 45*T_scaled
I'm sure that this can be done within Pandas too (but I'm not familiar with it). Perhaps you might study Pandas df.apply()
How To Normalize Array Between 1 and 10?
Your range is actually 9 long: from 1 to 10. If you multiply the normalized array by 9 you get values from 0 to 9, which you need to shift back by 1:
start = 1
end = 10
width = end - start
res = (arr - arr.min())/(arr.max() - arr.min()) * width + start
Note that the denominator here has a numpy built-in named arr.ptp()
:
res = (arr - arr.min())/arr.ptp() * width + start
Related Topics
Python: Fastest Way to Compare Arrays Elementwise
Python - How to Pad the Output of a MySQL Table
How to Convert Number 1 to a Boolean in Python
Using SQL Server Stored Procedures from Python (Pyodbc)
How to Extract Hours and Minutes from a Datetime.Datetime Object
In Python, How to Split a String and Keep the Separators
How to Bold Text in Telepot Telegram Bot
Python: Opencv - Selecting Region of an Image
How to Sort the Letters in a String Alphabetically in Python
Invalidargumenterror: Logits and Labels Must Have the Same First Dimension Seq2Seq Tensorflow
How to Check If Numbers Are in a List in Python
Passing a List of Values from Python to the in Clause of an SQL Query
Printing Lists in Python Without Spaces
How to Get the Response Json Data from Network Call in Xhr Using Python Selenium Web Driver Chorme
How Does \R (Carriage Return) Work in Python