Divide all columns by the value from the 2nd column - apply for all rows
We can make the lengths equal by replicating the second column and then divide by the subset of dataset that doesn't have the first or second column
df1[-(1:2)] <- df1[-(1:2)]/df1[,2][row(df1[-(1:2)])]
df1
# Name Col dKO1 dKO2 sdi1
#29 Mark 1769380098 0.8674967 0.9201740 0.8735108
#30 Anders 1444462500 1.2425947 1.2336649 1.2105541
#1278 Tom 1499146688 1.5293111 1.1068905 1.1640133
#1295 Vin 1276309375 0.6705163 0.5807531 1.2195172
#1296 Marcel 22279500 0.9836621 1.8511187 NaN
#1297 Tyta 3114023471 0.9813868 0.9098608 1.1405553
#1298 Gerta 2961012500 1.2097011 1.2412815 1.0496874
#1307 Moses 3978937424 0.9467125 0.9029171 0.9344295
#1642 Hank 1703925000 1.1991725 0.9310929 0.8100584
#1674 Rita 1838885550 1.1614969 1.0520367 1.2059076
#1754 Margary 1483386250 0.9990865 0.9891363 0.6922093
Divide multiple columns by another column in pandas
I believe df[['B','C']].div(df.A, axis=0)
and df.iloc[:,1:].div(df.A, axis=0)
work.
How to divide sing column by multiple columns in pandas?
Setup
import pandas as pd
import numpy as np
np.random.seed(0)
cols = ['A' + str(n) for n in range(1,11)]
df = pd.DataFrame(
np.random.randint(0,10, (8,10)),
columns=cols
)
df['Z'] = np.random.randint(0,70, 8)
Compute
Leverage the fact that pandas dataframes are basically just numpy arrays. Access only the columns you want to apply a function on (that's what the df.loc[:, 'A10']
was for. See pandas docs for help on slicing).
the .apply
method needs a function and by default is applied to each column (i.e. a column is the input argument). This means that you're dividing the column Z (df['Z']
) by the values of every column in df (df['A1']
, df['A2']
, etc...)
as I mentioned, pandas data structures are built on top of numpy arrays. So dividing an 8 x 1 shaped array (the df['Z']
column) by another 8 x 1 array (every other column), will mean you will be dividing each element by the equivalent element (first in Z divided by first in the other column, 2nd to 2nd, 3rd to 3rd, etc...)
df.loc[:, 'A10'].apply(lambda col: df['Z'] / col)
Divide all columns of a dataframe by a smaller one by key without merging
You can use set_index
:
df2.set_index('ID.').div(df1.set_index('ID.')['denominator'], axis=0)
Divide a value in a Dataframe column with the previous value in the same column
Use Series.shift
instead of apply
:
df['D'] = df['C'] / df['C'].shift()
# index A B C D
# 0 3 2 5 NaN
# 1 4 7 6 1.200000
# 2 2 4 8 1.333333
Optionally chain Series.fillna
if you want 0 instead of NaN:
df['D'] = df['C'].div(df['C'].shift()).fillna(0)
# index A B C D
# 0 3 2 5 0.000000
# 1 4 7 6 1.200000
# 2 2 4 8 1.333333
[pandas]Dividing all elements of columns in df with elements in another column (Same df)
Apologies in advance for the somewhat protracted answer, but the question is somewhat unclear with regards to what exactly you're attempting to accomplish.
If you simply want price[0]/share[0], price[1]/share[1], etc. you can just do:
dftest['price_div_share'] = dftest['price'] / dftest['share']
The issue with the operand types can be solved by:
dftest['price_div_share'] = dftest['price'].astype(float) / dftest['share'].astype(float)
You're getting the cant convert from str to float
error because you're trying to call astype(float)
on the ENTIRE dataframe which contains string columns.
If you want to divide each item by each item, i.e. price[0] / share[0], price[1] / share[0], price[2] / share[0], price[0] / share[1], etc. You would need to iterate through each item and append the result to a new list. You can do that pretty easily with a for loop, although it may take some time if you're working with a large dataset. It would look something like this if you simply want the result:
new_list = []
for p in dftest['price'].astype(float):
for s in dftest['share'].astype(float):
new_list.append(p/s)
If you want to get this in a new dataframe you can simply save it to a new dataframe using pd.Dataframe() method:
new_df = pd.Dataframe(new_list, columns=[price_divided_by_share])
This new dataframe would only have one column (the result, as mentioned above). If you want the information from the original dataframe as well, then you would do something like the following:
new_list = []
for n, a, p in zip(dftest['name'], dftest['age'], dftest['price'].astype(float):
for s in dftest['share'].astype(float):
new_list.append([n, a, p, s, p/s])
new_df = pd.Dataframe(new_list, columns=[name, age, price, share, price_div_by_share])
Using column-wise operations in dplyr to divide multiple column values by a specified row
across
should be closed after the .fns
library(dplyr)
df_pre %>%
mutate(across(starts_with("q_"), ~ .x / .x[name == "max"]))
-output
# A tibble: 3 x 3
name q_1 q_2
<chr> <dbl> <dbl>
1 max 1 1
2 a 0.8 0.5
3 b 0.6 0.25
The usage of ?across
is
across(.cols = everything(), .fns = NULL, ..., .names = NULL)
Related Topics
How to Create a Lag Variable Within Each Group
Why Are My Dplyr Group_By & Summarize Not Working Properly? (Name-Collision With Plyr)
How to Use Pivot_Longer to Reshape from Wide-Type Data to Long-Type Data With Multiple Variables
Data.Table VS Dplyr: Can One Do Something Well the Other Can't or Does Poorly
How to Set Limits For Axes in Ggplot2 R Plots
Grep Using a Character Vector With Multiple Patterns
Convert Data.Frame Columns from Factors to Characters
Error in ≪My Code≫: Object of Type 'Closure' Is Not Subsettable
Rcpp Package Doesn't Include Rcpp_Precious_Remove
Complete Dataframe With Missing Combinations of Values
Annotating Text on Individual Facet in Ggplot2
Is the "*Apply" Family Really Not Vectorized
Finding Local Maxima and Minima
Interpreting "Condition Has Length ≫ 1" Warning from 'If' Function
Expand Ranges Defined by "From" and "To" Columns
Add Column Which Contains Binned Values of a Numeric Column