I'm trying to multiply two existing columns in a pandas Dataframe (orders_df) - Prices (stock close price) and Amount (stock quantities) and add the calculation to a new column called 'Value'. For some reason when I run this code, all the rows under the 'Value' column are positive numbers, while some of the rows should be negative. Under the Action column in the DataFrame there are seven rows with the 'Sell' string and seven with the 'Buy' string.
for i in orders_df.Action:
if i == 'Sell':
orders_df['Value'] = orders_df.Prices*orders_df.Amount
elif i == 'Buy':
orders_df['Value'] = -orders_df.Prices*orders_df.Amount)
Please let me know what i'm doing wrong !
解决方案
If we're willing to sacrifice the succinctness of Hayden's solution, one could also do something like this:
In [22]: orders_df['C'] = orders_df.Action.apply(
lambda x: (1 if x == 'Sell' else -1))
In [23]: orders_df # New column C represents the sign of the transaction
Out[23]:
Prices Amount Action C
0 3 57 Sell 1
1 89 42 Sell 1
2 45 70 Buy -1
3 6 43 Sell 1
4 60 47 Sell 1
5 19 16 Buy -1
6 56 89 Sell 1
7 3 28 Buy -1
8 56 69 Sell 1
9 90 49 Buy -1
Now we have eliminated the need for the if statement. Using DataFrame.apply(), we also do away with the for loop. As Hayden noted, vectorized operations are always faster.
In [24]: orders_df['Value'] = orders_df.Prices * orders_df.Amount * orders_df.C
In [25]: orders_df # The resulting dataframe
Out[25]:
Prices Amount Action C Value
0 3 57 Sell 1 171
1 89 42 Sell 1 3738
2 45 70 Buy -1 -3150
3 6 43 Sell 1 258
4 60 47 Sell 1 2820
5 19 16 Buy -1 -304
6 56 89 Sell 1 4984
7 3 28 Buy -1 -84
8 56 69 Sell 1 3864
9 90 49 Buy -1 -4410
This solution takes two lines of code instead of one, but is a bit easier to read. I suspect that the computational costs are similar as well.