For reasons, I'm currently running a for loop in which A x N
arrays (actually xarray
data arrays, but the problem persists with basic numpy
arrays as in the MRE below) are multiplied by N x 1
pandas series (to form A x 1
arrays), for various N
s.
This works fine as long as N>1
. But as soon as N=1
(aka the entire array is just multiplied by the same value), an odd KeyError
appears (see below).
import pandas as pd
import numpy as np
# Size 2x1 series, multiplication works fine
a = np.reshape(np.array(np.random.rand(10,2)),(10,-1))
b = [pd.Series(data=[1.0,1.0],index=[1,2])]
a*b
# Size 1x1 series, multiplication fails
a = np.reshape(np.array(np.random.rand(10,1)),(10,-1))
b = [pd.Series(data=[1.0],index=[1])]
a*b
# With following error message
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-112-d29b22165258> in <module>
1 a = np.reshape(np.array(np.random.rand(10,1)),(10,-1))
2 b = [pd.Series(data=[1.0],index=[1])]
----> 3 a*b
~/opt/anaconda3/envs/climate1/lib/python3.7/site-packages/pandas/core/series.py in __getitem__(self, key)
869 key = com.apply_if_callable(key, self)
870 try:
--> 871 result = self.index.get_value(self, key)
872
873 if not is_scalar(result):
~/opt/anaconda3/envs/climate1/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
4402 k = self._convert_scalar_indexer(k, kind="getitem")
4403 try:
-> 4404 return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
4405 except KeyError as e1:
4406 if len(self) > 0 and (self.holds_integer() or self.is_boolean()):
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 0
I've found a workaround that replaces explicitly the series with floats:
# This works fine
a = np.reshape(np.array(np.random.rand(10,1)),(10,-1))
b = [pd.Series(data=[1.0],index=[1])]
a*[float(k) for k in b]
But I would love any help in understanding where this error message comes from; since I suspect that I'm misunderstanding something more fundamental about pandas
.
Thank you in advance!