python - Removing empty rows before aggregation

Question

Welcome To Ask or Share your Answers For Others

python - Removing empty rows before aggregation

asked Jan 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Removing empty rows before aggregation

I have a list of dataframes (with datetimeindex), the minimum time (date) between two rows in each dataframe is 15 minutes. I would like to grouping all dataframes in one (by day) using mean, median, geometric mean and other methods. The problem is there are some days that contain no data in all dataframes. Some methods, like mean, ignore that but with other methods it causes error. My question is how can remove such days before applying the method?

Data:

[                                 col1     col2      col3    col4  
 date                                                                   
 2020-02-03 08:00:00+00:00    3.616141   3.362717  1.627347    2.242732   
 2020-02-03 08:15:00+00:00    4.043727   3.749407  1.790467    2.272293   
 2020-02-03 08:30:00+00:00    3.872196   3.595969  1.729359    2.221447  
 ...                               ...        ...       ...         ...  
 2020-12-25 08:45:00+00:00    6.645853   1.352785  0.081961    4.112518   
 2020-12-25 09:30:00+00:00    6.066697   1.068805  0.058980    3.991505   
 
 [2204 rows x 6 columns],
...]

Data after aggregation with mean:

                                col1      col2        col3     col4
date                        
2020-02-02 00:00:00+00:00   4.636509    0.842644    0.069093    1.393849    
2020-02-03 00:00:00+00:00   6.649390    1.077993    0.081713    1.798794    
2020-02-04 00:00:00+00:00   5.765083    1.113354    0.097113    1.668112    
2020-02-05 00:00:00+00:00      NaN        NaN          NaN       NaN    
2020-02-06 00:00:00+00:00      NaN        NaN          NaN       NaN    
...                           ...         ...          ...       ...

As you can see, both days 02/05 and 02/06 have no data.

My code to aggregate with gstd which returns error:

from scipy.stats import gstd

cols = ["col1", "col2","col3","col4"]    
joined = pd.concat(df.reset_index() for df in datalist)
joined = joined.replace({np.nan:1, 0:1})
joined[cols] = joined[cols].mask(joined[cols] < 0, 1)

df = joined.set_index('date').groupby(pd.Grouper(freq='D'))

std = df.apply(gstd)
#std = df.agg(gstd)

The error:

ValueError: Degrees of freedom <= 0 for slice

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-01-24T00:16:02+0000

answered Jan 24, 2021 by 深蓝 (71.8m points)

Have you tried

df.dropna()

?

this will drop rows containing at least one null value.

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

python - Removing empty rows before aggregation

python - Removing empty rows before aggregation

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags