My input spark dataframe is;
Date Client Current
2020-10-26 1 NULL
2020-10-27 1 NULL
2020-10-28 1 NULL
2020-10-29 1 NULL
2020-10-30 1 NULL
2020-10-31 1 NULL
2020-11-01 1 NULL
2020-11-02 1 NULL
2020-11-03 1 NULL
2020-11-04 1 NULL
2020-11-05 1 NULL
2020-11-06 1 NULL
2020-11-07 1 NULL
2020-11-08 1 NULL
2020-11-09 1 NULL
2020-10-26 2 NULL
2020-10-27 2 NULL
2020-10-28 2 NULL
2020-10-29 2 10
2020-10-30 2 23
2020-10-31 2 NULL
2020-11-01 2 NULL
2020-11-02 2 1
2020-11-03 2 NULL
2020-11-04 2 NULL
2020-11-05 2 3
2020-10-27 3 NULL
2020-10-28 3 NULL
2020-10-29 3 10
2020-10-30 3 NULL
2020-10-31 3 NULL
2020-11-01 3 NULL
2020-11-02 3 NULL
2020-11-03 3 32
2020-11-04 3 NULL
2020-11-05 3 3
2020-11-03 4 NULL
2020-11-04 4 NULL
2020-11-05 4 NULL
Dataframe is ordered by client_no and date. If the "Current column" is completely null for a client, the Full_NULL_Count column should write the null number in the first line of the client. I shared the desired output according to the data above;
Date Client Current Full_NULL_Count
2020-10-26 1 NULL 15 -> All "Current" values are null for client 1. So first row
value is equal to total null count for Client 1 .
2020-10-27 1 NULL NULL
2020-10-28 1 NULL NULL
2020-10-29 1 NULL NULL
2020-10-30 1 NULL NULL
2020-10-31 1 NULL NULL
2020-11-01 1 NULL NULL
2020-11-02 1 NULL NULL
2020-11-03 1 NULL NULL
2020-11-04 1 NULL NULL
2020-11-05 1 NULL NULL
2020-11-06 1 NULL NULL
2020-11-07 1 NULL NULL
2020-11-08 1 NULL NULL
2020-11-09 1 NULL NULL
2020-10-26 2 NULL NULL ->There are non null current values for Client 2. So it' s null.
2020-10-27 2 NULL NULL
2020-10-28 2 NULL NULL
2020-10-29 2 10 NULL
2020-10-30 2 23 NULL
2020-10-31 2 NULL NULL
2020-11-01 2 NULL NULL
2020-11-02 2 1 NULL
2020-11-03 2 NULL NULL
2020-11-04 2 NULL NULL
2020-11-05 2 3 NULL
2020-10-27 3 NULL NULL ->There are non null current values for Client 3. So it' s null.
2020-10-28 3 NULL NULL
2020-10-29 3 10 NULL
2020-10-30 3 NULL NULL
2020-10-31 3 NULL NULL
2020-11-01 3 NULL NULL
2020-11-02 3 NULL NULL
2020-11-03 3 32 NULL
2020-11-04 3 NULL NULL
2020-11-05 3 3 NULL
2020-11-03 4 NULL 3 -> All "Current" values are null for client 4. So first row
value is equal to total null count for Client 4.
2020-11-04 4 NULL NULL
2020-11-05 4 NULL NULL
Could you please help me about this?
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…