Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
221 views
in Technique[技术] by (71.8m points)

asynchronous - speed up python codes by separating IO from CPU calculation

I'm using python to process large time series datasets. The data are processed frames by frames that are partially overlapped. Currently I'm processing in this way: read -> process -> write -> read -> process ....

read_data_into_datastore(...) # read data into a ndarray limited by RAM
while datastore_is_not_empty:
    if few_data_in_store:  # fill if not much data in the store
        read_successive_data_into_datastore(...)
    frames = pop_from_datastore(...) # fetch frames from datastore
    results = process(frames)
    write_results_to_disk(results)

The writing is not a big problem if I don't ask for flushing. But the time-consuming reading blocks the loops frequently. I want to speed it up by using two threadings or two processings: one called cargo in charge of monitoring and filling the datastore, the other called processor in charge of the processing.

My difficulty is, both cargo and processor change the data ndarray (by filling and by cutting). It will be a mess if processor cuts some data from the datastore, while in the meantime, cargo is filling the store. How can I "lock" the datastore when one operation is ongoing?

question from:https://stackoverflow.com/questions/65922280/speed-up-python-codes-by-separating-io-from-cpu-calculation

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...