Benchmarking¶
This notebook explores a key constructor parameter that affects the FileSenderClient
performance: concurrent_chunks
. concurrent_chunks
sets a limit on the number of chunks of file data that can loaded into memory at the same time. Therefore a lower number for this parameter is expected to reduce memory usage but also increase runtime.
The built-in benchmark
function can be used to run the comparison. The below code benchmarks an upload using 3 files, each of which is 100 MB. We test a read limit of 1-10 and a request limit of 1-10.
from filesender.benchmark import benchmark, make_tempfiles
from os import environ
with make_tempfiles(size=100_000_000, n=3) as paths:
results = benchmark(paths, limit=6, apikey=environ["API_KEY"], base_url=environ["BASE_URL"], recipient=environ["RECIPIENT"], username=environ["USERNAME"])
tmpom0d0m4w: 100%|██████████| 8/8 [00:02<00:00, 3.28it/s] tmpm9hndr4l: 100%|██████████| 8/8 [00:02<00:00, 3.50it/s] tmpe10e9bu9: 100%|██████████| 8/8 [00:02<00:00, 3.43it/s] tmpom0d0m4w: 100%|██████████| 8/8 [00:05<00:00, 1.46it/s] tmpm9hndr4l: 100%|██████████| 8/8 [00:04<00:00, 1.84it/s] tmpe10e9bu9: 100%|██████████| 8/8 [00:03<00:00, 2.10it/s] tmpom0d0m4w: 100%|██████████| 8/8 [00:03<00:00, 2.11it/s] tmpm9hndr4l: 100%|██████████| 8/8 [00:03<00:00, 2.65it/s] tmpe10e9bu9: 100%|██████████| 8/8 [00:02<00:00, 3.62it/s] tmpom0d0m4w: 100%|██████████| 8/8 [00:02<00:00, 2.70it/s] tmpm9hndr4l: 100%|██████████| 8/8 [00:03<00:00, 2.44it/s] tmpe10e9bu9: 100%|██████████| 8/8 [00:02<00:00, 2.80it/s] tmpom0d0m4w: 100%|██████████| 8/8 [00:02<00:00, 2.71it/s] tmpm9hndr4l: 100%|██████████| 8/8 [00:01<00:00, 4.18it/s] tmpe10e9bu9: 100%|██████████| 8/8 [00:01<00:00, 4.42it/s]
Transform the results into a data frame for analysis:
import pandas as pd
result_df = pd.DataFrame.from_records(vars(result) for result in results)
result_df.memory = result_df.memory / 1024 ** 2
result_df
time | memory | concurrent_chunks | |
---|---|---|---|
0 | 7.708177 | 0.195992 | 1 |
1 | 14.282335 | 0.350613 | 2 |
2 | 9.678596 | 0.401077 | 3 |
3 | 9.738843 | 0.406414 | 4 |
4 | 7.315477 | 0.418480 | 5 |
The memory usage consistently increases as we increase the number of concurrent chunks, as expect. What is more unusual is that the time doesn't follow a consistent pattern in relation to the number of chunks