-
-
Notifications
You must be signed in to change notification settings - Fork 940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SftpClient Enumerates Rather Than Accumulates Directory Items #395
Comments
znamenap
added a commit
to znamenap/SSH.NET
that referenced
this issue
Mar 4, 2018
… Than Accumulates Directory Items
Hi Gert, Can you review and approve this issue and pull-request #396 ? Thank you. |
znamenap
added a commit
to znamenap/SSH.NET
that referenced
this issue
Sep 1, 2020
… Than Accumulates Directory Items
znamenap
added a commit
to znamenap/SSH.NET
that referenced
this issue
Sep 20, 2023
znamenap
added a commit
to znamenap/SSH.NET
that referenced
this issue
Oct 2, 2023
znamenap
added a commit
to znamenap/SSH.NET
that referenced
this issue
Oct 2, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
Thank you for really great library! Thank you once again!
I've a proposal to enhance (fix) some behavior which leads to delayed processing and huge memory footprints. They are actually our current production issues caused by 65 thousand directory items in single parent directory. Our consumer service has to list all of them and then process one by one or in batches. The current SftpClient code basically reads all the items and make SftpFile objects of it. That generates a peak in memory usage which is still acceptable. However, the criteria or behavior which is not acceptable is the time required to make the list of the folder items. In our scenario, it takes 1 minute and 45 seconds. The distance is between London and New York for production environment. The service we maintain and which uses SSH.NET has to check the remote directory as frequent as possible at max by 1 minute. In my test environment between West Europe and West US the listing operation lasts 2 minutes and 6 seconds over 75 thousand folder items.
There is nothing actually to do with the principle on listing the directory into items but how the library handles result data. Currently, it accumulates all items and then passes them to consumer as one big list. The enhancement should be about to introduce a new method which does real enumeration. This would merge/join the time spent on listing and consuming the items and pushes it directly to consumer. The consumer's logic may decide - ah, we've got enough items to process so we can stop enumerating (i.e. in the middle or at the first 150 items). This scenario actually shorten the processing time to acceptable minimum.
Obviously, I've looked into the source code, have done the change, tested it and thought about multithreading scenarios. I wish it to be merged in if you agree on.
Many thanks,
Pavel
Mine proposal is about to add dedicated new methods into SftpClient which does enumeration:
The text was updated successfully, but these errors were encountered: