On Sunday 24 April 2022 at 04:15:35, Olaf Meeuwissen via Dng wrote:
> Hi,
>
> Antony Stone writes:
> >
> > I just tried several successive searches for a few unique filenames in a
> > directory tree (all files in the same directory, just in case the
> > position made a difference).
> >
> > The first search took 6 minutes and clearly set up some cache of results,
> > because subsequent searches were consistently:
> >
> > find . | grep filename : 20 seconds
> >
> > find . -name filename : 25 seconds
> >
> > That was consistent no matter whether the two filenames were the same, or
> > different but still in the same directory, and no matter which command
> > was run first.
> >
> > Nice observation.
>
> Indeed but you must have an awful lot of files, slow disks and/or a slow
> network connection. After setting up the cache on my machine, I get 0.7
> seconds for the -name approach and 0.5 seconds for grep.
> That's with close to half a million filesystem entries and about 7000 of
> those on an NFS backed filesystem on the NAS downstairs. The rest is on
> an SSD (NVMe).
I deliberately chose a large file system because I prefer comparing bigger
numbers when I can.
In my case there are 11,174,006 files occupying 8.8 Tbytes, all on spinning
metal disks, and housed in not-especially-fast HP N54L microservers.
Antony.
--
"It would appear we have reached the limits of what it is possible to achieve
with computer technology, although one should be careful with such statements;
they tend to sound pretty silly in five years."
- John von Neumann (1949)
Please reply to the list;
please *don't* CC me.