:: [DNG] Find efficiency [Was: Re: mo…
Top Page
Delete this message
Reply to this message
Author: Antony Stone
Date:  
To: dng
Old-Topics: Re: [DNG] mouse driver question
Subject: [DNG] Find efficiency [Was: Re: mouse driver question]
On Sunday 24 April 2022 at 04:15:35, Olaf Meeuwissen via Dng wrote:

> Hi,
>
> Antony Stone writes:
> >
> > I just tried several successive searches for a few unique filenames in a
> > directory tree (all files in the same directory, just in case the
> > position made a difference).
> >
> > The first search took 6 minutes and clearly set up some cache of results,
> > because subsequent searches were consistently:
> >
> >     find . | grep filename : 20 seconds

> >     
> >     find . -name filename : 25 seconds

> >
> > That was consistent no matter whether the two filenames were the same, or
> > different but still in the same directory, and no matter which command
> > was run first.
> >
> > Nice observation.
>
> Indeed but you must have an awful lot of files, slow disks and/or a slow
> network connection. After setting up the cache on my machine, I get 0.7
> seconds for the -name approach and 0.5 seconds for grep.
> That's with close to half a million filesystem entries and about 7000 of
> those on an NFS backed filesystem on the NAS downstairs. The rest is on
> an SSD (NVMe).


I deliberately chose a large file system because I prefer comparing bigger
numbers when I can.

In my case there are 11,174,006 files occupying 8.8 Tbytes, all on spinning
metal disks, and housed in not-especially-fast HP N54L microservers.

Antony.

--
"It would appear we have reached the limits of what it is possible to achieve
with computer technology, although one should be careful with such statements;
they tend to sound pretty silly in five years."

- John von Neumann (1949)

                                                   Please reply to the list;
                                                         please *don't* CC me.