Scan for new items takes ages after syncing collection btw. two computers

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Scan for new items takes ages after syncing collection btw. two computers

news@tcrass.de
Hi there,

there's one thing I've been wondering about for quite a while:

I use unison (https://www.cis.upenn.edu/~bcpierce/unison/ -- great
tool!) for syncing my photo collection (including digikam4.db) between
desktop and laptop computer. I take great care not to add or edit photos
on both computers simultaneously, so every syncing process is actually a
clean copy operation from one machine to the other. Yet, when, after
syncinc, I launch digikam on the target machine, it apparently does a
full scan of all items in the collection, which takes ages, even if only
a few photos have actually been added or changed. However, when
re-launching digiakm on the same machine without immediately preceeding
sync, the new items scan takes only a few seconds.

So how does digikam decide which folders and images are to be
re-scanned? How can digikam possibly 'know' that there has been going on
something else than just local file changes?

The only idea I came up with is that digikam might somehow detects the
changes in its database file's meta data (like file size, access time,
md5 hash value...) with respect to its last run on the same machine
which were introduced during syncing.

Any comments appreciated!

Cheers --

-- Torsten

Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Chris Green
On Sat, Jan 28, 2017 at 10:48:37PM +0100, [hidden email] wrote:
>
> So how does digikam decide which folders and images are to be re-scanned?
> How can digikam possibly 'know' that there has been going on something else
> than just local file changes?
>
It's probably because Unison updates the 'last access' date field on
the files and Digikam uses this to decide whether to resacn.

In Unix/Linux a file has three times associated with it:-

    Modified time - last time the file contents were changes (write or append)
    Accessed time - last time the file was accessed (read or write)
    Status changed - last time the attributes (owner, permissions, etc.) were changed


> The only idea I came up with is that digikam might somehow detects the
> changes in its database file's meta data (like file size, access time, md5
> hash value...) with respect to its last run on the same machine which were
> introduced during syncing.
>
Digikam will compare its database with one (or more) of the above file
times I expect.  A Digikam expert will no doubt tell us.

--
Chris Green
Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

news@tcrass.de
Chris,

> It's probably because Unison updates the 'last access' date field on
> the files and Digikam uses this to decide whether to resacn.

> In Unix/Linux a file has three times associated with it:-
>
>     Modified time - last time the file contents were changes (write or append)
>     Accessed time - last time the file was accessed (read or write)
>     Status changed - last time the attributes (owner, permissions, etc.) were changed

yeah, that's what I came up with, too -- the digikam4.db file will
probably have its access time changed during syncing.

> Digikam will compare its database with one (or more) of the above file
> times I expect.  A Digikam expert will no doubt tell us.

But if so, I still wonder where digikam keeps record of digikam4.db's
access time?

Cheers --

        Torsten


Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Philip Tuckey-2
In reply to this post by news@tcrass.de
Me too. I see the same behaviour when switching between dk running native on OS X, and running under Linux in a VM on the same machine. Database and image files are shared with the VM. Only the configuration files are specific to each OS.
Philip

On 28 January 2017 22:48:37 CET, "[hidden email]" <[hidden email]> wrote:
Hi there,

there's one thing I've been wondering about for quite a while:

I use unison (https://www.cis.upenn.edu/~bcpierce/unison/ -- great
tool!) for syncing my photo collection (including digikam4.db) between
desktop and laptop computer. I take great care not to add or edit photos
on both computers simultaneously, so every syncing process is actually a
clean copy operation from one machine to the other. Yet, when, after
syncinc, I launch digikam on the target machine, it apparently does a
full scan of all items in the collection, which takes ages, even if only
a few photos have actually been added or changed. However, when
re-launching digiakm on the same machine without immediately preceeding
sync, the new items scan takes only a few seconds.

So how does digikam decide which folders and images are to be
re-scanned? How can digikam possibly 'know' that there has been going on
something else than just local file changes?

The only idea I came up with is that digikam might somehow detects the
changes in its database file's meta data (like file size, access time,
md5 hash value...) with respect to its last run on the same machine
which were introduced during syncing.

Any comments appreciated!

Cheers --

-- Torsten


--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Eduard Zalar

Why do you try to sync the DB file?

I thought about syncing 2 or more PCs also, but I never had the idea to sync the DB file.

To be honest, I have not yet implemented my idea for syncing, but I thought that it is sufficient to sync the pictures only.

With every sync, digiKam would detect some new or changed pics, but that should not take so long... Every unchanged/unsynced file is detected as unchanged.

I would try to let the DB file under digiKams control and never try to copy it...
At least as long as you use the SQLite engine.


Philip Tuckey <[hidden email]> schrieb am Mo., 30. Jan. 2017, 18:09:
Me too. I see the same behaviour when switching between dk running native on OS X, and running under Linux in a VM on the same machine. Database and image files are shared with the VM. Only the configuration files are specific to each OS.
Philip


On 28 January 2017 22:48:37 CET, "[hidden email]" <[hidden email]> wrote:
Hi there,

there's one thing I've been wondering about for quite a while:

I use unison (https://www.cis.upenn.edu/~bcpierce/unison/ -- great
tool!) for syncing my photo collection (including digikam4.db) between
desktop and laptop computer. I take great care not to add or edit photos
on both computers simultaneously, so every syncing process is actually a
clean copy operation from one machine to the other. Yet, when, after
syncinc, I launch digikam on the target machine, it apparently does a
full scan of all items in the collection, which takes ages, even if only
a few photos have actually been added or changed. However, when
re-launching digiakm on the same machine without immediately preceeding
sync, the new items scan takes only a few seconds.

So how does digikam decide which folders and images are to be
re-scanned? How can digikam possibly 'know' that there has been going on
something else than just local file changes?

The only idea I came up with is that digikam might somehow detects the
changes in its database file's meta data (like file size, access time,
md5 hash value...) with respect to its last run on the same machine
which were introduced during syncing.

Any comments appreciated!

Cheers --

-- Torsten


--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Andrew Goodbody
In reply to this post by news@tcrass.de


On 30/01/17 16:18, [hidden email] wrote:

> Chris,
>
>> It's probably because Unison updates the 'last access' date field on
>> the files and Digikam uses this to decide whether to resacn.
>
>> In Unix/Linux a file has three times associated with it:-
>>
>>     Modified time - last time the file contents were changes (write or
>> append)
>>     Accessed time - last time the file was accessed (read or write)
>>     Status changed - last time the attributes (owner, permissions,
>> etc.) were changed
>
> yeah, that's what I came up with, too -- the digikam4.db file will
> probably have its access time changed during syncing.
>
>> Digikam will compare its database with one (or more) of the above file
>> times I expect.  A Digikam expert will no doubt tell us.
>
> But if so, I still wonder where digikam keeps record of digikam4.db's
> access time?

It doesn't, that's not how it works.

digiKam stores the path to the root of the collection in the database
and this can look different on different systems, hence it will do a
full rescan.

Andrew
Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Jim Gomi
In reply to this post by news@tcrass.de
I would strongly advise against including the digikam4.db file in the
synchronization when syncing between two different machines.
I ended up with a corrupted database file that way, and it was a lot of
effort to un-corrupt it
https://mail.kde.org/pipermail/digikam-users/2017-January/023348.html

It's much safer to configure digikam to write all tags etc to the image
files, and then just sync the image files. 
It's true that after syncing you have to wait while the database is
rebuilt, but at least you know that you'll end up with a clean correct
database.

Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

stancs3
Would we expect similar results if the database was in a mysql server?
The server would in theory remain a constant, and become attached to
the new machine. I am actually getting ready to do just such a thing. I
have a current VM with digikam and am preparing a new VM. If interested
I will post my progress, but it won't be until I clean up my backup
image data storage, which will take me a few more weeks at the rate I
am going.

On a different sceanario, I moved a mysql db from one server VM to a
different server VM, and pointed a non-changed digikam to the new
server. It was no problem. It did notice the db was on a different
server, but just proceeded to work.

In essence I always opt for a real db when possible. It takes some
learning offline, but it is worth it. Postgresql is even better than
mysql, so someday ..... :)


On Mon, 2017-01-30 at 22:43 -0600, Jim Gomi wrote:

> I would strongly advise against including the digikam4.db file in the
> synchronization when syncing between two different machines.
> I ended up with a corrupted database file that way, and it was a lot
> of
> effort to un-corrupt it
> https://mail.kde.org/pipermail/digikam-users/2017-January/023348.html
>
> It's much safer to configure digikam to write all tags etc to the
> image
> files, and then just sync the image files. 
> It's true that after syncing you have to wait while the database is
> rebuilt, but at least you know that you'll end up with a clean
> correct
> database.
>
Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Chris Green
In reply to this post by Jim Gomi
On Mon, Jan 30, 2017 at 10:43:49PM -0600, Jim Gomi wrote:
> It's much safer to configure digikam to write all tags etc to the image
> files, and then just sync the image files. 
>
Which is what I have been saying should be the default, for years,
for this reason among others.

--
Chris Green
Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Chris Green
In reply to this post by stancs3
On Mon, Jan 30, 2017 at 11:06:52PM -0700, stancs3 wrote:
> Would we expect similar results if the database was in a mysql server?
> The server would in theory remain a constant, and become attached to
> the new machine.

This would assume some sort of network access, I want digikam to work
wholly standalone.

There is also quite a difficult (for a home user anyway) issue of
backing up a mysql database.

--
Chris Green
Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Remco Viëtor
In reply to this post by Chris Green
On mardi 31 janvier 2017 08:41:10 CET Chris Green wrote:
> On Mon, Jan 30, 2017 at 10:43:49PM -0600, Jim Gomi wrote:
> > It's much safer to configure digikam to write all tags etc to the image
> > files, and then just sync the image files.
>
> Which is what I have been saying should be the default, for years,
> for this reason among others.

Any default setting should in principle limit possible harm. Writing just
changed metadata to image files _as_a_default_ would go against that. Having
it as an option is fine, then it's the user who decides he can accept the
associated risks.

If you want a default going in that direction, write to XMP files for all
metadata. Digikam knows to look for them, so can synchronise as well as all
changes written to the image files.

But don't set writing changed metadata to image files as a default, please.

Remco

Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Chris Green
On Tue, Jan 31, 2017 at 11:10:35AM +0100, Remco Viëtor wrote:

> On mardi 31 janvier 2017 08:41:10 CET Chris Green wrote:
> > On Mon, Jan 30, 2017 at 10:43:49PM -0600, Jim Gomi wrote:
> > > It's much safer to configure digikam to write all tags etc to the image
> > > files, and then just sync the image files.
> >
> > Which is what I have been saying should be the default, for years,
> > for this reason among others.
>
> Any default setting should in principle limit possible harm. Writing just
> changed metadata to image files _as_a_default_ would go against that. Having
> it as an option is fine, then it's the user who decides he can accept the
> associated risks.
>
I didn't mean just changed metadata, I want *all* metadata *always* in
the files.  It's the only sensible option (for me anyway).

--
Chris Green
Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Philip Tuckey-2
In reply to this post by Andrew Goodbody


On 30/01/17 20:03, Andrew Goodbody wrote:

>
>
> On 30/01/17 16:18, [hidden email] wrote:
>> Chris,
>>
>>> It's probably because Unison updates the 'last access' date field on
>>> the files and Digikam uses this to decide whether to resacn.
>>
>>> In Unix/Linux a file has three times associated with it:-
>>>
>>>     Modified time - last time the file contents were changes (write or
>>> append)
>>>     Accessed time - last time the file was accessed (read or write)
>>>     Status changed - last time the attributes (owner, permissions,
>>> etc.) were changed
>>
>> yeah, that's what I came up with, too -- the digikam4.db file will
>> probably have its access time changed during syncing.
>>
>>> Digikam will compare its database with one (or more) of the above file
>>> times I expect.  A Digikam expert will no doubt tell us.
>>
>> But if so, I still wonder where digikam keeps record of digikam4.db's
>> access time?
>
> It doesn't, that's not how it works.
>
> digiKam stores the path to the root of the collection in the database
> and this can look different on different systems, hence it will do a
> full rescan.

Here the db is synced in one case, shared in the other case, so the
collection root paths are identical on the two machines.
Philip
Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Remco Viëtor
In reply to this post by Chris Green
On mardi 31 janvier 2017 11:35:18 CET Chris Green wrote:

> On Tue, Jan 31, 2017 at 11:10:35AM +0100, Remco Viëtor wrote:
> > On mardi 31 janvier 2017 08:41:10 CET Chris Green wrote:
> > > On Mon, Jan 30, 2017 at 10:43:49PM -0600, Jim Gomi wrote:
> > > > It's much safer to configure digikam to write all tags etc to the
> > > > image
> > > > files, and then just sync the image files.
> > >
> > > Which is what I have been saying should be the default, for years,
> > > for this reason among others.
> >
> > Any default setting should in principle limit possible harm. Writing just
> > changed metadata to image files _as_a_default_ would go against that.
> > Having it as an option is fine, then it's the user who decides he can
> > accept the associated risks.
>
> I didn't mean just changed metadata, I want *all* metadata *always* in
> the files.  It's the only sensible option (for me anyway).

Apparently, always writing all metadata is *NOT* the only sensible option for
everyone (or we wouldn't be having this exchange).
- I'm *not* saying it shouldn't be possible when the user wants it.

And the part about XMP files you elided is important here, as it gives an
alternative that still provides redundancy in cases of database corruption.

Also, there are arguments *against* storing all metadata in the image files,
like privacy considerations (your privacy, and that of anyone figuring in your
images, think geotagging, face recognition, ...) which for some can be as
important as your arguments to make always writing metadata the default
(and again, the discussion for me is about it being the installation
*default*)

Remco


Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

jdd@dodin.org
Le 31/01/2017 à 13:32, Remco Viëtor a écrit :

> And the part about XMP files you elided is important here, as it gives an
> alternative that still provides redundancy in cases of database corruption.

yes. xmp files are also easy to copy, but also easy to lose

>
> Also, there are arguments *against* storing all metadata in the image files,
> like privacy considerations (your privacy, and that of anyone figuring in your
> images, think geotagging, face recognition, ...) which for some can be as
> important as your arguments to make always writing metadata the default
> (and again, the discussion for me is about it being the installation
> *default*)
>

we are speaking of your own collection, here, not export...

jdd

Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Remco Viëtor
On mardi 31 janvier 2017 13:35:54 CET jdd wrote:
> Le 31/01/2017 à 13:32, Remco Viëtor a écrit :
> > And the part about XMP files you elided is important here, as it gives an
> > alternative that still provides redundancy in cases of database
> > corruption.
>
> yes. xmp files are also easy to copy, but also easy to lose

Of course. But do keep in mind that it is as an alternative to "always write
all metadata to image files" *as* *a* *default* *choice*.

Remco


Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Chris Green
In reply to this post by jdd@dodin.org
jdd <[hidden email]> wrote:
> Le 31/01/2017 à 13:32, Remco Viëtor a écrit :
>
> > And the part about XMP files you elided is important here, as it gives an
> > alternative that still provides redundancy in cases of database corruption.
>
> yes. xmp files are also easy to copy, but also easy to lose
>
Exactly, if I copy an image (or move it) I want all relevant
information to go with it and not have to remember other files that
must be copied too.


> >
> > Also, there are arguments *against* storing all metadata in the image files,
> > like privacy considerations (your privacy, and that of anyone figuring in your
> > images, think geotagging, face recognition, ...) which for some can be as
> > important as your arguments to make always writing metadata the default
> > (and again, the discussion for me is about it being the installation
> > *default*)
> >
>
> we are speaking of your own collection, here, not export...
>
Yes



Even if "save all metadata with the image" isn't the default there
should be a simple *single* box to tick somewhere to request this.  My
memory of trying to do this a while ago was that it isn't at all
obvious how to do it.

--
Chris Green
·

Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

Remco Viëtor
On mardi 31 janvier 2017 13:12:55 CET Chris Green wrote:
>
> Even if "save all metadata with the image" isn't the default there
> should be a simple *single* box to tick somewhere to request this.  My
> memory of trying to do this a while ago was that it isn't at all
> obvious how to do it.

That's again your opinion, i.ow. "I want it to be that way", not "good
practice prescribes that way". As there are several elements involved, I'm not
sure it's easy to get a good and safe default *and* allow what you want in one
checkbox.

It all depends on what the developers consider the most important. I didn't
pay for the program, and as far as I know, neither of us is directly involved
with Digikam development. For me that means I can request changes, but I have
no right whatsoever to prescribe to the *unpaid* volunteers of the development
team what they should do or not do (let alone prescribe how to do it).

Remco
Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

news@tcrass.de
In reply to this post by Andrew Goodbody
Hi there,

sorry for the delay, been kinda off-line for a while...

> digiKam stores the path to the root of the collection in the database
> and this can look different on different systems, hence it will do a
> full rescan.

I made sure the path is the same on both machines and that digikam4.db
does actually use the path rather than the disk's UUID to identify the
album root. So with respect to the collection root, digikam should not
sense any difference during syncing.

Cheers --

        Torrsten


Reply | Threaded
Open this post in threaded view
|

Re: Scan for new items takes ages after syncing collection btw. two computers

news@tcrass.de
In reply to this post by Jim Gomi
Jim,

> It's much safer to configure digikam to write all tags etc to the image
> files, and then just sync the image files.

do you know if it's possible to teach digikam to store comments/remarks
as metadate within the image file? 'Cause image comments are the reason
why I'm -- so far -- syncing the database along with the images.

Cheers --

        Torsten


12