Difference between collection types

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Difference between collection types

Paul Waldo
Hi all,

I have a somewhat complicated Album Collection setup that I am trying to figure out.  I'm wondering what are the differences between Local, Removable Media and Network Share collections.  It is my understanding that any collection not residing on the local drive must be mounted before using it, so I don't see what the distinction is.  Thanks for any advice!

Paul

_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Marcel Wiesweg
> Hi all,
>
> I have a somewhat complicated Album Collection setup that I am trying to
> figure out. I'm wondering what are the differences between Local, Removable
> Media and Network Share collections. It is my understanding that any
> collection not residing on the local drive must be mounted before using it,
> so I don't see what the distinction is. Thanks for any advice!

For removable devices we have a good API (Solid) which works (under Linux, not
or partially implemented under Windows) and helps us to know if a given device
is connected or not.
For network collections, be it Samba, NFS or whatever, we have no such API at
all, we dont know if a given mount path is empty because it's empty or because
the collection is not mounted. Therefore, for network collections, the
algorithm is to check if a directory is empty, then it is not mounted.

Marcel


>
> Paul


_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Paul Waldo
In reply to this post by Paul Waldo
Hi Marcel, thanks for the reply.  

So, as I understand it, Digikam  
 * can say with certainty that a Removable Media is attached
 * uses some guess work to try and determine whether a Network Share is attached
 * always assumes that a Local Collection is available

So what happens if Digikam determines that a non-local resource is *not* available?  There must be different behavior based on the answer.  In previous versions of Digikam, the images would be removed from the database if they were not present.  Does Digikam keep them in the database then just work with thumbnails?

The reason I ask is that I have a complex setup and want to try to make it as transparent as possible.  Here is what I have:
 * images straight out of the camera stored on a NAS samba share.  The digikam database is stored here.
 * post-processed images stored on another share of the same NAS.
 * images being worked on stored on a local drive

I do most of my work on a laptop, which connects and disconnects from my network frequently.  What I would like to do is have three albums:
 * "Camera" for the digital negatives (raw files)
 * "Developed" for the images that have been post-processed
 * "Working" for the files local to my laptop that are in active development.

I envision the workflow looking something like this:
 1. Plug laptop into network
 2. KDE knows I'm connected to my network and makes available the two NAS samba shares
 3. Digikam is able to access all three albums
 4. Use Digikam to copy an image from either "Camera" or "Developed" to "Working".
 5. Disconnect from my network
 6. KDE sees the disconnect and removes the NAS shares from what it knows about
 7. Take the laptop to a nice quite place to do some image processing.
 8. Re-attach to my network and copy the image from "Working" album to "Developed" album


Do you have any tips for accomplishing this kind of setup?  Can KDE mount and umount shares based on the network connection?  How would I handle the digikam database?  This would be a dream setup if it can be accomplished!  Thanks!

Paul
----- "Marcel Wiesweg" <[hidden email]> wrote:

> > Hi all,
> >
> > I have a somewhat complicated Album Collection setup that I am
> trying to
> > figure out. I'm wondering what are the differences between Local,
> Removable
> > Media and Network Share collections. It is my understanding that
> any
> > collection not residing on the local drive must be mounted before
> using it,
> > so I don't see what the distinction is. Thanks for any advice!
>
> For removable devices we have a good API (Solid) which works (under
> Linux, not
> or partially implemented under Windows) and helps us to know if a
> given device
> is connected or not.
> For network collections, be it Samba, NFS or whatever, we have no such
> API at
> all, we dont know if a given mount path is empty because it's empty or
> because
> the collection is not mounted. Therefore, for network collections, the
>
> algorithm is to check if a directory is empty, then it is not
> mounted.
>
> Marcel
>
>
> >
> > Paul
>
>
> _______________________________________________
> Digikam-users mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-users
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Marcel Wiesweg
> So, as I understand it, Digikam
>  * can say with certainty that a Removable Media is attached
>  * uses some guess work to try and determine whether a Network Share is
> attached
> * always assumes that a Local Collection is available

Yes that's right

>
> So what happens if Digikam determines that a non-local resource is *not*
> available?  There must be different behavior based on the answer.  In
> previous versions of Digikam, the images would be removed from the database
> if they were not present.  Does Digikam keep them in the database then just
> work with thumbnails?

Yes. It just keeps them in the database. In 0.10 it hides them. For 1.0 we
envision to have thumbnails accessible and be able to work with absent photos,
without image preview or editing of course.

>
> The reason I ask is that I have a complex setup and want to try to make it
> as transparent as possible.  Here is what I have: * images straight out of
> the camera stored on a NAS samba share.  The digikam database is stored
> here. * post-processed images stored on another share of the same NAS.
>  * images being worked on stored on a local drive
>
> I do most of my work on a laptop, which connects and disconnects from my
> network frequently.  What I would like to do is have three albums: *
> "Camera" for the digital negatives (raw files)
>  * "Developed" for the images that have been post-processed
>  * "Working" for the files local to my laptop that are in active
> development.
>
> I envision the workflow looking something like this:
>  1. Plug laptop into network

>  2. KDE knows I'm connected to my network and makes available the two NAS
> samba shares
>  6. KDE sees the disconnect and removes the NAS shares from what it knows
> about

This is problematic. There is technology like Avahi, UPnp to make this
possible. I am sure it is possible to detect if a somehow identified Samba
share is mounted. But there is no common API just like Solid is there for
hardware. I really miss that. I have heard about some work on network
awareness using Avahi but I dont think there has been substantial progress.

> 3. Digikam is able to access all three albums
>  4. Use Digikam to copy an image from either "Camera" or "Developed" to
> "Working". 5. Disconnect from my network
> 7. Take the laptop to a nice quite place to do some image processing.
> 8. Re-attach to my network and copy the image from "Working" album to
> "Developed" album
>
>
> Do you have any tips for accomplishing this kind of setup?  Can KDE mount
> and umount shares based on the network connection?  How would I handle the
> digikam database?  

Database is a different issue. If it is stored on the NAS you cannot access it
with your laptop if unconnected. So store it on the laptop if you want to
access it. If other people also work on the pictures things get complicated
with working offline. For now, there is no solution but the social solution to
this (talking)


_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Paul Waldo
In reply to this post by Paul Waldo
OK, so I have a plan :-)  Here is what I have done:

 * Mount NAS camera share at /mnt/camera *and* /mnt/camera1
 * Mount NAS developed share at /mnt/developed
 * Local pictures are stored in ~/Pictures

I told Digikam that I have a Local collection at ~/Pictures, a network share at /mnt/camera and a network share at /mnt/developed.  

I told digikam that the database is stored at /mnt/camera1.  Note that this is where I have always stored the DB, on the NAS.  So the database knows about the raw files, the post-processed files, and the local files.  So you are either asking why I have a /mnt/camera and a /mnt/camera1, or you are asking how I handle the disconnected-from-the-network scenario.  The key is that when I am disconnected from the network (no mount on camera1) and only working on the local image store, Digikam will find a database at that directory that will only contain info about the local files.  When camera1 is mounted, that file is hidden by the mount and replaced by the NAS share.

Pretty darn complicated, but it seems to work.  The only problem now is converting my tags.  I still have the same albums, but they have moved around on the filesystem.  In other words, an album was at /a/b/c, and is now at /d/e/f.  Is there anyway to get Digikam to move the tags to the new location?

Paul


----- "Marcel Wiesweg" <[hidden email]> wrote:

> > So, as I understand it, Digikam
> >  * can say with certainty that a Removable Media is attached
> >  * uses some guess work to try and determine whether a Network Share
> is
> > attached
> > * always assumes that a Local Collection is available
>
> Yes that's right
>
> >
> > So what happens if Digikam determines that a non-local resource is
> *not*
> > available?  There must be different behavior based on the answer.
> In
> > previous versions of Digikam, the images would be removed from the
> database
> > if they were not present.  Does Digikam keep them in the database
> then just
> > work with thumbnails?
>
> Yes. It just keeps them in the database. In 0.10 it hides them. For
> 1.0 we
> envision to have thumbnails accessible and be able to work with absent
> photos,
> without image preview or editing of course.
>
> >
> > The reason I ask is that I have a complex setup and want to try to
> make it
> > as transparent as possible.  Here is what I have: * images straight
> out of
> > the camera stored on a NAS samba share.  The digikam database is
> stored
> > here. * post-processed images stored on another share of the same
> NAS.
> >  * images being worked on stored on a local drive
> >
> > I do most of my work on a laptop, which connects and disconnects
> from my
> > network frequently.  What I would like to do is have three albums:
> *
> > "Camera" for the digital negatives (raw files)
> >  * "Developed" for the images that have been post-processed
> >  * "Working" for the files local to my laptop that are in active
> > development.
> >
> > I envision the workflow looking something like this:
> >  1. Plug laptop into network
>
> >  2. KDE knows I'm connected to my network and makes available the
> two NAS
> > samba shares
> >  6. KDE sees the disconnect and removes the NAS shares from what it
> knows
> > about
>
> This is problematic. There is technology like Avahi, UPnp to make this
>
> possible. I am sure it is possible to detect if a somehow identified
> Samba
> share is mounted. But there is no common API just like Solid is there
> for
> hardware. I really miss that. I have heard about some work on network
>
> awareness using Avahi but I dont think there has been substantial
> progress.
>
> > 3. Digikam is able to access all three albums
> >  4. Use Digikam to copy an image from either "Camera" or "Developed"
> to
> > "Working". 5. Disconnect from my network
> > 7. Take the laptop to a nice quite place to do some image
> processing.
> > 8. Re-attach to my network and copy the image from "Working" album
> to
> > "Developed" album
> >
> >
> > Do you have any tips for accomplishing this kind of setup?  Can KDE
> mount
> > and umount shares based on the network connection?  How would I
> handle the
> > digikam database?  
>
> Database is a different issue. If it is stored on the NAS you cannot
> access it
> with your laptop if unconnected. So store it on the laptop if you want
> to
> access it. If other people also work on the pictures things get
> complicated
> with working offline. For now, there is no solution but the social
> solution to
> this (talking)
>
>
> _______________________________________________
> Digikam-users mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-users
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

gerlos
On domenica 07 giugno 2009 21:04:39, Paul Waldo wrote:
> Pretty darn complicated, but it seems to work.  The only problem now is converting my tags.  I still have the same albums, but they have moved around on the filesystem.  In other words, an album was at /a/b/c, and is now at /d/e/f.  Is there anyway to get Digikam to move the tags to the new location?

If /a/b/c and /d/e/f were seen in digikam as two albums, when you are inside digikam and move images between the two albums (directories), the tags will be preserved.

Another way is to tell digikam to store metadata inside the images. This way you can move them around without losing anything, but while it works OK for PNG and JPG images, I think it won't work for RAW files.

regards
gerlos


--
"Life is pretty simple: You do some stuff. Most fails. Some works. You do more
of what works. If it works big, others quickly copy it. Then you do something
else. The trick is the doing something else."
           < http://gerlos.altervista.org >
 gerlos  +- - - >  gnu/linux registred user #311588
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Paul Waldo

----- "gerlos" <[hidden email]> wrote:

> On domenica 07 giugno 2009 21:04:39, Paul Waldo wrote:
> > Pretty darn complicated, but it seems to work.  The only problem now
> is converting my tags.  I still have the same albums, but they have
> moved around on the filesystem.  In other words, an album was at
> /a/b/c, and is now at /d/e/f.  Is there anyway to get Digikam to move
> the tags to the new location?
>
> If /a/b/c and /d/e/f were seen in digikam as two albums, when you are
> inside digikam and move images between the two albums (directories),
> the tags will be preserved.

Unfortunately, both paths point to the same network share :-(
>
> Another way is to tell digikam to store metadata inside the images.
> This way you can move them around without losing anything, but while
> it works OK for PNG and JPG images, I think it won't work for RAW
> files.

And they are mostly raw files, too :-(
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

gerlos
On lunedì 08 giugno 2009 14:59:30, Paul Waldo wrote:

>
> ----- "gerlos" <[hidden email]> wrote:
>
> > On domenica 07 giugno 2009 21:04:39, Paul Waldo wrote:
> > > Pretty darn complicated, but it seems to work.  The only problem now
> > is converting my tags.  I still have the same albums, but they have
> > moved around on the filesystem.  In other words, an album was at
> > /a/b/c, and is now at /d/e/f.  Is there anyway to get Digikam to move
> > the tags to the new location?
> >
> > If /a/b/c and /d/e/f were seen in digikam as two albums, when you are
> > inside digikam and move images between the two albums (directories),
> > the tags will be preserved.
>
> Unfortunately, both paths point to the same network share :-(

But I'm sure you can find some smart solution... Apart from symlinks and "mount -o bind /d/e/f /a/b/c", I'm thinking something like:

1. create the new album /g/h/i
2. move things from /a/b/c to /g/h/i
3. move things back from /g/h/i to /d/e/f, that before was known as /a/b/c

Maybe you already did it, but I can't know and so I'd suggest you to not put your albums in the root of the share, but to use a subdirectory, it could let you do some things in a easier way.

good luck
gerlos

--
"Life is pretty simple: You do some stuff. Most fails. Some works. You do more
of what works. If it works big, others quickly copy it. Then you do something
else. The trick is the doing something else."
           < http://gerlos.altervista.org >
 gerlos  +- - - >  gnu/linux registred user #311588
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Marcel Wiesweg
In reply to this post by Paul Waldo

> Pretty darn complicated, but it seems to work.  The only problem now is
> converting my tags.  I still have the same albums, but they have moved
> around on the filesystem.  In other words, an album was at /a/b/c, and is
> now at /d/e/f.  Is there anyway to get Digikam to move the tags to the new
> location?

There is a "uniqueHash" field in the database (which is not 100% strong but
identical for identical files) which digikam uses to see if it already knows a
file. Now if you add new albums and these contain only files already in the
database (by content), it will just copy the attributes including tags. That
should work.  If not, file a bug. There is usually output on the console along
the lines "Recognized xy as identical to yz".

Marcel
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Paul Waldo
Hi Marcel,

I am hopeful this will work.  I have added a new album that points to the same location, but no tags were transferred to the new images :-(  Hopefully rebuilding fingerprints will allow the tags to be transferred over.  Unfortunately, it is taking literally all day for 15k images hosted on a Samba share.  Ugh!

If this doesn't work, I may have to try direct database chicanery.  Not looking forward to that!

Paul
----- "Marcel Wiesweg" <[hidden email]> wrote:

> > Pretty darn complicated, but it seems to work.  The only problem now
> is
> > converting my tags.  I still have the same albums, but they have
> moved
> > around on the filesystem.  In other words, an album was at /a/b/c,
> and is
> > now at /d/e/f.  Is there anyway to get Digikam to move the tags to
> the new
> > location?
>
> There is a "uniqueHash" field in the database (which is not 100%
> strong but
> identical for identical files) which digikam uses to see if it already
> knows a
> file. Now if you add new albums and these contain only files already
> in the
> database (by content), it will just copy the attributes including
> tags. That
> should work.  If not, file a bug. There is usually output on the
> console along
> the lines "Recognized xy as identical to yz".
>
> Marcel
> _______________________________________________
> Digikam-users mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-users
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Stephen-125
In reply to this post by Marcel Wiesweg
Hi All,

The only problem with solid is that it does not seem to detect a device
is removable IF it is encrypted with either Luks or truecrypt.

-- Stephen.


On Sunday 07 June 2009 14:01:27 Marcel Wiesweg wrote:
> > Hi all,
> >
> > I have a somewhat complicated Album Collection setup that I am
trying to
> > figure out. I'm wondering what are the differences between Local,
> > Removable Media and Network Share collections. It is my
understanding
> > that any collection not residing on the local drive must be
mounted
> > before using it, so I don't see what the distinction is. Thanks for
any
> > advice!
>
> For removable devices we have a good API (Solid) which works
(under Linux,
> not or partially implemented under Windows) and helps us to know
if a given
> device is connected or not.
> For network collections, be it Samba, NFS or whatever, we have no
such API
> at all, we dont know if a given mount path is empty because it's
empty or
> because the collection is not mounted. Therefore, for network
collections,
> the algorithm is to check if a directory is empty, then it is not
mounted.
>
> Marcel
>
> > Paul
>
> _______________________________________________
> Digikam-users mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-users

_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Marcel Wiesweg
In reply to this post by Paul Waldo
> Hi Marcel,
>
> I am hopeful this will work.  I have added a new album that points to the
> same location, but no tags were transferred to the new images :-(
> Hopefully rebuilding fingerprints will allow the tags to be transferred
> over.  Unfortunately, it is taking literally all day for 15k images hosted
> on a Samba share.  Ugh!

No this is not the fingerprints, it's a md5 hash over certain file regions
(and it's fast and you already have it)
I am surprised because that always works for me here. What is the output on
the console when an already known album is added scanned initially? (enable
50003 with kdebugdialog).

digikam(18829)/digikam (core) Digikam::ImageScanner::addImage: Adding new item
"/media/fotos/Digikam Sample/PPM/comment-256-0.jpg"                
digikam(18829)/digikam (core) Digikam::ImageScanner::scanFromIdenticalFile:
Recognized "/media/fotos/Digikam Sample/PPM/comment-256-0.jpg" as identical to
item 10519    

>
> If this doesn't work, I may have to try direct database chicanery.  Not
> looking forward to that!
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Paul Waldo
Hi Marcel,

I've imported the new album and indeed I see the messages that you speak of.  I imported the album and then let digikam sit "idle" and it looked through the whole collection and found many identical images.  The new album shows no tags though!

I'm not sure what is happening with digikam though.  It finally finished finding identical images (after 15 hours: ugh!), but I still see major network (disk) activity and digikam is still chugging along doing something.  I see lots of messages that say
"Digikam::AlbumManager::slotDirWatchDirty: KDirWatch detected change at /mnt/camera".  
Should I allow this activity to finish before expecting tags to be present in the newly imported album?  Thanks!

----- "Marcel Wiesweg" <[hidden email]> wrote:

> > Hi Marcel,
> >
> > I am hopeful this will work.  I have added a new album that points
> to the
> > same location, but no tags were transferred to the new images :-(
> > Hopefully rebuilding fingerprints will allow the tags to be
> transferred
> > over.  Unfortunately, it is taking literally all day for 15k images
> hosted
> > on a Samba share.  Ugh!
>
> No this is not the fingerprints, it's a md5 hash over certain file
> regions
> (and it's fast and you already have it)
> I am surprised because that always works for me here. What is the
> output on
> the console when an already known album is added scanned initially?
> (enable
> 50003 with kdebugdialog).
>
> digikam(18829)/digikam (core) Digikam::ImageScanner::addImage: Adding
> new item
> "/media/fotos/Digikam Sample/PPM/comment-256-0.jpg"                
> digikam(18829)/digikam (core)
> Digikam::ImageScanner::scanFromIdenticalFile:
> Recognized "/media/fotos/Digikam Sample/PPM/comment-256-0.jpg" as
> identical to
> item 10519    
>
> >
> > If this doesn't work, I may have to try direct database chicanery.
> Not
> > looking forward to that!
> _______________________________________________
> Digikam-users mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-users
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Marcel Wiesweg
> Hi Marcel,
>
> I've imported the new album and indeed I see the messages that you speak
> of.  I imported the album and then let digikam sit "idle" and it looked
> through the whole collection and found many identical images.  The new
> album shows no tags though!

When this message appears AlbumDB::copyImageAttributes is called immediately
afterwards, no way to avoid that, and it copies the tags (among all other
info). Very strange that this does not happen.
When I add a symlink to a subfolder of my collection here as a second
collection, all tags are available.
Is it possible that the database contains third entries to identical files
already that are not tagged?


>
> I'm not sure what is happening with digikam though.  It finally finished
> finding identical images (after 15 hours: ugh!),

It needed 15 hours to scan a collection??


> but I still see major
> network (disk) activity and digikam is still chugging along doing
> something.  I see lots of messages that say
> "Digikam::AlbumManager::slotDirWatchDirty: KDirWatch detected change at
> /mnt/camera". Should I allow this activity to finish before expecting tags
> to be present in the newly imported album?  Thanks!

KDirWatch behavior was quite fundamentally changed between KDE 4.2.2 and
4.2.3. Since then it reports single files instead of directories. (I dont want
to comment further on such changes on undocumented behavior between minor
revisions and breaking applications)
This leads to an endless loop of KDirWatch triggering a collection scan, which
then again accesses the db file and triggers KDirWatch. Ignore it.

Marcel
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Paul Waldo
In reply to this post by Paul Waldo
Hi Marcel, comments and analysis below.

Paul
----- "Marcel Wiesweg" <[hidden email]> wrote:

> > Hi Marcel,
> >
> > I've imported the new album and indeed I see the messages that you
> speak
> > of.  I imported the album and then let digikam sit "idle" and it
> looked
> > through the whole collection and found many identical images.  The
> new
> > album shows no tags though!
>
> When this message appears AlbumDB::copyImageAttributes is called
> immediately
> afterwards, no way to avoid that, and it copies the tags (among all
> other
> info). Very strange that this does not happen.
> When I add a symlink to a subfolder of my collection here as a second
>
> collection, all tags are available.
> Is it possible that the database contains third entries to identical
> files
> already that are not tagged?

It looks like theree is simply no additional tagging going on.  Lets look at one image that has tags in my original local collection (which is actually pointing to a mounted samba share).  Its name is CRW_1507.CRW.

sqlite> select * from Images where name='CRW_1507.CRW';
31341||CRW_1507.CRW|3|1|2006-06-24T17:39:28|2557936|8e3bed88d7cf91c811991e86fcf9394c
33963|3|CRW_1507.CRW|1|1|2006-06-24T17:39:28|2557936|8e3bed88d7cf91c811991e86fcf9394c
51007||CRW_1507.CRW|3|1|2006-06-24T17:39:28|2557936|f8e04060cbcaa34b5f7dd6618259ada4
68188|1682|CRW_1507.CRW|1|1|2006-06-24T17:39:28|2557936|f8e04060cbcaa34b5f7dd6618259ada4
83523|2417|CRW_1507.CRW|1|1|2006-06-24T17:39:28|2557936|f8e04060cbcaa34b5f7dd6618259ada4

sqlite> .schema Images
CREATE TABLE Images
 (id INTEGER PRIMARY KEY,
  album INTEGER,
  name TEXT NOT NULL,
  status INTEGER NOT NULL,
  category INTEGER NOT NULL,
[...]

Now lets see if we can look at the tags:

sqlite> .schema ImageTags
CREATE TABLE ImageTags
 (imageid INTEGER NOT NULL,
  tagid INTEGER NOT NULL,
  UNIQUE (imageid, tagid));
CREATE INDEX tag_index  ON ImageTags (tagid);
sqlite> select * from ImageTags where imageid in (31341, 33963, 51007, 68188, 83523);
31341|41
31341|54
31341|55
31341|56
33963|41
33963|54
33963|55
33963|56

So, as we can see, we have two images that have the same tags.  This is good!


Let's go back to the albums.  As you can see, this image is in the DB 5 times, but only belongs to three albums :-O  Which albums are these?
sqlite> select * from Albums where id in (3, 1682, 2417);
3|1|/2003/2003-06-29|2008-01-19|||
1682|3|/2003/2003-06-29|2008-01-19|||
2417|5|/2003/2003-06-29|2008-01-19|||
sqlite> .schema Albums
CREATE TABLE Albums
 (id INTEGER PRIMARY KEY,
  albumRoot INTEGER NOT NULL,
  relativePath TEXT NOT NULL,
  date DATE,
  caption TEXT,
  collection TEXT,
  icon INTEGER,
  UNIQUE(albumRoot, relativePath));
CREATE TRIGGER delete_album DELETE ON Albums
BEGIN
 DELETE FROM Images
   WHERE Images.album = OLD.id;
END;
sqlite> select * from AlbumRoots where id in (1, 3, 5);
1|camera|0|1|volumeid:?path=%2Fhome%2Fpaul%2FPictures%2Fcamera|/
5|Camera|0|3|networkshareid:?mountpath=%2Fmnt%2Fcamera|/
sqlite> .schema AlbumRoots
CREATE TABLE AlbumRoots
 (id INTEGER PRIMARY KEY,
  label TEXT,
  status INTEGER NOT NULL,
  type INTEGER NOT NULL,
  identifier TEXT,
  specificPath TEXT,
  UNIQUE(identifier, specificPath));
CREATE TRIGGER delete_albumroot DELETE ON AlbumRoots
BEGIN
 DELETE FROM Albums
   WHERE Albums.albumRoot = OLD.id;
END;

So, if we trace all of this back to the image, we can see that from a tags perspective, image 33963 and 83523 should have the same tags.  The mapping between images and tags shows that it should be images 31341 and 33963. A mismatch! Also the reason the tags don't show up in the newly added album.  Any ideas on where to proceed from here?

>
>
> >
> > I'm not sure what is happening with digikam though.  It finally
> finished
> > finding identical images (after 15 hours: ugh!),
>
> It needed 15 hours to scan a collection??


Yup.  15511 images stored on a NAS samba share.  Digikam running at 28% CPU on a Quad Xeon and approx 256 MB/sec constant network throughput.  Think something might be wrong?  As you can imagine, Digikam startup with DB scan is quite painfull...

>
>
> > but I still see major
> > network (disk) activity and digikam is still chugging along doing
> > something.  I see lots of messages that say
> > "Digikam::AlbumManager::slotDirWatchDirty: KDirWatch detected change
> at
> > /mnt/camera". Should I allow this activity to finish before
> expecting tags
> > to be present in the newly imported album?  Thanks!
>
> KDirWatch behavior was quite fundamentally changed between KDE 4.2.2
> and
> 4.2.3. Since then it reports single files instead of directories. (I
> dont want
> to comment further on such changes on undocumented behavior between
> minor
> revisions and breaking applications)
> This leads to an endless loop of KDirWatch triggering a collection
> scan, which
> then again accesses the db file and triggers KDirWatch. Ignore it.

Part of the 15 hours above was probably this behavior.  Based on the DB analysis above, I'm sure I'll be trying the album import again, so I'll update the numbers ;-)

>
> Marcel
> _______________________________________________
> Digikam-users mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-users
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Marcel Wiesweg

> It looks like theree is simply no additional tagging going on.  Lets look
> at one image that has tags in my original local collection (which is
> actually pointing to a mounted samba share).  Its name is CRW_1507.CRW.
>
> sqlite> select * from Images where name='CRW_1507.CRW';
> 33963|3|CRW_1507.CRW|1|1|2006-06-24T17:39:28|2557936|8e3bed88d7cf91c811991e
>86fcf9394c


> 68188|1682|CRW_1507.CRW|1|1|2006-06-24T17:39:28|2557936|f8e04060cbcaa34b5f7
>dd6618259ada4
> 83523|2417|CRW_1507.CRW|1|1|2006-06-24T17:39:28|2557936|f8e04060cbcaa34b5f7
>dd6618259ada4

Ignoring removed images, you see: the hash is different. 8e3... vs. f8e...
Is the file in /home and the file on the network storage bit by bit identical?
Please verify with the md5 or shasum utility (digikam's hash is md5 only over
parts of the file)


> > It needed 15 hours to scan a collection??
>
> Yup.  15511 images stored on a NAS samba share.  Digikam running at 28% CPU
> on a Quad Xeon and approx 256 MB/sec constant network throughput.  Think
> something might be wrong?  As you can imagine, Digikam startup with DB scan
> is quite painfull...

A complete scan of 26994 pictures on 39GB, 99% JPEGs, took 12 minutes and 20
seconds in a short test while writing this mail. A normal application start
uses <5s for the scan if no files are new. That's local harddisk.
I dont know what is causing the huge performance drop over network storage.
In 15h, 3.5s/picture, you could transfer 900MB of data for every picture over
the network.


_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Paul Waldo
In reply to this post by Paul Waldo

----- "Marcel Wiesweg" <[hidden email]> wrote:

> Ignoring removed images, you see: the hash is different. 8e3... vs.
> f8e...
> Is the file in /home and the file on the network storage bit by bit
> identical?
> Please verify with the md5 or shasum utility (digikam's hash is md5
> only over
> parts of the file)

Hmm, dunno how they could be different.  The two collections point to the same place, the only difference is that one is a symlink:

ls -ld /home/paul/Pictures/camera
lrwxrwxrwx 1 paul paul 12 2009-06-08 11:25 /home/paul/Pictures/camera -> /mnt/camera/

>
>
> > > It needed 15 hours to scan a collection??
> >
> > Yup.  15511 images stored on a NAS samba share.  Digikam running at
> 28% CPU
> > on a Quad Xeon and approx 256 MB/sec constant network throughput.
> Think
> > something might be wrong?  As you can imagine, Digikam startup with
> DB scan
> > is quite painfull...
>
> A complete scan of 26994 pictures on 39GB, 99% JPEGs, took 12 minutes
> and 20
> seconds in a short test while writing this mail. A normal application
> start
> uses <5s for the scan if no files are new. That's local harddisk.
> I dont know what is causing the huge performance drop over network
> storage.
> In 15h, 3.5s/picture, you could transfer 900MB of data for every
> picture over
> the network.

Keep in mind that most of these images are 3 or 6 MB raw files.  Also, the for at least half of the 15 hours I was waiting for KDirWatch to (never) finish.  I'm going to go back to my DB backups and reimport again.

>
>
> _______________________________________________
> Digikam-users mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-users
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Marcel Wiesweg

>
> Hmm, dunno how they could be different.  The two collections point to the
> same place, the only difference is that one is a symlink:
>
> ls -ld /home/paul/Pictures/camera
> lrwxrwxrwx 1 paul paul 12 2009-06-08 11:25 /home/paul/Pictures/camera ->
> /mnt/camera/

That means something would have changed in the way the hash is generated. I
dont like that. I can only think of a different exiv2 version providing
different binary metadata. I did not come across this so far.

If you set modificationDate (in the Images table) of 33963 to NULL and start
digikam - which triggers a full rescan of an image - is the hash then 8e3 or
fe8?

Marcel
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Paul Waldo
In reply to this post by Paul Waldo
Hi Marcel,

The slow startup time seems to be DB related.  Just for fun, I moved the DB to a local drive.  The startup time was half or a quarter of the time with the DB on the NAS.  Also, I saw the CPU get pegged for a good bit of the time (yay!).

I have no idea how sqlite accesses a DB on what it thinks is a local file, but this seems like quite a hint to me that I need to make the DB local.  Maybe Digikam could have a setting to make DB backups in the background...?

Paul
----- "Marcel Wiesweg" <[hidden email]> wrote:

> A complete scan of 26994 pictures on 39GB, 99% JPEGs, took 12 minutes
> and 20
> seconds in a short test while writing this mail. A normal application
> start
> uses <5s for the scan if no files are new. That's local harddisk.
> I dont know what is causing the huge performance drop over network
> storage.
> In 15h, 3.5s/picture, you could transfer 900MB of data for every
> picture over
> the network.
>
>
> _______________________________________________
> Digikam-users mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-users
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Difference between collection types

Gilles Caulier-4
2009/6/12 Paul Waldo <[hidden email]>:
> Hi Marcel,
>
> The slow startup time seems to be DB related.  Just for fun, I moved the DB to a local drive.  The startup time was half or a quarter of the time with the DB on the NAS.  Also, I saw the CPU get pegged for a good bit of the time (yay!).


SQlite do not support remote DB file hosted on NFS or Samba. It's a
sqlite limitation. In digiKam setup dialog is clear. Look all tip
words...

Gilles Caulier

>
> I have no idea how sqlite accesses a DB on what it thinks is a local file, but this seems like quite a hint to me that I need to make the DB local.  Maybe Digikam could have a setting to make DB backups in the background...?
>
> Paul
> ----- "Marcel Wiesweg" <[hidden email]> wrote:
>
>> A complete scan of 26994 pictures on 39GB, 99% JPEGs, took 12 minutes
>> and 20
>> seconds in a short test while writing this mail. A normal application
>> start
>> uses <5s for the scan if no files are new. That's local harddisk.
>> I dont know what is causing the huge performance drop over network
>> storage.
>> In 15h, 3.5s/picture, you could transfer 900MB of data for every
>> picture over
>> the network.
>>
>>
>> _______________________________________________
>> Digikam-users mailing list
>> [hidden email]
>> https://mail.kde.org/mailman/listinfo/digikam-users
> _______________________________________________
> Digikam-users mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-users
>
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
12