[Digikam-devel] [Bug 125736] New: Uniquely identifying each image in a collection of images

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] New: Uniquely identifying each image in a collection of images

Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         
           Summary: Uniquely identifying each image in a collection of
                    images
           Product: digikam
           Version: unspecified
          Platform: unspecified
        OS/Version: Linux
            Status: UNCONFIRMED
          Severity: wishlist
          Priority: NOR
         Component: general
        AssignedTo: digikam-devel kde org
        ReportedBy: kdebugs nacnud force9 co uk


Version:           0.9 SVN (using KDE KDE 3.5.2)
Compiler:          gcc 4.0.2 prerelease
OS:                Linux

With all versions of digiKam up to .9 SVN, there is no way to determine if a photo is unique in a collection of albums, other than by name (and perhaps size in combination with name).

I propose that digiKam store a checksum of each image in the DB table for images.  This checksum can be generated in a 'lazy' manner (background), or 'non-lazy' manner (foreground, holding focus).

Once the checksum has been generated, several things become possible.

1) Parent-child relationships (sort of like what #103350 discusses), and lineage of a photo.
2) Trivial duplicate finding.  Whether the user wants to -remove- duplicates is another issue.
3) Externally moved images don't lose meta-data in the DB.
4) Album/collection split and merge.

To clarify those points a bit:
1) Once each file is identified with a unique digest, it should be possible to link (automatically and manually) the derived (edited) images to the original image.  There is an edge case of an edited photo being a collage of multiple parents, and I'm sure this can be handled fairly easily with the right table design.

To store the parent-child relationships, a simple two-column table is needed.  Parent on the left, child on the right.
[  P   |  C   ]
---------------
[ 1234 | 2345 ]
[ 1234 | 9876 ]
[ 9876 | 1010 ]
[ 1010 | 1011 ]

1234 is the parent of 2345 and 9876.  9876 is the parent of 1010, and 1010 is the parent of 1011.  This makes 1234 the great grandparent of 1011, and this can be displayed in any manner of ways, including a radial display, or a tree display.

2) Duplicate finding, as it stands, appears to be based on name.  This doesn't work well when you reset the internal numbering system of a digital camera.  With digest checksums in place, the search is a simple select where the count of unique hashes is > 1.  The user can then see each duplicate photo, and hopefully the album that the photo is in (essentially, make it part of the search interface as a pre-built search).

3) Right now, if an image is moved externally to digiKam (but within the digiKam albums tree), all meta-data is lost, and the user is probably a tad frustrated.  With checksums for every image, it becomes a case of 'Does this checksum already exist?'  If it does, present the options of:
* Move image back to original location
* Keep image in new/current location
Meta-data is intact, and the user is happy.

4) Recently, on the users list, the issue of backing up and porting collections between two computers was discussed.  I think that the unique checksum concept can help here, but I'm not quite sure how yet.  It should at least help with finding duplicate imported items, as in point 2.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Gilles Caulier
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         
caulier.gilles free fr changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|general                     |Searches
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Bugzilla from gilles@vonet.lu
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From gilles vonet lu  2006-08-12 17:02 -------
ad 3: see also wish #110066
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Bugzilla from owner@bugs.kde.org
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         
caulier.gilles kdemail net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kde elfstone de



------- Additional Comments From caulier.gilles kdemail net  2006-12-12 12:51 -------
*** Bug 110066 has been marked as a duplicate of this bug. ***
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Fabien-5
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From fabien.ubuntu gmail com  2006-12-19 12:34 -------
I think it won't be very easy to manage md5sum because if metadatas are saved inside a picture (iptc/exif), file md5sum will change each time you modify comment/tag/rating...
The best would be to do the hash only on the pixel data. But I don't know if it's easy/possible to do it or not.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Gilles Caulier-2
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From caulier.gilles kdemail net  2006-12-19 12:38 -------
Fabien,

Well, if a metadata is changed into image file, we just need to update the MD5 stored in the database at the same time.

Gilles
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Duncan Hill-7
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From kdebugs nacnud force9 co uk  2006-12-19 13:45 -------
I had to de-dupe my 70 GB archive of photos the other day, so I've got the technique sorted for finding duplicates.  Now to find the time to code it!  Started looking at using the KDE MD5 framework to execute MD5 in the io-slave that loads the files into the database, but found that was the wrong spot.  Doubt I'll get to it over Christmas, but who knows what can happen while I'm at work between Christmas and New Years :)
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Fabien-5
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From fabien.ubuntu gmail com  2006-12-19 14:05 -------
Yes, but I was thinking about "Trivial duplicate finding" and "Parent-child relationships"...
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Gilles Caulier-2
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From caulier.gilles kdemail net  2006-12-19 14:45 -------
Yes finding a dupplicate picture is not really integrated in digiKam with FindDuplicate kipi-plugin.

A possiblility to find duplicate image is to use the "Haar" algorithm witch is based on wavelet theory. This one is more powerfull than MD5 and is used into ImgSeek program :

http://imgseek.cvs.sourceforge.net/imgseek/imgseek/imgSeekLib/haar.cpp?revision=1.10&view=markup

You can see a paper about Haar stuff here :

http://scien.stanford.edu/class/ee368/projects2001/dropbox/project01/method_retrieval.html

http://en.wikipedia.org/wiki/Haar_wavelet

In the past, before to create the FindDupplicate plugin (witch is based on another algorithm used in ShowImg core program),  i have proposed to integrate "Haar" algorithm in digiKam core and store the result matrix for each image in database, but Renchi have been opposed because it out of digiKam subject.

I think it different now. What do you think about ?

Gilles
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Bugzilla from mikmach@wp.pl
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From mikmach wp pl  2006-12-19 18:18 -------
> I think it different now. What do you think about ?

Haar - I am for it (as a user ;) . Not only sure if Digikam is the best
place for this, maybe extragear/libs? You planned to export there exiv2
support, weren't you?

m.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Marcel Wiesweg
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From marcel.wiesweg gmx de  2006-12-19 18:51 -------
Amarok has collected a lot of experience with uniquely identifying files:

http://amarok.kde.org/wiki/Advanced_Tag_Features_(ATF)

is the basic technology,

http://amarok.kde.org/wiki/Dynamic_Collection

adds support for removable media (CD, USB stick, NFS)
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Digikam-devel] [Bug 125736] Uniquely identifying each image in a collection of images

Fabien-5
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From fabien.ubuntu gmail com  2006-12-19 18:54 -------
I didn't really get what type of result you would get and store in the database, but it could be a base for nice features I guess.
So, I'm up for it too :)
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Bug 125736] Uniquely identifying each image in a collection of images

Bugzilla from owner@bugs.kde.org
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         
hhielscher gmail com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
      everconfirmed|0                           |1



------- Additional Comments From hhielscher gmail com  2007-04-13 23:49 -------
*** This bug has been confirmed by popular vote. ***
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Bug 125736] Uniquely identifying each image in a collection of images

Bugzilla from andi.clemens@gmx.net
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From andi.clemens gmx net  2007-06-05 12:55 -------
I'm new to digikam so I didn't know it saves only the filenames in the db, but today I had to find it out by accident :(
I renamed some folders and files with krename because the rename dialog from the kipi plugins isn't that powerful. After renaming the files I started digikam and all my tags where wrong!
For example:
A = "2007_jan_birthday_001.jpg"
B = "2007_jan_birthday_101.jpg"
A was renamed as B. After that B has all tags from a file prior named as B, A has no tags at all...
It would be very helpful to save md5sums in the database so this would not happen. I used kphotoalbum before and was used to move my files around or rename them with external programs. But now it isn't possible anymore.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Bug 125736] Uniquely identifying each image in a collection of images

Gerhard Kulzer
Am Tuesday 05 June 2007 schrieb Andi Clemens:

> ------- You are receiving this mail because: -------
> You are the assignee for the bug, or are watching the assignee.
>
> http://bugs.kde.org/show_bug.cgi?id=125736
>
>
>
>
> ------- Additional Comments From andi.clemens gmx net  2007-06-05 12:55
> ------- I'm new to digikam so I didn't know it saves only the filenames in
> the db, but today I had to find it out by accident :( I renamed some
> folders and files with krename because the rename dialog from the kipi
> plugins isn't that powerful. After renaming the files I started digikam and
> all my tags where wrong! For example:
> A = "2007_jan_birthday_001.jpg"
> B = "2007_jan_birthday_101.jpg"
> A was renamed as B. After that B has all tags from a file prior named as B,
> A has no tags at all... It would be very helpful to save md5sums in the
> database so this would not happen. I used kphotoalbum before and was used
> to move my files around or rename them with external programs. But now it
> isn't possible anymore. _______________________________________________
> Digikam-devel mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-devel
Notwithstanding your wish for unique ID of images, if you check in the digikam
setup 'save tags to IPTC' your tags will be save in the files and re-imported
when moved outside digikam.

Gerhard

--
Hakuna matata
http://www.gerhard.fr

_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

[Bug 125736] Uniquely identifying each image in a collection of images

Gerhard Kulzer
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From gerhard kulzer net  2007-06-05 16:12 -------
Am Tuesday 05 June 2007 schrieb Andi Clemens:
[bugs.kde.org quoted mail]

Notwithstanding your wish for unique ID of images, if you check in the digikam
setup 'save tags to IPTC' your tags will be save in the files and re-imported
when moved outside digikam.

Gerhard
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Bug 125736] Uniquely identifying each image in a collection of images

mark cox
In reply to this post by Duncan Hill-7
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
         
http://bugs.kde.org/show_bug.cgi?id=125736         




------- Additional Comments From markcox email com  2007-08-27 09:08 -------
I would love to have this feature (not haar-based) but based on pixel data. Also, my dream would be if this unique id is stored as meta-data and converted to a machine tag when i upload to flickr, this would solve my problem of having many duplicate images on flickr. Also, the pixel-data tagging could be optimized for jpeg by using the compressed image data!

Regards,
mark
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel