https://bugs.kde.org/show_bug.cgi?id=319001
Bug ID: 319001 Summary: Smart detection whether file was been downloaded Classification: Unclassified Product: digikam Version: unspecified Platform: Ubuntu Packages OS: Linux Status: UNCONFIRMED Severity: wishlist Priority: NOR Component: Import Assignee: [hidden email] Reporter: [hidden email] By using digiKam 2.5.0 on Ubuntu 12.10 and reading the latest GIT source code, I deduce that, when importing photos from a UMS, digiKam uses the download history to display whether a file has already been downloaded or not. This is somewhat inconvenient, in several cases: (1) A user just started using digiKam: all photos are shown as "new". (2) A user changed the way she downloads picture, e.g., she switched from PTP to using a card reader. (3) A user might import photos through several ways, e.g., sometimes directly from the camera, sometimes from a "holiday" laptop. Wouldn't it be possible for digiKam to more intelligently detect if photos have already been downloaded? I was thinking of the following solution. First, for each photo (local or to-be-imported) compute a unique ID by reading the EXIF data. Newer cameras already add a "unique photo ID" EXIF tag. For older cameras, one may compute a unique picture ID using a combination of camera make, camera model and file name. If I understood correctly, this would also conform to digiKam's philosophy of using its database only to accelerate operations, without storing any data that could not be found in the files themselves. Any thoughts on this? If this makes sense, I could try dedicating some time to develop the feature myself. Reproducible: Always Steps to Reproduce: (For example) 1. Start digiKam 2. Import some photos from a USB mass storage device 3. Exit digiKam 4. Delete its database (but not the photos it has just imported). 5. Start digiKam 6. Open the import window for the same USB mass storage device Actual Results: All photos are marked as new Expected Results: The photos that have been downloaded at step 2 should be detected as already downloaded. -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
https://bugs.kde.org/show_bug.cgi?id=319001
--- Comment #1 from Marcel Wiesweg <[hidden email]> --- The problem you encounter sooner or later (with gphoto cameras sooner than with UMS cameras) is that the time you need to compute the hash, by accessing the Exif data, will be disproportional to the gained functionality. Regarding the use of make, model and name, let's have a look at the DownloadHistory database header file: /** * Queries the status of a download item that is uniquely described by the four parameters. * The identifier is recommended to be an MD5 hash of properties describing the camera, * if available, and the directory path (though you are free to use all four parameters as you want) */ static Status status(const QString& identifier, const QString& name, qlonglong fileSize, const QDateTime& date); For me all points are very minor problems, yes we could make wild guesses that pictures on the camera were already downloaded based on some parameters, yet file name is not useful as there can be renames, file size is not useful as metadata can have been edited, date alone is by far too weak. -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by Cristian Klein
https://bugs.kde.org/show_bug.cgi?id=319001
--- Comment #2 from Cristian Klein <[hidden email]> --- Hi Marcel, Let me address your comments inline. On 2013-04-28 18:46, Marcel Wiesweg wrote: > The problem you encounter sooner or later (with gphoto cameras sooner than with > UMS cameras) is that the time you need to compute the hash, by accessing the > Exif data, will be disproportional to the gained functionality. I'm not sure I agree with this. When importing photos through UMS, the user is presented with a preview of each photo, so very likely the EXIF tag is already read in by digiKam. Even if the EXIF tag is for some reason not read by digiKam (e.g., using seek), the kernel will cache whole disk blocks (usually 4KB in size), therefore, reading the EXIF tag would have a minimum performance impact. I have already presented several use-cases when smart "already-downloaded" detection would help, so I don't find the cost disproportional. I'm not sure what would be the performance impact for gphoto cameras. Isn't the EXIF metadata read in anyway as part of image preview? > Regarding the > use of make, model and name, let's have a look at the DownloadHistory database > header file: > /** > * Queries the status of a download item that is uniquely described by the > four parameters. > * The identifier is recommended to be an MD5 hash of properties describing > the camera, > * if available, and the directory path (though you are free to use all > four parameters as you want) > */ > static Status status(const QString& identifier, const QString& name, > qlonglong fileSize, const QDateTime& date); For UMS, "identifier" depends on the media ID and not on the photo metadata. Therefore, if I receive the same photo through two source, DownloadHistory will mark the photo incorrectly as not-previously-downloaded. For me, this is cumbersome. > For me all points are very minor problems, yes we could make wild guesses that > pictures on the camera were already downloaded based on some parameters, yet > file name is not useful as there can be renames, file size is not useful as > metadata can have been edited, date alone is by far too weak. I agree that for legacy cameras, this might be difficult. However, like I wrote, newer cameras include a "unique photo ID" (something like a UUID) in the EXIF tags of each photo. Users might already have access to such cameras (I do), why not take advantage of it? -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by Cristian Klein
https://bugs.kde.org/show_bug.cgi?id=319001
Teemu Rytilahti <[hidden email]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[hidden email] --- Comment #3 from Teemu Rytilahti <[hidden email]> --- I think EXIF is already read for all the photos, at least partially at some point, so this could be possible. If there's wide support for this, we could use that as a hash and fallback to our current calculation. Nevertheless I wasn't able to find any photos from my collection having anything this unique, do you have some samples? -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by Cristian Klein
https://bugs.kde.org/show_bug.cgi?id=319001
Gilles Caulier <[hidden email]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[hidden email] Summary|Smart detection whether |Smart detection whether |file was been downloaded |file was been already | |downloaded -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by Cristian Klein
https://bugs.kde.org/show_bug.cgi?id=319001
[hidden email] changed: What |Removed |Added ---------------------------------------------------------------------------- Version|unspecified |4.14.0 -- You are receiving this mail because: You are the assignee for the bug. |
Free forum by Nabble | Edit this page |