[Bug 283013] New: Accelerating writing metadata back to image files

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 283013] New: Accelerating writing metadata back to image files

Bugzilla from gerhardk@gmx.ch
https://bugs.kde.org/show_bug.cgi?id=283013

           Summary: Accelerating writing metadata back to image files
           Product: digikam
           Version: unspecified
          Platform: Compiled Sources
        OS/Version: Linux
            Status: UNCONFIRMED
          Severity: wishlist
          Priority: NOR
         Component: Metadata
        AssignedTo: [hidden email]
        ReportedBy: [hidden email]


Version:           unspecified (using KDE 4.7.0)
OS:                Linux

I'm not sure how the files are updated when writing metadata back to images
(probably kioslave task). My suspicoin is that the whole file is written back
and not just the metadata part.
When I backup my images with rsync (using Luckybackup GUI) after I've modified
metadata with digikam, the update seems much quicker than in digikam. And with
rsync it is clear that only a small part of every file is being rewritten
(example log:    32.77K   1%   71.75kB/s    0:01:31)

If my understanding is correct, it should be relatively easy to do the same
with digikam, just using rsync?




Reproducible: Always

Steps to Reproduce:
Update some metadata in digikam, then backup the same data using rsync.

Actual Results:  
rsync update is much faster.

Expected Results:  
accelerate writing metadata back to image files

--
Configure bugmail: https://bugs.kde.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Bug 283013] Accelerating writing metadata back to image files

Marcel Wiesweg
https://bugs.kde.org/show_bug.cgi?id=283013





--- Comment #1 from Marcel Wiesweg <marcel wiesweg gmx de>  2011-09-29 18:20:10 ---
This is fully relevant of exiv2. I dont even know anything the details of
metadata writing.

--
Configure bugmail: https://bugs.kde.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Bug 283013] Accelerating writing metadata back to image files

Gilles Caulier-4
In reply to this post by Bugzilla from gerhardk@gmx.ch
https://bugs.kde.org/show_bug.cgi?id=283013


Gilles Caulier <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]
            Version|unspecified                 |2.1.1




--- Comment #2 from Gilles Caulier <caulier gilles gmail com>  2011-09-29 20:48:27 ---
Andreas Huggel from Exiv2 is in copy for more details.

Gilles Caulier

--
Configure bugmail: https://bugs.kde.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Bug 283013] Accelerating writing metadata back to image files

Bugzilla from ahuggel@gmx.net
In reply to this post by Bugzilla from gerhardk@gmx.ch
https://bugs.kde.org/show_bug.cgi?id=283013





--- Comment #3 from Andreas Huggel <ahuggel gmx net>  2011-09-30 06:13:52 ---
The Exiv2 write logic is optimized based on the image format and the kind of
changes to the metadata. The two classes of image formats are TIFF-like images,
where the metadata is not in a specific portion of the image but potentially
spread over the entire image (image == metadata) and images which keep the
metadata in a specific portion of the file (e.g., JPEG, PNG). The type of
changes distinguish "intrusive" and "non-intrusive" changes. If any metadata
tags are added, deleted or an existing metadata field is extended, the change
is intrusive and requires Exiv2 to re-serialize the entire metadata structure.
If an existing field is changed and its size is not extended (it can shrink),
then Exiv2 makes the change in-place, without rewriting the entire metadata.
This has the considerable advantage that the TIFF structure stays intact, even
if Exiv2 can't parse it. A typical examples for a non-intrusive change is
changing the Exif date/time of an image.

Writing works as follows:

                 intrusive    non-intrusive
                 ------------ -------------
TIFF-like      : copy         mmap
Metadata block : copy         copy*

"copy" means the file is re-written and re-named (its size changes)
"mmap" means the file is changed in-place (the file size remains the same)

* In this case, the metadata structure is changed in-place but the file is
copied and in the process, the new metadata block is inserted.

The only further optimization I can see is that in the case of images with a
metadata block and non-intrusive changes, it would be possible to change the
entire file in-place rather than only the metadata block.

For additional considerations (memory related), see
http://dev.exiv2.org/issues/617

How does rsync work? Does it really operate on portions of a file (not only
modified files + compression)?

-ahu.

--
Configure bugmail: https://bugs.kde.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Bug 283013] Accelerating writing metadata back to image files

Bugzilla from gerhardk@gmx.ch
In reply to this post by Bugzilla from gerhardk@gmx.ch
https://bugs.kde.org/show_bug.cgi?id=283013





--- Comment #4 from Gerhard Kulzer <gerhardk gmx ch>  2011-09-30 08:05:03 ---
First, thank you very much Andreas for this detailed explanation, it's good to
memorize this one.

Concerning the rsync mechanisms, I found this description on the Wikipedia site
of rsync:


"The rsync utility uses an algorithm invented by the Australian computer
programmer Andrew Tridgell for efficiently transmitting a structure (such as a
file) across a communications link when the receiving computer already has a
similar, but not identical, version of the same structure.

The recipient splits its copy of the file into fixed-size non-overlapping
chunks and computes two checksums for each chunk: the MD4 hash, and a weaker
'rolling checksum'. (Version 30 of the protocol, released with rsync version
3.0.0, now uses MD5 hashes rather than MD4.[14]) It sends these checksums to
the sender.

The sender computes the rolling checksum for every chunk of size S in its own
version of the file, even overlapping chunks. This can be calculated
efficiently because of a special property of the rolling checksum: if the
rolling checksum of bytes n through n + S − 1 is R, the rolling checksum of
bytes n + 1 through n + S can be computed from R, byte n, and byte n + S
without having to examine the intervening bytes. Thus, if one had already
calculated the rolling checksum of bytes 1–25, one could calculate the rolling
checksum of bytes 2–26 solely from the previous checksum, and from bytes 1 and
26.

The rolling checksum used in rsync is based on Mark Adler's adler-32 checksum,
which is used in zlib, and is itself based on Fletcher's checksum.

The sender then compares its rolling checksums with the set sent by the
recipient to determine if any matches exist. If they do, it verifies the match
by computing the hash for the matching block and by comparing it with the hash
for that block sent by the recipient.

The sender then sends the recipient those parts of its file that did not match
the recipient's blocks, along with information on where to merge these blocks
into the recipient's version. This makes the copies identical."

There is a longish but nice interview with Andrew Tridgell, the creator of
rsync here: http://oceanpark.com/webmuseum/rsync.html

So it works on blocks, which seem to be chunks of 500-1000 bytes (as I read on
various sources). Anyways, judging from the logs I get from rsyncing, the
change size is usually less than 1% of an image, and that may contain several
blocks of course.

--
Configure bugmail: https://bugs.kde.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[Bug 283013] Accelerating writing metadata back to image files

Gilles Caulier-4
In reply to this post by Bugzilla from gerhardk@gmx.ch
https://bugs.kde.org/show_bug.cgi?id=283013


Gilles Caulier <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]
         Depends on|                            |188925




--- Comment #5 from Gilles Caulier <caulier gilles gmail com>  2011-12-17 10:34:00 ---
Note this file depends of #188925 for few points...

Gilles Caulier

--
Configure bugmail: https://bugs.kde.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 283013] Accelerating writing metadata back to image files

Gilles Caulier-4
In reply to this post by Bugzilla from gerhardk@gmx.ch
https://bugs.kde.org/show_bug.cgi?id=283013

--- Comment #6 from Gilles Caulier <[hidden email]> ---
Note : writting metadata from Maintenance tool use now parallelized threads if
you have multi-core CPU. This will increase a little bit the speed of process
to write metadata on files.

But the lead problem here, if i'm not too wrong still in Exiv2 shared
library...

Gilles Caulier

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 283013] Accelerating writing metadata back to image files

Gilles Caulier-4
In reply to this post by Bugzilla from gerhardk@gmx.ch
https://bugs.kde.org/show_bug.cgi?id=283013

Gilles Caulier <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--- Comment #7 from Gilles Caulier <[hidden email]> ---
*** Bug 252494 has been marked as a duplicate of this bug. ***

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 283013] Accelerating writing metadata back to image files

Gilles Caulier-4
In reply to this post by Bugzilla from gerhardk@gmx.ch
https://bugs.kde.org/show_bug.cgi?id=283013

Gilles Caulier <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |UPSTREAM

--- Comment #8 from Gilles Caulier <[hidden email]> ---
This file is definitively an UPSTREAM entry which much be reported to Exiv2
bugzilla, as low level writing metadata to files are processed in background by
Exiv2.

Gilles Caulier

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 283013] Accelerating writing metadata back to image files

Gilles Caulier-4
In reply to this post by Bugzilla from gerhardk@gmx.ch
https://bugs.kde.org/show_bug.cgi?id=283013
Bug 283013 depends on bug 188925, which changed state.

Bug 188925 Summary: Write image metadata in background with user feedback
https://bugs.kde.org/show_bug.cgi?id=188925

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 283013] Accelerating writing metadata back to image files

Gilles Caulier-4
In reply to this post by Bugzilla from gerhardk@gmx.ch
https://bugs.kde.org/show_bug.cgi?id=283013

Gilles Caulier <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--- Comment #9 from Gilles Caulier <[hidden email]> ---
*** Bug 269467 has been marked as a duplicate of this bug. ***

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel