[digikam] [Bug 337688] New: Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

classic Classic list List threaded Threaded
43 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] New: Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

            Bug ID: 337688
           Summary: Reading/writing of keyword-tags  to jpg and xmp
                    corrupts tag hierarchy, duplicate root tag
           Product: digikam
           Version: 4.1.0
          Platform: openSUSE RPMs
                OS: Linux
            Status: UNCONFIRMED
          Severity: major
          Priority: NOR
         Component: Tags
          Assignee: [hidden email]
          Reporter: [hidden email]

OS / Release Details:
---------------------
Digikam/Kipi Plugins Release 4.1.0-11.1 (libkexiv2-11 4.11.5-298.11)
with and without MySQl DB (mariadb 5.5.33-2.2, libmysqlclient18 5.5.33-2.2)
(OS: OpenSuse 13.2 from BuildService KDE:Extra, Windows 8.1 from Installer)

Symptoms:
------------
Tag writing and reading destroys your tag-hierarchy metadata any time you use
digikam to write tags if you use a tag tree with subnodes (tag hierachy) !!
Digikam CRUD operations for tagging will duplicate tags and mess up the
hierarchy, if the "writing metadata to image" option is selected.

You will end up with multiple nested levels of _Digikam_root_tag_ hierarchies
with duplicate keywords (up to 8 times) in each level. A nightmare if you have
millions of tagged images.
If you try to consolidate the duplicates with digikam you will cause much more
damage.

Analysis:
------------
Five different behaviours (and bugs?) when writing tags:

- Create and write new tag embedded to jpg image  (iptc, xmp) -> worked well
before 4.1 if only a few tags are selected and the tag was created on the same
PC/DB ... after rereading all metadata this operation creates duplicate tag
hierarchy with new root tag on same level after rereading metadata.

- Set and write embedded tag, that was imported from jpg metadata before ->
messed up hierarchy position or duplicates entry on pre 4.1 version, now seems
to work for 4.1 under a single root tag, or on top level, but after rereading
metadata the hierarchy might be corrupted with duplicates as well.

- Set and write embedded tag, that was imported from xmp sidecar before -> will
have two or more nested root tags until you remove all duplicate root tags from
the xml file before the import (photoshop section).
To Reproduce: write tags from a hierarchy to image, remove tags from db, import
the tags again and write tags - every time an additional nested root node is
added. For me this is at least a major bug.

- Set and write new tag with same name of tag somewhere else in the hierarchy
(very likely you run into this, because of the bug that causes recursivley
nested root tag) -> duplicates tags because another way of writing tags is used
(full path) - but reading these tags will not consolidate them with same tags
from other images without full name

- Set and write many tags to a jpg file (cannot be stored in IPTC section) ->
Digikam will write them to xmp xml-file with two or more nested root tags, next
import duplicates the hierarchy .. see above.

These situations do NOT produce consistent keyword-tags in the images (after
rereading metadata !). Each situation produces duplicate tags that are shown on
different levels of the tag tree, some in the same, some on top.

Background:
-------------
 I am tagging a very large image collection with digikam for a long time. After
a few tests with the tag manager I decided to clean up tags ... because till
4.0 there was no way to remove a tag from an image. In 4.1 there are three ways
to remove tags. Two of them do not act as CRUD operation - even if you remove
the tag and wrtie metadata to the file they will not be removed. The new
"remove tag from all images" functions causes massive damage on the tag
hierarchy in the images - so there is no option left to manage tags with
digikam 4.1.

Tagging whish list:
--------------------
 a quick-link to the digiKam bug-list from the digikam site (and devel-blog)

 no digikam root tag whatsoever ... the user can create one if needed

 a single tagging mode with CRUD operations for image-metadata compatible with
photoshop/lightroom. For other tag sources just provide import operations, but
no write operations. These apps are toys.

 a specific tagging mode: "database only" that will not touch any image (no
additional settings, just all or nothing)

 a single, consistent way to remove tags - from images/database only, see above

 import/export of tag hierarchies to xml files in tag manager
 all CRUD operations for image-metadata have to be consistent,
   or they will to be removed from the code -> there is already an entry
(whish) in the bug-list. PS/Lightroom format and xml would be great.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #1 from Christian <[hidden email]> ---
Created attachment 87875
  --> https://bugs.kde.org/attachment.cgi?id=87875&action=edit
Screenshot of corrupted tag hierarchy

Before the last writing of metadata and re-reading metadata, the tag tree did
not contain duplicates and two levels of digikam root tags. After some
corrections (removing outdated tags) the tag tree looks like this. (desaster)

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #2 from Christian <[hidden email]> ---
More details on the way, digikam destroys tags:

It looks like two independent bugs are involved:
- duplication of root tag (for sure when writing/reading from xmp sidecar)
- duplication of parent-keywords when editing child keywords
   (e.g.  deletion, adding new childs ...)
- duplication of added keywords in digikam 4.1.0
    Up to 8 times, in most of the cases 5 times
    In most of the cases, the duplicate keywords refer to a single image
    of the album, where image-tags have been changed before.
    Only the last duplicate entry refers to most of the images
    it was assigned to before ... see screenshot.
    In rare cases the one above the last refers to most of the images.

Why are these tags (with exactly the same name) showed many times? The
duplicates are assigned to just a few images, but there is no relation to the
changes that have been made before, except that these images are always in the
same album.

Note: I have a lot of duplicate images in different albums e.g. with different
sizes and color profiles. But duplication of tags takes place inside a single
album. Usually some images among the first 8 ones in the album view get
assigned to inconsistent key word tags with exactly the same name, but with
multiple duplicate entries in the tag hierarchy - see screen shot.

Hope this helps to track down the root cause of this severe bug (that will cost
me at least 7 working days to restore the old image databases on all my devices
- including the work of clearing the outdated tags that is lost)

More about my configuration:
- image rotation is not viewed
- all tags are written to files except "rating" and two similar items I am not
using at all
- writing to xmp sidecars is selected if "image is not writeable", and sidecars
are read
- on OpenSuse I migrated my digikam settings from 12.1 to 13.1, but I also
tried a clean setup with same results.
- the official version of digikam in OpenSuse is still 3.5. I installed 4.1
from the KDE:Extra Repository that uses release 4.1.0-11.1 for the build. Is
this a stable one?
- due to the lack of a hierarchy import/export feature, I select all
keyword-tags in a specific jpg image, and I import metadata from this image
first, e.g. when I set up a new linux work station for digikam to transfer the
hierarchy. Before import I remove the duplicate root tags from the xml file,
but I leave one root tag, otherwise the imported keywords will not match with
the imported keywords from the other jpgs.

The bugs described before are not related to this workflow ... I also tried to
import from jpgs without the special file with all the tags, and the result is
similar (bad).

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #3 from Christian <[hidden email]> ---
Errata:

I am using the stable OpenSuse Release 13.1 (not 13.2).

Remark: Today I replaced digiKam 4.1 with 3.5 (as recommend by Suse). This is a
stable version with regards to tagging. Unfortunately there is no way to ever
remove a tag from an image, but this release is not destroying your keyword
tags in the images.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Gilles Caulier-4
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

Gilles Caulier <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|Tags                        |Metadata
                 CC|                            |[hidden email],
                   |                            |veaceslav.munteanu90@gmail.
                   |                            |com

--- Comment #4 from Gilles Caulier <[hidden email]> ---
veaceslav,

This kind of problem do not have been fixed previously with 4.0.0 release ?

Gilles Caulier

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Veaceslav Munteanu-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #5 from Veaceslav Munteanu <[hidden email]> ---
I tried to reproduce this problem, added a tag with subtags on image, wrote tag
to metadata, wiped all tags from database, and triggered re-read, everything
works as expected...

I quick search on all digikam sources, and I don't find any _Digikam_root_tag_
string in all sources. Some legacy from old database?

I will try to test and fix everything related to tags tomorrow.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Gilles Caulier-4
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #6 from Gilles Caulier <[hidden email]> ---
Veaceslav,

yes, _Digikam_root_tag_ is a very old internal tag. I don't remember why it
have been implemented. It have been dropped since a very long time.

Perhaps Marcel remember the story...

Christian,

Can you reproduce this problem using a fresh database ?

Gilles Caulier

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Veaceslav Munteanu-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #7 from Veaceslav Munteanu <[hidden email]> ---
Hmm.. I remember a user asked me on IRC channel, and suggested him to use exiv2
-pa on a tagged image to see the content.

It had indeed a _Digikam_root_tag_ in metadata, but I don't see any reasons why
it creates duplicates...

Probably I should add a line or two to strip info from old _Digikam_root_tag_ ?

Cristian, can you provide me with few tagged images, so I can test?

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #8 from Christian <[hidden email]> ---
(In reply to Gilles Caulier from comment #4)
> veaceslav...

Another note on the topic:
--------------------------

Many simple cases have been fixed so far. Complex tag hierarchies never worked
for me on any digikam version so far. I have not tested 4.0, only 3.5 and 4.1.

Some symptoms smell like outdated chached models ore gui states at the time
when metadata is written back to the images.

An adequate testcase would be:
..................................

Create tag tree with 8 branches, depth of 8 levels and 8 elements. 8 albums
with at least 20 images nested in subfolders on 3 levels (e.g. 20 Mpx jpgs).
Use tags below and above the digikam root tag. Create an artificial digikam
root tag as well in your hierarchy.

1. Test CRUD operations - Precondition:

 - assign 2 random elements of first 3 branches to all pics, each from
different levels (usecase for "location", "person", "time" category)
Assumption for these tags: they will not be changed any more, and therefore
should stay unchanged during the whole test. Some images, and some complete
albums should only use unchanged tags to check if reading/writing is stable
without any changes applied.

 - assign 2 random elements of branch nr 4 from all levels and assume these are
outdated and have to be removed from all images later on
 - assign 2 random elements of branch nr 5 from all levels and assume these
keywords have to be renamed, and some are moved to other positions in the
hierarchy
 - assign 2 random elements of branch nr 6 from all levels and assume these are
spacial keywords that have to be split to three different keywords later on
(rename old keyword, filter on new keyword, add another keyword to some images,
and remove the other one)
 - assign 2 random elements of branch nr 7 from all levels and assume these are
spacial keywords that have to be joined with two or thee different keywords
later on: filter on the keywords and add the one to be joined, then remove the
others from all images
 - assign 2 random elements of branch nr 8 from all levels and assume these are
keywords for events that will be filled up with additional keywords on all
levels which will be assigned to images in several steps later on. Also remove
some of the added sub-keywords from single images (will this delete them from
these images?) and some from all images.

2. Test CRUD operations:

Write and re-read metadata after creation of the initial tags before the
changes are applied: check before/after

Then perform all the changes in the keyword-tags as described in step 1, e.g.
remove the outdated tags from branch number 4 ... .

3. Test CRUD operations - Postcondition:

Write and re-read metadata after creation of the initial tags: check
before/after

A possible method:

Make a copy of the latest tag tree - eg. store all tags in one image with all
tags available (xmp xml) to document a snapshot of the tag tree in the db.
Make a screenshot of the Tag Manager View as well, because this bugs are likely
related to gui/chache status (except duplicate root tag).

Then write metadata from db to all files. Remove tags in database and reload
metadata from all files.

Store all tags in another image with all tags available (xmp xml) and compare
the resulting tag tree, and also compare Tag Manager View with the screen shot
taken before.

Write and re-read metadata a second time without any changes: check
before/after

4. Try to delete a nested digikam root tag on the second level (will be created
because of the mentioned bug)

Write and reread metadata again: check before/after

5. Try to delete the digikam root tag on the topmost level

Write and reread metadata again: check before/after

6. Try to delete the "My Tags" icon (I was able to do so in earlier versions)

Write and reread metadata again: check before/after

I know this a lot of work (I lost 6 days to find out) just for jpegs with
embedded keywords. This test should be automated to keep the tag feature stable
in future releases.

Unfortunately I cannot invest more time at the moment, but I will if this is
not working out. It is a very important feature for me. I migrated all assets
from lightroom some years ago and I dont want to go back to adobe.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #9 from Christian <[hidden email]> ---
(In reply to Veaceslav Munteanu from comment #7)

> Hmm.. I remember a user asked me on IRC channel, and suggested him to use
> exiv2 -pa on a tagged image to see the content.
>
> It had indeed a _Digikam_root_tag_ in metadata, but I don't see any reasons
> why it creates duplicates...
>
> Probably I should add a line or two to strip info from old
> _Digikam_root_tag_ ?
>
> Cristian, can you provide me with few tagged images, so I can test?

Hi, I deleted all the images from 4.1 because my database has to be online till
tomorrow. I installed 3.5 again.

You will not see _Digikam_root_tag_ until you import all metadata from the
images to a new database or until you clear all tags in your database.

Maybe it is an issue with german language version or with tags from several
relases in my images.  I could import all the tags without the
_Digikam_root_tag_ if I want, but then the tags imported from the images will
not be joined.

I will provide older images from digikam 3.5. Sorry they are quite big, up to
10 MB.

thank you for taking a look : )

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Veaceslav Munteanu-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #10 from Veaceslav Munteanu <[hidden email]> ---
Well, this is a migration problem(3.5 -> 4.1), I guess.

I strongly need at least 2-3 images with metadata written in old format, so I
could trigger your digiKam's behavior(with my images nothing happens,
everything works)

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #11 from Christian <[hidden email]> ---
Created attachment 87882
  --> https://bugs.kde.org/attachment.cgi?id=87882&action=edit
Digikam 4.1  automatically created xmp with explicit root tag.

I found an older example in the backups - it was created with digikam
4.1.0-11.1
Unfortunately it is to big - I have no images below 4 MB. I added the xmp only
and will prepare an ftp download with more examples.

I used this jpg to store and import all tag keywords on other workstations. The
xmp file was created without asking - i guess because there are too many tags
selected to be embedded.

You can see the _Digikam_root_tag_ that caused me a lot of pain in the last
weeks.

As I mentioned I could create my hierarchy without the root tag as well (I did
before), but then these tags are not joined with the tags digikam reads from
new images that have been tagged on another workstation.

In short:
I moved my tags below the root tag - this was a lot of work. There was no such
tag before. But when I imported metadata (in 3.5) this tag was inserted by
digikam - same behaviour in 4.1. For me the only way to join tagged images from
several workstations was to rearrange my tags this way. Is 4.1 going nuts
because of this tag?

By the way, why does dk introduce this unwanted parent-tag when reading from an
image that was not tagged with this root-tag visible in the GUI (same in 3.5
and 4.1)? There was no such tag in the xmp metadata - I checked that - but
after importing metadata this root tag is shown in the GUI.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #12 from Christian <[hidden email]> ---
(In reply to Veaceslav Munteanu from comment #10)
> Well, this is a migration problem(3.5 -> 4.1), I guess.
>
> I strongly need at least 2-3 images with metadata written in old format, so
> I could trigger your digiKam's behavior(with my images nothing happens,
> everything works)

Ok, I will create an archive for ftp download - my pics are bigger than 4MB.
Did you check a more complex hierarchy with serveral writing and reading
cycles?

I do not expect that you will not find anything special in the files. Some are
tagged with a list of unstructured keywords, some with full path (if I try to
correct the position in the tree), and there is no root tag until I moved all
tags below this root tags on my own - see comment in the second attachement.

The database and the digikam settings where removed and created from the
scratch before I started to work with 4.1. All metadata was written to files
and reread several times since migration (6 hours on 8 core machine).

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #13 from Christian <[hidden email]> ---
Created attachment 87885
  --> https://bugs.kde.org/attachment.cgi?id=87885&action=edit
http://buitk.at/download/digikam_35_41_samples.zip

Please download the sample images from:
http://buitk.at/download/digikam_35_41_samples.zip

These images contained duplicate tags after I tried to add "StadtSchleining"
and "Heiligenbrunn" tags below:
_Digikam_root_tag_/Orte/Oesterreich/Burgenland/  to mark these images with a
new location.

"StadtSchleining" was duplicated 2 times below "Burgenland" and 2 times on the
top level. A single one of the included images was assigned to the duplicate
tags, while all other pics of the album remained tagged with the right keyword
on the right position. Why always just a few images? Maybe they where selected
in the GUI while I worked in the tag manager? I cant tell.
Tag "Heiligenbrunn" was duplicated 3 times under "Burgenland" - all tags
assigned to image DSC0448 in the second album, but no duplicate tags in the
first album.

I tried to remove the duplicated tags with the new tag manager in digikam -
after rereading the metadata from all images StadtSchleining was duplicated 5
times below "Burgenland", two times in a nested root tag below "Burgenland",
and no more on the toplevel (where other tags from other branches showed up) -
see first screenshot. "Heiligenbunn" was 4 times below "Burgenland" and also
several times below a second nested root tag.

You can import the old tag hierarchy from the image in album
"all_tag_categories_old_with_xmp". I lost the latest state of the hierarchy. It
was redesigned with hundreds of changes till a crash forced me to reread
metadata with the documented results.

Note: you have to delete all tags from database and import all metadata again
to see these effects. everything looks fine as long you do not import metadata.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #14 from Christian <[hidden email]> ---
(In reply to Gilles Caulier from comment #6)

> Veaceslav,
>
> yes, _Digikam_root_tag_ is a very old internal tag. I don't remember why it
> have been implemented. It have been dropped since a very long time.
>
> Perhaps Marcel remember the story...
>
> Christian,
>
> Can you reproduce this problem using a fresh database ?
>
> Gilles Caulier

...

> Can you reproduce this problem using a fresh database ?

Yes, the db crashed and was rebuild three times (after all tags disappeared).
I also removed all tags with the new tag manager and read all meta data two
times, but the duplication happended again. Is this "fresh" enough?

I have tagged faces as well, so cleaning the db in production is a drawback. I
therefore used the "write face tags to image" option before I deleted and
rebuild all tags. But I never tried with a completely new db. This will take 4
hours, to be set up but if you think it might help I will try.

Will check this if there is time to upgrade an old laptop to OpenSuse 13.1 - a
prerequisite to install digikam 4.1 again. I will be busy with the downgrade to
digikam 3.5 in production environment till next week.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

Christian <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|major                       |grave

--- Comment #15 from Christian <[hidden email]> ---
-----------------------------------------------------------------
Testcase explains why tag hierarchy is getting corrupt quickly:
     http://buitk.at/download/digikam_4.1_tag_testcase.zip
-----------------------------------------------------------------

I suggest to set the bug to "grave" after building a test case to reproduce the
corruption of a small tag hierarchy - see below. Three sources of corruption
eat up your tags and limit the usage of digikam 4.1 to a single device with
single database that should never break. Do not leave this path until this
family of bugs is fixed.

On top of my wishlist:
-------------------------
Please help me with my inconsistent tags in thousands of images tagged with
different releases of digikam caused by these bugs and older ones. A simple
tool could help me and many others out of this nightmare.

Requirements: This tool should read all tags that make sense from each file and
copy them to all sections (XMP/IPTC ...) in a consistent way without any root
tags and without any duplications or imports to the database. Remove all
unreadable stuff. The database should be rebuild from scratch after the
consolidation is complete.

----------------------
Testcase Explanation:
----------------------

I used a "clean" install, a new empty mysql database and OpenSuse 13.1 with
digikam 4.1.0-11.8 and KDE 4.13.3 to test keyword tagging. I am convinced that
SQLLite shows similar results, but I had no time to check this.

1. Copy test images to the folder with your collection

Download the zip file and unpack it to a local folder:
  http://buitk.at/download/digikam_4.1_tag_testcase.zip
Copy the sample files into your image folder, but do not copy the screenshots.

2. Subfolder: 0_inconsistent_writing_reading/album1_with_single_root_tags

bug i:
There is a bug in "Read metadata" that duplicates hierarchies when IPTC and XMP
Section contain different keywords, or when "full path keywords" are mixed with
single keywords. Select all images in album1 and "Read metadata from images"

bug ii:
There is another bug in "Read metadata" - whenever tags are found, that are not
in the hierarchy, a "_Digikam_root_tag_" is created in the GUI, that was not in
the images taglist. This is also done when there is already such a tag - in
same cases they are shown beside on the same level, most of the time nested
under each other.

To check this remove all tags from the database using "Tag Manager" and "Read
metadata from image" of any tagged image. If you try to delete this tag you
loose all others too. The same happens if you add a tagged image from another
PC with keywords that are not already in your tag tree.

Note: The automatic update might behave in another way - but the "Read
metadata" function will always create such unwanted tags.

bug iii:
Beware - duplicated tag-branches in digikam 4.1 GUI are not always different
tags until you use "write metadata". If you close digikam and open it again,
some of the duplicate tags will disapear, others remain. This is another
anoying bug: GUI view and internal model are out of sync. Many times this
causes a loss of all tags - e.g. if you delete a nested duplicate tag, that is
internally identical with the root tag above - so you delete the topmost root
tag and with it all your tags are gone!

Even worse - you will not see the loss until you close and open digikam again -
so any tagging operation with write operations writes chaos tags or nothing to
your files.

See "digikam_4.1_remove_one_duplicate_tag_branch_before_close.jpg"
    "digikam_4.3_remove_one_duplicate_tag_branch_after_opening.jpg"
    for demonstration


3. Subfolder: 0_inconsistent_writing_reading/album2_with_duplicate_root_tags

DSC0448_with_all_duplicate_root_tags_after_writing.JPG

... demonstrates the mentioned bugs - I found this one in my collection that
was edited on two different PCs. Some of the tags have duplicate root tags on
top because the tag tree was imported on another PC from the image and later on
more tags where added ... this caused the corrupt tags in the GUI to be written
to the file.

DSC0448_with_few_duplicate_root_tags_before_writing.JPG
... demonstrates what happens when this metadata is read and written again -
the root tag was not duplicated this time, but the ones without root tag have
one now. Please not that this is undesired behaviour - why are toplevel tags
added to some tags and not to others? Unpredictable behavior ..

4. Subfolder: 0_inconsistent_.../album3_with_no_dk_root_tag_and_duplication

album3 demonstrates the bugs described before on a single file, starting with
an empty tag tree. First two nested tags are added to the file. Then two more
tags are added on the third level. Finally two of the topmost tags are removed
again. The removal of these tags now works without any explicit writing of
metadata ... this is an advantage compared to older versions : )

Everything worked out well - also writing of metadata and reading metadata
again does not cause corruption. The bugs i-iii described above do not apply,
because the hierarchy was created on this PC and is still present!

Corruption starts once you remove some of these tags before reading metadata
again, or if you copy the tagged files to another PC and "read metadata".
This explains why these severe bugs have not been fixed for such along time.

See: "_DSC2638_5_unwanted_digikam_root_tag_written_to_file_from_gui.JPG"

See also this example to understand creation of nested duplicate root tags:
"_DSC2638_7_root_tag_is_duplicated_after_2nd_read_when_no_tag_present_and_write"


4. Subfolder: 1a_move_tag_to_new_position_in_tree_by_moving

This example demonstrates inconsistent IPTC and XMP tags that cause a bad mess
when reading metadata from such a file, because no tag tree will match these
cases - so many duplicate branches are created (I hope this is not excpected?)

It shows what happens if one tries to move a tag from top down to a subbranch
and writes metadata to all related files again (not needed if tags stored as
single keywords- but who knows in which way tags are stored in a particular
image?) In this case writing metadata works well.

But rereading metadata from this file really surprised me - a Person tag shows
up, why now I cant tell - and even worse: the "Zeit" tag was removed to top
(why causes reading a move of a tag?) - and some hidden, old "Zeit" Versions
appeared, that where not visible before, when we read metadata.

bug iv:  In case of inconsistent IPTC keywords that do not match XMP keywords,
reading metadata will not show all kewords. After some other operations reading
again will bring new keywords (hidden in the file). In this case the position
of existing keywords is changed as well while reading - this is unwanted.
This might be an issue of "full path keywords" mixed with "single keywords".

4. Subfolder: 1b_reread_with_corrected_hierarchy_duplicates

digikam_4.1_tag_tree_with_missing_br_duplicates_root_when_reading.jpg and
digikam_4.1_before_tag_tree_with_missing_branches.jpg

demonstrate the mentioned bug, that the GUI is sometimes not in sync with the
internal model - branches that seem duplicates are internally not duplicated -
so deleting a nested tag leads to the loss of the whole branch.
This can be avoided, if you close an open digikam each time a duplication
occurs - to see if it is real or just fake.

5. Subfolder:
    2a_inconsistent_readingwriting_of_metadata_duplicates_tags
    2b_inconsistent_reading_with_missing_tags

 Several of methods to write tags have been applied to these images, and the
"Read metadata from images" functions causes a big mess of duplicate tags.
 There is also an example of a tagged file that contains no root tag - but if
it resides in an album with an image with an root tag, it will get one the next
time the metadata of this album is writen to all files-

bug v: unwanted root tags infect other images with root tags if they are
changed together.

6. Subfolder: 3_write_from_duplicated_hierarchy_to_file

Digikam used (hopefully this does not continue) a mixed strategy to write tags
- sometimes with full path, sometimes not. This causes strange semantics for
duplicate keywords.

Wish / bug vi:
To avoid chaos: always write keywords with full path, and NEVER write identical
keywords or the same path more than once.
I am not sure if digikam 4.1 meets this requirement.

In this example the same keyword is used on four different positions in the tag
tree (because of other bugs some branches have been duplcated), and all four
have been selected and written to the file. If the GUI view was out of sync,
some of them stand for the same position - but they have been written four
times. I cant tell if resulting metadata is as expected.

7. Subfolder: 4_remove_inconsistent_tag_close_open

Another example that GUI view and internal model are sometimes out of sync.
After accidentially deleting the top most tag (because it was shown two times)
the tree looks well - after closing and opening the whole branch is gone.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Christian-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #16 from Christian <[hidden email]> ---
Created attachment 87925
  --> https://bugs.kde.org/attachment.cgi?id=87925&action=edit
http://buitk.at/download/digikam_4.1_tag_testcase.zip

Download and unzip testcase to reproduce six kinds of bugs around tagging

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Veaceslav Munteanu-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #17 from Veaceslav Munteanu <[hidden email]> ---
that host is so slow, it takes me 8 hours to download it. Use google drive or
dropbox for faster speeds.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Veaceslav Munteanu-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

--- Comment #18 from Veaceslav Munteanu <[hidden email]> ---
Somehow, I was able to download the attachment, and I'm able to reproduce what
you said. Fixing now, please wait :)

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 337688] Reading/writing of keyword-tags to jpg and xmp corrupts tag hierarchy, duplicate root tag

Veaceslav Munteanu-2
In reply to this post by Christian-2
https://bugs.kde.org/show_bug.cgi?id=337688

Veaceslav Munteanu <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|UNCONFIRMED                 |RESOLVED
      Latest Commit|                            |http://commits.kde.org/digi
                   |                            |kam/86d06f51a3d391fd243ad82
                   |                            |983e532e12171b6b5

--- Comment #19 from Veaceslav Munteanu <[hidden email]> ---
Git commit 86d06f51a3d391fd243ad82983e532e12171b6b5 by Veaceslav Munteanu.
Committed on 24/07/2014 at 08:37.
Pushed by munteanu into branch 'master'.

M  +10   -2    libs/database/imagescanner.cpp

http://commits.kde.org/digikam/86d06f51a3d391fd243ad82983e532e12171b6b5

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
123