[digikam] [Bug 374591] New: Deleting image only removes the file and sets the status to hidden but does not delete the image from DB

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] New: Deleting image only removes the file and sets the status to hidden but does not delete the image from DB

bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

            Bug ID: 374591
           Summary: Deleting image only removes the file and sets the
                    status to hidden but does not delete the image from DB
           Product: digikam
           Version: 5.4.0
          Platform: Compiled Sources
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: NOR
         Component: Database
          Assignee: [hidden email]
          Reporter: [hidden email]
  Target Milestone: ---

While doing some tests for bug ids https://bugs.kde.org/show_bug.cgi?id=320666
and https://bugs.kde.org/show_bug.cgi?id=374191 , I moved some image to trash
to make sure that the similarities to this image are deleted.
While the similarities to this image were deleted, the similarities of the
image to other images were not deleted.

Moving to trash thus obviously does not delete the image properties for the
image moved to trash.

Then, I deleted the image from trash but the image properties were still there.
After inspecting the SQLite DB (shoud be the same for MySQL), I learned that
the file was deleted from the hard disk but the image info in the database was
not. The image is still in database with Album set to NULL and status set to
hidden.

Thus, the trigger which removes the image properties was not activated.

This does not make any sense. Deleting an image from trash or directly should
also remove the entry from the images table.

Step to reproduce:
1) get the image id from some arbitrary image from the DB
2) add some image property for this image in the ImageProperties table
3) move the image to trash and delete it
The image id should still be in the DB

I inspected the source code. The imageviewutilities trigger the deletion of the
file via DIO but the deleteItem function of the core db class is not triggered.

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB

bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #1 from Mario Frank <[hidden email]> ---
There is another thing:
When I move the image to trash and use the recover button, a new entry is
created in the images table. But the deleted image is still there.

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #2 from Mario Frank <[hidden email]> ---
(In reply to Mario Frank from comment #1)
> There is another thing:
> When I move the image to trash and use the recover button, a new entry is
> created in the images table. But the deleted image is still there.

I assume this is done since the former album id is not stored. Shouldn't it be
possible to get the former album by fetching the path from the dtrash info?

Also, moving pictures from one album to another leads to new Images table
entries. This wastes many image ids. Couldn't this be solved by just setting
the new album id?

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

[hidden email] changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--- Comment #3 from [hidden email] ---
Mario, you is right.

In fact, we need a garbage collector, as it's explained in this file :

https://bugs.kde.org/show_bug.cgi?id=317210

...even if the DB transactions must be improved when action are processed.

Gilles

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #4 from Mario Frank <[hidden email]> ---
(In reply to caulier.gilles from comment #3)
> Mario, you is right.
>
> In fact, we need a garbage collector, as it's explained in this file :
>
> https://bugs.kde.org/show_bug.cgi?id=317210
>
> ...even if the DB transactions must be improved when action are processed.
>
> Gilles

Yes, I see. A garbage collection would be helpful. But some cleanup could be
already done without garbage collection.

There are some spots I wanted to improve anyway. But I will leave this for
after the release.

Did you see the patch (https://bugs.kde.org/attachment.cgi?id=103203) for
https://bugs.kde.org/show_bug.cgi?id=374191 (context menu in tags and people
sidebar?

It also improves the way similarity information is handled in table view, image
view and for sorting. The new approach is much more stable and the
functionality is now tested quite excessively. Also, the DB handling seems to
work as expected apart from the problems described in this bug here.
I would like to commit the patch.

Cheers,
Mario

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #5 from Mario Frank <[hidden email]> ---
Oh, yes. The complete patch description:
https://bugs.kde.org/show_bug.cgi?id=320666#c17

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

Mario Frank <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|5.4.0                       |5.5.0

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #6 from Mario Frank <[hidden email]> ---
Created attachment 103766
  --> https://bugs.kde.org/attachment.cgi?id=103766&action=edit
This patch introduces garbage collection as maintenance stage and reduces the
amount of generated garbage.

This patch introduces garbage collection as maintenance stage and reduces the
amount of generated garbage.
The stage in which the garbage collector is run is before the rebuilding of
thumbnails.

Description of problems and approach.

The garbage that bloated my databases was quite annoying. I want to draw
a sketch about where the garbage comes from:

1) Move image to trash. Everytime I delete some image, the Images table
entry is set to status Removed (3) and
the original album id is removed from the entry.
If I restore the image from trash, a new Images entry is generated.

2) Deleting images/albums directly or from trash. The image files are
deleted from hard drive but the Images entries
are not removed (and also ImageTagProperties and so on.

3) Moving/renaming images creates a new Images entry. The old one is set
to status Removed.

4) Deleting images does not remove the thumbnails. (by path/ uniqueHash)

5) Deleting face regions does not remove the region thumbnails (custom
identifier)

6) Removing tag does not remove identities from recognition db (every
identity should have the same faceEngineUUID as a tag)

Here is a description about what the patch does:
1) Creating less junk:
I introduced a new item status "Obsolete" and renamed the status
"Removed" to "Trashed".
Items are set to status Trashed if they are moved to trash. If items are
deleted directly/permanently,
they get the status "Obsolete".

If an image is restored from trash, i search for an item entry that has
status Trashed and has the same properties
as the new one. If i find such one, I use this entry and set the new/old
album and the status to visible.

If an image is renamed/moved, I use the moveItem method of the core DB
to set the new album/name of the image.
This way, the ImageScanner does not think that this is a new image. The
old entry is reused.
This could solve the grouping problem.

I cannot solve points 4 to 6 in the same easy way, explicitely not the
thumbnails problem since thumbnails can
be referenced by image path, image uniqueHash/file size and image
path/face region (custom identifier)

Thus, I made some clean-up routine for our databases.

2) Collecting junk:
I implemented the DbCleaner Maintenance module. It runs at every start
of digikam (if configured so in setup->misc) and removes
all stale Images entries (detectable by status Obsolete). This does not
take much time. But the DbCleaner can do more.
In Maintenance dialog, I added a stage Database Cleanup, which can be
triggered. The stage can also clean the thumbs db
and recognition db. But this must be explicitely selected in the menu as
this can take more time. Also, the thumbs and
recognition db are never cleaned at the start of digikam. I do not want
our users to wait minutes until they can work.

Now to what the DbCleaner does. As already said, it removes the stale
images. But let's take a step back. Getting the
stale images is just one call to core db. Getting the stale thumbnails
is more complicated. Getting the stale face identities
is less complicated. In first phase of the DbCleaner I analyse the
databases (thumbs and recognition only if enabled).
Identities are stale if there is no tag in core db that has the same
faceEngineUUID as the identity.
Thumbnails are stale if the following holds:
    1) There is no image in core DB whose file path leads to that
thumbnail (FilePaths table)
    2) There is no image in core DB whose uniqueHash and file size leads
to that thumbnail (UniqueHashes table)
    3) There is no face region of an image whose custom identifier
(image file path + region) leads to that thumbnail (CustomIdentifiers table)

So I first get all thumbnail ids from thumbs db into a list A and all
image ids into another list B.
Then I get the thumbnail ids for every image by their file path,
uniqueHash/file size and also
the thumbnails for the face regions and remove those thumbnail ids from
my list A.
The remainder in the list is thus neither connected to an image/video
nor to face regions.
I know that this is no really efficient way. But if, let's say a face
region is deleted, I cannot delete the thumbnail since it could
still be used from some other image by file path for example.

When I am done with that, I first clean the core db (stale images) After
that, I clean the thumbs db and after that the
recognition db. So far for the main process. The progress is shown to
the user and I show, what currently is done (analyse,
clean core DB, clean thumbs DB, clean recognition DB).

Then I tested my implementation with my database. I have got about 40000
images and my thumbnail db contains
96000 FilePath entries, 205000 UniqueHashes entries and 180000
CustomIdentifier entries. The Thumbnails entries are
about 255000. File size of the database (SQLite) file is 2.9 GB

About 200000 of the thumbnails are recognised as stale. Removing the
thumbnails one by one per thread as it is done for thumbnail
generation for example took exorbitant amount of time. To be frank (pun
intended), I gave up after 2 hours. The context swithing while
multi-threaded access to the thumbs db does not seem to work (well) is a
pain. Then I adopted my cleaner threads to work in chunks,
with chunk size modifiable (currently hard-coded). This worked better
but still too much threads that wait for IO. Against all expectations,
I found out that completely sequential cleaning is the fastest. If i let
one worker thread remove all stale thumbnails, the process of cleaning
my complete databases takes only about 8 mins on my 3 years old core i5
(with 16 GB RAM and no SSD). After vacuuming, my thumbs db has only size
of 650 MB.

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #7 from Mario Frank <[hidden email]> ---
To all who test this patch: Please backup your databases before you test.
Though I made tests without any problems, your databases differ from mine. And
I do not want you to lose data. The core database is only slightly touched,
i.e. really removed image entries are deleted. The most work is done in
thumbnails db.

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

[hidden email] changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Deleting image only removes |Deleting image only removes
                   |the file and sets the       |the file and sets the
                   |status to hidden but does   |status to hidden but does
                   |not delete the image from   |not delete the image from
                   |DB                          |DB [patch]

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #8 from Mario Frank <[hidden email]> ---
Created attachment 103773
  --> https://bugs.kde.org/attachment.cgi?id=103773&action=edit
Patch including new files.

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

Kusi <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--- Comment #9 from Kusi <[hidden email]> ---
do you have by any chance a branch on a git repo to check out from? I use an
external mysql server and could test it with that configuration

When you delete a tagged photo and restart digikam, the tags are gone forever,
right? It wouldn't be possible to restore the photo including tags from the
trash. What about a retention period (e.g 30 days), or if the deletion date is
not saved, just clean up once every month instead of every launch of
digikam(just as an idea)

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #10 from [hidden email] ---
Kusi,

The branch is already created :

https://cgit.kde.org/digikam.git/log/?h=development/garbagecollection

Anyway, if you wait a little bit, i currently re-build the AppImage bundle with
this branch.

Gilles Caulier

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #11 from [hidden email] ---
AppImage is done.
Upploading to GDrive under progress. It will be online in few minutes at usual
place :

https://drive.google.com/drive/folders/0BzeiVr-byqt5Y0tIRWVWelRJenM

New database Garbage Collector options are there :

https://www.flickr.com/photos/digikam/32549923912/in/dateposted-public/
https://www.flickr.com/photos/digikam/32549923632/in/dateposted-public/

Gilles Caulier

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #12 from Mario Frank <[hidden email]> ---
(In reply to Kusi from comment #9)
> do you have by any chance a branch on a git repo to check out from? I use an
> external mysql server and could test it with that configuration
>
> When you delete a tagged photo and restart digikam, the tags are gone
> forever, right? It wouldn't be possible to restore the photo including tags
> from the trash. What about a retention period (e.g 30 days), or if the
> deletion date is not saved, just clean up once every month instead of every
> launch of digikam(just as an idea)

Kusi,
I never would automatically delete items from trash. I think this is bad
usability. I only delete the items that were either removed directly (not moved
to trash) or removed from trash. So, as long as you do not delete the items
from trash, I will not do this either.

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #13 from Mario Frank <[hidden email]> ---
What about the garbage collection branch. When will we merge it to master?
I identified some potential stale data, i.e. table entries that reference a
non-existent image id, in core DB that can also be collected.
I will implement that when I have the time.

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #14 from [hidden email] ---
Hi Mario,

Well, merge back will be done when you decide. If more time is necessary to
hack, we can postbone release date for 5.5.0.

In other word, we don't have any restriction here (:=)))...

Gilles

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #15 from Mario Frank <[hidden email]> ---
(In reply to caulier.gilles from comment #14)
> Hi Mario,
>
> Well, merge back will be done when you decide. If more time is necessary to
> hack, we can postbone release date for 5.5.0.
>
> In other word, we don't have any restriction here (:=)))...
>
> Gilles

Hey Gilles,
the stale data (in ImageTags) seems to be the result of some queries I made in
the DB with my DB browser.
I did not find more stale data in core DB that can be cleaned.
I just resolved the last thing I disliked. Before 5.5, items were marked as
Removed (now Trashed) and deleted after a retention period. So users may have
items with that status. But the garbage collection only removes Obsolete items.
I adopted the CollectionsScanner to marke trashed items as obsolete after a
retention period. This way, users will also get rid of old trashed items.

Also, there are no more tables in thumbs and recognition db that are not
handled by the garbage collection.

To conclude: in functionality, we are finished for now. I am in fact working
with the garbage collection since we have the branch and did not encounter
problems. I would merge the branch into the master today. Then, more people can
test.

No need to postpone 5.5 - except users raid us with bug requests.

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

--- Comment #16 from Mario Frank <[hidden email]> ---
Merge is done. I will close this file now

--
You are receiving this mail because:
You are the assignee for the bug.
Reply | Threaded
Open this post in threaded view
|

[digikam] [Bug 374591] Deleting image only removes the file and sets the status to hidden but does not delete the image from DB [patch]

bugzilla_noreply
In reply to this post by bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=374591

Mario Frank <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Latest Commit|                            |https://commits.kde.org/dig
                   |                            |ikam/a1f67531b3269941df5ff5
                   |                            |31baa3e487cf58f1fc
         Resolution|---                         |FIXED
   Version Fixed In|                            |5.5.0
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #17 from Mario Frank <[hidden email]> ---
Git commit a1f67531b3269941df5ff531baa3e487cf58f1fc by Mario Frank.
Committed on 08/02/2017 at 14:09.
Pushed by mfrank into branch 'master'.

Merged the garbage collection into master. The garbage collector is a
maintenance stage that runs before the rebuild of thumbnails and must be
triggered explicitely.
It removes stale image entries in core db and if enabled also stale thumbnails
and face identities from thumbnails and recognition DB.
If configured so, the core DB part of the garbage collector removes stale image
entries in core db during the start of digiKam.
Note that cleaning the databases does not necessarily make them smaller as no
auto-vacuum is done on the databases. The vacuuming proces differs highly
between the three
supported database variants (SQLite, internal MySQL and external MySQL). Thus,
currently there is no automatism.
Related: bug 323718
FIXED-IN: 5.5.0

M  +3    -1    NEWS

https://commits.kde.org/digikam/a1f67531b3269941df5ff531baa3e487cf58f1fc

--
You are receiving this mail because:
You are the assignee for the bug.
12