Implement transparent database shadowing in Database::Open#4397
Implement transparent database shadowing in Database::Open#4397chuong-data61 wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a database shadowing mechanism to improve performance on high-latency network drives. When the COLMAP_USE_LOCAL_DATABASE environment variable is set, the database is transparently copied to a local temporary directory and synchronized back to the original path upon destruction. The reviewer provided several actionable suggestions to improve the implementation, including ensuring the synchronization process is atomic to prevent file corruption, using a more robust path comparison method to detect existing temporary files, and optimizing the random number generation for shadow file naming.
| std::filesystem::copy_file( | ||
| local_path, path, std::filesystem::copy_options::overwrite_existing); | ||
| std::filesystem::remove(local_path); |
There was a problem hiding this comment.
The synchronization back to the original path is not atomic. If the process is interrupted or a network failure occurs during copy_file, the original database file may be left in a corrupted or truncated state. To ensure atomicity, copy the shadow database to a temporary file in the same directory as the target path and then use std::filesystem::rename to replace the original file.
const std::filesystem::path temp_sync_path =
path.parent_path() / (path.filename().string() + ".sync.tmp");
std::filesystem::copy_file(
local_path,
temp_sync_path,
std::filesystem::copy_options::overwrite_existing);
std::filesystem::rename(temp_sync_path, path);
std::filesystem::remove(local_path);There was a problem hiding this comment.
This is a great suggestion.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
Thanks for your contribution. I understand the motivation. My immediate gut reaction is that this is something that is better handled by the business logic wrapping colmap, as it is highly dependent on the specific setup of each consumer? |
Thanks for the comment @ahojnnes. It was surprising to me that COLMAP stopped working when it was simply deployed to process data on a network drive. Such a use case can be quite common in reality. With the current version of COLMAP, inefficient data manipulation is required to get around the limitations of SQLITE. An alternative solution to this PR is to switch from SQLite to PostgreSQL, which requires more code changes. |
|
I see. The problem is that the proposed fix here still requires an opt-in env variable to be set, so the behavior will still be surprising to anybody not setting the env variable. Setting the env variable is not really that much more convenient as compared to just copying the database in a wrapper shell script? |
Not surprised if adding a suggestion to set the env variable when a database error occurs, and/or including in COLMAP tutorial/help document. Once set up, the database update is handled automatically and does not interfere with the overall workflow. Furthermore, many users don't really know when the database is updated to copy across. |
Summary:
This PR introduces a transparent database shadowing mechanism in
colmap::Database. When enabled via theCOLMAP_USE_LOCAL_DATABASE=1environment variable, COLMAP will automatically clone the database to local temporary storage (/tmp) upon opening and synchronize all changes back to the original (potentially remote) path upon destruction.Problem:
Working directly on SQLite databases stored on network-attached storage (NAS) or high-latency filesystems often leads to:
Solution:
By intercepting
Database::Open, we can redirect the database interaction to a fast local disk. The implementation uses a custom deleter pattern on thestd::shared_ptr<Database>returned by the factory to ensure that:Technical Details:
COLMAP_USE_LOCAL_DATABASE=1.std::filesystemfor platform-independent path manipulation.try-catchblocks to prevent data loss if the remote drive becomes unavailable during sync-back.Databaseclass.Verification:
Verified on a 120-image reconstruction pipeline. Feature extraction, matching, mapping, and dense reconstruction all successfully utilized the shadow database and correctly synchronized results back to the source NAS.