torebasics.blogg.se - Syncthing compression

#Syncthing compression full#

#Syncthing compression full#

Aggregation means that in some cases, instead of scanning many changed files individually we will do a full scan of their parent folder. That is, instead of scanning the whole folder on a set schedule we scan individual files and directories when notified about them having changed.Įvents are aggregated and processed in batches. Internally it works much the same as the three step process described above - it’s just that the process is limited to the files or the subtree that has changed. Nowadays Syncthing supports listening for filesystem notifications, which gives us faster response to changes and less need to scan. The process above describes what Syncthing has done pretty much since its inception. Usually, however, this step is quick enough for it not to matter. The GUI shows “Scanning (100%)” while this is ongoing, which might be less than totally intuitive. This is yet another folder walk and the performance considerations are the same in step one. Once the hashing is complete the third step kicks in to look for deleted files.

We also calculate the current hash rate and estimate how long the scan will take to complete. This has effects on rename detection.ĭuring this step the GUI shows progress information - “Scanning (52%)” and similar. The new index information is also sent to other devices when it is committed to the database. I say “periodically” because it’s done in batches instead of immediately for each file, for improved efficiency. The hashing process reads each changed file, computes the cryptographic hashes for each block, and periodically updates the index entries in the database. Once we’ve built a list of files to hash we know how much work is left to do in step two. We can’t predict how long this step will take because we don’t know what the contents on disk are before we look - hence this step is shown simply as “Scanning” in the GUI, without any progress indication. If not, and the folder is large, it can take a while and cause a lot of I/O. Walking the folder on disk and comparing to the database is quick if both the file metadata and the database are mostly cached in RAM. These steps can all take up a long time or be really quick, depending. In steps one and two we find and hash new or updated files, in step three we find files that have been deleted. Walk the index database for the folder, checking if each item in the database still exists on disk. Queue any differing items for further inspection. Walk the folder on disk, comparing each found item with the corresponding item in the index database. It is described in our protocol buffer schema. The index entries are stored in the same format that is used to exchange index information between devices. Keeping this database up to date is called scanning. (Directories and symlinks don’t have any blocks apart from that the handling is the same.) Nowadays we use larger blocks sometimes, and of course the last block is usually smaller unless the file size happens to be an even multiple of the block size. The blocks are all of a specific size, historically always 128 KiB each.