• tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    13
    ·
    edit-2
    3 days ago

    here is a checksum of my /usr/lib folder

    That’s actually not as trivial as it seems, because you need a canonical representation of that directory to generate such a thing in the same way on each side.

    You need to encode the metadata in a standard way, encode new data that shows up in a standard way, and various people can add more metadata to files: think like Posix ACLs or the immutable flag or whatever.

    Then there is maybe some metadata that you probably want to exclude, like atime (though not if you’re something like rsync -U!), and some metadata that you almost-certainly want to exclude, like inode number.

    The OS’s file APIs won’t have a defined order in which they return entries in a directory. Like, they’ll normally just return it in whatever order things come back from the filesystem, which is probably whatever is most-efficient for the filesystem in question, given how things are encoded on disk. If you sort the directory entries, then it can’t be — as is the case for most things on the system — done in a locale-dependent fashion. Utilities like tar don’t impose a canonical ordering, so you can’t just dump the problems on tar by checksumming a tarball of the directory.

    EDIT: tar does appear to have a canonical ordering option today, though it also probably doesn’t have the constraint of being backwards-compatible with metadata included, another thing that one would need for such a checksum if one were to leverage tar.

    • FishFace@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      1 day ago

      You need to

      Given it was a joke, I don’t think you need to do anything…

      The OS’s file APIs won’t have a defined order in which they return entries in a directory

      Sorting is a thing :)