While implementing the validation and decoding of ZFS Receive Resume Tokens for a personal project, I noticed that the checksum of the tokens produced by ZFS does not match the one I computed myself using the Fletcher4 checksum algorithm. Therefore I took a deeper look into the ZFS code and noticed something unexpected:
The get_receive_resume_token_impl function uses fletcher_4_native_varsize to compute a checksum over the token's payload. As documented in the commit introducing this function, the fletcher_4_native_varsize will ignore the last bytes of the input data if its size is not a multiple of 4:
- Added
fletcher_4_native_varsize()special purpose method for use when buffer size
is not known in advance. The method does not enforce 4B alignment on buffer size, and
will ignore last (size % 4) bytes of the data buffer.
This can be seen here:
| const uint32_t *ipend = ip + (size / sizeof (uint32_t)); |
Because the receive resume token payloads often have a size that is not a multiple of 4, the last bytes of them will not be covered by the token's checksum and corruptions there may be left unnoticed.
While this is a rather theoretical issue, I still wanted to raise awareness to this and am wondering, if this behavior is intended or if this may have been an accident during one of the last refactorings of the fletcher_4 code?