大象传媒

A simple tool and library for finding the offset of an audio file within another file.

The algorithm uses cross-correlation of standardised Mel-Frequency Cepstral Coefficients, so it should be relatively robust to noise (encoding, compression, etc). The accuracy is typically to within about 0.01s.

The tool outputs the calculated offset in seconds, and a indicating the prominence of the chosen correlation peak. This can be used as a very rough estimate of the accuracy of the calculated offset - one with a score greater than ten is likely to be correct (at least for audio without similar repeated sections) within the accuracy of the tool; an offset with a score less than five is unlikely to be correct, and a manual check should be carried out. Note that the value of the score depends on the length of the audio analysed.

The tool uses for transcoding, so should work on all file formats supported by FFmpeg. It is tested for compatibility with Python 3.8-3.12 on Linux, Windows and macOS. Other Python versions and platforms may or may not work.