Information Retrieval: Implementing and Evaluating Search Engines


Chapter 6: Index Compression

  • Page 213, Table 6.9
    The program used for measuring compression effectiveness and decoding performance erroneously ignored some of the terms in the TREC Terabyte query set. These are the correct numbers:

    Compression Decoding       Cumulative Overhead
    Compression Method       (bits per docid)       (ns per docid) (decoding + disk I/O)
    Gamma4.567.7313.98 ns
    Golomb3.7810.8516.03 ns
    Rice3.816.4611.68 ns
    LLRUN3.897.0412.37 ns
    Interpolative3.8726.5031.80 ns
    vByte8.111.3912.50 ns
    Simple-94.912.769.49 ns

    The ordering of the respective methods, in terms of compression effectiveness as well as decoding performance, is unaffected.