template <typename CountType>
class CountMin
Defined at line 35 of file ../../third_party/cobalt/src/algorithms/privacy/count_min.h
Implements Count-Min Sketch.
The count-min sketch represents a distribution of hashable observations as |num_hashes| arrays
of size |num_cells_per_hash|. It relies on a fixed choice of |num_hashes| hash functions mapping
the observation space into the range [0,..., |num_cells_per_hash|].
This implementation flattens the representation into a single vector of integer-valued cells.
The cell index range corresponding the kth hash function begins at (k * |num_cells_per_hash|) and
ends (inclusive) at ((k + 1) * |num_cells_per_hash| - 1).
Incrementing the count for an observation |data| has the effect of incrementing the |num_hashes|
cells in the set
cells(|data|) = { (k * num_cells_per_hash + h_k(|data|)) for k = {1, ..., |num_hashes|} }.
To estimate the recorded count for |data|, CountMin computes the minimum value of the cells in
cells(|data|).
Public Methods
CountMin<CountType> MakeSketch (size_t num_cells_per_hash, size_t num_hashes)
Returns a CountMin sketch with dimensions |num_cells_per_hash| and |num_hashes| and with
|num_cells_per_hash| * |num_hashes| zero-valued cells.
Defined at line 41 of file ../../third_party/cobalt/src/algorithms/privacy/count_min.h
lib::statusor::StatusOr<CountMin<CountType>> MakeSketchFromCells (size_t num_cells_per_hash, size_t num_hashes, std::vector<CountType> cells)
Returns a CountMin sketch with dimensions |num_cells_per_hash| and |num_hashes| containing
|cells| if the size of |cells| is equal to |num_cells_per_hash| * |num_hashes|, or an error
status if not.
Defined at line 50 of file ../../third_party/cobalt/src/algorithms/privacy/count_min.h
size_t size ()
Returns the number of cells in the sketch.
Defined at line 61 of file ../../third_party/cobalt/src/algorithms/privacy/count_min.h
void Increment (const std::string & data, CountType count)
Increments the number of observations of |data| by |count|.
Defined at line 64 of file ../../third_party/cobalt/src/algorithms/privacy/count_min.h
std::vector<size_t> GetCellIndices (const std::string & data)
Returns a vector of the |num_hashes| indices corresponding to |data|, without updating the
sketch.
Defined at line 70 of file ../../third_party/cobalt/src/algorithms/privacy/count_min.h
CountType GetCount (const std::string & data)
Gets the estimated count for the specified |data|.
Defined at line 79 of file ../../third_party/cobalt/src/algorithms/privacy/count_min.h
Status IncrementCell (size_t cell_index, CountType count)
Increments the value at |cell_index| by |count|. Returns an error status if |cell_index| is not
a valid cell index.
Defined at line 85 of file ../../third_party/cobalt/src/algorithms/privacy/count_min.h
lib::statusor::StatusOr<CountType> GetCellValue (size_t cell_index)
Gets the value of the sketch cell with index |cell_index|. Returns an error status if
|cell_index| is not a valid cell index.
Defined at line 95 of file ../../third_party/cobalt/src/algorithms/privacy/count_min.h