Histograms module

Histograms and related functionality.

Contents

Classes

class dip::Histogram
Computes and holds histograms.

Functions

auto dip::CumulativeHistogram(dip::Histogram const& in) -> dip::Histogram
Computes a cumulative histogram from in. See dip::Histogram::Cumulative.
auto dip::Smooth(dip::Histogram const& in, dip::FloatArray const& sigma) -> dip::Histogram
Returns a smoothed version of the histogram in. See dip::Histogram::Smooth.
auto dip::Smooth(dip::Histogram const& in, dip::dfloat sigma = 1) -> dip::Histogram
Returns a smoothed version of the histogram in. See dip::Histogram::Smooth.
auto dip::Mean(dip::Histogram const& in) -> dip::FloatArray
Computes the mean value of the data represented by the histogram.
auto dip::Covariance(dip::Histogram const& in) -> dip::FloatArray
Computes the covariance matrix of the data represented by the histogram.
auto dip::MarginalPercentile(dip::Histogram const& in, dip::dfloat percentile = 50) -> dip::FloatArray
Computes the marginal percentile value of the data represented by the histogram. The marginal percentile is a percentile computed independently on each dimension, and thus is not one of the input values.
auto dip::MarginalMedian(dip::Histogram const& in) -> dip::FloatArray
Computes the marginal median value of the data represented by the histogram. The median is the 50th percentile, see dip::MarginalPercentile for details.
auto dip::Mode(dip::Histogram const& in) -> dip::FloatArray
Returns the mode, the bin with the largest count.
auto dip::PearsonCorrelation(dip::Histogram const& in) -> dip::dfloat
Computes the Pearson correlation coefficient between two images from their joint histogram in.
auto dip::Regression(dip::Histogram const& in) -> dip::RegressionParameters
Fits a line through the histogram. Returns the slope and intercept of the regression line.
auto dip::MutualInformation(dip::Histogram const& in) -> dip::dfloat
Calculates the mutual information, in bits, between two images from their joint histogram in.
auto dip::Entropy(dip::Histogram const& in) -> dip::dfloat
Calculates the entropy, in bits, of an image from its histogram in.
auto dip::IsodataThreshold(dip::Histogram const& in, dip::uint nThresholds = 1) -> dip::FloatArray
Determines a set of nThresholds thresholds using the Isodata algorithm (k-means clustering), and the image’s histogram in.
auto dip::OtsuThreshold(dip::Histogram const& in) -> dip::dfloat
Determines a threshold using the maximal inter-class variance method by Otsu, and the image’s histogram in.
auto dip::MinimumErrorThreshold(dip::Histogram const& in) -> dip::dfloat
Determines a threshold using the minimal error method method, and the image’s histogram in.
auto dip::GaussianMixtureModelThreshold(dip::Histogram const& in, dip::uint nThresholds = 1) -> dip::FloatArray
Determines a set of nThresholds thresholds by modeling the histogram with a Gaussian Mixture Model, fitting the model using the Expectation Maximization procedure, and choosing the optimal Bayes thresholds.
auto dip::TriangleThreshold(dip::Histogram const& in, dip::dfloat sigma = 4.0) -> dip::dfloat
Determines a threshold using the using the chord method (a.k.a. skewed bi-modality, maximum distance to triangle), and the image’s histogram in.
auto dip::BackgroundThreshold(dip::Histogram const& in, dip::dfloat distance = 2.0, dip::dfloat sigma = 4.0) -> dip::dfloat
Determines a threshold using the unimodal background-symmetry method, and the image’s histogram in.
auto dip::KMeansClustering(dip::Histogram const& in, dip::uint nClusters = 2) -> dip::Histogram
Partitions a (multi-dimensional) histogram into nClusters partitions using k-means clustering.
auto dip::MinimumVariancePartitioning(dip::Histogram const& in, dip::uint nClusters = 2) -> dip::Histogram
Partitions a (multi-dimensional) histogram into nClusters partitions iteratively using Otsu thresholding along individual dimensions.
auto dip::EqualizationLookupTable(dip::Histogram const& in) -> dip::LookupTable
Computes a lookup table that, when applied to an image with the histogram in, yields an image with a flat histogram (or rather a histogram that is as flat as possible).
auto dip::MatchingLookupTable(dip::Histogram const& in, dip::Histogram const& example) -> dip::LookupTable
Computes a lookup table that, when applied to an image with the histogram in, yields an image with a histogram as similar as possible to example.
auto dip::PerObjectHistogram(dip::Image const& grey, dip::Image const& label, dip::Image const& mask = {}, dip::Histogram::Configuration configuration = {}, dip::String const& mode = S::FRACTION, dip::String const& background = S::EXCLUDE) -> dip::Distribution
Computes a histogram of grey values in grey for each object in label.

Operators

auto dip::operator+(dip::Histogram const& lhs, dip::Histogram const& rhs) -> dip::Histogram
Adds two histograms.
auto dip::operator-(dip::Histogram const& lhs, dip::Histogram const& rhs) -> dip::Histogram
Subtracts two histograms.
auto dip::operator<<(std::ostream& os, dip::Histogram const& histogram) -> std::ostream&
You can output a dip::Histogram to std::cout or any other stream. Some information about the histogram is printed.

Function documentation

dip::FloatArray dip::Mean(dip::Histogram const& in)

Computes the mean value of the data represented by the histogram.

Computing statistics through the histogram is efficient, but yields an approximation equivalent to computing the statistic on data rounded to the bin centers.

dip::FloatArray dip::Covariance(dip::Histogram const& in)

Computes the covariance matrix of the data represented by the histogram.

Computing statistics through the histogram is efficient, but yields an approximation equivalent to computing the statistic on data rounded to the bin centers.

The returned array contains the elements of the symmetric covariance matrix in the same order as tensor elements are stored in a symmetric tensor image (see dip::Tensor::Shape). That is, there are \(\frac{1}{2}n(n+1)\) elements (with \(n\) the histogram dimensionality), with the diagonal matrix elements stored first, and the off-diagonal elements after. For a 2D histogram, the three elements are xx, yy, and xy.

dip::FloatArray dip::MarginalPercentile(dip::Histogram const& in, dip::dfloat percentile = 50)

Computes the marginal percentile value of the data represented by the histogram. The marginal percentile is a percentile computed independently on each dimension, and thus is not one of the input values.

In the 1D histogram case (for scalar images) this function computes the approximate percentile (i.e. the bin containing the percentile value). The distinction between marginal percentile and percentile is only relevant for multivariate data (histograms from tensor images). In short, here we compute 1D percentile on each of the 1D projections of the histogram.

The percentile must be a value between 0 (minimum) and 100 (maximum).

Computing statistics through the histogram is efficient, but yields an approximation equivalent to computing the statistic on data rounded to the bin centers.

dip::FloatArray dip::Mode(dip::Histogram const& in)

Returns the mode, the bin with the largest count.

When multiple bins have the same, largest count, the first bin encountered is returned. This is the bin with the lowest linear index.

dip::dfloat dip::PearsonCorrelation(dip::Histogram const& in)

Computes the Pearson correlation coefficient between two images from their joint histogram in.

in must be a 2D histogram. The number of bins along each axis determines the precision for the result.

dip::RegressionParameters dip::Regression(dip::Histogram const& in)

Fits a line through the histogram. Returns the slope and intercept of the regression line.

in must be a 2D histogram. The number of bins along each axis determines the precision for the result.

dip::dfloat dip::MutualInformation(dip::Histogram const& in)

Calculates the mutual information, in bits, between two images from their joint histogram in.

in must be a 2D histogram. The number of bins along each axis determines the precision for the result.

dip::dfloat dip::Entropy(dip::Histogram const& in)

Calculates the entropy, in bits, of an image from its histogram in.

in must be a 1D histogram. The number of bins determines the precision for the result.

dip::FloatArray dip::IsodataThreshold(dip::Histogram const& in, dip::uint nThresholds = 1)

Determines a set of nThresholds thresholds using the Isodata algorithm (k-means clustering), and the image’s histogram in.

The algorithm uses k-means clustering (with k set to nThresholds + 1) to separate the histogram into compact, similarly-weighted segments. This means that, for each class, the histogram should have a clearly visible mode, and all modes should be similarly sized. A class that has many fewer pixels than another class will likely not be segmented correctly.

The implementation here uses initial seeds distributed evenly over the histogram range, rather than the more common random seeds. This fixed initialization makes this a deterministic algorithm.

Note that the original Isodata algorithm (referenced below) does not use the image histogram, but instead works directly on the image. 2-means clustering on the histogram yields an identical result to the original Isodata algorithm, but is much more efficient. The implementation here generalizes to multiple thresholds because k-means clustering allows any number of thresholds.

dip::dfloat dip::OtsuThreshold(dip::Histogram const& in)

Determines a threshold using the maximal inter-class variance method by Otsu, and the image’s histogram in.

This method assumes a bimodal distribution. It finds the threshold that maximizes the inter-class variance, which is equivalent to minimizing the inter-class variances. That is, the two parts of the histogram generated when splitting at the threshold value are as compact as possible.

dip::dfloat dip::MinimumErrorThreshold(dip::Histogram const& in)

Determines a threshold using the minimal error method method, and the image’s histogram in.

This method assumes a bimodal distribution, composed of two Gaussian distributions with (potentially) different variances, and finds the threshold that minimizes the classification error. The algorithm, however, doesn’t try to fit two Gaussians to the data, instead uses an error measure that depends on the second order central moment for the two regions of the histogram obtained by dividing it at a given threshold value. The threshold with the lowest error measure is returned.

dip::FloatArray dip::GaussianMixtureModelThreshold(dip::Histogram const& in, dip::uint nThresholds = 1)

Determines a set of nThresholds thresholds by modeling the histogram with a Gaussian Mixture Model, fitting the model using the Expectation Maximization procedure, and choosing the optimal Bayes thresholds.

The algorithm fits a mixture of nThresholds + 1 Gaussians to the 1D histogram, and returns the thresholds in between the fitted Gaussians that minimize the Bayes error (if possible). Note that the sum of a narrow Gaussian and an overlapping broad Gaussian would typically yield two thresholds (dividing space into three regions, the middle one belonging to the narrow Gaussian and the other two to the broad Gaussian). This routine instead always returns a single threshold in between each of the Gaussian means.

dip::dfloat dip::TriangleThreshold(dip::Histogram const& in, dip::dfloat sigma = 4.0)

Determines a threshold using the using the chord method (a.k.a. skewed bi-modality, maximum distance to triangle), and the image’s histogram in.

This method finds the point along the intensity distribution that is furthest from the line between the peak and either of the histogram ends. This typically coincides or is close to the inflection point of a unimodal distribution where the background forms the large peak, and the foreground contributes a small amount to the histogram and is spread out. For example, small fluorescent dots typically yield such a distribution, as does any thin line drawing.

To robustly detect and characterize the background peak, smoothing is necessary. This function applies a Gaussian filter with sigma, in samples (i.e. this value is independent of the bin width). See dip::Histogram::Smooth. Do note that smoothing also broadens the distribution.

dip::dfloat dip::BackgroundThreshold(dip::Histogram const& in, dip::dfloat distance = 2.0, dip::dfloat sigma = 4.0)

Determines a threshold using the unimodal background-symmetry method, and the image’s histogram in.

The method finds the peak in the intensity distribution, characterizes its half width at half maximum, then sets the threshold at distance times the half width.

This method assumes a unimodal distribution, where the background forms the large peak, and the foreground contributes a small amount to the histogram and is spread out. For example, small fluorescent dots typically yield such a distribution, as does any thin line drawing. The background peak can be at either end of the histogram. However, it is important that the peak is not clipped too much, for example when too many background pixels in a fluorescence image are underexposed.

To robustly detect and characterize the background peak, smoothing is necessary. This function applies a Gaussian filter with sigma, in samples (i.e. this value is independent of the bin width). See dip::Histogram::Smooth. Do note that smoothing also broadens the distribution; even though this broadening is taken into account when computing the peak width, too much smoothing will be detrimental.

dip::Histogram dip::KMeansClustering(dip::Histogram const& in, dip::uint nClusters = 2)

Partitions a (multi-dimensional) histogram into nClusters partitions using k-means clustering.

K-means clustering partitions the histogram into compact, similarly-weighted segments. The algorithm uses a random initialization, so multiple runs might yield different results.

For 1D histograms, dip::IsodataThreshold is more efficient, and deterministic.

dip::Histogram dip::MinimumVariancePartitioning(dip::Histogram const& in, dip::uint nClusters = 2)

Partitions a (multi-dimensional) histogram into nClusters partitions iteratively using Otsu thresholding along individual dimensions.

Minimum variance partitioning builds a k-d tree of the histogram, where, for each node, the marginal histogram with the largest variance is split using Otsu thresholding.

For two clusters in a 1D histogram, use dip::OtsuThreshold.

dip::LookupTable dip::EqualizationLookupTable(dip::Histogram const& in)

Computes a lookup table that, when applied to an image with the histogram in, yields an image with a flat histogram (or rather a histogram that is as flat as possible).

The lookup table will be of type dip::DT_DFLOAT, meaning that applying it to an image will yield an image of that type. Convert the lookup table to a different type using dip::LookupTable::Convert.

The lookup table will produce an output in the range [0,255].

in must be a 1D histogram.

dip::LookupTable dip::MatchingLookupTable(dip::Histogram const& in, dip::Histogram const& example)

Computes a lookup table that, when applied to an image with the histogram in, yields an image with a histogram as similar as possible to example.

The lookup table will be of type dip::DT_DFLOAT, meaning that applying it to an image will yield an image of that type. Convert the lookup table to a different type using dip::LookupTable::Convert.

The lookup table will produce an output in the range [example.LowerBound(),example.UpperBound()].

in and example must be 1D histograms.

dip::Distribution dip::PerObjectHistogram(dip::Image const& grey, dip::Image const& label, dip::Image const& mask = {}, dip::Histogram::Configuration configuration = {}, dip::String const& mode = S::FRACTION, dip::String const& background = S::EXCLUDE)

Computes a histogram of grey values in grey for each object in label.

label is a labelled image. For each object, the corresponding pixels in grey are used to build a histogram. mask can optionally be used to further constrain which pixels are used. grey must be a real-valued image, but does not need to be scalar.

configuration describes the histogram. The same configuration is applied to each of the histograms, they all use the same bins. Percentiles are computed over all tensor components of grey (and masked by mask).

mode can be "fraction" (the default) or "count". The former yields normalized distributions, whereas the latter yields an integer pixel count per bin.

If background is "include", the label ID 0 will be included in the result if present in the image. Otherwise, background is "exclude", and the label ID 0 will be ignored.

The output dip::Distribution has bin centers as the x values, and one y value per object and per tensor component. These are accessed as distribution[bin].Y(objectID, tensor), where objectID is the pixel values in label, but subtract one if background is "exclude".

Note that you will need to also include <diplib/distribution.h> to use this function.