Histograms module #include "diplib/histogram.h"
Histograms and related functionality.
Classes
-
class dip::
Histogram - Computes and holds histograms.
Functions
-
auto dip::
CumulativeHistogram(dip::Histogram const& in) -> dip::Histogram - Computes a cumulative histogram from
in
. Seedip::Histogram::Cumulative
. -
auto dip::
Smooth(dip::Histogram const& in, dip::FloatArray const& sigma) -> dip::Histogram - Returns a smoothed version of the histogram
in
. Seedip::Histogram::Smooth
. -
auto dip::
Smooth(dip::Histogram const& in, dip::dfloat sigma = 1) -> dip::Histogram - Returns a smoothed version of the histogram
in
. Seedip::Histogram::Smooth
. -
auto dip::
Mean(dip::Histogram const& in) -> dip::FloatArray - Computes the mean value of the data represented by the histogram.
-
auto dip::
Covariance(dip::Histogram const& in) -> dip::FloatArray - Computes the covariance matrix of the data represented by the histogram.
-
auto dip::
MarginalPercentile(dip::Histogram const& in, dip::dfloat percentile = 50) -> dip::FloatArray - Computes the marginal percentile value of the data represented by the histogram. The marginal percentile is a percentile computed independently on each dimension, and thus is not one of the input values.
-
auto dip::
MarginalMedian(dip::Histogram const& in) -> dip::FloatArray - Computes the marginal median value of the data represented by the histogram. The median is the 50th
percentile, see
dip::MarginalPercentile
for details. -
auto dip::
Mode(dip::Histogram const& in) -> dip::FloatArray - Returns the mode, the bin with the largest count.
-
auto dip::
PearsonCorrelation(dip::Histogram const& in) -> dip::dfloat - Computes the Pearson correlation coefficient between two images from their joint histogram
in
. -
auto dip::
Regression(dip::Histogram const& in) -> dip::RegressionParameters - Fits a line through the histogram. Returns the slope and intercept of the regression line.
-
auto dip::
MutualInformation(dip::Histogram const& in) -> dip::dfloat - Calculates the mutual information, in bits, between two images from their joint histogram
in
. -
auto dip::
Entropy(dip::Histogram const& in) -> dip::dfloat - Calculates the entropy, in bits, of an image from its histogram
in
. -
auto dip::
GaussianMixtureModel(dip::Histogram const& in, dip::uint numberOfGaussians, dip::uint maxIter = 20) -> std::vector<GaussianParameters> - Determines the parameters for a Gaussian Mixture Model fitted to the histogram
in
. -
auto dip::
IsodataThreshold(dip::Histogram const& in, dip::uint nThresholds = 1) -> dip::FloatArray - Determines a set of
nThresholds
thresholds using the Isodata algorithm (k-means clustering), and the image’s histogramin
. -
auto dip::
OtsuThreshold(dip::Histogram const& in) -> dip::dfloat - Determines a threshold using the maximal inter-class variance method by Otsu, and the image’s histogram
in
. -
auto dip::
MinimumErrorThreshold(dip::Histogram const& in) -> dip::dfloat - Determines a threshold using the minimal error method method, and the image’s histogram
in
. -
auto dip::
GaussianMixtureModelThreshold(dip::Histogram const& in, dip::uint nThresholds = 1) -> dip::FloatArray - Determines a set of
nThresholds
thresholds by modeling the histogram with a Gaussian Mixture Model, and choosing the optimal Bayes thresholds. -
auto dip::
TriangleThreshold(dip::Histogram const& in, dip::dfloat sigma = 4.0) -> dip::dfloat - Determines a threshold using the using the chord method (a.k.a. skewed bi-modality, maximum distance
to triangle), and the image’s histogram
in
. -
auto dip::
BackgroundThreshold(dip::Histogram const& in, dip::dfloat distance = 2.0, dip::dfloat sigma = 4.0) -> dip::dfloat - Determines a threshold using the unimodal background-symmetry method, and the image’s histogram
in
. -
auto dip::
KMeansClustering(dip::Histogram const& in, dip::uint nClusters = 2) -> dip::Histogram - Partitions a (multi-dimensional) histogram into
nClusters
partitions using k-means clustering. -
auto dip::
MinimumVariancePartitioning(dip::Histogram const& in, dip::uint nClusters = 2) -> dip::Histogram - Partitions a (multi-dimensional) histogram into
nClusters
partitions iteratively using Otsu thresholding along individual dimensions. -
auto dip::
EqualizationLookupTable(dip::Histogram const& in) -> dip::LookupTable - Computes a lookup table that, when applied to an image with the histogram
in
, yields an image with a flat histogram (or rather a histogram that is as flat as possible). -
auto dip::
MatchingLookupTable(dip::Histogram const& in, dip::Histogram const& example) -> dip::LookupTable - Computes a lookup table that, when applied to an image with the histogram
in
, yields an image with a histogram as similar as possible toexample
. -
auto dip::
PerObjectHistogram(dip::Image const& grey, dip::Image const& label, dip::Image const& mask = {}, dip::Histogram::Configuration configuration = {}, dip::String const& mode = S::FRACTION, dip::String const& background = S::EXCLUDE) -> dip::Distribution - Computes a histogram of grey values in
grey
for each object inlabel
.
Operators
-
auto dip::
operator+(dip::Histogram const& lhs, dip::Histogram const& rhs) -> dip::Histogram - Adds two histograms.
-
auto dip::
operator-(dip::Histogram const& lhs, dip::Histogram const& rhs) -> dip::Histogram - Subtracts two histograms.
-
auto dip::
operator<<(std::ostream& os, dip::Histogram const& histogram) -> std::ostream& - You can output a
dip::Histogram
tostd::cout
or any other stream. Some information about the histogram is printed.
Function documentation
dip::FloatArray
dip:: Mean(dip::Histogram const& in)
Computes the mean value of the data represented by the histogram.
Computing statistics through the histogram is efficient, but yields an approximation equivalent to computing the statistic on data rounded to the bin centers.
dip::FloatArray
dip:: Covariance(dip::Histogram const& in)
Computes the covariance matrix of the data represented by the histogram.
Computing statistics through the histogram is efficient, but yields an approximation equivalent to computing the statistic on data rounded to the bin centers.
The returned array contains the elements of the symmetric covariance matrix in the same order as tensor
elements are stored in a symmetric tensor image (see dip::Tensor::Shape
). That is, there are
elements (with the histogram dimensionality), with the diagonal matrix elements stored first, and the
off-diagonal elements after. For a 2D histogram, the three elements are xx, yy, and xy.
dip::FloatArray
dip:: MarginalPercentile(dip::Histogram const& in,
dip::dfloat percentile = 50)
Computes the marginal percentile value of the data represented by the histogram. The marginal percentile is a percentile computed independently on each dimension, and thus is not one of the input values.
In the 1D histogram case (for scalar images) this function computes the approximate percentile (i.e. the bin containing the percentile value). The distinction between marginal percentile and percentile is only relevant for multivariate data (histograms from tensor images). In short, here we compute 1D percentile on each of the 1D projections of the histogram.
The percentile
must be a value between 0 (minimum) and 100 (maximum).
Computing statistics through the histogram is efficient, but yields an approximation equivalent to computing the statistic on data rounded to the bin centers.
dip::FloatArray
dip:: Mode(dip::Histogram const& in)
Returns the mode, the bin with the largest count.
When multiple bins have the same, largest count, the first bin encountered is returned. This is the bin with the lowest linear index.
dip::dfloat
dip:: PearsonCorrelation(dip::Histogram const& in)
Computes the Pearson correlation coefficient between two images from their joint histogram in
.
in
must be a 2D histogram. The number of bins along each axis determines the precision for the result.
dip::RegressionParameters
dip:: Regression(dip::Histogram const& in)
Fits a line through the histogram. Returns the slope and intercept of the regression line.
in
must be a 2D histogram. The number of bins along each axis determines the precision for the result.
dip::dfloat
dip:: MutualInformation(dip::Histogram const& in)
Calculates the mutual information, in bits, between two images from their joint histogram in
.
in
must be a 2D histogram. The number of bins along each axis determines the precision for the result.
dip::dfloat
dip:: Entropy(dip::Histogram const& in)
Calculates the entropy, in bits, of an image from its histogram in
.
in
must be a 1D histogram. The number of bins determines the precision for the result.
std::vector<GaussianParameters>
dip:: GaussianMixtureModel(dip::Histogram const& in,
dip::uint numberOfGaussians,
dip::uint maxIter = 20)
Determines the parameters for a Gaussian Mixture Model fitted to the histogram in
.
numberOfGaussians
Gaussians will be fitted to the histogram using the Expectation Maximization (EM) procedure.
The parameters are initialized deterministically, the means are distributed equally over the domain, the sigma are all set to the distance between means, and the amplitude are set to 1.
maxIter
sets how many iterations are run. There is currently no other stopping criterion.
The output is sorted by amplitude, most important component first.
dip::FloatArray
dip:: IsodataThreshold(dip::Histogram const& in,
dip::uint nThresholds = 1)
Determines a set of nThresholds
thresholds using the Isodata algorithm (k-means clustering),
and the image’s histogram in
.
The algorithm uses k-means clustering (with k set to nThresholds + 1
) to separate the histogram into
compact, similarly-weighted segments. This means that, for each class, the histogram should have a clearly
visible mode, and all modes should be similarly sized. A class that has many fewer pixels than another class
will likely not be segmented correctly.
The implementation here uses initial seeds distributed evenly over the histogram range, rather than the more common random seeds. This fixed initialization makes this a deterministic algorithm.
Note that the original Isodata algorithm (referenced below) does not use the image histogram, but instead works directly on the image. 2-means clustering on the histogram yields an identical result to the original Isodata algorithm, but is much more efficient. The implementation here generalizes to multiple thresholds because k-means clustering allows any number of thresholds.
dip::dfloat
dip:: OtsuThreshold(dip::Histogram const& in)
Determines a threshold using the maximal inter-class variance method by Otsu, and the image’s histogram in
.
This method assumes a bimodal distribution. It finds the threshold that maximizes the inter-class variance, which is equivalent to minimizing the inter-class variances. That is, the two parts of the histogram generated when splitting at the threshold value are as compact as possible.
dip::dfloat
dip:: MinimumErrorThreshold(dip::Histogram const& in)
Determines a threshold using the minimal error method method, and the image’s histogram in
.
This method assumes a bimodal distribution, composed of two Gaussian distributions with (potentially) different variances, and finds the threshold that minimizes the classification error. The algorithm, however, doesn’t try to fit two Gaussians to the data, instead uses an error measure that depends on the second order central moment for the two regions of the histogram obtained by dividing it at a given threshold value. The threshold with the lowest error measure is returned.
dip::FloatArray
dip:: GaussianMixtureModelThreshold(dip::Histogram const& in,
dip::uint nThresholds = 1)
Determines a set of nThresholds
thresholds by modeling the histogram with a Gaussian Mixture Model,
and choosing the optimal Bayes thresholds.
The algorithm fits a mixture of nThresholds + 1
Gaussians to the 1D histogram, and returns the thresholds
in between the fitted Gaussians that minimize the Bayes error (if possible). Note that the sum of a narrow
Gaussian and an overlapping broad Gaussian would typically yield two thresholds (dividing space into three
regions, the middle one belonging to the narrow Gaussian and the other two to the broad Gaussian). This
routine instead always returns a single threshold in between each of the Gaussian means.
dip::dfloat
dip:: TriangleThreshold(dip::Histogram const& in,
dip::dfloat sigma = 4.0)
Determines a threshold using the using the chord method (a.k.a. skewed bi-modality, maximum distance
to triangle), and the image’s histogram in
.
This method finds the point along the intensity distribution that is furthest from the line between the peak and either of the histogram ends. This typically coincides or is close to the inflection point of a unimodal distribution where the background forms the large peak, and the foreground contributes a small amount to the histogram and is spread out. For example, small fluorescent dots typically yield such a distribution, as does any thin line drawing.
To robustly detect and characterize the background peak, smoothing is necessary. This function applies
a Gaussian filter with sigma
, in samples (i.e. this value is independent of the bin width).
See dip::Histogram::Smooth
. Do note that smoothing also broadens the distribution.
dip::dfloat
dip:: BackgroundThreshold(dip::Histogram const& in,
dip::dfloat distance = 2.0,
dip::dfloat sigma = 4.0)
Determines a threshold using the unimodal background-symmetry method, and the image’s histogram in
.
The method finds the peak in the intensity distribution, characterizes its half width at half maximum, then sets
the threshold at distance
times the half width.
This method assumes a unimodal distribution, where the background forms the large peak, and the foreground contributes a small amount to the histogram and is spread out. For example, small fluorescent dots typically yield such a distribution, as does any thin line drawing. The background peak can be at either end of the histogram. However, it is important that the peak is not clipped too much, for example when too many background pixels in a fluorescence image are underexposed.
To robustly detect and characterize the background peak, smoothing is necessary. This function applies
a Gaussian filter with sigma
, in samples (i.e. this value is independent of the bin width).
See dip::Histogram::Smooth
. Do note that smoothing also broadens the distribution; even though this
broadening is taken into account when computing the peak width, too much smoothing will be detrimental.
dip::Histogram
dip:: KMeansClustering(dip::Histogram const& in,
dip::uint nClusters = 2)
Partitions a (multi-dimensional) histogram into nClusters
partitions using k-means clustering.
K-means clustering partitions the histogram into compact, similarly-weighted segments. The algorithm uses a random initialization, so multiple runs might yield different results.
For 1D histograms, dip::IsodataThreshold
is more efficient, and deterministic.
dip::Histogram
dip:: MinimumVariancePartitioning(dip::Histogram const& in,
dip::uint nClusters = 2)
Partitions a (multi-dimensional) histogram into nClusters
partitions iteratively using Otsu
thresholding along individual dimensions.
Minimum variance partitioning builds a k-d tree of the histogram, where, for each node, the marginal histogram with the largest variance is split using Otsu thresholding.
For two clusters in a 1D histogram, use dip::OtsuThreshold
.
dip::LookupTable
dip:: EqualizationLookupTable(dip::Histogram const& in)
Computes a lookup table that, when applied to an image with the histogram in
, yields an image with a
flat histogram (or rather a histogram that is as flat as possible).
The lookup table will be of type dip::DT_DFLOAT
, meaning that applying it to an image will yield an image
of that type. Convert the lookup table to a different type using dip::LookupTable::Convert
.
The lookup table will produce an output in the range [0,255].
in
must be a 1D histogram.
dip::LookupTable
dip:: MatchingLookupTable(dip::Histogram const& in,
dip::Histogram const& example)
Computes a lookup table that, when applied to an image with the histogram in
, yields an image with a
histogram as similar as possible to example
.
The lookup table will be of type dip::DT_DFLOAT
, meaning that applying it to an image will yield an image
of that type. Convert the lookup table to a different type using dip::LookupTable::Convert
.
The lookup table will produce an output in the range [example.LowerBound()
,example.UpperBound()
].
in
and example
must be 1D histograms.
dip::Distribution
dip:: PerObjectHistogram(dip::Image const& grey,
dip::Image const& label,
dip::Image const& mask = {},
dip::Histogram::Configuration configuration = {},
dip::String const& mode = S::FRACTION,
dip::String const& background = S::EXCLUDE)
Computes a histogram of grey values in grey
for each object in label
.
label
is a labelled image. For each object, the corresponding pixels in grey
are
used to build a histogram. mask
can optionally be used to further constrain which
pixels are used. grey
must be a real-valued image, but does not need to be scalar.
configuration
describes the histogram. The same configuration is applied to each
of the histograms, they all use the same bins. Percentiles are computed over all tensor
components of grey
(and masked by mask
).
mode
can be "fraction"
(the default) or "count"
. The former yields normalized
distributions, whereas the latter yields an integer pixel count per bin.
If background
is "include"
, the label ID 0 will be included in the result if present in the image.
Otherwise, background
is "exclude"
, and the label ID 0 will be ignored.
The output dip::Distribution
has bin centers as the x values, and one y value
per object and per tensor component. These are accessed as
distribution[bin].Y(objectID, tensor)
, where objectID
is the pixel values in label
,
but subtract one if background
is "exclude"
.
Note that you will need to also include <diplib/distribution.h>
to use this function.