(chap:08)=
# Quantum estimation, the Rayleigh limit, and sub-resolution information
:::{admonition} Chapter opening
:class: chapter-opening
The Rayleigh criterion is an imaging criterion: it asks whether the main peaks of two Airy patterns can be visually separated in the image plane. Astronomical observations more often pose an estimation problem. Given a set of photons, a point-spread function, and a source model, how small can the error be on a binary separation, a stellar angular diameter, or a brightness moment? Quantum estimation theory writes this as a matching problem between the optical field state and the measurement. For weak thermal sources, direct imaging loses part of the separation information below the Rayleigh scale, while spatial-mode measurements can retain Fisher information about separations and source size {cite:p}`1925PCPS...22..700F,1969JSP.....1..231H,2000PhRvL..85.3789K,2016PhRvX...6c1033T,2016OExpr..24.3684N,2017NJPh...19b3054T`.
:::

## The Rayleigh criterion and the estimation problem

For a telescope aperture $D$ and wavelength $\lambda$, the first zero of the circular-aperture diffraction pattern lies at 

```{math}
:label: eq:ch08-rayleigh
\theta_{\rm R}=1.22\,\frac{\lambda}{D}
```

 radians. Multiplication by $206265\times10^3$ converts this angle to milliarcseconds. An $8~\mathrm{m}$ telescope has $\theta_{\rm R}\simeq15.7~\mathrm{mas}$ at $\lambda=500~\mathrm{nm}$, and about $50~\mathrm{mas}$ at $1.6~\mu\mathrm{m}$. A $30~\mathrm{m}$ telescope reaches about $4.2~\mathrm{mas}$ at $500~\mathrm{nm}$. A real point-spread function is set jointly by aperture obstruction, segmented mirrors, aberrations, adaptive-optics correction, and atmospheric residuals. To see the estimation structure cleanly, start with a one-dimensional Gaussian PSF, 

```{math}
:label: eq:ch08-gaussian-psf
h(x)=\frac{1}{\sqrt{2\pi}\sigma}
       \exp\!\left(-\frac{x^2}{2\sigma^2}\right),
```

 where $x$ is a focal-plane coordinate, or an equivalent angular coordinate, and $\sigma$ is the PSF width in the same coordinate. In angular coordinates, a typical $\sigma_\theta$ is of order $0.4\lambda/D$ to $0.5\lambda/D$, with the actual value obtained from measured PSFs or injected source calibration.

Consider two equally bright, mutually incoherent point sources separated by angle $s$, with their centroid at zero. Direct imaging records photon positions $x_i$. The single-photon probability density is 

```{math}
:label: eq:ch08-direct-probability
p(x|s)=\frac{1}{2}h\!\left(x-\frac{s}{2}\right)
        +\frac{1}{2}h\!\left(x+\frac{s}{2}\right).
```

 The left-hand side is estimated from a normalized image, an event-position histogram, or a PSF fit, and has units of inverse angle. On the right-hand side, $s$ and $\sigma$ must be expressed in the same units. Equation {eq}`eq:ch08-direct-probability` assumes equal brightness, identical spectra, the same PSF for the two sources, and a background that has either been subtracted or modeled separately. If the flux ratio is unknown, it belongs in the parameter vector.

When $s\ll\sigma$, the direct image changes very little. Expanding Eq. {eq}`eq:ch08-direct-probability`, the first-order terms from the two shifted PSFs cancel, and the image-plane shape first changes through an $s^2$-level broadening. Not seeing two peaks is not the same as having no information, but the information delivered by direct pixel measurement does drop rapidly. For a Gaussian PSF, the single-photon Fisher information for the separation parameter satisfies 

```{math}
:label: eq:ch08-direct-small-s
F_{ss}^{\rm direct}\simeq \frac{s^2}{8\sigma^4},
  \qquad s\ll\sigma .
```

 $F_{ss}^{\rm direct}$ has units of angle$^{-2}$. If the parameter is written as $q=s/\sigma$, then $F_{qq}^{\rm direct}\simeq q^2/8$. This quadratic loss is the Rayleigh curse. More photons still reduce the error, but for a fixed number of photons, closer sources become much harder to separate with direct imaging {cite:p}`2016PhRvX...6c1033T,2018Optic...5.1177P`.


```{figure} ../_static/figures/generated/chapter_08/ch08_rayleigh_information.png
:name: fig:chapter-08-rayleigh-information
:width: 82.0%

Separation information for two equally bright incoherent Gaussian point sources. The blue curve is the direct-imaging Fisher information obtained by numerical integration of the image-plane probability density <span class="math inline"><em>p</em>(<em>x</em>|<em>s</em>)</span>, normalized by the quantum Fisher information. The orange curve indicates the quantum limit reached by an ideal spatial-mode measurement. At small separation, the image-plane shape changes only at second order, so $F_{ss}^{\rm direct}$ falls toward zero, while the mode measurement retains finite information. The dashed line marks the commonly used Rayleigh angular scale.
```


Estimation and detection are different problems. If the question is \"one source or two sources,\" the model dimension changes and $s=0$ lies on a boundary of parameter space, so ordinary likelihood-ratio approximations may fail. Once the binary-source model has been accepted, the task becomes estimating $s$, centroid, flux ratio, and color. Sub-resolution measurement relies on a specified parametric model and on a measurement that reads out information already present in the optical field. Real astronomical models usually also include finite angular diameter, companion flux ratio, bandpass differences, polarization, time variability, and background-star contamination. These turn a single-parameter lower bound into a multiparameter covariance problem.

## Fisher information, the Cramer--Rao bound, and quantum limits

Suppose an observation yields $N_\gamma$ source photons that are useful for estimation. If background, bad pixels, and selection functions have been built into the probability density, the direct-imaging likelihood is 

```{math}
:label: eq:ch08-direct-likelihood
\ln \mathcal{L}(s)=\sum_{i=1}^{N_\gamma}\ln p(x_i|s),
```

 where $x_i$ is the angular or pixel position of the $i$th photon. The Fisher information is 

```{math}
:label: eq:ch08-fisher-direct
F_{ss}=N_\gamma \int
  \frac{1}{p(x|s)}
  \left[\frac{\partial p(x|s)}{\partial s}\right]^2{\rm d}x .
```

 The integration variable $x$ has the same units as $s$. $N_\gamma$ is the source-photon count after aperture, exposure time, bandpass, total efficiency, and data selection. For a bright star, millisecond-to-hour exposures can span $10^6$ to $10^{12}$ photons, depending on telescope area, filter width, detector efficiency, and whether the light is divided among modes or baselines. If each pixel also has background $b(x)$, then $p$ must be replaced by the normalized source-plus-background mixture. The ratio $N_{\rm bg}/N_\gamma$ can range from $\ll0.01$ in dark fields to $>1$ in crowded fields or strong sky background.

For any unbiased estimator $\hat{s}$, the variance obeys 

```{math}
:label: eq:ch08-crb
{\rm Var}(\hat{s})\ge \frac{1}{F_{ss}} .
```

 If several parameters are estimated together, $\boldsymbol{\theta}=(s,x_0,r,\sigma,b,\ldots)$, the scalar bound becomes the matrix inequality 

```{math}
:label: eq:ch08-fisher-matrix
{\rm Cov}(\hat{\boldsymbol{\theta}})\succeq F^{-1},\qquad
  F_{\alpha\beta}=N_\gamma\int
  \frac{1}{p(x|\boldsymbol{\theta})}
  \frac{\partial p}{\partial\theta_\alpha}
  \frac{\partial p}{\partial\theta_\beta}\,{\rm d}x .
```

 The off-diagonal terms of $F^{-1}$ describe degeneracies. An unknown centroid mixes with separation. An unknown flux ratio redistributes information between even and odd modes. An unknown PSF width can make binary separation look like ordinary broadening. A useful error report therefore includes statistical covariance, PSF calibration error, background-estimation error, and model error, not only a single $1/\sqrt{F_{ss}}$.

Quantum Fisher information replaces \"pixel measurement\" by \"any measurement allowed by quantum mechanics.\" Equation {eq}`eq:ch03-weak-thermal` in Chapter {ref}`chap:03` wrote weak thermal light as a vacuum term plus a small one-photon term. The mean photon number in each coherence-time mode is $\epsilon\ll1$, so most time windows have no click. Empty windows carry no separation information. The detected one-photon spatial mode is the relevant part of the state. For two equally bright incoherent point sources, the one-photon component can be written as 

```{math}
:label: eq:ch08-one-photon-state
\rho_1(s)=\frac{1}{2}
  \left|\psi_{+s/2}\right\rangle\left\langle\psi_{+s/2}\right|
  +\frac{1}{2}
  \left|\psi_{-s/2}\right\rangle\left\langle\psi_{-s/2}\right| ,
```

 where $\psi_{\pm s/2}$ are the field-amplitude modes produced by the two sources after the telescope and imaging system. For a Gaussian amplitude PSF, and in the ideal case where centroid, brightness, and PSF are known, the quantum Fisher information for separation is 

```{math}
:label: eq:ch08-qfi-gaussian
F_{ss}^{Q}=N_\gamma\,\frac{1}{4\sigma^2}.
```

 It does not vanish as $s\to0$. In Eq. {eq}`eq:ch08-qfi-gaussian`, $\sigma$ is the image-plane width associated with the field mode; in angular coordinates it is measured in radians. This result assumes a linear optical system, loss that does not depend on separation, two mutually incoherent and equally bright sources, known centroid, and a measurement that can approach the optimal spatial-mode projection {cite:p}`1969JSP.....1..231H,2016PhRvX...6c1033T,2016OExpr..24.3684N,2017PhRvA..95f3847A`.


```{figure} ../_static/figures/generated/chapter_08/ch08_cramer_rao_bound.png
:name: fig:chapter-08-crb
:width: 82.0%

Cramer–Rao separation error for a sub-resolution double source. The horizontal axis is the detected source-photon count <span class="math inline"><em>N</em><sub><em>γ</em></sub></span>, and the vertical axis is <span class="math inline"><em>Δ</em><em>s</em>/<em>σ</em></span>. For the example <span class="math inline"><em>s</em> = 0.25<em>σ</em></span>, direct imaging requires more photons to reach the same error because its Fisher information is already small. The ideal mode measurement falls as <span class="math inline"><em>N</em><sub><em>γ</em></sub><sup>−1/2</sup></span>. The green curve includes a <span class="math inline">0.004<em>σ</em></span> mode-matching systematic and reaches a floor, showing that calibration error cannot be beaten by collecting more photons.
```


A quantum lower bound is not automatically an observing precision. If the centroid error is $\delta x$, the first-order mode receives about $\delta x^2/(4\sigma^2)$ leakage. The double-source separation signal itself is about $s^2/(16\sigma^2)$ at small separation. When $\delta x\sim s/2$, pointing and centroid calibration are already comparable to the signal being measured. If the Strehl ratio is low, aberrations vary in time, the mode-sorter crosstalk matrix is unstable, or the background light has a different spatial-mode distribution, the bound in Eq. {eq}`eq:ch08-qfi-gaussian` is replaced by a systematic floor. Experimental phase plates, PSF shaping, and SPLICE-type measurements show that simple phase-sensitive measurements can also reduce the quadratic information loss of direct imaging, but the actual gain is still set by visibility, crosstalk, and alignment errors {cite:p}`2017PhRvL.118g0801T,2019NJPh...21i3010B,2018Optic...5.1177P`.

## How spatial-mode measurements read sub-resolution information

Direct imaging projects each photon onto a position eigenstate. SPADE first decomposes the field into a set of spatial modes and then counts photons in each mode. For a Gaussian PSF, the natural basis is the Hermite--Gaussian mode set. If the centroid is known and the two point sources are equally bright, an ideal mode sorter gives 

```{math}
:label: eq:ch08-spade-probability
p_n=\exp(-Q)\frac{Q^n}{n!},\qquad
  Q=\frac{s^2}{16\sigma^2},
```

 where $p_n$ is the single-photon probability of Hermite--Gaussian mode $n$. $p_0$ remains close to 1, so most light stays in the fundamental mode. $p_1\simeq s^2/(16\sigma^2)$, so a small but countable amount of light appears in the odd mode. That leakage is produced by the geometry of two overlapping source modes in the spatial-mode basis; it is part of the separation signal. For $N_\gamma$ photons, the expected first-order mode count is 

```{math}
:label: eq:ch08-first-mode-count
\langle n_1\rangle\simeq N_\gamma\,\frac{s^2}{16\sigma^2}.
```

 For $s=0.1\sigma$, $p_1\simeq6.25\times10^{-4}$. With $10^8$ effective source photons, the ideal first-order mode contains about $6.3\times10^4$ photons. With only $10^5$ photons, the expectation is about 63, and dark counts, crosstalk, and background become leading limitations.


```{figure} ../_static/figures/generated/chapter_08/ch08_spade_probabilities.png
:name: fig:chapter-08-spade-probabilities
:width: 82.0%

Mode probabilities for ideal Hermite–Gaussian mode sorting. The horizontal axis is the source separation <span class="math inline"><em>s</em>/<em>σ</em></span>, and the vertical axis is the single-photon probability in each mode. At small separation <span class="math inline"><em>p</em><sub>0</sub></span> remains near 1, while <span class="math inline"><em>p</em><sub>1</sub></span> and higher modes grow as powers of <span class="math inline"><em>Q</em> = <em>s</em><sup>2</sup>/(16<em>σ</em><sup>2</sup>)</span>. The separation information is carried by counts in these low-probability modes.
```


The image plane and the mode basis can both look visually unremarkable: the image profiles are almost identical, while the low-probability modes have already become measurable count channels. If the mode counts follow Poisson or multinomial statistics, their Fisher information is 

```{math}
:label: eq:ch08-mode-fisher
F_{ss}^{\rm mode}
  =N_\gamma\sum_n \frac{1}{p_n}
  \left(\frac{\partial p_n}{\partial s}\right)^2 .
```

 In the small-separation limit, the $n=1$ mode alone gives nearly $N_\gamma/(4\sigma^2)$ information. Although $p_1$ decreases as $s^2$, $\partial p_1/\partial s$ also decreases as $s$, and their ratio in the Fisher information has a finite limit. This is how mode measurement avoids the Rayleigh curse {cite:p}`2016PhRvX...6c1033T,2017NJPh...19b3054T`.


```{figure} ../_static/figures/generated/chapter_08/ch08_direct_image_vs_modes.png
:name: fig:chapter-08-direct-vs-modes
:width: 92.0%

The same double-source model in the direct image plane and in mode outputs. The left panel shows normalized image-plane intensity for <span class="math inline"><em>s</em> = 0</span>, <span class="math inline">0.35<em>σ</em></span>, and <span class="math inline">1.2<em>σ</em></span>; at sub-resolution separation the curve mainly shows a weak broadening. The right panel shows the first four Hermite–Gaussian mode probabilities for <span class="math inline"><em>s</em> = 0.35<em>σ</em></span> and <span class="math inline">1.2<em>σ</em></span>. The probabilities in the nonfundamental modes are small, but they turn separation information into mode counts.
```


Astronomical implementations also have to handle two-dimensional separations, polarization, bandpass, and optical coupling. In two dimensions, $s$ becomes $(s_x,s_y)$, and the modes become two-dimensional Hermite--Gaussian modes or orthogonal modes matched to the real aperture. With a circularly symmetric PSF, the two separation directions have QFI of the same order, but noncircular PSFs, segmented mirrors, and aberrations introduce directional dependence {cite:p}`2017PhRvA..95f3847A`. Generalized SPADE can estimate second and higher spatial moments of subdiffraction objects, but higher moments require higher modes, photon demands rise quickly, and a finite set of moments does not uniquely reconstruct an arbitrary brightness distribution {cite:p}`2017NJPh...19b3054T`. Close stellar pairs, nearby companions, stellar shapes, and low-dimensional brightness moments are therefore natural targets. Complex disks, jets, and nebulae will need mode measurements combined with conventional imaging, amplitude interferometry, or intensity interferometry.

Partial coherence changes the mode probabilities. Larson and Saleh described conditions under which the Rayleigh curse reappears for partially coherent sources. Tsang and Nair later pointed out that, for weak thermal light and the correct SPADE Fisher calculation, mode measurement still avoids complete loss of small-separation information except in the degenerate case of full positive correlation {cite:p}`2018Optic...5.1382L,2019Optic...6..400T`. In astronomy it is important to separate two notions: whether two emitting regions on the source surface radiate independently, and the spatial coherence produced at the telescope by propagation and the van Cittert--Zernike theorem. The latter is what first- and second-order interferometry measure; it should not be conflated with intrinsically coherent emission from the source.

## Fisher information in intensity interferometry

The previous section discussed single-aperture measurements, or spatial-mode measurements after coherent beam combination. A long-baseline array puts the same estimation problem in the Fourier plane. Different telescope pairs sample different spatial frequencies, and the constraints come from the way visibility changes with baseline. Intensity interferometry does not measure the phase of the first-order field. It measures correlations between intensity fluctuations at different telescopes. For ideal thermal light, the zero-delay second-order correlation is 

```{math}
:label: eq:ch08-hbt-g2
g_{ij}^{(2)}(0)-1 = |\gamma_{ij}|^2,
```

 where $\gamma_{ij}$ is the complex degree of coherence between telescopes $i$ and $j$, equal to the normalized first-order visibility $V(u,v)$. With finite time bins, finite optical bandwidth, and detector response, the measured contrast is diluted by the ratio of coherence time to electronic time resolution. In practice, the analysis usually fits calibrated $|V|^2$ and its covariance. The HBT experiments and the Narrabri Stellar Intensity Interferometer used this logic to measure stellar angular diameters; modern Cherenkov telescope arrays bring large collecting area and digital correlators to the same problem {cite:p}`1956Natur.177...27B,1956Natur.178.1046H,1957RSPSA.242..300B,1974iiia.book.....H,1974MNRAS.167..121H,1974MNRAS.167..475H,2008AIPC..984..205L,2013APh....43..331D,2015NatCo...6.6852D,2020NatAs...4.1164A,2024MNRAS.529.4387A`.

If the observables across baselines and spectral channels are 

```{math}
:label: eq:ch08-sii-data
d_k=|V(u_k,v_k;\boldsymbol{\theta})|^2+\eta_k ,
```

 where $k$ labels baseline, time, spectral channel, or polarization channel, and $\eta_k$ is the noise term, then the Fisher matrix for parameter $\theta_\alpha$ is 

```{math}
:label: eq:ch08-sii-fisher
F_{\alpha\beta}^{\rm SII}
  =\sum_{k,l}
  \frac{\partial |V_k|^2}{\partial\theta_\alpha}
  C^{-1}_{kl}
  \frac{\partial |V_l|^2}{\partial\theta_\beta}.
```

 $C_{kl}$ is the covariance of the $|V|^2$ measurements, estimated from time-shift backgrounds, null baselines, repeat observations, or injection simulations. If the errors are nearly independent, Eq. {eq}`eq:ch08-sii-fisher` reduces to a sum of squared model slopes divided by variances. For a uniform-disk star, 

```{math}
:label: eq:ch08-uniform-disk-visibility
V(x)=\frac{2J_1(x)}{x},\qquad
  x=\pi B\theta_\star/\lambda ,
```

 where $B$ is the projected baseline and $\theta_\star$ is the angular diameter. The most useful baselines for $\theta_\star$ usually fall where $|V|^2$ has a large slope. On short-baseline plateaus the model changes too little; on very long baselines the signal is easily limited by systematics.


```{figure} ../_static/figures/generated/chapter_08/ch08_visibility_fisher_design.png
:name: fig:chapter-08-sii-fisher
:width: 82.0%

Relation between baseline choice and parameter information in intensity interferometry. The blue curve is the squared visibility of a uniform-disk model with <span class="math inline"><em>λ</em> = 450 nm</span> and <span class="math inline"><em>θ</em><sub>⋆</sub> = 0.55 mas</span>. The orange curve is a normalized Fisher weight computed from <span class="math inline">∂|<em>V</em>|<sup>2</sup>/∂ln <em>θ</em><sub>⋆</sub></span>. The most useful baselines lie where the visibility is changing rapidly. Simply making baselines longer or shorter does not automatically add angular-diameter information.
```


The cost of intensity interferometry is missing phase. $|V|^2$ is insensitive to a global translation, and it creates mirror and centro-symmetric degeneracies. The orientation of a binary, the location of a bright spot, or the structure of an asymmetric disk often requires Earth-rotation coverage, multiband information, prior models, or third-order correlations. The advantage is insensitivity to atmospheric phase noise: the telescopes need only electronic connections, not optical phase coherence. Modern time taggers and high-speed correlators can handle many channels, but the usable Fisher information is still limited by photon rate, electronic bandwidth, background, timing synchronization, baseline coverage, and systematic covariance {cite:p}`2017MNRAS.472.4126G,2020RScI...91a3108W,2020NatAs...4.1164A,2024MNRAS.529.4387A`.

## Where this applies in astronomical instruments

Spatial-mode measurement and intensity interferometry face different engineering bottlenecks. SPADE, SLIVER, and phase-sensitive measurements usually require a stable wavefront from a single aperture or a coherent beam combiner. They are natural when the target light can be coupled into fiber modes, integrated photonic chips, or free-space mode sorters. Adaptive optics tries to compress the incoming light into a small number of controlled spatial modes. Strehl ratio, non-common-path aberrations, dispersion, and polarization crosstalk all enter the mode-response matrix. Intensity interferometry accepts random atmospheric phase, and uses large collecting area plus long baselines to measure $|V|^2$. It is especially natural for bright hot stars, rapid rotators, binaries, and angular-diameter measurements.

A scientifically usable sub-resolution estimate should report the photon budget, angular scale, measurement matrix, model boundaries, and error decomposition together. The photon budget includes target magnitude, bandpass, total efficiency, exposure time, and effective $N_\gamma$. The angular scale includes $\lambda/D$, PSF width $\sigma_\theta$, candidate separation $s$, or stellar angular diameter $\theta_\star$. The measurement matrix can be a pixel response, mode-crosstalk matrix, $|V|^2$ covariance, or time-correlation background. Model boundaries state how equal-brightness binaries, unknown flux ratios, finite disks, partial coherence, background stars, and time variability change the Fisher matrix. The error decomposition checks whether the statistical term really falls as $N_\gamma^{-1/2}$. When the error stops falling, the floor usually comes from alignment, PSF calibration, background, correlators, or model error.

Bayesian sampling and evidence comparison are often the cleanest way to carry these boundaries into the final inference. Direct images, mode counts, and intensity-interferometry data can share one parameter vector and enter a joint likelihood. To compare one-source and two-source models, simulations, posterior predictive checks, or evidence calculations are needed; a simple Gaussian approximation is not enough. In sub-resolution problems, priors directly change the identifiable parameter space. Known orbits, colors, spectral types, distances, companion flux ratios, or stellar-evolution models can all change the final constraints {cite:p}`2009MNRAS.398.1601F,2013PASP..125..306F`.

$\lambda/D$ still sets the spatial bandwidth of the field. Photon number still sets the statistical error. System calibration still sets the error floor. Quantum estimation changes the measurement. Pixels are not the only measurement basis. For some low-dimensional astronomical problems, spatial modes, phase plates, intensity correlations, and multibaseline design can turn information that is buried in high-order changes of the diffraction pattern into more direct counts or correlation signals. The next chapter returns to the radiation mechanisms of astrophysical sources: the source statistics decide which correlation functions have a signal, and the instrument's measurement basis decides whether those signals can be read out with sufficient signal-to-noise ratio.