(chap:07)= # Data analysis for event tables :::{admonition} Chapter opening :class: chapter-opening Chapter {ref}`chap:06` wrote each photon event as a table row carrying time, channel, wavelength, polarization, and quality information. From that table one can estimate light curves, delay histograms, second-order correlation functions, squared visibilities, and model parameters. The analysis starts by deciding which events are comparable. That choice determines whether the data should be treated with an unbinned point-process likelihood, binned delay counts, continuous waveform correlations, or a combined data vector with a covariance matrix. Iqueye/AquEYE photon time tags, VERITAS-SII offline FPGA correlations, MAGIC-SII real-time GPU Pearson correlations, Bayesian Blocks, and Poisson likelihood methods in astronomical statistics all address the same practical question: how to turn the timing and channel information in an event table into testable scientific quantities {cite:p}`2009A&A...508..531N,2009JMOp...56..261B,2020NatAs...4.1164A,2024MNRAS.529.4387A,1979ApJ...228..939C,2013ApJ...764..167S`. ::: ## From an event table to a likelihood Using the raw and calibrated event tables of Chapter {ref}`chap:06`, the analysis layer can write a calibrated event list as ```{math} :label: eq:ch07-event-data \mathcal{D} = \left\{ t_q,\ a_q,\ d_q,\ c_q,\ p_q,\ w_q,\ f_q \right\}_{q=1}^{N_{\rm evt}}, ``` where $t_q$ is the arrival time on a common time scale, measured in seconds; $a_q$ is the telescope or station index; $d_q$ is the detector channel; $c_q$ is the spectral channel; $p_q$ is the polarization channel; $w_q$ is a weight; and $f_q$ is a quality flag. The instrumental origin, clock correction, and format requirements for these fields were discussed around Eqs. {eq}`eq:ch06-raw-event` and {eq}`eq:ch06-calibrated-time`, and in the data-format discussion at the end of Chapter {ref}`chap:06`. The first analysis object is the selection function ```{math} :label: eq:ch07-selection S_q = S(t_q,a_q,d_q,c_q,p_q,f_q), \qquad S_q \in \{0,1\}\ {\rm or}\ [0,1]. ``` A binary $S_q$ keeps or rejects an event. A continuous $S_q$ encodes a weight from exposure, dead time, cloud transparency, background, or channel efficiency. At an event rate of $10^6\,\mathrm{s^{-1}}$, a one-hour observation contains $3.6\times10^9$ events. At that scale, any selection made by visually inspecting plots is not reproducible. The selection function has to be regenerated from metadata and thresholds. For unbinned events, the most direct model is an inhomogeneous Poisson point process. If the event-rate model is $\lambda(t,\mathbf{x}\mid\boldsymbol{\theta})$, where $\mathbf{x}$ represents channel, wavelength, polarization, or sky position, the likelihood is ```{math} :label: eq:ch07-point-process-likelihood \ln L(\boldsymbol{\theta}) = \sum_{q:S_q>0} \ln \lambda(t_q,\mathbf{x}_q\mid\boldsymbol{\theta}) - \int_{\mathcal{T}} \lambda(t,\mathbf{x}\mid\boldsymbol{\theta})\,E(t,\mathbf{x})\,dt\,d\mathbf{x}. ``` $\lambda$ is measured in s$^{-1}$ or s$^{-1}$ channel$^{-1}$. The first term evaluates the model rate at each observed event. The second term is the number of events predicted by the model over the effective exposure $E(t,\mathbf{x})$. The exposure may be dimensionless or may carry area and efficiency factors; it includes bad weather, dead time, channel switching, filter throughput, and the observing window. The Cash statistic is the classic form of this Poisson likelihood in X-ray and low-count astronomy, where small- count confidence intervals cannot be replaced by symmetric $\sqrt{N}$ errors {cite:p}`1979ApJ...228..939C,1986ApJ...303..336G`. ```{figure} ../_static/figures/generated/chapter_07/ch07_poisson_likelihood.png :name: fig:ch07-poisson-likelihood :width: 72.0% Observed picture of an inhomogeneous Poisson process. The upper panel shows the instantaneous event rate λ(t), and the lower panel shows one set of event times drawn from it. The likelihood uses both the fact that events fall preferentially in high-rate intervals and the integrated rate across the whole observation. Rebinning the events into a light curve discards part of the timing information, especially for rapidly varying sources or nonuniform exposure. ``` If the events are binned in time, Eq. {eq}`eq:ch07-point-process-likelihood` becomes ```{math} :label: eq:ch07-binned-poisson \ln L = \sum_i \left[ d_i\ln m_i(\boldsymbol{\theta}) -m_i(\boldsymbol{\theta}) -\ln(d_i!) \right], ``` where $d_i$ is the observed count in bin $i$, and $m_i$ is the predicted count. Both are dimensionless. This likelihood is useful for light curves, delay histograms, and spectral-channel counts. When $d_i\gtrsim20$ and the model is far from a boundary, a Gaussian likelihood is often adequate. With backgrounds, response matrices, absorption, clock drift, or several model components, however, likelihood-ratio tests do not always follow the nominal $\chi^2$ distribution. If the strength of a new component is constrained to be nonnegative, zero strength lies on a parameter boundary, so the significance must be calibrated with simulations, posterior predictive checks, or explicit model evidence {cite:p}`2002ApJ...571..545P`. Event tables can also represent slowly varying or flaring backgrounds through adaptive segmentation. Bayesian Blocks partitions the time axis into intervals with constant rates and provides fitness functions for event data, binned counts, and measurements with errors. Gaps enter through the live time, rather than being mistaken for intervals with no events {cite:p}`2013ApJ...764..167S`. In a working pipeline, this type of method is most useful for quality control, background segmentation, and transient-candidate screening. Physical parameters should still come from the model likelihood used later. ## Delay histograms and correlation normalization For two event tables $a$ and $b$, the basic observable is the corrected time difference ```{math} :label: eq:ch07-event-delay \tau_{ij}^{ab} = t_{j,b}^{\rm cal} - t_{i,a}^{\rm cal} - \Delta t_{\rm geo}^{ab}(t_i,\hat{\mathbf{s}}) - \Delta t_{\rm inst}^{ab}(t_i). ``` $t_{i,a}^{\rm cal}$ and $t_{j,b}^{\rm cal}$ are measured in seconds. $\Delta t_{\rm geo}^{ab}$ is the geometric delay, and $\Delta t_{\rm inst}^{ab}$ is the residual instrumental delay between the two channels. The maximum geometric delay is 333 ns for a 100 m baseline and $3.3\,\mu\mathrm{s}$ for a 1 km baseline. With a 4 ns time bin, an error of one bin is enough to move the correlation peak away from zero delay. With tens-of-picoseconds instruments, the changing projected baseline from Earth rotation also has to be updated within short time blocks {cite:p}`2006MNRAS.369..655H,2006MNRAS.372.1549E,2018RNAAS...2....4K,2020NatAs...4.1164A`. The number of event pairs in delay bin $k$ is ```{math} :label: eq:ch07-pair-count C_{ab,k} = \sum_{i\in a}\sum_{j\in b} S_iS_j\, \mathbf{1} \left[ \tau_k-\frac{\Delta\tau}{2} \le \tau_{ij}^{ab} < \tau_k+\frac{\Delta\tau}{2} \right], ``` where $\Delta\tau$ is the delay-bin width in seconds, and $C_{ab,k}$ is a dimensionless count. If the two channels are independent and their event rates are approximately stable over the live time, the expected number of accidental coincidences is ```{math} :label: eq:ch07-accidental B_{ab,k} \simeq R_a R_b\,\Delta\tau\,T_{\rm live}. ``` $R_a$ and $R_b$ are measured in s$^{-1}$, and $T_{\rm live}$ is the integration time after bad intervals and dead time have been removed. In a typical intensity-interferometry experiment, $R_a$ and $R_b$ may lie between $10^6$ and $10^8\,\mathrm{s^{-1}}$, $\Delta\tau$ may be 0.1--10 ns, and $T_{\rm live}$ may run from minutes to an entire night. The accidental background $B_{ab,k}$ is therefore enormous, while the astrophysical signal is only a $10^{-6}$--$10^{-4}$ excess on top of it {cite:p}`1974iiia.book.....H,2020NatAs...4.1164A,2024MNRAS.529.4387A`. The second-order correlation estimator can be written as ```{math} :label: eq:ch07-g2-estimator \widehat{g}^{(2)}_{ab}(\tau_k) = \frac{C_{ab,k}}{B_{ab,k}}, \qquad \widehat{C}_{ab}(\tau_k) = \widehat{g}^{(2)}_{ab}(\tau_k)-1. ``` $\widehat{C}_{ab}$ is dimensionless. If the data are continuous waveforms rather than discrete events, MAGIC-SII uses the Pearson correlation ```{math} :label: eq:ch07-pearson \rho_{ab}(\tau_k) = \frac{\sum_n \left[x_a(t_n)-\bar{x}_a\right] \left[x_b(t_n+\tau_k)-\bar{x}_b\right]} {\sqrt{\sum_n\left[x_a(t_n)-\bar{x}_a\right]^2 \sum_n\left[x_b(t_n)-\bar{x}_b\right]^2}}, ``` where $x_a$ and $x_b$ are digitized voltage or current samples. Pearson correlation is simple to compute in hardware, but it mixes starlight, night-sky light, and electronic noise. DC currents, background pixels, and zero-baseline calibration are then needed to turn it into $|V|^2$ {cite:p}`2024MNRAS.529.4387A`. ```{figure} ../_static/figures/generated/chapter_07/ch07_time_shift_background.png :name: fig:ch07-time-shift-background :width: 72.0% Structure of a time-shift background estimate. The blue curve is the event-pair histogram near zero delay. The orange curve is the accidental background obtained after shifting one channel to an unphysical delay. The green region is the excess of the central correlation peak above the shifted background. This method preserves slow rate changes, bad-time intervals, and background structure, but the shift must be much larger than both the instrumental response width and the astrophysical correlation width. ``` Working pipelines often do not normalize directly with Eq. {eq}`eq:ch07-accidental`. Instead they estimate $B_{ab,k}$ using time-shifted data, off-source data, off-band data, or wrong-polarization products. A time shift moves channel $b$ by $\Delta T_{\rm shift}$. The shift must be much larger than the correlation-peak width but shorter than the time scale over which backgrounds and gains drift. Several positive and negative shifts give both a background mean and a background uncertainty. Off-source data are useful for night-sky light and dark current. Off-band data are useful for line science. Wrong-polarization products test polarization leakage. VERITAS-SII rejects high-frequency electronic noise and low-PMT-current frames in 1 s correlation frames, then forms a weighted average after geometric-delay correction. MAGIC-SII corrects night-sky light with DC readings from both signal and background pixels {cite:p}`2020NatAs...4.1164A,2024MNRAS.529.4387A`. The bin width $\Delta\tau$ should be chosen together with the correlation-peak width, time-stamp quantization, and final fitting method. If $\Delta\tau$ is much wider than the peak, the peak height is averaged down. If it is much narrower than the time-stamp quantization and response width, neighboring bins become strongly correlated and the computation grows. If only the peak area will be fitted, oversampling does not add much information. VERITAS-SII used 4 ns correlation lags and a peak width fixed near 4 ns. With an 80 ps TDC and tens-of-picoseconds detectors, finer bins are possible, but the geometric delay, instrument response, and clock residuals must then be controlled to comparable precision {cite:p}`2020NatAs...4.1164A,2020RScI...91a3108W,2009A&A...508..531N`. ## Correlation algorithms and computational cost A direct double loop over Eq. {eq}`eq:ch07-pair-count` has complexity $O(N_aN_b)$. If the two event streams each contain $10^9$ events, exhaustive comparison requires $10^{18}$ time differences, which is not a practical routine analysis. A sorted-window algorithm first sorts by time, then compares events only inside $|t_{j,b}-t_{i,a}|\le W$: ```{math} :label: eq:ch07-window-pairs N_{\rm pair}^{\rm win} \simeq 2W\,R_b\,N_a . ``` $W$ is the half-width of the maximum delay window, measured in seconds. For $W=100\,\mathrm{ns}$, $R_b=10^6\,\mathrm{s^{-1}}$, and $N_a=10^9$, the number of candidate pairs is about $2\times10^8$, ten orders of magnitude below the full double loop. This algorithm is well suited to sparse event tables and preserves wavelength and polarization labels for every event. If the events or voltage streams have already been put into uniform time bins, the cross-correlation can be computed with FFTs: ```{math} :label: eq:ch07-fft-correlation c_{ab}[k] = \mathcal{F}^{-1} \left[ \mathcal{F}\{x_a[n]\}^{*} \mathcal{F}\{x_b[n]\} \right]_k . ``` $x_a[n]$ and $x_b[n]$ are time series, and $\mathcal{F}$ is the discrete Fourier transform. FFT correlation scales approximately as $O(M\log M)$, where $M$ is the number of time bins. It is well suited to continuous sampling and fixed-bin high-speed correlation. Once the data have been binned, however, timing information below the bin width cannot be recovered. VERITAS-SII streams continuous PMT signals to disk and computes the correlations offline with an FPGA. MAGIC-SII uses GPUs to compute all pairwise correlations among four channels in real time during the observation. Multichannel TCSPC systems instead write time-ordered event streams, which are convenient for offline construction of arbitrary delay and channel selections {cite:p}`2020NatAs...4.1164A,2024MNRAS.529.4387A,2020RScI...91a3108W`. ```{figure} ../_static/figures/generated/chapter_07/ch07_correlator_scaling.png :name: fig:ch07-correlator-scaling :width: 72.0% Operation-count scaling for three correlation strategies. The full event-pair double loop grows as N², so it is useful only for small samples or didactic checks. The sorted-window algorithm compares events only within a finite delay window and is nearly linear. FFT correlation after uniform binning scales as Mlog M, making it suitable for continuous sampling or real-time quality monitoring. Actual speed also depends on memory bandwidth, I/O, GPU/FPGA data layout, and quality-flag selection. ``` The algorithm also shapes the scientific product. A sorted-window correlator preserves all labels for every event and is useful for posterior reselection. FFT correlation is fast, but assumes fixed channels and fixed time bins. Real-time FPGA/GPU correlation can reveal clock jumps, electronic crosstalk, and cloud-driven flux changes during the observation, but often stores only compressed correlation products. For long-term archiving, it is best to keep raw events or raw waveforms, calibrated events, and correlation products together. FITS BINTABLEs are suitable for event archives, HDF5/Parquet for column-oriented large-data analysis, and Astropy for FITS, time, and coordinate metadata {cite:p}`2010A&A...524A..42P,2013A&A...558A..33A,2018AJ....156..123A`. ## Statistical errors, covariance, and null tests If $C_{ab,k}$ and $B_{ab,k}$ are independent Poisson counts, and if the background is estimated from $M$ equally weighted time shifts, then ```{math} :label: eq:ch07-excess-variance \operatorname{Var} \left[ C_{ab,k}-\widehat{B}_{ab,k} \right] \simeq C_{ab,k} + \frac{1}{M}B_{ab,k}. ``` This approximation requires independent events, non-overlapping shifted backgrounds, and stable rates. Real data rarely give diagonal errors. Adjacent delay bins share events; several baselines share the same telescope; wavelength channels share weather and gain fluctuations; model points can share a common zero-baseline normalization. A complete data vector may be written as ```{math} :label: eq:ch07-covariance \mathbf{d} = \left( \widehat{|V|^2}_{1}, \widehat{|V|^2}_{2}, \ldots, \widehat{|V|^2}_{N_d} \right), \qquad \Sigma_{ij} = \left\langle (d_i-\mu_i)(d_j-\mu_j) \right\rangle . ``` $\Sigma_{ij}$ has the product of the units of the two data entries. If $d_i$ is dimensionless $|V|^2$, then $\Sigma$ is dimensionless as well. The covariance can be estimated by analytic error propagation, time-block bootstrap, jackknife, the ensemble of time-shift backgrounds, or end-to-end injection simulations. A time block should be longer than the correlation-peak width, electronic-noise correlation time, and short pointing or weather fluctuations. For night-long data, blocks from 10 s to several minutes are often used to test whether the error decreases as $T^{-1/2}$. MAGIC-SII includes electronic bandwidth, optical bandwidth, DC-readout gain, background subtraction, and residual electronic noise as systematic terms, and estimates the angular-diameter systematic error by pushing the $|V|^2$ points in extreme directions {cite:p}`2024MNRAS.529.4387A`. ```{figure} ../_static/figures/generated/chapter_07/ch07_covariance_matrix.png :name: fig:ch07-covariance-matrix :width: 66.0% Schematic covariance matrix for multibaseline or multichannel measurements. Diagonal entries are individual data-point variances. Nearby blocks can come from noise shared by the same time interval or telescope. Far off-diagonal structure can come from common background estimates, zero-baseline normalization, or electronic crosstalk. A fit that uses only diagonal errors overestimates the number of independent measurements. ``` Null tests must use the same selection function and weights as the main analysis. Common tests include shifting one channel to an unphysical delay, reversing the sign of the geometric delay, using polarization or wavelength channels that should not be correlated, using off-target sky as background, splitting telescope pairs into time blocks, removing one telescope or one night at a time, and injecting a known correlation peak into the real event table. If a signal appears only for one selection threshold and disappears for nearby thresholds, wrong delays, or jackknife subsets, it is more likely a pipeline or instrument-state effect than an astrophysical correlation. ## Model fitting, Fisher information, and posterior distributions After the correlation function has been estimated, a common model maps angular diameter, binary separation, flux ratio, ring radius, jet parameters, or other source parameters into squared visibility: ```{math} :label: eq:ch07-model-vector \mathbf{m}(\boldsymbol{\theta}) = \left[ |V(\mathbf{u}_1,\lambda_1\mid\boldsymbol{\theta})|^2, \ldots, |V(\mathbf{u}_{N_d},\lambda_{N_d}\mid\boldsymbol{\theta})|^2 \right]. ``` $\mathbf{u}=\mathbf{B}_\perp/\lambda$ is the spatial frequency, expressed in rad$^{-1}$ or as a projected baseline in wavelengths. $\boldsymbol{\theta}$ contains the physical parameters. If $\mathbf{d}$ is already an approximately Gaussian correlation product, the likelihood can be written as ```{math} :label: eq:ch07-gaussian-likelihood -2\ln L = \left[\mathbf{d}-\mathbf{m}(\boldsymbol{\theta})\right]^T \Sigma^{-1} \left[\mathbf{d}-\mathbf{m}(\boldsymbol{\theta})\right] +\ln |\Sigma| + {\rm const}. ``` If the fit is performed directly on raw counts or low-count delay bins, the binned Poisson likelihood in Eq. {eq}`eq:ch07-binned-poisson` should be used instead. VERITAS-SII fits uniform-disk and limb-darkened models to the variation of $|g^{(1)}|^2$ with projected baseline. MAGIC-SII calibrates the Pearson correlations into $|V|^2$ using flux and background measurements, then fits stellar angular diameters and systematic errors {cite:p}`2020NatAs...4.1164A,2024MNRAS.529.4387A,1967MNRAS.137..375H`. During observing design, the Fisher matrix is often used to estimate parameter errors: ```{math} :label: eq:ch07-fisher F_{\alpha\beta} = \frac{\partial \mathbf{m}^T}{\partial \theta_\alpha} \Sigma^{-1} \frac{\partial \mathbf{m}}{\partial \theta_\beta}, \qquad \operatorname{Cov}(\theta_\alpha,\theta_\beta) \gtrsim (F^{-1})_{\alpha\beta}. ``` $F_{\alpha\beta}$ has units $\theta_\alpha^{-1}\theta_\beta^{-1}$. If independent photon statistics dominate the error, the information usually grows approximately linearly with effective photon number, and parameter errors fall as $N_{\rm eff}^{-1/2}$. Once electronic bandwidth, background subtraction, zero-baseline normalization, or channel gain set a systematic floor, longer integration only approaches that floor. The Fisher matrix is a local quadratic approximation, useful for observing plans and high-signal-to-noise estimates. Multimodal posteriors, nonlinear degeneracies, and parameter boundaries require MCMC, nested sampling, or end-to-end simulations {cite:p}`1925PCPS...22..700F,2013PASP..125..306F,2009MNRAS.398.1601F`. ```{figure} ../_static/figures/generated/chapter_07/ch07_fisher_scaling.png :name: fig:ch07-fisher-scaling :width: 70.0% Scaling of parameter error with effective photon number. The blue curve shows the pure statistical $N_{\rm eff}^{-1/2}$ decline. The orange and green curves add different systematic floors; once those floors dominate, longer integration no longer improves the final error. Observing plans should put integration time, target brightness, baseline coverage, and calibration systematics into one error budget. ``` The Bayesian posterior is ```{math} :label: eq:ch07-posterior p(\boldsymbol{\theta}\mid\mathcal{D},M) = \frac{ L(\mathcal{D}\mid\boldsymbol{\theta},M)\, \pi(\boldsymbol{\theta}\mid M)} {Z_M}, \qquad Z_M= \int L\,\pi\,d\boldsymbol{\theta}. ``` $\pi$ is the prior, and $Z_M$ is the model evidence. Priors can come from distance, spectral type, previous angular-diameter measurements, binary orbits, eruption times, or physical boundaries. They must be stated clearly, because intensity interferometry often measures only $|V|^2$, and the missing phase creates degeneracies in mirror images, flux ratios, and position angles. emcee is useful for moderate-dimensional posterior sampling. MultiNest is useful for multimodal posteriors and evidence estimates. Neither replaces physical checks: autocorrelation time, acceptance rate, prior volume, and recovery of injected signals should all be reported {cite:p}`2013PASP..125..306F,2009MNRAS.398.1601F,1992ApJ...398..146G`. A reproducible analysis starts with raw events and metadata, applies a selection function, then performs clock and geometric-delay corrections. Delay histograms or continuous correlations provide candidate signals. Time shifts, off-source data, off-band data, and wrong-polarization products provide backgrounds. Time blocks or injection simulations provide covariance. After fitting a physical model, the result still has to pass null tests, injection recovery, and posterior predictive checks. The next chapter places these likelihoods and Fisher information in the problem of imaging resolution, where Rayleigh's criterion, sub-resolution estimation, and quantum measurements ask which information channels can actually be changed.