.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/plot_temporal_accuracy.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_plot_temporal_accuracy.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_plot_temporal_accuracy.py:


Temporal accuracy metrics
=========================

This example demonstrates how to evaluate the **timeliness** of a monitoring system.

Unlike spatial accuracy (which asks "is the map correct?"), temporal accuracy asks:
**"How quickly does the system detect changes?"**

This tutorial will guide you through:
1.  Generating synthetic disturbance data with known dates.
2.  Computing the **True** accuracy curves using the exhaustive dataset (as a benchmark).
3.  Sampling points using a stratified design to capture enough disturbance events.
4.  Using :class:`~nrt.validate.metrics.TemporalEvaluator` to estimate accuracy curves from the sample.
5.  Calculating key metrics like **Initial Delay** and **Level-off Point**.

.. GENERATED FROM PYTHON SOURCE LINES 17-28

.. code-block:: Python


    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    from nrt.data import simulate
    from nrt.validate.metrics import TemporalEvaluator
    from nrt.validate.estimators import StratifiedEstimator, SimpleRandomEstimator

    # Set random seed for reproducibility
    np.random.seed(42)


.. GENERATED FROM PYTHON SOURCE LINES 29-36

1. Generate Synthetic Temporal Data
-----------------------------------
We simulate a landscape where disturbances happen at specific dates.
The monitoring system detects these with some delay and occasional errors.

* **Reference Dates:** When the disturbance actually occurred.
* **Prediction Dates:** When the system flagged the pixel.

.. GENERATED FROM PYTHON SOURCE LINES 36-81

.. code-block:: Python


    # 1. Landscape & Reference Dates
    # 0=Non-Forest, 1=Forest, 2=Loss
    land_cover, loss_dates = simulate.make_landscape(
        shape=(1000, 1000),
        year=2021,
        forest_pct=0.8,
        loss_pct=0.05,
        seed=42
    )

    # 2. Prediction & Detection Dates
    # Simulating a system with:
    # - Good detection rate (low omission)
    # - Variable delays (some fast, some slow)
    pred_lc, pred_dates = simulate.make_prediction(
        land_cover,
        loss_dates,
        omission_rate=0.1,
        commission_rate=0.01,
        seed=42
    )

    # Mask valid forest universe
    forest_mask = land_cover > 0
    valid_idx = np.where(forest_mask.ravel())[0]

    # Flatten arrays for sampling
    y_true_dates = loss_dates.ravel()
    y_pred_dates = pred_dates.ravel()
    y_pred_class = pred_lc.ravel()  # 0=Stable, 1=Loss

    # Determine temporal bounds dynamically from the data.
    # This ensures robust handling whether the simulation returns Day-of-Year (1-365)
    # or Days-Since-Epoch.
    valid_dates = y_true_dates[y_true_dates > 0]
    if len(valid_dates) > 0:
        exp_start = int(valid_dates.min())
        exp_end = int(valid_dates.max())
    else:
        # Fallback if no loss generated
        exp_start, exp_end = 1, 365

    print(f"Temporal Experiment Window: {exp_start} to {exp_end}")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Temporal Experiment Window: 18628 to 18992


.. GENERATED FROM PYTHON SOURCE LINES 82-91

2. Compute True Accuracy Curves
-------------------------------
In a simulation, we have access to the full "Truth". We can calculate the
exact temporal accuracy curves by evaluating every single pixel. This serves
as a ground truth to verify our sample-based estimates later.

We use the :class:`~nrt.validate.metrics.TemporalEvaluator` with a
:class:`~nrt.validate.estimators.SimpleRandomEstimator`. When applied to the
entire population without sampling, this calculates the true population parameters.

.. GENERATED FROM PYTHON SOURCE LINES 91-109

.. code-block:: Python


    # We apply the evaluator to the full flattened arrays (valid forest pixels only)
    y_true_full = y_true_dates[valid_idx]
    y_pred_full = y_pred_dates[valid_idx]

    true_evaluator = TemporalEvaluator(
        y_true=y_true_full,
        y_pred=y_pred_full,
        estimator=SimpleRandomEstimator(),
        experiment_start=exp_start,
        experiment_end=exp_end
    )

    lags = list(range(0, 101, 5))
    df_true = true_evaluator.compute_curve(lags=lags, metrics=['ua', 'pa', 'f1'])

    print(f"True F1 at lag 30: {df_true.loc[df_true['lag'] == 30, 'f1'].values[0]:.3f}")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    True F1 at lag 30: 0.733


.. GENERATED FROM PYTHON SOURCE LINES 110-120

3. Stratified Sampling
----------------------
In real-world scenarios, validating every pixel is impossible. We must sample.
Temporal analysis requires a sufficient number of **detected changes** to build
reliable curves. A simple random sample might not pick enough disturbances.

We use **Stratified Random Sampling** based on the *prediction map*:

* **Stratum 1 (Detected Loss):** 800 samples. (Heavily oversampled to capture timing dynamics).
* **Stratum 0 (Stable):** 200 samples. (To monitor false alarms).

.. GENERATED FROM PYTHON SOURCE LINES 120-153

.. code-block:: Python


    # Define Strata based on Prediction Map
    strata_map = y_pred_class
    idx_loss = valid_idx[strata_map[valid_idx] == 1]
    idx_stable = valid_idx[strata_map[valid_idx] == 0]

    # Allocate Samples
    n_loss = 800
    n_stable = 200

    samp_loss = np.random.choice(idx_loss, size=n_loss, replace=False)
    samp_stable = np.random.choice(idx_stable, size=n_stable, replace=False)
    sample_indices = np.concatenate([samp_loss, samp_stable])

    # Extract Data for Samples
    sample_true_dates = y_true_dates[sample_indices]
    sample_pred_dates = y_pred_dates[sample_indices]
    sample_strata = strata_map[sample_indices]

    # Calculate Population Weights
    # Essential for correcting the bias introduced by oversampling the Loss stratum.
    pop_counts = {
        0: np.sum((strata_map == 0) & forest_mask.ravel()),
        1: np.sum((strata_map == 1) & forest_mask.ravel())
    }

    # Initialize Estimator
    # This handles the weighting logic for us
    estimator = StratifiedEstimator(
        strata_labels=sample_strata,
        stratum_pop_sizes=pop_counts
    )


.. GENERATED FROM PYTHON SOURCE LINES 154-160

4. Estimated Temporal Evaluation
--------------------------------
We initialize the :class:`~nrt.validate.metrics.TemporalEvaluator` with our sample
and the Stratified Estimator.

It allows us to ask: *"What is the accuracy if we allow a delay of X days?"*

.. GENERATED FROM PYTHON SOURCE LINES 160-174

.. code-block:: Python


    evaluator = TemporalEvaluator(
        y_true=sample_true_dates,
        y_pred=sample_pred_dates,
        estimator=estimator,
        experiment_start=exp_start,
        experiment_end=exp_end
    )

    # Compute Accuracy Curves
    # We evaluate User's Accuracy (UA), Producer's Accuracy (PA), and F1 Score
    df_curve = evaluator.compute_curve(lags=lags, metrics=['ua', 'pa', 'f1'])
    print(df_curve.head())


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

       lag        ua        pa        f1
    0    0  0.000000  0.000000  0.000000
    1    5  0.161388  0.143085  0.151686
    2   10  0.398190  0.353033  0.374254
    3   15  0.579186  0.513502  0.544370
    4   20  0.704374  0.624493  0.662033


.. GENERATED FROM PYTHON SOURCE LINES 175-183

5. Key Metrics: Initial Delay & Level-off
-----------------------------------------
Curves are informative, but single numbers are often needed to benchmark systems.

* **Initial Delay:** How long does it take to start detecting *anything* meaningful?
    (Defined as the lag where Producer's Accuracy > 2%).
* **Level-off Point:** At what lag does accuracy stop improving significantly?
    (Found using the "Kneedle" algorithm on the F1 curve).

.. GENERATED FROM PYTHON SOURCE LINES 183-199

.. code-block:: Python


    # Calculate Initial Delay (based on PA)
    init_delay_lag, init_delay_val = evaluator.find_initial_delay(df_curve, metric='pa')

    # Handle case where threshold is never reached (returns NaN)
    start_lag_for_knee = 0
    if not np.isnan(init_delay_lag):
        start_lag_for_knee = int(init_delay_lag)

    # Calculate Level-off Point (based on F1)
    # We start searching after the initial delay to avoid early noise
    level_off_lag, level_off_val = evaluator.find_level_off(df_curve, metric='f1', start_lag=start_lag_for_knee)

    print(f"Initial Delay: {init_delay_lag} days (PA={init_delay_val:.2f})")
    print(f"Level-off:     {level_off_lag} days (F1={level_off_val:.2f})")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Initial Delay: 5 days (PA=0.14)
    Level-off:     25 days (F1=0.72)


.. GENERATED FROM PYTHON SOURCE LINES 200-210

6. Visualization and Discussion
-------------------------------
We plot the estimated metrics against the True population curves calculated in Step 2.

* **Solid Lines:** Estimated values from our stratified sample.
* **Dotted Lines:** True values from the exhaustive population.

Notice how the stratified estimator successfully recovers the true population dynamics
despite the heavy oversampling of the disturbed class. Without the weights provided by
``StratifiedEstimator``, the results would be heavily biased.

.. GENERATED FROM PYTHON SOURCE LINES 210-243

.. code-block:: Python


    plt.figure(figsize=(10, 6))

    # Plot Estimated Curves
    plt.plot(df_curve['lag'], df_curve['f1'], label='Estimated F1', color='green', linewidth=2)
    plt.plot(df_curve['lag'], df_curve['pa'], label="Estimated PA", color='blue', linestyle='--')
    plt.plot(df_curve['lag'], df_curve['ua'], label="Estimated UA", color='orange', linestyle='--')

    # Plot True Curves (Ground Truth)
    plt.plot(df_true['lag'], df_true['f1'], label='True F1', color='green', linestyle=':', alpha=0.6)
    plt.plot(df_true['lag'], df_true['pa'], label="True PA", color='blue', linestyle=':', alpha=0.6)
    plt.plot(df_true['lag'], df_true['ua'], label="True UA", color='orange', linestyle=':', alpha=0.6)

    # Mark Metrics
    if not np.isnan(level_off_lag):
        plt.axvline(level_off_lag, color='grey', linestyle=':', alpha=0.7)
        plt.scatter([level_off_lag], [level_off_val], color='green', zorder=5)
        plt.text(level_off_lag + 2, level_off_val - 0.05,
                 f'Level-off\n({int(level_off_lag)} days)', color='green')

    if not np.isnan(init_delay_lag):
        plt.scatter([init_delay_lag], [init_delay_val], color='blue', zorder=5)
        plt.text(init_delay_lag + 2, init_delay_val + 0.02,
                 f'Initial Delay\n({int(init_delay_lag)} days)', color='blue')

    plt.xlabel('Temporal Lag Tolerance (days)')
    plt.ylabel('Accuracy Metric')
    plt.title('Time-to-Detection: Estimated vs True')
    plt.legend(loc='lower right')
    plt.grid(True, alpha=0.3)
    plt.ylim(0, 1.0)
    plt.show()


.. image-sg:: /auto_examples/images/sphx_glr_plot_temporal_accuracy_001.png
   :alt: Time-to-Detection: Estimated vs True
   :srcset: /auto_examples/images/sphx_glr_plot_temporal_accuracy_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 244-253

Interpretation
--------------
* **Convergence:** The estimated curves (solid) closely track the true curves (dotted),
    validating our sampling and estimation strategy.
* **Rapid Rise:** The steep slope at the beginning indicates that most disturbances
    are detected relatively quickly.
* **Plateau:** The curve flattens out (Level-off) when adding more "patience"
    (lag tolerance) no longer yields better accuracy. This suggests that remaining errors
    are likely permanent (e.g., missed events) rather than just late detections.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 2.243 seconds)


.. _sphx_glr_download_auto_examples_plot_temporal_accuracy.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_temporal_accuracy.ipynb <plot_temporal_accuracy.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_temporal_accuracy.py <plot_temporal_accuracy.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_temporal_accuracy.zip <plot_temporal_accuracy.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_