Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the functionality to select single repeat cycles/timesteps for LI L2 accumulated archive (ARC) data #2878

Open
ameraner opened this issue Aug 8, 2024 · 0 comments
Labels
component:readers enhancement code enhancements, features, improvements

Comments

@ameraner
Copy link
Member

ameraner commented Aug 8, 2024

Feature Request

Is your feature request related to a problem? Please describe.
Satpy and the li_l2_nc reader currently does not offer a simple way to extract single repeat cycles/timesteps from LI L2 accumulated products (AF/AFA/AFR). These products contain 20 timesteps covering 10 minutes of sensing time.

When loading the main variables in the accumulated_flash_area, flash_accumulation, flash_radiance with the default reader kwarg with_area_definition=True, the reader will currently return a 2-d array with the summation of the entire sensing period of the file (particularly after #2867). This way, the archived products behave the same as the normal dissemination products.

However, some users need to be able to access each timestep individually, or might want to extract a specific range of timesteps.

Describe the solution you'd like
Ideally, an easy interface to select single timesteps returned as a 2-d array, and ranges of timesteps/all timesteps organised in a 3-d xarray with a time coordinate. It could look like

scn.load(["accumulated_flash_area"])  # returns summed-up array as already currently
scn.load(["accumulated_flash_area"], start_time=[datetime(2024, 8, 8, 12, 0, 0)]) # returns 2d array for given timestamp
scn.load(["accumulated_flash_area"], start_time='all') # returns 3d array with time coordinate and all timesteps
scn.load(["accumulated_flash_area"], start_time=datetime(2024, 8, 8, 12, 0, 0), end_time=datetime(2024, 8, 8, 12, 1, 0)) # returns 3d array with time coordinate with all timesteps within the given range

Or similar. Whether the above is implemented as a load kwarg to be handled by the FileYamlReader, or via the DataID/DataQuery mechanism, it's to be defined.

This has been discussed a bit on the slack on 08/08/2024: https://pytroll.slack.com/archives/C0LNH7LMB/p1723118806005139

Describe any changes to existing user workflow
No, this is a new feature.

Additional context
There is currently a workaround/hack to access single timesteps and get them into the gridded format.

First, this is the nominal, recommended code to read the accumulated data as 2-d arrays. It will give you the summed-up 2-d array for the entire timespan of the file.

from satpy import Scene
from glob import glob
import os
import numpy as np

path_to_testdata = '/tcenas/fbf/mtg/RollingBuffer/OPE/l2pf/LI-L2/'

# initialise LI L2 acc product Scene to produce 2-d array output (recommended, does not require resampling)
scn_2d_adef = Scene(filenames=glob(os.path.join(path_to_testdata, '*AFA*--ARC-*T_0030_0001.nc')), reader='li_l2_nc')
scn_2d_adef.load(['accumulated_flash_area'])

afa_directfrom2d = scn_2d_adef['accumulated_flash_area']

Now the workaround if you need the single datasets, while we wait for a better way to be used with the mode above:

# in alternative, you can initialise it the Scene produce 1-d outputs by setting with_area_definition False
scn_1d = Scene(filenames=glob(os.path.join(path_to_testdata, '*AFA*--ARC-*T_0030_0001.nc')), reader='li_l2_nc',
               reader_kwargs={'with_area_definition': False})
scn_1d.load(['accumulated_flash_area'])
# The array is now 1-d, and we need to resample it to the correct grid (!) and using the bucket_sum resampler (!) as below
scn_1d_r = scn_1d.resample('mtg_fci_fdss_2km', resampler='bucket_sum', empty_bucket_value=np.nan)

# now, this
afa_resampledfrom1d = scn_1d_r['accumulated_flash_area']
# is a 2-d array (practically) the same as afa_directfrom2d  as computed in the previous example
# note: possible remaining spurious differences are related to resampling close to the disk edge


# WORKAROUND/HACK FOR ISSUE
# Now, we can use the fact that we can reconstruct the 2-d array from the scn_1d Scene to access single time steps
# to do that, we need to load the accumulation_offsets variable, that contains the indices for each time step in the 1-d array
scn_1d.load(['accumulation_offsets'])
# let's extract the third time step as an example
timestep = 3
start_index = int(scn_1d['accumulation_offsets'][timestep-1])
end_index = int(scn_1d['accumulation_offsets'][timestep])

# let's create a new dataset, but in principle we could also overwrite accumulated_flash_area directly if we wanted to, and composites would then work
ds_name_new = 'accumulated_flash_area_' + str(timestep)
# and here it comes: we slice the 1-d dataset using the extracted start and end indices
scn_1d[ds_name_new] = scn_1d['accumulated_flash_area'][start_index:end_index]
# we need to do the same for the coordinates
scn_1d[ds_name_new].attrs['area'] = scn_1d['accumulated_flash_area'].attrs['area'].copy()
scn_1d[ds_name_new].attrs['area'].lats = scn_1d[ds_name_new].attrs['area'].lats[start_index:end_index]
scn_1d[ds_name_new].attrs['area'].lons = scn_1d[ds_name_new].attrs['area'].lons[start_index:end_index]
#  then let's resample this to the correct grid (!) and using the bucket_sum resampler (!) as below
scn_1d_r = scn_1d.resample('mtg_fci_fdss_2km',  resampler='bucket_sum', empty_bucket_value=np.nan)

# and here's our extracted 2-d array for one timestep
afa_resampled1d_fortimestep = scn_1d_r[ds_name_new]

note that you need an updated pyresample (>=1.29.0) for the bucket_sum to work as intended here (due to pytroll/pyresample#602)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:readers enhancement code enhancements, features, improvements
Projects
None yet
Development

No branches or pull requests

1 participant