# DataType Testing

The data type information is stored in two places in the MS domain objects.
Originally, we just had a mapping of column (data or corrected) to data
type for all sources and spws. This dictionary is `MeasurementSet::data_column`
with keys being a data type from the DataType enum and values being the column
name.

Later we saw the need for the per source/spw information. This is stored in
the `MeasurementSet::data_types_per_source_and_spw` dictionary. Its keys are
tuples of source name and real spw ID. The values are lists of data types that
exist for the given selection in this MS (at most two because there are just
the data and corrected columns).
This dictionary is not intended to be manipulated directly, but only via calls
to `MeasurementSet::set_data_column` method. The `importdata` tasks use it
to set a data type for the column. This also sets *all* source/spw entries in
`data_types_per_source_and_spw if` no special source/spw selection is given 
(which is the case for the `importdata` tasks).

The self calibration task is using `set_data_column` always with a source/spw
selection. In that case `MeasurementSet::data_column` is set, but
`MeasurementSet::data_types_per_source_and_spw` just gets an entry for the
particular selection.

The two methods `MeasurementSet::get_data_column` and `MeasurementSet::get_data_type`
retrieve the corresponding data type for the given column or vice versa,
optionally restricting the query to a subset of sources and/or spws.

To arrive at a setup that might be generated by `hif_selfcal` one needs to mimic
some `set_data_column` calls.

For PIPE-1474, for example, two MSes were used that were mentioned in PIPE-1209
which are not too large and also exercise the virtual spws. From a normal PL
run one can take
`uid___A002_Xed4607_Xfbf7_targets.ms`
and
`uid___A002_Xed4607_Xfd64_targets.ms`
and manually copy the DATA column to CORRECTED just to have some data in both
columns. The first MS has science spw IDs 13, 15, 17, 19, the second MS has
5, 7, 9, 11.

Now one can import these MSes in a new PL session and modify the lookup tables
manually:

```python
from pipeline.domain import DataType
ctx=h_init()
hifa_importdata(['uid___A002_Xed4607_Xfbf7_targets.ms',
                 'uid___A002_Xed4607_Xfd64_targets.ms'],
                 datacolumns={'data': 'regcal_contline_science'})
# First MS
ctx.observing_run.measurement_sets[0].set_data_column(DataType.SELFCAL_CONTLINE_SCIENCE, 'CORRECTED_DATA', 'HL_Tau', '17')
ctx.observing_run.measurement_sets[0].set_data_column(DataType.SELFCAL_CONTLINE_SCIENCE, 'CORRECTED_DATA', 'HL_Tau', '19')
# Second MS
ctx.observing_run.measurement_sets[1].set_data_column(DataType.SELFCAL_CONTLINE_SCIENCE, 'CORRECTED_DATA', 'HL_Tau', '7')
ctx.observing_run.measurement_sets[1].set_data_column(DataType.SELFCAL_CONTLINE_SCIENCE, 'CORRECTED_DATA', 'HL_Tau', '9')
ctx.observing_run.measurement_sets[1].set_data_column(DataType.SELFCAL_CONTLINE_SCIENCE, 'CORRECTED_DATA', 'HL_Tau', '11')
h_save('c1')
```

This would set up just regular calibration data for virtual spw 13, one MS
with selfcal in virtual spw 15 (real spw ID 7 in MS2) and two MSes with selfcal
in virtual spw IDs 17 and 19.

Now one has a setup to test various combinations of data types per
source/spw, including virtual spw lookups. One would keep resuming the saved
context, run some commands and save to a new file (`h_save('c2')`, etc.).

For `hif_makeimlist` and `hif_editimlist` the query involves a list of MSes,
so there is another method to deliver that information. It is
`ObservingRun::get_measurement_sets_of_type`. It takes a list of data types in
the order of possible fallbacks. This list is provided by a function 
`get_specmode_datatypes` from `makeimlist.py`.
For example, for specmode mfs the list is

`[DataType.SELFCAL_CONTLINE_SCIENCE, DataType.REGCAL_CONTLINE_SCIENCE, DataType.REGCAL_CONTLINE_ALL, DataType.RAW]`

for science targets and

`[DataType.REGCAL_CONTLINE_ALL, DataType.RAW]`

for calibrators. This means for science targets that selfcal is preferred over
regcal and if the split into the _targets MSes has not yet been done, the regcal
from the original MSes is used and so on. The method delivers a list MSes where
to find the data. Optionally, one can again specify a source and spw.
