Stacking of low‐SNR calibrators ‐ facetselfcal

Group members: Frits Sweijen, Etienne Bonnassieux, Reinout van Weeren, Marco Bondi, Roland Timmerman

Some difficulty is occassionally experienced with obtaining a sufficiently high signal-to-noise ratio (SNR) to perform delay calibration. This can occur due to an unfortunate lack of known bright compact sources, challenging ionospheric conditions or reduced sensitivity due to the beam. During this busy week, we explore the application of a technique whereby we stack multiple sources in the uv domain as described in Lindroos et al. 2014 to increase our SNR.

Brief description of method

In order to gain signal-to-noise, it is key that whatever calibration solution is being derived is constrained by all sources at the same time. While facetselfcal.py/DP3 can accept multiple measurement sets already, this is intended to be used for different time intervals on a single source, and therefore the self-cal will derive separate solutions per MS. To circumvent this, it is necessary to stack the different measurement sets in the uv domain and feed it into the self-cal as if it were a single MS. However, due to the different source structure contained within each MS, it is not possible to simply average these. Instead, we first divide the DATA column by the MODEL column to normalize every MS to a point source. As these then agree on the source structure, they can be averaged and fed into the self-cal. This also circumvents the issue that the different sources would otherwise have a different uv-coverage. Finally, the derived calibration solutions (h5) then need to be applied to the original data sets in order to update the MODEL column and proceed to the next self-cal cycle. In summary, starting with a number of separate Measurement Sets, the following steps need to be performed:

Divide DATA by MODEL and write to DATA_NORM. In case of a lack of MODEL, assume a point-source model (MODEL=1), so copy DATA to DATA_NORM.
Update the visibility weights as WEIGHT_SPECTRUM_PM = WEIGHT_SPECTRUM * (abs(MODEL))^2.
Average the DATA_NORM columns weighted by WEIGHT_SPECTRUM_PM and write into a stacked MS. (PM for point source)
Sum WEIGHT_SPECTRUM_PM for each MS and write into WEIGHT_SPECTRUM of the stacked MS.
Derive calibration solutions as you wish against a point-source model.
Apply these solutions to the original data sets to create a CORRECTED_DATA column.
Image CORRECTED_DATA for each MS to update MODEL.
Go back to step 1 until converged.

Results

Normalization

For the purpose of this busy week, a separate wrapper script has been written to manage the steps described above until these are integrated within facetselfcal.py. This script calls functions which have already been included in facetselfcal.py, such as stackwrapper(), create_weight_spectrum(), create_weight_spectrum_taqL(), normalize_data_bymodel(), stackMS(), and stackMS_taql().

To verify the normalization function, we started testing on 3C295, for which the model is well-known. From a data set for which only the Dutch stations had received direction-independent (DI) solutions, we divided by the model and imaged the result. As shown below, we indeed obtain a point source.

[IMAGE SHOWING 3C295 BOTH NORMAL AND AS A POINT SOURCE]

After deriving the calibration solutions, a problem was revealed where a significant amount of data on the long baseline was flagged. This problem was narrowed down to dysco compression being unable to accommodate the dynamic range in the weights created by dividing by existing weights by MODEL squared. This can be resolved by giving dysco more bits to store each weight (e.g., 16 instead of the standard 12).

noisyweights

With the calibration solutions applied to the original data, we see the improvement we expect from self-calibration.

[3C295 BEFORE AND AFTER ONE CALIBRATION CYCLE]

Stacking

Next, we attempted to stack three different calibrator sources from within the Lockman hole (area D). Aggravated by the ionospheric conditions, these relatively faint sources had proven to be challenging delay calibrators prior to the busy week. We performed multiple experiments to thoroughly evaluate the performance of this stacked approach.

First of all, we performed the self-cal using the standard approach on each source individually. Next, we perform the self-cal using the normalized visiblities on each source individually to confirm that this strategy performs similarly. Finally, we perform the self-cal on the stacked normalized sources to check if the strategy as a whole works. The images of the individual sources using standard self-cal are shown below:

[SET OF IMAGES OF REGULAR SELF-CAL ON LBCS1]

[SET OF IMAGES OF REGULAR SELF-CAL ON LBCS2]

[SET OF IMAGES OF REGULAR SELF-CAL ON LBCS3]

[FINAL CALIBRATION SOLUTIONS OF REGULAR SELF-CAL ON LBCS1]

[FINAL CALIBRATION SOLUTIONS OF REGULAR SELF-CAL ON LBCS2]

[FINAL CALIBRATION SOLUTIONS OF REGULAR SELF-CAL ON LBCS3]

For comparison, the images of the normalized self-cal runs are shown below:

[SET OF IMAGES OF NORMALIZED SELF-CAL ON LBCS1]

[SET OF IMAGES OF NORMALIZED SELF-CAL ON LBCS2]

[SET OF IMAGES OF NORMALIZED SELF-CAL ON LBCS3]

[FINAL CALIBRATION SOLUTIONS OF NORMALIZED SELF-CAL ON LBCS1]

[FINAL CALIBRATION SOLUTIONS OF NORMALIZED SELF-CAL ON LBCS2]

[FINAL CALIBRATION SOLUTIONS OF NORMALIZED SELF-CAL ON LBCS3]

Finally, below are the images of the stacked normalized self-cal run:

[SET OF IMAGES OF REGULAR SELF-CAL ON STACKED MS]

[FINAL CALIBRATION SOLUTIONS OF NORMALIZED SELF-CAL ON STACKED MS]

From the image above, we conclude that at the moment, the normalized self-cal is consistent though not exactly equivalent to the regular self-cal, with small variations in the images visible. This is likely caused by the fact that in the normalized approach - even though the data visibilities are equivalent - we adjust the weights according to the model of our source. This has a slightly negative effect on the calibration quality.

Next, when we stack the different images, we find that the total peak flux in each of the images decreases as the self-cal proceeds. This is problematic, and may be caused by the ionospheric solutions being too unrelated between the different sources, causing the calibration solutions to be sub-optimal for each individual source and resulting in the leakage of flux away from the central compact component. This would be reinforced every cycle again. It is likely that the same would not be the case for calibrator sources which are located closer together, as then the ionospheric solutions would be more similar. On our current test field, the different sources are located relatively far apart.

Conclusions and future outlook

We have written and tested a different approach to self-calibration whereby we normalize the data visibilities to a point source and use that normalization to stack different calibrators sources and boost the SNR available for self-calibration. Based on our results this week, we conclude the following:

The normalized approach is valid and works within our implementation. However, the adjusting of the weights can result in small variations in the final image.
The stacking of Measurement Sets after normalization works, resulting in a valid MS that is dominated by the higher-SNR sources of the constituent Measurement Sets. However, the self-calibration on this stacked MS has not yet been proven to be able to improve the calibration quality in low-SNR scenarios. This is in part due to the fact that this time we have only managed to test on a single field.

In the near future, we aim and suggest to do the following:

It should be investigated how the stacking performs on a set of targets located relatively closely together. It is likely that this will result in better performance, as then the ionospheric conditions are more similar. However, we also note that if this is done on sources which are too bright/close together, this may result in self-interference, making this not a guaranteed improvement. In the meantime, we will also be on the lookout for any potential bugs remaining in the code.
The stacking approach should be tested on simulated data, as this will allow us to narrow down on the cause of issues more precisely.
If proven functional, the stacking approach should be implemented within facetselfcal.py to make this accessible to the community.

Acknowledgements

The LOFAR VLBI pipeline and its scripts were developed by:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly