-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmemo14.tar
144 lines (108 loc) · 10 KB
/
memo14.tar
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
memo14.tex 0000644 0001750 0001750 00000017521 13706543764 011053 0 ustar njj njj \documentclass[12pt]{article}
\textwidth 160mm
\textheight 245mm
\oddsidemargin -3mm
\evensidemargin -3mm
\parskip 3mm
\parindent 0mm
\topmargin -20mm
\pagestyle{empty}
\usepackage{graphicx}
\graphicspath{{.}{./figures/}}
\usepackage{color}
\usepackage[utf8]{inputenc}
\usepackage{hyperref}
\usepackage{subcaption}
\usepackage{amssymb}
\usepackage{gensymb}
\newcommand{\memohead}[3]{\clearpage\begin{centering}\mbox{}\hrulefill \vskip 50mm \mbox{}{\Huge \bf LBWG memo #1}\\ \mbox{}\vskip 40mm {\Huge \bf #2}\mbox{}\vskip 40mm{\Large #3}\\\end{centering}\vfill\hrulefill\clearpage}
\usepackage{listings}
\usepackage{rotating}
\begin{document}
\memohead{14}{Description of loop 2 HDF5 routines}{Sean Mooney, 2018.10.08}
\section{Introduction} % ==========================================================
\subsection{Overview of the routines} % ------------------------------------------
HDF5 (Hierarchical Data Format 5) is a file format designed to store and organize large amounts of data. There is no limit on the number or size of data objects that can be stored in HDF5 files, and manipulating HDF5 files is fast. For these reasons, the HDF5 file format was chosen to handle the phase solutions in the long-baseline pipeline.
The long-baseline pipeline is structured into loops, as outlined in memo $11$. Much of the book-keeping of the pipeline is done in loop $2$, as it organizes data between the higher-level loop $1$, which does things like calculate the delays, and the lower-level loop $3$, which performs the self-calibration imaging. Most of the book-keeping involves handling HDF5 files in some respect. As such, there are a few routines which have been written in Python that perform different operations on HDF5 files. There are four in total and they are described below in turn. These functions are at \url{https://github.com/mooneyse/lb-loop-2/blob/master/hdf5_functions.py}. They will be implemented into the pipeline shortly.
\section{Details of the routines} % ===============================================
\subsection{\texttt{evaluate\_solutions}} % ---------------------------------------
\subsubsection*{Description}
\begin{enumerate}
\item Get the direction from the HDF5 file.
\item Evaluate the phase solutions in the HDF5 file for each station using the XX-YY statistic. This coherence metric is \\
\texttt{np.nanmean(np.gradient(abs(np.unwrap(phasesol\_xx - phasesol\_yy))) ** 2)}
\item Determine the validity of each XX-YY statistic that was calculated by checking it is above or below some given threshold.
\item Append the right ascension, declination, and validity to the master text file.
\end{enumerate}
\subsubsection*{Inputs}
Usage: \texttt{evaluate\_solutions(h5parm, mtf, threshold = 0.25)}. The arguments are described below:
\begin{itemize}
\item \texttt{hdf5} \textit{(string)} -- LOFAR HDF5 parameter file.
\item \texttt{mtf} \textit{(string)} -- The path and file name of the master text file. The master text file contains a list of the HDF5 files, the direction of that HDF5 file, and whether its solutions were deemed good for each station. For example, see \url{https://github.com/mooneyse/lb-loop-2/blob/master/mtf.txt}.
\item \texttt{threshold} \textit{(float, optional)} -- Threshold to determine the goodness of the XX-YY statistic, which is on a scale of $0$ to $1$ where $0$ is the best situation.
\end{itemize}
\subsubsection*{Outputs}
\begin{itemize}
\item \texttt{None} -- This function returns nothing but it will have written results to the master text file.
\end{itemize}
\subsection{\texttt{make\_h5parm\_multiprocessing}} % -----------------------------
\subsubsection*{Description}
\begin{enumerate}
\item This is a wrapper for the \texttt{make\_h5parm} function, where it runs that function in parallel.
\item This wrapper takes as input a list of arguments to be passed to the \texttt{make\_h5parm} function, where each list entry is a tuple of parameters for one thread.
\item The \texttt{make\_h5parm} function gets the direction from the measurement set or takes it from the list provided.
\item It then reads the directions of the h5parms from the master text file.
\item It calculates the separation between the measurement set direction and the HDF5 files directions.
\item For each station, it finds the HDF5 file of smallest separation which has valid phase solutions.
\item It creates a new HDF5 file.
\item It writes these best phase solutions to this new HDF5 file.
\end{enumerate}
\subsubsection*{Inputs}
Usage: \texttt{make\_h5parm\_multiprocessing(args)}. The arguments are a list of tuples for multiprocessing, and the tuple arguments are described below:
\begin{itemize}
\item \texttt{mtf} \textit{(string)} -- The path and file name of the master text file. The master text file contains a list of the HDF5 files, the direction of that HDF5 file, and whether its solutions were deemed good for each station. For example, see \url{https://github.com/mooneyse/lb-loop-2/blob/master/mtf.txt}.
\item \texttt{ms} \textit{(string)} -- This is the measurement set which is going to be put through the self-calibration imaging loop (loop $3$).
\item \texttt{clobber} \textit{(bool, default = false)} -- If true, this will overwrite the new HDF5 file that is being created, should it exist already.
\item \texttt{directions} \textit{(list, default = [])} -- Right ascension and declination of the source, in radians.
\end{itemize}
\subsubsection*{Outputs}
\begin{itemize}
\item The \texttt{make\_h5parm} function returns the name newly created HDF5 file (i.e. a string).
\end{itemize}
\subsection{\texttt{applyh5parm}} % -----------------------------------------------
\subsubsection*{Description}
\begin{enumerate}
\item It creates an NDPPP parset which will apply the HDF5 file that was outputted by \texttt{make\_h5parm} to the measurement set.
\item It executes the NDPPP process.
\end{enumerate}
\subsubsection*{Inputs}
Usage: \texttt{}. The arguments are described below:
\begin{itemize}
\item \texttt{new\_h5parm} \textit{(str)} -- This is the output of the \texttt{make\_h5parm} function.
\item \texttt{ms} \textit{(string)} -- This is the measurement set which is going to be put through the self-calibration imaging loop (loop $3$). The HDF5 solutions will be applied to it.
\item \texttt{clobber} \textit{(bool, default = False)} -- Whether or not to overwrite the parset, if it already exists.
\item \texttt{column\_out} \textit{(str, default = `DATA')} -- Which column in the measurement set will NDPPP write to.
\end{itemize}
\subsubsection*{Outputs}
\begin{itemize}
\item This function returns the name of the measurement set as a string which has had the solutions applied to it.
\end{itemize}
\subsection{\texttt{updatelist}} % ------------------------------------------------
\subsubsection*{Description}
\begin{enumerate}
\item Combine the phase solutions from the initial HDF5 file and the final HDF5 file.
\item It calls \texttt{evaluate\_solutions} to update the master text file with a new line appended.
\end{enumerate}
\subsubsection*{Inputs}
Usage: \texttt{updatelist(new\_h5parm, loop3\_h5parm, mtf, clobber = False, threshold = 0.25)}. The arguments are described below:
\begin{itemize}
\item \texttt{new\_h5parm} \textit{(str)} -- The initial HDF5 file (i.e. the output from \texttt{make\_h5parm}).
\item \texttt{loop3\_h5parm} \textit{(str)} -- This is the final HDF5 file from loop 3.
\item \texttt{mtf} \textit{(string)} -- The path and file name of the master text file. The master text file contains a list of the HDF5 files, the direction of that HDF5 file, and whether its solutions were deemed good for each station. For example, see \url{https://github.com/mooneyse/lb-loop-2/blob/master/mtf.txt}.
\end{itemize}
\subsubsection*{Outputs}
\begin{itemize}
\item \texttt{combined\_h5parm} -- A new HDF5 file that is a combination of \texttt{new\_h5parm} and \texttt{loop3\_h5parm}.
\end{itemize}
\end{document}