-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fourier design matrix definition consistency #349
Comments
I'm not sure this is a good idea. We're doing something pretty strange already, because we treat the signal as periodic when it isn't. (Yongqi Zhang and I are trying to understand how important the effect of this is and what can be done about it, following your ideas from a while back, Rutger.) Furthermore, typically |
For a PTA, we can define deltaf in a way that makes sense to us. Giving it deltaf=1/Tmax with Tmax the same way we do now is totally reasonable. Also, there is no 'right' way to choose those frequencies, and I am not arguing to modify those. However, for a single pulsar I think it is a little inconsistent what we are doing now. Just try to do Fourier analysis of regularly sampled data with an FFT. See if you can understand why the frequencies are chosen by the FFT package. Discrete Fourier Transforms are very well-understood and standardized. All I am arguing is to use the same definitions. I have done the cross-checks, and I can reproduce the results of an FFT to machine precision with the modifications I posted above. |
I agree that there is something wrong about analyzing a single pulsar that was observed starting with an observation at On the other hand, I'm not sure this is very usefully fixed by using At the moment my thought is to require the person who calls the function to set up the Fourier matrix to supply |
I agree on several points: the user should be specifying deltaf rather than T, not in the least because T is also ambiguous when there are several pulsars, but also because of the other issue we mentioned above. Also agree regarding counting epochs rather than observations, although this is something than one can argue about. Most of all, I think we would like "sane defaults", and options. The different normalizations should be options with the same names as in scipy ('forward'/'backward'/'ortho' -- we only use 'backward' now). The defaults IMO ideally would be equal to the FFT frequency bins for regularly sampled data (in epochs). |
I guess one proposal is to remove |
We should probably keep Tspan in there for compatibility. Since it is None by default it is easy to add a deltaf keyword. I agree that we should not pass anything like Nepochs or whatever. If they need that kind of processing they can pass deltaf. And 't' is passed, so we have N already, so the normalization can be calculated. My suggestion would be to default to the scipy/FFT deltaf going forward so we do not confuse people who would want to use such tools in conjunction with Enterprise, but that can also just be done conditional on a keyword that is checked if both Tspan and deltaf are set to None. So:
|
OK. So by default you will use the current normalization, for which you don't need to know N. If the user specifies 'forward' or 'ortho', you'll use a different normalization with My preference would be to raise an exception if the caller specifies neither Tspan nor deltaf. The present default is to use the max-min toa for the specific pulsar under analysis, and this sometimes causes trouble when the person was expecting the max-min toa for the whole PTA. It also makes you unhappy because it's missing the I would strongly oppose silently changing to a different convention. It's one thing to break people's code and require them to make a small change to their calls, quite another to make their code give different answers, even if they are only slightly different. I would also favor raising an exception if the user specifies both deltaf and Tspan. |
This is perhaps controversial, but I consider the way we define our Fourier design matrices in Enterprise not clear or clean enough.
The Fourier design matrix is closely related to the DFT and least-squares spectral analysis. Ideally, we would use definitions that generalize the definition of the DFT used elsewhere in the literature. In scipy/numpy, the DFT/FFT is defined in a way that makes the DFT a unitary transformation with normalizations for various purposes:
norm='ortho'
: the DFT is a unitary transformation, meaning that np.dot(F.T, F) = np.identity(nobs)norm='forward'
: the standard definition of a DFTnorm='backward'
: going back from the Fourier coefficients to the original data (inverse of 'forward')In Enterprise (or anywhere in the PTA literature), we use a different definition introduced by the Lentati et al. paper. That definition is closely related to the 'backward' normalization. The only difference is that, for regularly sampled data, our frequency spacing 'Delta f' is off from the standard definition. This is because for a dataset with N observations, you actually only have (N-1) spacings between observations. So the total observation length T is different from the definition you would use for regularly sampled data. For regularly sampled data, one would use the definition T = N * Delta_t. This results in a different multiplicative factor for the frequency spacing of N/(N-1).
I am of the opinion that our definition of the Fourier design matrix should generalize to the usual definition of the DFT for regularly sampled data. The advantage of doing that is also that our Fourier modes would become truly orthogonal for regularly sampled data. Currently, they do not have that property. In practice, this is not a big difference, but it is not optimal.
We can fix this consistency in the
gp_bases.py
file, by adding anorm
keyword. Basically, I would suggest modifying with codeblock that is something similar to:If we need to match different datasets, instead of passing the total observation time
Tspan
, we should be passingdeltaf
, because we actually intend on matching the frequency spacing and not the observation time.The text was updated successfully, but these errors were encountered: