Project

General

Profile

Why mul(missval, 0) = 0 and ifthen(0, _) = missval ?

Added by Karl-Hermann Wieners about 14 years ago

I have difficulties in understanding the rationale behind the handling of zero (0) values in mul and ifthen operators. First of all, at least for me it is not obvious why in the mul/mulc case, missing values multiplied with zero return zero. So after having multiplied a data file containing 'legal' zeroes, there's no way to distinguish the original missing value (like land points in the ocean model) and the original zero, so information is lost.
Same holds true for ifthen/ifthenc: not only missing values but also zeroes are mapped to missval destroying the original distinction between legal zeroes and genuine missing values.
Seems odd. Can anyone enlighten me as to why this convention was chosen?


Replies (4)

RE: Why mul(missval, 0) = 0 and ifthen(0, _) = missval ? - Added by Ralf Mueller about 14 years ago

Karl-Hermann Wieners wrote:

I have difficulties in understanding the rationale behind the handling of zero (0) values in mul and ifthen operators. First of all, at least for me it is not obvious why in the mul/mulc case, missing values multiplied with zero return zero. So after having multiplied a data file containing 'legal' zeroes, there's no way to distinguish the original missing value (like land points in the ocean model) and the original zero, so information is lost.

Multiplication with 0 is always destructive. No matter if there is a regular field value or not, after multiplication with 0, it would be 0, too. That's the difference between 0 and other numbers: With 0 you cannot go back after multiplication. Maybe using div instead of mul with 0 would be an improvement.

Same holds true for ifthen/ifthenc: not only missing values but also zeroes are mapped to missval destroying the original distinction between legal zeroes and genuine missing values.

If I got it right, missval is considered false just like 0. I think, there has to be a decision how to handle this and in this case (ifthen) it looks quit useful to me.

Seems odd. Can anyone enlighten me as to why this convention was chosen?

Could you give examples when these conventions cause you trouble?

RE: Why mul(missval, 0) = 0 and ifthen(0, _) = missval ? - Added by Karl-Hermann Wieners about 14 years ago

The application I had in mind was the comparison of two (ocean data) files, where the first file contains missing values (land points), and the second file should contain the same data without the missing values but with unknown data instead (land points are somehow initialized).

So first thing I tried was getting a 'missing value' mask from the first file, like 1 if file1(t,x) is not missing, or missval otherwise.

  • First thought would be cdo div file1 file1 outfile. But if 0 is a legal value in the non-missing parts, as it was for me, this will give you additional missing values where file1(t,x) = 0, because 0/0 = NaN is treated like a missing value. So my mask had holes where there shouldn't be any. Well, fine by me.
  • So I tried the second approach which was cdo ifthenc,1 file1 outfile, but now ifthenc explicitly treats 0 and missval the same way. So I was back to field 1.
  • Third approach was cdo addc,1 -mulc,0 file1 outfile. This fails because missval * 0 = 0, thus resulting in a constant 1 field. I thought this was odd because in effect to try to get answers from points where you definitely know there won't be any data.
  • The final solution was cdo add -eqc,0 file1 -nec,0 file1 outfile. eq and ne treat missing values seperately, so I get two distinct 0/1/missval masks which add up to the required mask.

So frankly, I felt there should be an easier way instead of running three operations and duplicating the input streams. Hence my questions.

RE: Why mul(missval, 0) = 0 and ifthen(0, _) = missval ? - Added by Ralf Mueller about 14 years ago

I admit, that this seems complex for a rather simple problem. I added your solution to an appropriate issue (#34).
Missval are handled be nearly every operator. So the above conventions are built in many different parts of the code. I suggest to add:

  • an alias mechanism for operators (see #34), or
  • a special multiplication operator for that specific purpose

PS: Why mul(missval, 0) = 0 and ifthen(0, _) = missval ? - Added by Karl-Hermann Wieners about 14 years ago

Actually, there was another hitch: when I finally wanted to apply the 1/missval mask to file2, I could not use mul but had to use ifthen, because in some cases, file2 contained 0 where file1 contained missval, resulting in 0 for the outfile... Maybe you want to change #34 accordingly, Ralf :-)

    (1-4/4)