Previous SAXO Documentation Assistant: Overview Next

Matrix/

cmset_op.pro

Simplified version of CMSET_OP_UNIQ which sorts, and takes the "first" value, whatever that may mean. Performs an AND, OR, or XOR operation between two sets Description: SET_OP performs three common operations between two sets. The three supported functions of OP are: OP Meaning 'AND' - to find the intersection of A and B; 'OR' - to find the union of A and B; 'XOR' - to find the those elements who are members of A or B but not both; Sets as defined here is one dimensional array composed of numeric or string types. Comparisons of equality between elements are done using the IDL EQ operator. The complements of either set can be taken as well, by using the NOT1 and NOT2 keywords. For example, it may be desirable to find the elements in A but not B, or B but not A (they are different!). The following IDL expressions achieve each of those effects: SET = CMSET_OP(A, 'AND', /NOT2, B) ; A but not B SET = CMSET_OP(/NOT1, A, 'AND', B) ; B but not A Note the distinction between NOT1 and NOT2. NOT1 refers to the first set (A) and NOT2 refers to the second (B). Their ordered placement in the calling sequence is entirely optional, but the above ordering makes the logical meaning explicit. NOT1 and NOT2 can only be set for the 'AND' operator, and never simultaneously. This is because the results of an operation with 'OR' or 'XOR' and any combination of NOTs -- or with 'AND' and both NOTs -- formally cannot produce a defined result. The implementation depends on the type of operands. For integer types, a fast technique using HISTOGRAM is used. However, this algorithm becomes inefficient when the dynamic range in the data is large. For those cases, and for other data types, a technique based on SORT() is used. Thus the compute time should scale roughly as (A+B)*ALOG(A+B) or better, rather than (A*B) for the brute force approach. For large arrays this is a significant benefit.

cmset_op Array

result = cmset_op(a, op0, b, NOT1=NOT1, NOT2=NOT2, COUNT=COUNT, EMPTY1=EMPTY1, EMPTY2=EMPTY2, MAXARRAY=MAXARRAY, INDEX=INDEX)

Return value

The resulting set as a one-dimensional array. The set may be represented by either an array of data values (default), or an array of indexes (if INDEX is set). Duplicate elements, if any, are removed, and element order may not be preserved. The empty set is represented as a return value of -1L, and COUNT is set to zero. Note that the only way to recognize the empty set is to examine COUNT. SEE ALSO: SET_UTILS.PRO by RSI

Parameters

a        in required

The two sets to be operated on. A one dimensional array of either numeric or string type. A and B must be of the same type. Empty sets are permitted, and are either represented as an undefined variable, or by setting EMPTY1 or EMPTY2.

op0        in required type: string

a string, the operation to be performed. Must be one of 'AND', 'OR' or 'XOR' (lower or mixed case is permitted). Other operations will cause an error message to be produced.

b        in required

See A

Keywords

NOT1       

If set and OP is 'AND', then the complement of A (for NOT1) or B (for NOT2) will be used in the operation. NOT1 and NOT2 cannot be set simultaneously.

NOT2       

See NOT1

COUNT       

upon return, the number of elements in the result set. This is only important when the result set is the empty set, in which case COUNT is set to zero.

EMPTY1       

If set, then A (for EMPTY1) or B (for EMPTY2) are assumed to be the empty set. The actual values passed as A or B are then ignored.

EMPTY2       

See EMPTY1

MAXARRAY       

INDEX       

if set, then return a list of indexes instead of the array values themselves. The "slower" set operations are always performed in this case. The indexes refer to the *combined* array [A,B]. To clarify, in the following call: I = CMSET_OP(..., /INDEX); returned values from 0 to NA-1 refer to A[I], and values from NA to NA+NB-1 refer to B[I-NA].

Examples

 Utility function, similar to UNIQ, but allowing choice of taking
 first or last unique element, or non-unique elements.
 Unfortunately this doesn't work because of implementation dependent
 versions of the SORT() function.

 function cmset_op_uniq, a, first=first, non=non, count=ct, sort=sortit
   if n_elements(a) LE 1 then return, 0L
   sh = (2L*keyword_set(first)-1L)*(-2L*keyword_set(non)+1)

   if keyword_set(sortit) then begin
       ;; Sort it manually
       ii = sort(a) & b = a[ii]
       if keyword_set(non) then wh = where(b EQ shift(b, sh), ct) $
       else                     wh = where(b NE shift(b, sh), ct)
       if ct GT 0 then return, ii[wh]
   endif else begin
       ;; Use the user's values directly
       if keyword_set(non) then wh = where(a EQ shift(a, sh), ct) $
       else                     wh = where(a NE shift(a, sh), ct)
       if ct GT 0 then return, wh
   endelse

   if keyword_set(first) then return, 0L else return, n_elements(a)-1
 end

 Simplified version of CMSET_OP_UNIQ which sorts, and takes the
 "first" value, whatever that may mean.

    

Version history

Version

$Id: cmset_op.pro 325 2007-12-06 10:04:53Z pinsard $

History

Written, CM, 23 Feb 2000 Added empty set capability, CM, 25 Feb 2000 Documentation clarification, CM 02 Mar 2000 Incompatible but more consistent reworking of EMPTY keywords, CM, 04 Mar 2000 Minor documentation clarifications, CM, 26 Mar 2000 Corrected bug in empty_arg special case, CM 06 Apr 2000 Add INDEX keyword, CM 31 Jul 2000 Clarify INDEX keyword documentation, CM 06 Sep 2000 Made INDEX keyword always force SLOW_SET_OP, CM 06 Sep 2000 Added CMSET_OP_UNIQ, and ability to select FIRST_UNIQUE or LAST_UNIQUE values, CM, 18 Sep 2000 Removed FIRST_UNIQUE and LAST_UNIQUE, and streamlined CMSET_OP_UNIQ until problems with SORT can be understood, CM, 20 Sep 2000 (thanks to Ben Tupper) Still trying to get documentation of INDEX and NOT right, CM, 28 Sep 2000 (no code changes) Correct bug for AND case, when input sets A and B each only have one unique value, and the values are equal. CM, 04 Mar 2004 (thanks to James B. jbattat at cfa dot harvard dot edu) Add support for the cases where the input data types are mixed, but still compatible; also, attempt to return the same data type that was passed in; CM, 05 Feb 2005 Fix bug in type checking (thanks to "marit"), CM, 10 Dec 2005 Work around a stupidity in the built-in IDL HISTOGRAM routine, which tries to "help" you by restricting the MIN/MAX to the range of the input variable (thanks to Will Maddox), CM, 16 Jan 2006 Author: Craig B. Markwardt, NASA/GSFC Code 662, Greenbelt, MD 20770 craigm@lheamail.gsfc.nasa.gov

 


  Produced by IDLdoc 2.0.