2 Replies Latest reply: May 31, 2012 6:25 PM by Rob Wunderlich RSS

    problem with using previous and order by resident table

      I just discovered something interesting.  Please see attached file for reference.


      When I tried to do a order by in the same table as I am doing the previous, it breaks (new_table1).  I had to create a new sort table, sort the data first, and then de-dupe the data (new_table4).


      new_table1 shows that it doesn't work while new_table4 does work after I do a sort_table first.


      Why is that?


      Also, since I have to do 2 table loads using the previous function, which will be faster?  Using distinct or use the 2 tables and the previous function?

        • problem with using previous and order by resident table
          Oleg Troyansky

          Looks like a lot of trouble just to ensure that the rows are DISTINCT.


          1. You can add the keyword DISTINCT after the LOAD to get to the same result. Keep in mind, however, that the DISTINCT applies to all the fields in the row - i.e. if some of the non-key fields are different, the rows will be technically considered DISTINCT even though they key is the same.


          2. To ensure that the keys are unique, I'd rather load the same info using GROUP BY and min() or firstsortedvalue() for all non-key fields:



          Key1, Key2, Key3,

          min(Attr1) as Attr1,

          MinString(Attr2) as Attr2,




          group by

          Key1, Key2, Key3