tyrantdeath Posted August 9, 2011 Share Posted August 9, 2011 i would like to ask how to remove duplicate data on ireport SAMPLE DATADriver Destination Transportcarl new york car carl new jersey motorcyclecarl new york motorcycle SAMPLE OUTPUTDriver Destination Transportcarl new york car Link to comment Share on other sites More sharing options...
lukmanars Posted August 9, 2011 Share Posted August 9, 2011 i think your data isn't duplicate,it look like duplicate, but in column transportthere's data that make your data different from beforeDriver Destination Transportcarl new york motorcyclecarl new york car i think the problem is in your queries,regardsLukman Link to comment Share on other sites More sharing options...
tyrantdeath Posted August 9, 2011 Author Share Posted August 9, 2011 lukmanarsWrote: i think your data isn't duplicate, it look like duplicate, but in column transport there's data that make your data different from before Driver Destination Transport carl new york motorcycle carl new york car i think the problem is in your queries, regards Lukman but is there a way to have the output that way? Link to comment Share on other sites More sharing options...
cbarlow3 Posted August 11, 2011 Share Posted August 11, 2011 If you're going to define "duplicate" data as meaning that you want to suppress any rows that duplicate a value for a single column, then yes there's a way to do it...at least there is if your version of SQL supports "window functions" (DB2 and SQL Server do). Using your example, the goal is to only return one row maximum per driver...a strange sort of request, since you'll obviously be missing information about the other rows where that driver appears. Let me give you two solutions, actually. Both will assume that the field "driver" really is a unique key so two drivers named "carl" wont' be a problem (I happen to be sensitive to this, since MY name is Carl, and I happen to be the "third" (the son of Carl Jr....no, not THAT Carl Jr.!)).So my first solution supposes that there is another column called "destination_date" and that the reason you only want to return one row per driver is because you are looking for each driver's NEXT (earliest) destination. We'll also assume that the table only has destinations in the future so I don't have to introduce logic for skipping destinations that are in the past. This solution is simple:SELECT driver, destination, transport, destination_date FROM (SELECT driver, destination, transport, destination_date, min(destination_date) over (partition by driver) as min_date FROM MyTable) where destination_date=min_dateMy second solution supposes that you don't conveniently have a logical method of choosing which "duplicate" record to display and that you arbitrarily going to pick one based on whatever order the rows are returned. In this case, you make up TWO fields rather than one: one is a simple row_number() function and the other is again the MIN() of that row_function over the partition of driver:SELECT driver, destination, transport FROM (SELECT driver, destination, transport, row_number() over (order by driver) as rownum, min(row_number() over (order by driver)) over (partition by driver) as minrow FROM MyTable) where rownum=minrowAs you see, either solution depends on the ability to use the "window" function OVER. Reworking the first example to use a subquery instead of windowing is a lot messier:SELECT driver, destination, transport, destination_dateFROM MyTable INNER JOIN(SELECT driver, MIN(destination_date) AS min_dateFROM MyTableGROUP BY driver) x ONMyTable.dirver=x.driverWHERE destination_date=x.min_date I hope that helps! Carl (III) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now