Monday 14 October 2013

Teradata DISTINCT VS GROUP BY

Which is better DISTINCT or GROUP BY in Teradata?

     Till Teradata 12, we all knew that DISTINCT uses more spool since it picks the each row from ever amp and redistributes them to appropriate AMP then SORT the data to find the duplicates.(So the output has sorted output)

      Whereas GROUP BY uses "Aggregate" concept (AMP local grouping) to remove duplicates so consumes less spool(Output won’t be sorted, you need to add ORDER BY incase of sorted output)

      From Teradata 13/14, All DISTINCT will automatically rewritten to GROUP BY and apply some new algorithms developed by Teradata to get faster response… 

You can run a simple query for distinct and group by. You can observe they have same explain plan from TD 13

Explain SEL Distinct COL1 FROM DB.TABLE1; -- No Sorted output now
Explain SEL COL1 FROM DB.TABLE1 GROUP BY 1; -- Same explain as above

So use DISTINCT and GROUP BY for its created purpose, i.e. Distinct to find unique records and group by for analytic functions to give an aggregate result. don’t worry about performance or spool space issues any more. 


Happy Reading and blog is open for any queries/Clarification!!

8 comments:

  1. Thanks for Information Teradata is a Relational Database Management System (RDBMS) for the world’s largest commercial databases. Teradata can store data upto Teradata bytes in size. This makes the Teradata as a market leader in data warehousing applications. Teradata Online Training

    ReplyDelete
  2. Thanks for sharing the very informative content on Teradata

    ReplyDelete
  3. Thank you for sharing nice information.
    It is very useful to me and who are searching for Teradata online training

    ReplyDelete
  4. nice article thanks for sharing the post..!
    http://www.kitsonlinetrainings.com/vmware-online-training.html
    http://www.kitsonlinetrainings.com/azure-training.html
    http://www.kitsonlinetrainings.com/scom-training.html

    ReplyDelete
  5. nice article thanks for sharing the post..!
    http://www.kitsonlinetrainings.com/scom-2012-online-training.html
    http://www.kitsonlinetrainings.com/spark-online-training.html
    http://www.kitsonlinetrainings.com/teradata-online-training.html
    http://www.kitsonlinetrainings.com/testing-tools-online-training.html

    ReplyDelete
  6. nice article thanks for sharing the post..!
    http://www.kitsonlinetrainings.com/microsoft-azure-online-training.html
    http://www.kitsonlinetrainings.com/oracle-dba-online-training.html
    http://www.kitsonlinetrainings.com/oracle-soa-online-training.html
    http://www.kitsonlinetrainings.com/r-programming-online-course.html

    ReplyDelete

Monday 14 October 2013

Teradata DISTINCT VS GROUP BY

Which is better DISTINCT or GROUP BY in Teradata?

     Till Teradata 12, we all knew that DISTINCT uses more spool since it picks the each row from ever amp and redistributes them to appropriate AMP then SORT the data to find the duplicates.(So the output has sorted output)

      Whereas GROUP BY uses "Aggregate" concept (AMP local grouping) to remove duplicates so consumes less spool(Output won’t be sorted, you need to add ORDER BY incase of sorted output)

      From Teradata 13/14, All DISTINCT will automatically rewritten to GROUP BY and apply some new algorithms developed by Teradata to get faster response… 

You can run a simple query for distinct and group by. You can observe they have same explain plan from TD 13

Explain SEL Distinct COL1 FROM DB.TABLE1; -- No Sorted output now
Explain SEL COL1 FROM DB.TABLE1 GROUP BY 1; -- Same explain as above

So use DISTINCT and GROUP BY for its created purpose, i.e. Distinct to find unique records and group by for analytic functions to give an aggregate result. don’t worry about performance or spool space issues any more. 


Happy Reading and blog is open for any queries/Clarification!!

8 comments:

  1. Thanks for Information Teradata is a Relational Database Management System (RDBMS) for the world’s largest commercial databases. Teradata can store data upto Teradata bytes in size. This makes the Teradata as a market leader in data warehousing applications. Teradata Online Training

    ReplyDelete
  2. Thanks for sharing the very informative content on Teradata

    ReplyDelete
  3. Thank you for sharing nice information.
    It is very useful to me and who are searching for Teradata online training

    ReplyDelete
  4. nice article thanks for sharing the post..!
    http://www.kitsonlinetrainings.com/vmware-online-training.html
    http://www.kitsonlinetrainings.com/azure-training.html
    http://www.kitsonlinetrainings.com/scom-training.html

    ReplyDelete
  5. nice article thanks for sharing the post..!
    http://www.kitsonlinetrainings.com/scom-2012-online-training.html
    http://www.kitsonlinetrainings.com/spark-online-training.html
    http://www.kitsonlinetrainings.com/teradata-online-training.html
    http://www.kitsonlinetrainings.com/testing-tools-online-training.html

    ReplyDelete
  6. nice article thanks for sharing the post..!
    http://www.kitsonlinetrainings.com/microsoft-azure-online-training.html
    http://www.kitsonlinetrainings.com/oracle-dba-online-training.html
    http://www.kitsonlinetrainings.com/oracle-soa-online-training.html
    http://www.kitsonlinetrainings.com/r-programming-online-course.html

    ReplyDelete