Dependencies have played a significant role in database design for many years. They have also been shown to be useful in query optimization. In this paper, we discuss the new type of dependency for polarized lexicograph-ically ordered sets of tuples. We introduce formally the concept of polarized order dependencies (PODs). We discuss their potential significance for database systems, and present a chase procedure for testing logical implication for them. 1 Introduction Consider the following SQL query in Example 1. In the schema, Dates is a dimension table with a row per day, and Sales is a large fact table recording all individual sales. The column date_id is the primary key for Dates, each row describes a given day with explicit columns as year, quarter, month, and day that describe the natural date values.

Of course, quarter is logically redundant in the group by, as month (which follows it in the group by) functionally determines quarter. (First quarter encompasses the months of January, February, and March, second quarter, the months of April, May, and June, and so forth.) The query's author could not leave quarter out of the group by, because it is stated in the select. The query optimizer could, however, remove quarter to accomplish the group by on year, quarter, month, sales if it recognizes that year, month and year, quarter, month offer the same partition. This is done by query optimizers today – given the functional dependency (FD) information that month → quarter is available to the optimizer – by rewrite [16]. For the query above, the rewrite might still not be applied, however since the query also specifies the answers to be ordered by year asc, quarter asc, month asc, sales desc. The FD that month → quarter is not logically sufficient to eliminate quarter from the order by, as it was to eliminate it from the group by. To see that the functional dependency does not suffice to eliminate quarter from the order by, imagine the values for quarter were the strings first, second, third, and fourth. Data would be lexicographically ordered as first, fourth, second, then third! Of course, we intend that values of quarter are, say, 1, 2, 3, and 4, so the data would order naturally as by date. It is unfortunate, then, that quarter is, in fact, redundant (in this query) in the order by also, but that the optimizer does …