Show simple item record

dc.contributor.authorBruhathi, H S
dc.date.accessioned2025-10-30T10:39:55Z
dc.date.available2025-10-30T10:39:55Z
dc.date.submitted2012
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/7259
dc.description.abstractDatabase Management Systems (DBMS) are popular software packages for storing and managing enterprise data. Modern DBMS typically store data in the form of tables, called relations, and query the information using the Structured Query Language (SQL), which is declarative in nature. The number of candidate strategies, called “plans”, by which the query can be processed, is exponential in the number of relations. Database systems therefore incorporate a “query optimizer” module, which explores this exponential space using dynamic programming techniques, to find the best query execution strategy. Query optimizers are especially critical for efficiently processing the complex queries that characterize current decision-support applications. In recent times, a fresh perspective on the behavior of query optimizers has been introduced through the concept of “plan diagrams”. A plan diagram is a color-coded pictorial enumeration of the optimizer’s plan choices over a query parameter space. In this thesis, we investigate a variety of issues related to the semantics of plan diagrams, covering the structural characteristics of parametric-optimal plans, the diagram coloring paradigm and the underlying database model. Our specific contributions are the following: The join order of an execution plan tree of a query is the order in which its relations are joined and finding the optimal join order is one of the most crucial tasks in query optimization, with any savings in this front being extremely beneficial. In this thesis, as our first contribution, we develop and discuss a structure-based reduction scheme as a potent alternative to the popular cost-based reduction schemes that have been explored in the past. In this new reduction scheme, we merge all plans that have identical join orders into a single entity, leaving us with a diagram that has as many colors as there are unique join orders. We discuss join order caching as an immediate application of structure-based reduction that provides the “best” plan at every point along with savings in query optimization time, when the join order cardinality is low or moderate. While experimental results indicate that the resulting join order cardinalities are low or moderate in many cases, we also encounter situations wherein the cardinality is high. For the latter case, we present the SRE (Small Relation Elimination) heuristic, that removes small-sized relations in the join orders to bring down the join order cardinality. Even though the plan optimality is no longer guaranteed now, experimental evaluation of the heuristic shows that the cost-increase of any query point is within acceptable limits. Finally, we study the occurrence of common join orders in the plan diagrams produced in two different engines. Experimental results show that this intersection is surprisingly very sparse. Structure-based reduction provides an overview of the differences between plans in a plan diagram. As our second contribution, we investigate a deeper comparison between plans that is based on their complete plan tree structures. Specifically, we semantically color the plan diagrams such that the differences in colors between any pair of plans reflect the differences in their structures. With this approach, the plan diagram intrinsically reflects the diversity in the parametric-optimal space. The challenges here include assessing plan differences, and developing transformation techniques for accurately representing these differences in the three-dimensional color model. In particular, we adapt Kruskal’s Iterative Steepest Descent multidimensional scaling technique, and test its representational quality through coloring a rich diversity of plan diagrams on benchmark database environments over a suite of commercial database engines. Our experimental results indicate that the vast majority of plan distances can be represented with satisfactory visual accuracy. Given this, we found that, in most of the plan diagrams, more than half the space is colored with shades of the same color, implying that large areas of plan diagrams are occupied by plan trees that are structurally similar. As our last contribution, we reengineer the plan diagram notion to the flexibly structured world of Extensible Markup Language (XML), which is the de facto standard for information transfer on the Internet. Since XML queries are founded on monadic second-order logic, as opposed to the first-order logic of relational DBMS, they possess significantly more expressive power than their relational counterparts, in the process exacerbating the query optimization challenge. We analyze, through the production of XML plan diagrams, the behavior of a hybrid commercial optimizer that optimizes both SQL and XML queries in a unified framework. Our experiments cover a variety of XML database benchmarks, including TPoX, XBench and TPCH_X (XML version of the TPC-H benchmark), using a set of complex XQuery templates. The results indicate that XML plan diagrams often feature complicated plan geometries that remain in the plan diagram even after reduction. In fact, for both TPoX and TPCH_X, an extremely high cost-increase threshold is required in order to remove the complex plan spatial layouts from the plan diagram. Also, we found that more than three-quarters of the plans were structurally very similar, implying that the optimizer makes extremely fine-grained choices with small changes in input parameters. All the three semantic features discussed above have been incorporated in the publicly available Picasso optimizer visualization tool.
dc.language.isoen_US
dc.relation.ispartofseriesT07564
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation
dc.subjectStructure-Based Reduction
dc.subjectSmall Relation Elimination
dc.subjectXML Query Processing
dc.titleExploring the Semantics of Plan Diagrams
dc.degree.nameMSc Engg
dc.degree.levelMasters
dc.degree.grantorIndian Institute of Science
dc.degree.disciplineEngineering


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record