Show simple item record

dc.contributor.advisorHaritsa, Jayant
dc.contributor.authorRamanath, Maya
dc.date.accessioned2025-10-07T10:51:53Z
dc.date.available2025-10-07T10:51:53Z
dc.date.submitted2000
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/7146
dc.description.abstractThe Web, with its vast, heterogeneous, and dynamic content, poses significant challenges for applying classical database technologies. The lack of structure, the presence of hyperlinks, and the absence of centralized control make traditional data modeling, querying, and processing approaches inadequate. This thesis presents DIASPORA, a web database system designed to address these challenges through an integrated solution encompassing data modeling, query language design, and distributed query processing. DIASPORA introduces a graph-based data model that captures both the content and hyperlink structure of web documents. It supports traditional formats like HTML and emerging semantic formats like XML. The model automatically infers semantic relationships using markup tags and element values, enabling fully automatic graph construction. A declarative query language allows users to specify keyword-based hints and hyperlink predicates, facilitating both content-based and structure-aware querying. DIASPORA’s most novel feature is its fully distributed query processing mechanism, which contrasts with conventional centralized approaches. Queries are shipped across web nodes, processed locally, and results are returned without requiring coordination from a master site. The system addresses key challenges in distributed query processing, including query completion, rewriting, termination, and result transmission. A Java-based prototype has been implemented and tested on IISc campus websites, demonstrating significant improvements in query quality and resource efficiency. DIASPORA is positioned to support a wide range of web applications, including search engine indexing, site mapping, and fine-grained querying of XML documents. Its distributed architecture also opens avenues for mining user queries to enhance public and commercial web services.
dc.language.isoen_US
dc.relation.ispartofseriesT04749
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation
dc.subjectDistributed Query Processing
dc.subjectHyperlink Structure
dc.subjectDeclarative Query Language
dc.titleDIASPORA : A fully distributed web-query processing system
dc.typeThesis
dc.degree.levelMSc Engg
dc.degree.levelMasters
dc.degree.grantorIndian Institute of Science
dc.degree.disciplineEngineering


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record