• Login
    View Item 
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Computer Science and Automation (CSA)
    • View Item
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Computer Science and Automation (CSA)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Knowledge-based preanalysis for multilevel clustering

    View/Open
    T02949.pdf (24.26Mb)
    Author
    Suresh Babu, V S S
    Metadata
    Show full item record
    Abstract
    Multilevel clustering offers the cluster analyst flexibility of choosing different algorithms at different levels with the possibility of reducing overall computational effort in comparison with the single-level application of a clustering algorithm. Even if the same algorithm is used at different levels, computational savings may occur because of the division of data at any level into blocks for applying the algorithm on individual blocks separately and collecting the samples from the clusters to form data for the next level. If the computer's main memory is a limitation for the use of a versatile algorithm like hierarchical agglomerative clustering, multilevel approach becomes inescapable. In the first part of this thesis, the computational properties of two-level clustering algorithms are analysed. In particular, it is shown that in the case of two-level hierarchical algorithm, solution of a cubic equation together with a simple rounding procedure gives optimal number of blocks at the first level to minimize the number of similarity computations. A more general formulation of multilevel clustering which allows for unequal size blocks is introduced. And a constructive proof for the existence of an optimal scheme of division is given. In addition to the Prolog program for solving the cubic equation, a Prolog program which calculates the order of time-complexity for hybrid algorithm is given. It is also proved that two-level hierarchical clustering is optimal if and only if the number of data items is a product of two prime numbers. In the second part, the problem of optimal number of levels for a multilevel clustering algorithm is solved by using horn clause specification of the optimization logic. The advantage of horn clause specification is its closeness to a working Prolog program. From the results of the Prolog program, a regularity is observed and a theorem is proved that the observed regularity is true in all cases. Similar theorem is proved for the multilevel hierarchical algorithm with unequal size blocks restricted the following way: the number of blocks at any level is less than or equal to half of the number of blocks at the previous level. Pre-analysis for multilevel clustering involves not only the choices of the number of blocks at each level but also qualitative reasoning about the appropriateness of algorithms. Qualitative reasoning must be modeled in computer in order to make a knowledge-based approach to preprocessing for multilevel clustering possible. A rule interpreter is built in Prolog using forward chaining. The rule interpreter coupled with the horn clause specification of general framework for algorithm selection forms a prototype knowledge system. Applicability to clustering algorithm selection is illustrated by means of an example. Possibilities of extending this work and suggestions for improvement to overcome the limitations of this study are also given.
    URI
    https://etd.iisc.ac.in/handle/2005/7323
    Collections
    • Computer Science and Automation (CSA) [516]

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV
     

     

    Browse

    All of etd@IIScCommunities & CollectionsTitlesAuthorsAdvisorsSubjectsBy Thesis Submission DateThis CollectionTitlesAuthorsAdvisorsSubjectsBy Thesis Submission Date

    My Account

    LoginRegister

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV