Welcome on my PhD topics web page! If you should be interested in working on your PhD with me, please contact me by email (mschneid@cise.ufl.edu) to make an appointment. You will find information with respect to the following subjects:
Required knowledge: databases in general; programming skills in C++, Oracle 10g
Desired (but not required) knowledge: spatial databases, computational geometry
Introducing Literature:
Ralf Hartmut Güting. An Introduction to Spatial
Database Systems. VLDB
Journal (Special
Issue on Spatial Database Systems), 7(3), 231-246, 1994. [pdf]
Markus Schneider. Spatial Data Types for Database Systems - Finite Resolution Geometry for Geographic Information Systems. LNCS 1288, Springer-Verlag, 1997.
Markus Schneider & Brian Weinrich. An Abstract Model of Three-Dimensional Spatial Data Types. 12th ACM Int. Symp. on Advances in Geographic Information Systems (ACM GIS), 67-72, 2004. [pdf]
Abstract. Spatial database systems and geographical information systems (GIS) are currently only able to represent and manage crisp, determinate spatial objects, that is, spat ial objects which have sharply defined boundaries and whose extent is precisely known. Examples are mainly man-made objects like land parcels, states, school districts, and canals. But geographical reality reveals that the boundaries and extent of most spatial objects cannot be precisely determined. Examples are land features such as population density, soil quality, vegetation, oceans, biotopes, deserts, an English speaking area, clouds, and sandbanks. A possible approach to modeling this kind of indeterminate spatial objects is to apply fuzzy set theory. We obtain fuzzy spatial data types like fuzzy points, fuzzy lines, and fuzzy regions. The topic of this PhD project is to make a design for such data types and for a comprehensive collection of operations and predicates defined on them, to develop implementation concepts for them, and to integrate them into database management systems and their query languages.
Required knowledge: databases in general; programming skills in C++, Oracle 10g
Desired (but not required) knowledge: spatial databases, fuzzy set theory
Introducing Literature:
Markus Schneider. Uncertainty Management for Spatial Data in Databases: Fuzzy Spatial Data Types. 6th Int. Symp. on Advances in Spatial Databases (SSD), LNCS 1651, Springer Verlag, 330-351, 1999. [pdf]
Markus Schneider. Finite Resolution Crisp and Fuzzy Spatial Objects. 9th Int. Symp. on Spatial Data Handling (SDH), 5a.3-17, 2000. [pdf]Markus Schneider. Metric Operations on Fuzzy Spatial Objects in Databases. 8th ACM Symp. on Geographic Information Systems (ACM GIS), 21-26, 2000. [pdf]
Markus Schneider. Fuzzy Topological Predicates, Their Properties, and Their Integration into Query Languages. 9th ACM Symp. on Geographic Information Systems (ACM GIS), 9-14, 2001. [pdf]
Markus Schneider. A Design of Topological Predicates for Complex Crisp and Fuzzy Regions. 20th Int. Conf. on Conceptual Modeling (ER), 103-116, 2001. [pdf]
Markus Schneider. Fuzzy Spatial Data Types and Predicates: Their Definition and Integration into Query Languages. Spatio-Temporal Databases: Flexible Querying and Reasoning. Springer-Verlag, 265-293, 2004. [pdf] [Springer] [Amazon]
Abstract. An important spatial concept in maps are spatial graphs representing, for example, road networks, railway networks, and power networks. This PhD project has three goals. First, a design of an abstract model should give a definition of spatial graphs and their properties and further identify the most important operations and predicates on them. An example of an important operation on spatial graphs is to find the shortest path from a source to a destination. Such a model will be based on mathematical concepts like point set theory, point set topology, graph theory, and functions. Second, since the abstract model is based on infinite point sets and functions and cannot be directly implemented, a discrete model is needed that yields finite representations for the infinite concepts of the abstract model and algorithms for the operations and predicates of the abstract model. Third, the ultimate goal is to incorporate spatial graphs into databases systems and their query languages.
Required knowledge: databases in general; programming skills in C++, Oracle 10g
Desired (but not required) knowledge: spatial databases, graph theory
Introducing Literature:
(none, not much available, has to be searched for)
Abstract. Research in data warehousing and on-line analytical processing (OLAP) has produced important technologies for the design, management, and use of information systems for decision support. However, despite the continued success and maturing of the field, much work remains to be done in the future. Given the wealth of models, terminology, and definitions, the first task is to review the most important models and their treatment of the basic concepts including the notions “dimension”, “fact”, “hierarchy”, “data cube”, and many more. The intent should be to evaluate existing models based on their expressiveness, flexibility, separation of modeling aspects from implementation aspects, etc. With the knowledge gained, the second task is to develop an overall and comprehensive conceptual model adapted to the users’ needs and abstracting from implementation aspects. The third task is to identify existing OLAP operators, to get an overview of their capabilities, and to learn how they can be used to manipulate multi-dimensional data (e.g., cube, roll-up, drill-across). The fourth task is to define these OLAP operators and perhaps new ones, which have so far not been considered, on the basis of the designed model in task 2. The fifth task is to identify new applications for data warehousing and OLAP in the spatial and spatio-temporal domain and to extend the model of task 2 correspondingly. The impact of new, advanced, and non-standard data types on the data warehousing concepts has to be explored. Also, new OLAP operations have to be detected and formally defined. The sixth task is to implement the complete model as a data warehouse extension package that can be integrated as a cartridge, datablade, or extender into Oracle, DB2, and Informix.
Required knowledge: databases in general; programming skills in C++, Oracle 10g
Desired (but not required) knowledge: data warehouses in general, spatial databases, spatio-temporal databases, computational geometry
Introducing Literature:
Torben B. Pedersen, Christian S. Jensen & Curtis E. Dyreson. A Foundation for Capturing and Querying Complex Multidimensional Data. Information Systems, 26(5), 383-423, 2001. [pdf]
Maurizio Rafanelli (editor). Multidimensional Databases: Problems and Solutions. Idea Group Publishing, 2003.
Abstract. Multimedia database systems are of interest in many application areas which deal with video, image, audio, text, or graphic data, or any kind of mixture of them. The goal of this topic is to focus exclusively on the image part. We then call these systems image database systems. Images are of particular interest in many applications since they allow the visual transport of large volumes of information in a packed manner. Although a large knowledge about images exists from a processing standpoint in disciplines like computer graphics, computer vision, and image processing, a study of the literature reveals that not much is known about the conceptual view the user has or should have on image database systems. Simply collecting images in a database and enabling to browse them does not justify the use of a database system. Some central questions are: What kind of interface should these systems provide to the user? What kind of query languages should be made available? What are the central operations on images? How can images be represented so that they can support the identified operations in an efficient way? Are formats like jpeg, tiff, and many others appropriate for this purpose? Our idea is to incorporate the answers to all these questions into a type system (we call it algebra) for images. That is, the first task is to design and implement AN IMage ALgebra called ANIMAL, which provides data types and operations for images. The next step is then to integrate these types and operations into an extensible database system (like Oracle, DB 2, or others) and its query language, and thus to create an image database system. Later, extensions are conceivable with respect to image indexing and information retrieval.
Required knowledge: databases in general, computer vision; programming skills in C++, Oracle 10g
Desired (but not required) knowledge: image processing
Introducing Literature:
(none, has to be searched for)
Abstract. The dramatic increase of mostly semi-structured genomic data, their heterogeneity and high variety, and the increasing complexity of biological applications and methods mean that many and very important challenges in biology are now challenges in computing and here especially in databases. In contrast to the many query-driven approaches advocated in the literature, we propose a new integrating approach that is based on two fundamental pillars. The Genomics Algebra provides an extensible set of high-level genomic data types (GDT) (e.g., genome, gene, chromosome, protein, nucleotide) together with a comprehensive collection of appropriate genomic functions (e.g., translate, transcribe, decode). The Unifying Database allows us to manage the semi-structured contents of publicly available genomic repositories and to transfer these data into GDT values. These values then serve as arguments of Genomics Algebra operations which are supposed to be embedded into a DBMS query language.
Required knowledge: databases in general, bioinformatics; programming skills in C++, Oracle 10g
Desired (but not required) knowledge: biological databases
Introducing Literature:
Joachim Hammer & Markus Schneider. Genomics Algebra: A New, Integrating Data Model, Language, and Tool for Processing and Querying Genomic Information. 1st Biennial Conf. on Innovative Data Systems Research (CIDR), 176-187, 2003. [pdf]