Current Work
When users search the deep web, the essence of their search is often found in a previously answered query. The Morpheus question answering system reuses prior searches to answer similar user queries. Queries are represented in a semistructured format that contains query terms and referenced classes within a specific ontology. Morpheus answers questions by using methods from prior successful searches. The system ranks stored methods based on a similarity quasimetric defined on assigned classes of queries. Similarity depends on the class heterarchy in an ontology and its associated text corpora. Morpheus revisits the prior search pathways of the stored searches to construct possible answers. The realm-based ontologies are created using Wikipedia pages, associated categories, and the synset heterarchy of WordNet.
There are two distinct Morpheus user roles. A path finder enters queries in the Morpheus web interface and searches for an answer to the query using an instrumented web browser. This web tracking tool stores the query and necessary information to revisit the pathways to the page where the path finder found the answer. A path follower uses the Morpheus system much like a regular search engine with a natural language interface. The path follower enters a question in a text box and receives a guided path to the answer. The system exploits previously found paths to provide an answer.
History of Project
The need to integrate collections of independently written data base schemas has seriously challenged enterprises and decision-makers across many domains. More precisely, information integration comprises the extraction, transformation, and loading (ETL) of data from disparate systems into a single repository to support data sharing, collaboration, or decision-making (reporting) to name a few. The industry thrust toward web services and the internet will increase the scope of this information integration problem from inside a single enterprise (intra-enterprise) to among enterprises (inter-enterprise). This thrust will make information integration that much more daunting.
The Morpheus project is aimed at simplifying the transformation component of ETL making it easy to build, find and reuse transformation between disparate data types. Our data transformation tool is called Morpheus TCT (Transformation Construction Toolkit) and provides the following components and capabilities:
- A graphical scripting facility that allows composition of transform building blocks from simple primitives (e.g., computation, control, table lookup, byte rearranger). It also facilitates the composition of more complex transforms out of existing (simple) ones.
- A repository in which to store transformations and associated data types
- A sophisticated browsing facility that allows a user to discover transforms similar or identical to the one he needs, and then modify them to meet his needs.
- A scaleable execution environment for performing the actual data transformations.
Morpheus TCT is based on the Postgres DBMS for storage and execution of transforms and leverages the Postgres ADT system. Unlike many existing ETL tools, which require transformations to be performed outside of the repository where the data is stored, Morpheus TCT executes the transformations inside the DBMS thereby taking advantage of the amenities provided by a modern DBMS, including efficient storage for data and support for transactions and recovery.
