A number of business, scientific, engineering and defense organizations generate and store gigabytes to terabytes of data per day. There is an increasing interest in the exploration and mining of this data to improve business processes, make new scientific discoveries, detect security intrusions, etc. In this project, we are developing middleware that will enable distributed data mining and exploration and thus, will significantly impact users, data owners and system administrators who face issues involving distributed data and multiple ownership. Users will be able to explore and mine information in the above environment using state-of-the-art data exploration and data mining primitives, or develop their own application specific primitives without having to worry about the nature of the underlying resources. Data owners will be able to ensure that all users are given access to data based on their privileges and to guarantee that a user does not compromise the privacy constraints by using multiple related queries. System administrators will be able to effectively manage the available resources and provide users constrained access to them based on the organization's policy.
Faculty:
Sponsors:
This research is supported in part by NSF under a medium ITR GRant