Most of the important biological applications require comparison of large
genome strings.Current techniques suffer from both disk I/O and computational
cost because of extensive memory requirements and large candidate sets. We
propose an efficient technique for alignment of large genome strings. Our
technique precomputes the associations between the database strings and the
query string. These associations are used to prune the database-query substring
pairs that do not contain similar regions. We use a hash table to compare
the unpruned regions of the query and database strings. The cost of the ensuing
search is determined by how the hash table is constructed. We present a dynamic
strategy that optimizes the random disk I/O needed for accessing the hash
table. Our technique gives the user the freedom to select specific regions
within the search space and assign different accuracy levels to these regions
prior to search. It also provides the user a coarse grain visualization of
the similarity pattern quickly before the actual search. The experimental
results show that our technique reduces the cost of genome alignment significantly.