In this assignment, your job is to implement a B+-Tree. Essentially, you need to expand your main program from the last assignment to include an option that allows you to create a file, but then also put a B+-Tree index on top of the file. This assignment will be built using the code base that you developed for the previous assignment.
From a user's point of view, the way that your Assignment 2 main program will work is not much different from the Assignment 1 version, except that the user will have a few more options. When someone chooses to load all of the data into a file, they have the option of also indexing the file at the same time with a B+-Tree. That is, at the top level, if someone chooses the option "1. Load a text file into a binary database file", then he or she will next see the two options:
What do you want to do?
1. Create a binary file without an index.
2. Create a binary file with a B+-Tree index.
The first option is exactly the same as the database file creation from the first assignment, and the code will be exactly the same. However, if the user chooses to create an index, he or she is first asked the same four questions from the last assignment:
What is the name of the text file to load?
What is the name of the file where the schema is located?
What is the name of the relation schema?
Where should the binary file be stored?
These have exactly the same meaning as last time. Then, the user is asked two new questions:
What attribute would you like to index?
Where should the B+-Tree index file be stored?
Using the answer to these questions, you will use the new BPlusTree class (see bellow) to constuct a B+-Tree that will index the new database file that is being created. The way this will work is that when you fill up your new binary database file with data using repeated calls to SuckNextRecord and Append (like you did in the last assignment) just after you add a new record to the binary database file, you insert a (key, ptr) pair into the B+-Tree index where key is the indexed attribute value of the given record, and ptr is the page number where the record is to be found. Once you have filled up your database file, you then close your B+-Tree. One thing to keep in mind is that this requires that you know the page number for each new record that was added into your binary database file. You might want to modify File.Append slightly so that it returns the page number that the new record was added to; you can then take this returned value and use it as the value of ptr when you add a new record into the B+-Tree.
The other thing that you need to do in your main is to give the user one additional option: a user can choose to answer a range query over a file using a B+-Tree. So at the top level of your program, the user should be given the following three options:
What do you want to do?
1. Load a text file into a binary database file.
2. Apply a selection predicate to a binary database file.
3. Find all records falling in a range using a B+-Tree.
If the user chooses the third option, the user is asked the following questions in sequence:
What is the name of the file where the schema is located?
What is the name of the relation schema?
What is the name of the binary data file?
These questions have exactly the same meaning that they do in the previous assignment. Then, the following questions are asked:
What is the name of the attribute to be queried over?
What is the lower bound of the query range?
What is the upper bound of the query range?
What is the name of the B+-Tree file that will be used to answer the query?
Your program should then do the following. First, it looks up the name of the attribute given to find its type in the schema that was specified. Then it reads two values of that type from the user input: a lower value, and an upper value. Then, your program should use BPlusTree.Open to open up the B+-Tree stored at the location indicated by the user. Your program then uses the B+-Tree to find the page address of each and every page in the underlying binary database file that has a record falling in the specified range. Once all of those page addresses have been retrieved from the B+-Tree, you then go off to the binary database file, look up those pages, and print to the screen all of the database records in those pages that fall in the specified range.
In this assignment, there are only two classes to implement in addition to the changes to main. The header file for these classes is here. First, you will implement the IndexRecord class, which is a subclass of the Record class from the last assignment. Actually, it has no additional data compared to the Record class, but it has a couple of additional methods. Conceptually, this is a special type of record that we assume only stores (key, pointer) pairs; a B+-Tree file stores records of this form.
Next, you will implement the BPlusTree class.
Note that while I have given you the signatures of five methods as well as defined some of the attributes that your BPlusTree class will have, you will probably need to add a bunch of additional helper methods and perhaps even define some helper classes in order to get all of this to work.