CSCI 1321 (Principles of Algorithm Design II), Spring 2001:
Homework 51

Assigned:
March 9, 2001.

Due:
March 23, 2001, at 5pm.

Credit:
40 points, plus up to 20 points extra credit.


Contents

Problem statement

You are to implement a dynamic array class using arrays and dynamic memory. Two approaches to implementing this class are presented here; you can implement either one you choose, or for extra credit you may implement both. The approaches differ in their strategies for when to resize arrays. You are to determine how these strategies affect program efficiency; if you implement the class both ways, you will be able to compare the two.

As always, please read through the entire assignment before beginning to code; timing your code might be easier if you read through that section first.

Introduction

A dynamic array is an array that grows and shrinks as more or less storage is needed. For example, the STL vector class supports dynamic array operations. As more elements are added to a vector, e.g., using push_back(), it increases in size. If elements are removed, e.g., using pop_back(), it decreases in size. Thus, users can avoid the requirement of specifying an ordinary array's maximum size at creation time.

In this homework, we will implement a dynamic-array class, i.e., a primitive substitute for the STL vector class.

What the class must provide

dynamicArray objects should support the operations listed in the following table.

Function prototype Example use Explanation
dynamicArray(void) dynamicArray v; Creates an object with no items.
length_pos dynamicArray::length_pos i = v.size(); Type specifying array's length or a position within array.
item_type dynamicArray::item_type i = v.pop_back(); Type specifying an array element.
void push_back(const item_type & item) v.push_back(17); Appends the given item to the end of the array.
item_type pop_back(void) int i = v.pop_back(); Removes the last element of the array and returns it.
length_pos size(void) dynamicArray::length_pos i = v.size(); Returns the number of items in array.
item_type get(const length_pos i) int i = v.get(0); Returns the item stored at the specified position.
void set(const length_pos i, const item_type & item) v.set(0,3); Stores the second parameter in the position specified by the first parameter.

If in doubt about a function's semantics, read about the corresponding function in the STL vector class. push_back() increases the number of items stored in the dynamicArray object, while pop_back() decreases the number of items. Using pop_back() on an array with no elements is undefined; you choose whether to check for this case or not. (This might be a good place to use assert().) Similarly, get() and set() can optionally check for acceptable position values (i.e., not bigger than the array's size). Define a copy constructor, an assignment operator, and a destructor if necessary. (Hint: They are necessary.)

Assume the dynamic array stores ints, but notice how easy it will be to change to a different type by changing the definition of item_type. Positions are numbered just as for ordinary arrays and STL vectors: The ``leftmost'' element is numbered 0 and the ``rightmost'' element has number n - 1 if the array has n items.

Implementation

Your mission in this assignment is to define a class dynamicArray that provides the functions and types described in the preceding section. You are to implement the class using new, delete, and arrays, not using vectors or other library classes. You may, however, use the simple array class presented in lecture (dArray.h) as a starting point.

The next sections discuss two possible implementation strategies. The first is somewhat less trouble to implement; the second is more efficient. You may use either strategy, or for extra credit (up to 20 points) you may produce and submit two implementations, one for each strategy. (If you do this, you will need to be a little careful about maintaining two copies of your class definition, since the test and timing programs assume your class is defined in a file called dynamicArray.h. A reasonable approach is to keep each implementation in a separate subdirectory.)

Naive implementation strategy

Our naive implementation strategy for the dynamicArray class is that each object should store its n items in a dynamically-allocated array of size n. For example, an object holding 17 items will store them in a dynamically-allocated array of size 17. (Dynamic memory is sometimes called the heap or colloquially ``the bit bucket.'') To add one item to the end of the array, an array of size n + 1 is allocated, the existing n items are copied to the new array, and the old array is returned to the bit bucket. The procedure to remove one item from the end of the array is similar. dynamicArray objects only support adding and removing elements from the ``right'' end of the array (not from the left end or middle).

Powers-of-two implementation strategy

The naive implementation allocates an array exactly the same size as the number of elements to hold. Thus, every time an element is added or removed, an entirely new array is allocated. If we permit an array to have a size different from its number of elements, we can do better using the ``double or halve'' heuristic, a common computer science rule of thumb:

When allocating a new array, either double or halve its size.
Initially, the array should have size one even though it has no elements. Whenever adding an additional element requires reallocating the array, double the array's size. Whenever an array becomes less than one-quarter full, halve its size. Thus, the array's size will always be a power of two.

For example, consider the following sequence of operations:

Operation Number of elements Array size Comments
       
dynamicArray v; 0 1 makes a new array
v.push_back(3); 1 1  
v.push_back(4); 2 2 makes a new array
v.push_back(5); 3 4 makes a new array
v.push_back(1); 4 4  
int i = v.pop_back(); 3 4  
int i = v.pop_back(); 2 4  
int i = v.pop_back(); 1 4  
int i = v.pop_back(); 0 2 makes a new array

Note that the size() member function should continue to return the number of elements actually stored in the array, which may be different for this strategy from the array's physical size.

Timing comparisons

A theoretically-minded friend claims that the powers-of-2 implementation runs asymptotically faster than the naive implementation, and in fact makes an even more precise claim:

Consider a sequence of n push_back() operations. The naive implementation requires n reallocations and copying of n(n - 1)/2 elements. The powers-of-2 implementation requires log2n reallocations and a maximum of 2n element copies.

Experimentally prove the part(s) of this claim relevant to the implementation approach(es) you choose: Instrument your class(es) to count the number of array reallocations and associated element copies. A reallocation is the process of allocating a new array, copying the existing array's elements to the new array, and destroying the old array. If you write your code in a modular form, this should occur in only one function. Also add a member function show_op_counts() to both dynamicArray implementations that, given an ostream and a string representing the array's name, prints the array's statistics (number of reallocations and element copies).

Run the timing program time-dynamicArray.cpp using your implementation(s). This program allows you to measure running time and the numbers of reallocations and copies required for a specified number of push_back() operations, say n. Use it collect this information for several values of n (enough to see how running time, etc., changes as n increases); plot the results in the form of three separate graphs, one each for running time, number of reallocations, and number of copies. The running-time graph should plot running time (the y axis) versus number of push_back() operations (the x axis); the other two graphs should be similar. You may do this by hand or using any program that provides appropriate functionality. (I use gnuplot.) It might be interesting to try using a number of push_back()s comparable to 16000, 32000,..., 128000.

Checking for memory leaks

In lecture, we emphasized the importance of returning all allocated memory to the bit bucket when finished. There are commercial tools available for this purpose, but some are quite expensive. A free, if primitive, alternative is a tool called mtrace. You can use this tool by following these steps:

  1. Compile the program to be checked for memory leaks (test-dynamicArray.cpp, for example). Suppose the executable is named a.out.

  2. Before running the executable, type
    declare -x MALLOC_TRACE=foo.txt
    (You can replace foo.txt with any filename you like.)

  3. Run the executable as you usually do. Memory allocation and deallocation information is stored in the file foo.txt (or whatever filename you chose in the step above).

  4. To print memory leak information, type
    mtrace a.out $MALLOC_TRACE
    (If your executable is not called a.out, replace a.out in the above command with the name of your executable.)

For more information, see the info pages. (``info pages'', like ``man pages'', are a standard form of Unix/Linux documentation.) You can access the relevant pages by first typing info (to start a text-based program to browse info pages) and then typing m libc, m memory allocation, and m allocation debugging.

Some caveats:

  1. Ignore any leak information not involving the words ``new'' or ``delete.'' For example, ignore any leaks involving the words ``exit.''

  2. Ignore the indicated line numbers. Just check your code for news without corresponding deletes and vice versa.

  3. Using STL strings may cause memory leak errors; the algorithm the STL string class uses to allocate memory does not clean up after itself properly. Thus, for show_op_counts, I recommend using a C-style string (array of characters) rather than an STL string.

Please consider testing your program for memory leaks. If mtrace indicates memory leaks, I will hand-inspect your program for memory leaks and deduct points if I find any.

What files do I need?

There are two provided files:

You may also find the following sample programs useful:

What to turn in

Source code

Submit your implementation(s) in files named naive-dynamicArray.h and/or powers-dynamicArray.h. Note that you are only required to submit one implementation (whichever one you choose), but you can submit a second implementation for extra credit (up to 20 points). You do not need to submit a main program; I will test your code using my own main() function. Submit this source code as described in the Guidelines for Programming Assignments. For this assignment use a subject line of ``cs1321 hw 5''.

Experimental evidence (graphs)

It is probably easiest if you submit your graphs on paper. You may also submit them electronically (i.e., via e-mail, as you do source code) if you choose a format I can read. Formats I know I can read are PostScript, PDF, GIF, and Microsoft Word (with some hassle). If you want to submit graphs in some other format, check with me first to make sure I will be able to read it.



Footnotes

... 51
© 2001 Jeffrey D. Oldham and Berna L. Massingill. All rights reserved. This document may not be redistributed in any form without the express permission of at least one of the authors.


Berna Massingill
2001-03-09