Solution
Approach 1: Topological Sort
Before discussing the required design, we'll discuss some prerequisites to help ease the understanding of the solution.
Firstly, we can note that once a formula is applied to any cell in excel, let's say , if any change is made to or , the result to be put into needs to be evaluated again based on the new values of and . Further, suppose some other cell, say is also dependent on due to some prior formula applied to . Then, when any change is made to, say, , we re-evaluate 's value. Furhter, since is dependent on , we need to re-evaluate 's value as well.
Thus, whenever, we make any change to any cell, , we need to determine the cells which are dependent on , and update these cells, and further determine the cells which are dependent on the changed cells and so on. We can assume that no cycles are present in the formulas, i.e. Any cell's value won't directly or indirectly be dependent on its own value.
But, while doing these set of evaluations of the cells to determine their updated values, we need to update the cells in such an order that the cell on which some other cell is dependent is always evaluated prior to the cell which is dependent on the former cell.
In order to do so, we can view the dependence between the cells in the form of a dependency graph, which can be a Directed Graph. Since, no cycles are allowed between the formulas, the graph reduces to a Directed Acyclic Graph. Now, to solve the problem of evaluating the cells in the required order, we can make use of a very well known method specifically used for such problems in Directed Acyclic Graphs, known as the Topological Sorting.
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every directed edge , vertex comes before in the ordering. For example, a topological sorting of the following graph is 5 4 2 3 1 0
.
There can be more than one topological sorting for a graph. For example, another topological sorting of the following graph is 4 5 2 3 1 0
. The first vertex in topological sorting is always a vertex with in-degree as 0 (a vertex with no in-coming edges).
Topological Sorting can be done if we modify the Depth First Search to some extent. In Depth First Search, we start from a vertex, we first print it and then recursively call DFS for its adjacent vertices. Thus, the DFS obtained for the graph above, starting from node 5, will be 5 2 3 1 0 4
. But, in the case of a topological sort, we can't print a node until all the nodes on which it is dependent have already been printed.
To solve this problem, we make use of a temporary stack. We do the traversals in the same manner as in DFS, but we don’t print the current node immediately. Instead, for the current node we do as follows:
Recursively call topological sorting for all the nodes adjacent to the current node.
Push the current node onto a stack.
Repeat the above process till all the nodes have been considered atleast once.
Print the contents of the stack.
Note that a vertex is pushed to stack only when all of its adjacent(dependent) vertices (and their adjacent(dependent) vertices and so on) are already in stack. Thus, we obtain the correct ordering of the vertices.
The following animation shows an example of topological sorting for the graph above.
We can make use of the same concept while evaluating the cell values to determine the order in which they need to be evaluated.
Now, let's discuss how we implement the various required functions. We make use of a simple structure(Class), , which contains two elements. First, the value of the cell which it represents, , and a HashMap, . It is a list of cells on which the current cell's value is dependent. This hashmap stores the data in the form . has the format . refers to the number of times the current cell directly or indirectly comes in the current cell's formulas. e.g. . In this case, the frequency of is 1 and that of is 2.
Excel(int H, char W)
: We simply need to initialize an array of with rows and the required number of columns corresponding to .set(int row, char column, int val)
: For setting the value of the cell corresponding to the given and , we can simply change the value , , in the array at the indices corresponding to the current cell. Further, if any new formula is applied to a particular cell, we need to remove the previously applied formulas on the same cell. This is because two formulas can't be used to determine the value of a cell simultaneously. Now, setting a cell to a particular value can also be seen as a formula e.g. . Thus, we remove all the in the for the current cell. Further, when the current cell's value is changed, all the other cells which are dependent on it also need to be evaluated in the correct order. Thus, we make use of Topological Sorting starting with the current cell. We make use of a functiontopologicalSort(r, c)
for this purpose.
topologicalSort(r, c)
: In every call to this function, we traverse over all the cells in the array and further apply topological sorting to all the cells which are dependent on the current cell(row=r, column=c). To find these cells, we can check the in the associated with each cell and check if the current cell lies in it. After applying Topological sorting to all these dependent cells, we put the current cell onto a .
After doing the topological sorting, the cells on the lie in the order in which their values should be evaluated given the current dependency chain based on the formulas applied. Thus, we pick up these cells one by one, and evaluate their values. To do the evaluation, we make use of calculate_sum(r, c, cells)
. In this function, we traverse over all the in the of the current cell(row=r, column=c), and keep on adding their values. When this summing has been done, we update the current cell's value, , to the sum just obtained. We keep on doing so till all the cells in the have been evaluated.
get(int row, char column)
: We can simply obtain the value() associated with the current cell's . If the cell has never been initialized previously, we can return a 0 value.sum(int row, char column, List of Strings : numbers)
: To implement this function, firstly, we need to expand the given to obtain all the cells which need to be added in the current formula. We obtain them, by making use of aconvert
function, which extracts all these cells by doing appropriate expansions based on:
values. We put all these cells in the associated with the current cell's . We also need to set the current cell's value to a new value based on the current formula added. For this, we make use ofcalculate_sum
function as discussed above. We also need to do the topological sorting and evaluate all the cells dependent on the current cell. This is done in the same manner as in theset
function discussed above. We also need to return the value to which the current cell has been set.
Performance Analysis
set
takes time. Here, and refer to the number of rows and columns in the current Excel Form. There can be a maximum of formulas for an Excel Form with rows and columns. For each formula, time will be needed to find the dependent nodes. Thus, in the worst case, a total of will be needed.sum
takes time. Here, refers to the number of elements in the the list of strings used for obtaining the cells required for the current sum. In the worst case, the expansion of each such element requires time, leading to time for expanding such elements. After doing the expansion,calculate_sum
itself requires time for traversing over the required elements for obtaining the sum. After this, we need to update all the dependent cells, which requires the use ofset
which itself requires time.get
takes time.The space required will be in the worst case. space will be required for the Excel Form itself. For each cell in this form, the list can contain cells.
Comments
Post a Comment