346. Moving Average from Data Stream
Solution
Approach 1: Array or List
Intuition
Following the description of the problem, we could simply keep track of all the incoming values with the data structure of Array or List. Then from the data structure, later we retrieve the necessary elements to calculate the average.
Algorithm
First, we initialize a variable
queue
to store the values from the data stream, and the variablen
for the size of the moving window.At each invocation of
next(val)
, we first append the value to the queue. We then retrieve the lastn
values from the queue, in order to calculate the average.
Complexity
- Time Complexity: where is the size of the moving window, since we need to retrieve elements from the queue at each invocation of
next(val)
function. - Space Complexity: , where is the length of the queue which would grow at each invocation of the
next(val)
function.
Approach 2: Double-ended Queue
Intuition
We could do better than the first approach in both time and space complexity.
First of all, one might notice that we do not need to keep all values from the data stream, but rather the last
n
values which falls into the moving window.
By definition of the moving window, at each step, we add a new element to the window, and at the same time we remove the oldest element from the window. Here, we could apply a data structure called double-ended queue (a.k.a deque) to implement the moving window, which would have the constant time complexity () to add or remove an element from both its ends. With the deque, we could reduce the space complexity down to where is the size of the moving window.
Secondly, to calculate the sum, we do not need to reiterate the elements in the moving window.
We could keep the sum of the previous moving window, then in order to obtain the sum of the new moving window, we simply add the new element and deduce the oldest element. With this measure, we then can reduce the time complexity to constant.
Algorithm
Here is the definition of the deque from Python. We have similar implementation of deque in other programming languages such as Java.
Deques are a generalization of stacks and queues (the name is pronounced
deck
and is short fordouble-ended queue
). Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.
Follow the intuition, we replace the queue with the deque and add a new variable window_sum
in order to calculate the sum of moving window in constant time.
Complexity
- Time Complexity: , as we explained in intuition.
- Space Complexity: , where is the size of the moving window.
Approach 3: Circular Queue with Array
Intuition
Other than the deque data structure, one could also apply another fun data structure called circular queue
, which is basically a queue with the circular shape.
- The major advantage of circular queue is that by adding a new element to a full circular queue, it automatically discards the oldest element. Unlike deque, we do not need to explicitly remove the oldest element.
- Another advantage of circular queue is that a single index suffices to keep track of both ends of the queue, unlike deque where we have to keep a pointer for each end.
Algorithm
No need to resort to any library, one could easily implement a circular queue with a fixed-size array. The key to the implementation is the correlation between the index of head
and tail
elements, which we could summarize in the following formula:
In other words, the tail
element is right next to the head
element. Once we move the head forward, we would overwrite the previous tail element.
Complexity
Time Complexity: , as we can see that there is no loop in the
next(val)
function.Space Complexity: , where is the size of the circular queue.
I would mention a queue instead of a Deque here in the explanation. We aren't really using both the stack and queue aspects of this data structure. Instead we are simply using the deque as a queue.
We don't need to keep a separate
count
variable,len(self.queue)
has a time complexity of O(1).I think in Java solution of "Approach 2: Double-ended Queue" it would be clearer to call the result of
queue.poll()
as head (as per javadoc) rather thantail. Just to avoid confusion among those, who are not sure how java Deque does work internally. Minor though.class MovingAverage { private final Queue<Integer> window; private final int maxSize; private double sum = 0.0; public MovingAverage(int maxSize) { this.window = new ArrayDeque<>(maxSize + 1); this.maxSize = maxSize; } public double next(int val) { window.add(val); sum += val; if (window.size() > maxSize) { sum -= window.poll(); } return sum / window.size(); } }
Isn't the circular queue solution backwards regarding the head and tail? The head represents the front of the queue. When you automatically dequeue you are dequeuing the head. Enqueues occur on the tail.
public double Next(int val) { _count++; var head = (_tail + 1) % _size; _windowSum = _windowSum - _queue[head] + val; _tail = (_tail + 1) % _size; _queue[_tail] = val; return (double)_windowSum / Math.Min(_size, _count); }
All the approaches given are broken due to integer overflow. Try having 2147483647 as part of the input. You can keep the mean instead of the sum (windowSum), or keep the sum as a double, but then you can have issues with floating point precision.
Here's a simple approach in Python. It uses a queue (specifically a deque that's limited to queue operations) to store the most-recent numbers in the sliding window. It calculates the sum(queue) / len(queue) at each pass.
This isn't quite as efficient as Approach 2, since it needs to sum the k numbers in the queue during each pass. However, I'd argue that it's shorter and easier to follow.
class MovingAverage: def __init__(self, size: int): self.size = size self.queue = deque() def next(self, val: int) -> float: if len(self.queue) > self.size - 1: self.queue.popleft() self.queue.append(val) return sum(self.queue) / len(self.queue)
Update: Here's a revised version that's more performant (next is O(1), not O(k)). It's a modification of Approach 2.
class MovingAverage: def __init__(self, size: int): self.queue = collections.deque() self.size = size self.total = 0 def next(self, val: int) -> float: self.queue.appendleft(val) self.total += val if len(self.queue) > self.size: removed_val = self.queue.pop() self.total -= removed_val return self.total / len(self.queue)
C# queue...
private int size; private Queue<int> values; private double sum = 0; public MovingAverage(int size) { this.size = size; this.values = new Queue<int>(size); } public double Next(int val) { if(this.values.Count == this.size) { this.sum -= this.values.Dequeue(); } this.values.Enqueue(val); this.sum += val; return this.sum / this.values.Count; }
class MovingAverage {
protected :
list l;
int sum;
int n;
public:
/** Initialize your data structure here. */
MovingAverage(int size) {
l.resize(size);
sum=0;
n=0;
}
double next(int val) { if(n< l.size()){ sum+=val; n++; l.push_front(val); } else { cout<<l.back()<<endl; sum-=l.back(); l.pop_back(); l.push_front(val); sum+=val; } return (double)sum/(double)n; }
};
Is there any
O(1)
time complexity in javascript? If we use an array then when weshift
is that then anO(n)
operation as all the elements must be shifted?var MovingAverage = function(size) { this.window = []; this.sum = 0; this.size = size; }; /** * @param {number} val * @return {number} */ MovingAverage.prototype.next = function(val) { this.window.push(val); this.sum += val; if (this.window.length > this.size) { // console.log(11, this.window) const front = this.window.shift(); // console.log(22, this.window, front) this.sum -= front; } return this.sum / this.window.length; };