Skip to main content

157. Read N Characters Given Read4

 

157. Read N Characters Given Read4

Solution


Overview

Interview Tendencies: Google and Facebook

A long time ago, long ago, so long ago that no one can remember, algorithm interview questions were less popular. Ten years ago big companies mainly filtered the candidates by the university ranks, and the interview questions were like please describe how DDR memory works.

Nowadays there are some tendencies to merge this "old-style interview" with the modern algorithm questions interview. The idea is to ask a question which sounds like algorithmic but checks your knowledge of how do computers work: Round-robin CPU schedulingC10k problem first solved by nginx, etc.

Is it good or bad? That's a reality to deal with, especially if we speak about Google or Facebook interviews.

Read N Characters Given Read4

Back to the problem, the question is "how does the memory work":

  • Because of the physical implementation, loading 4 bytes in DDR is faster than to load 1 byte 4 times.

  • On the majority of computers today, collection of 4 bytes, or 32 bits, is called a word. Most modern CPUs are optimized for the operations with words.

To sum up, the problem is a practical low-level question. The standard approach (Approach 1) to solve it using the internal buffer of 4 characters:

File -> Internal Buffer of 4 Characters -> Buffer of N Characters.

imgFigure 1. Approach 1: solution with internal buffer.

Once it's done, and you show your understanding of memory operations, the follow-up question is how to speed up. The answer (Approach 2) is quite straightforward. If it's possible, do not use the internal buffer of 4 characters to avoid the double copy:

File -> Buffer of N Characters.

imgFigure 2. Approach 2: solution without internal buffer.




Approach 1: Use Internal Buffer of 4 Characters

imgFigure 3. Solution with internal buffer.

Algorithm

Let's use an internal buffer of 4 characters to solve this problem:

File -> Internal Buffer of 4 Characters -> Buffer of N Characters.

  • Initialize the number of copied characters copiedChars = 0, and the number of read characters: readChars = 4. It's convenient initialize readChars to 4 and then use readChars != 4 as EOF marker.

  • Initialize an internal buffer of 4 characters: buf4.

  • While number of copied characters is less than N: copiedChars < n and there are still characters in the file: readChars == 4:

    • Read from file into internal buffer: readChars = read4(buf4).

    • Copy the characters from internal buffer buf4 into main buffer buf one by one. Increase copiedChars after each character. In the number of copied characters is equal to N: copiedChars == n, interrupt the copy process and return copiedChars.

Implementation

Complexity Analysis

  • Time complexity: \mathcal{O}(N) to copy N characters.

  • Space complexity: \mathcal{O}(1) to keep buf4 of 4 elements.




Approach 2: Speed Up: No Internal Buffer

imgFigure 4. Solution without internal buffer.

This solution is mainly suitable for the languages (C, C++, Golang) where pointers allow to append directly to the primary buffer buf.

Algorithm

  • Initialize the number of copied characters copiedChars = 0, and the number of read characters: readChars = 4.

  • While number of copied characters is less than N: copiedChars < n and there are still characters in the file: readChars == 4:

    • Read from file directly into buffer: read4(buf + copiedChars).

    • Increase copiedCharscopiedChars += readChars.

  • Now buf contains at least N characters. Return min(n, copiedChars).

Implementation

Complexity Analysis

  • Time complexity: \mathcal{O}(N) to copy N characters.

  • Space complexity: \mathcal{O}(1).



Comments: 
12
  September 2, 2020 11:45 AM

Took some time to even to completely get the question :(

  August 11, 2020 8:14 PM

the Approach 2 might overflow the buffer since the buf is only guaranteed to fit n characters. :|
For example, file is "abcdefgh" and you want to read 5 chars, than buf will overflow with 3 additional bytes.

  September 1, 2020 10:50 PM

Old school question. The rational stumped me at first, but it's a good question.

  August 6, 2020 5:50 AM

I saw that question in Facebook interview, I couldn't solve that time. Here my two cents (Java)
Thanks for the explanation.

// Global variable
int ans  = 0;
int ans4 = 0;
int idx4 = 0;
boolean eof = false;
char[] buf4 = new char[4];

public int read(char[] buf, int n) {
    ans = 0; // Reset to reuse read()

    while(ans < n){
        while(ans < n && idx4 < ans4) buf[ans++] = buf4[idx4++];
        if(ans == n || eof) return ans;
        res4 = read4(buf4); // Read other chunk
        idx4 = 0; // Reset to reuse
        if(res4 < 4) eof = true;
    }

    return ans;
}

  September 6, 2020 4:16 PM

Am I wrong in understanding this question ? At first question looked tough but it turned out to be very simple one. I didn't get the rationale behind this simple stupid question!!!

  August 16, 2020 8:24 PM

FYI, word is not 4 bytes, word is 2 bytes, dword is 4 bytes.
Please correct explanation

  October 6, 2020 12:16 PM

Good intro to solution. But definitely 13 years ago at least companies didn't select candidates based on DDR questions. But yes questions used to be much simpler.

  September 26, 2020 8:50 AM

The example solutions does not consider consecutive read-by-N, unfortunately.

Consider this case: file = "abcdefg", n = 5. If we call the read() twice, what should the second read() get? If it is real world problem, we would expect the second read() to get 'fg', but the solutions here would get empty string "".

The first read() will call read4 twice, and get 'abcd', 'efg', then the solution returned 'abcde' 5 bytes, and discard 'fg'. The second read() calls read4, and the read4 maintained file pointer is already pointing to the end of file, then read4 returns empty. A complete solution should buffer the extra bytes internally, and prefix the left-over to the next read().

  September 8, 2020 12:47 AM

Approach 2 with protection against overflowing primary buffer - allocate an internal buffer for the last read if it might overflow.

class Solution {
public:
    int read(char *buf, int n) {
        int copiedChars = 0, readChars = 4;
        
        while (copiedChars < n && readChars == 4) {
            if (n - copiedChars < 4) {
                // internal buffer for final read that may overflow buf
                char buf4[4];
                readChars = read4(buf4);
                for (int i = 0; i < readChars; ++i) {
                    buf[copiedChars] = buf4[i];
                    ++copiedChars;
                }
            } else {
                readChars = read4(buf + copiedChars);
                copiedChars += readChars;
            }
        }
        return min(n, copiedChars);
    }
};

  September 3, 2020 4:08 PM

I agree with the approach 2, but definitely not the implementation of approach 2, which takes the risk of overflowing the given buffer.

Comments

Popular posts from this blog

Python - List - Append, Count, Extend, Index, Insert, Pop, Remove, Reverse, Sort

🐍 Advance List List is widely used and it's functionalities are heavily useful. Append Adds one element at the end of the list. Syntax list1.append(value) Input l1 = [1, 2, 3] l1.append(4) l1 Output [1, 2, 3, 4] append can be used to add any datatype in a list. It can even add list inside list. Caution: Append does not return anything. It just appends the list. Count .count(value) counts the number of occurrences of an element in the list. Syntax list1.count(value) Input l1 = [1, 2, 3, 4, 3] l1.count(3) Output 2 It returns 0 if the value is not found in the list. Extend .count(value) counts the number of occurrences of an element in the list. Syntax list1.extend(list) Input l1 = [1, 2, 3] l1.extend([4, 5]) Output [1, 2, 3, 4, 5] If we use append, entire list will be added to the first list like one element. Extend, i nstead of considering a list as one element, it joins the two lists one after other. Append works in the following way. Input l1 = [1, 2, 3] l1.append([4, 5]) Output...

Difference between .exec() and .execPopulate() in Mongoose?

Here I answer what is the difference between .exec() and .execPopulate() in Mongoose? .exec() is used with a query while .execPopulate() is used with a document Syntax for .exec() is as follows: Model.query() . populate ( 'field' ) . exec () // returns promise . then ( function ( document ) { console . log ( document ); }); Syntax for .execPopulate() is as follows: fetchedDocument . populate ( 'field' ) . execPopulate () // returns promise . then ( function ( document ) { console . log ( document ); }); When working with individual document use .execPopulate(), for model query use .exec(). Both returns a promise. One can do without .exec() or .execPopulate() but then has to pass a callback in populate.

683 K Empty Slots

  Approach #1: Insert Into Sorted Structure [Accepted] Intuition Let's add flowers in the order they bloom. When each flower blooms, we check it's neighbors to see if they can satisfy the condition with the current flower. Algorithm We'll maintain  active , a sorted data structure containing every flower that has currently bloomed. When we add a flower to  active , we should check it's lower and higher neighbors. If some neighbor satisfies the condition, we know the condition occurred first on this day. Complexity Analysis Time Complexity (Java):  O(N \log N) O ( N lo g N ) , where  N N  is the length of  flowers . Every insertion and search is  O(\log N) O ( lo g N ) . Time Complexity (Python):  O(N^2) O ( N 2 ) . As above, except  list.insert  is  O(N) O ( N ) . Space Complexity:  O(N) O ( N ) , the size of  active . Approach #2: Min Queue [Accepted] Intuition For each contiguous block ("window") of  k  po...