157. Read N Characters Given Read4

327

2218

157. Read N Characters Given Read4

Solution

Overview

Interview Tendencies: Google and Facebook

A long time ago, long ago, so long ago that no one can remember, algorithm interview questions were less popular. Ten years ago big companies mainly filtered the candidates by the university ranks, and the interview questions were like please describe how DDR memory works.

Nowadays there are some tendencies to merge this "old-style interview" with the modern algorithm questions interview. The idea is to ask a question which sounds like algorithmic but checks your knowledge of how do computers work: Round-robin CPU scheduling, C10k problem first solved by nginx, etc.

Is it good or bad? That's a reality to deal with, especially if we speak about Google or Facebook interviews.

Read N Characters Given Read4

Back to the problem, the question is "how does the memory work":

Because of the physical implementation, loading 4 bytes in DDR is faster than to load 1 byte 4 times.
On the majority of computers today, collection of 4 bytes, or 32 bits, is called a word. Most modern CPUs are optimized for the operations with words.

To sum up, the problem is a practical low-level question. The standard approach (Approach 1) to solve it using the internal buffer of 4 characters:

File -> Internal Buffer of 4 Characters -> Buffer of N Characters.

Figure 1. Approach 1: solution with internal buffer.

Once it's done, and you show your understanding of memory operations, the follow-up question is how to speed up. The answer (Approach 2) is quite straightforward. If it's possible, do not use the internal buffer of 4 characters to avoid the double copy:

File -> Buffer of N Characters.

Figure 2. Approach 2: solution without internal buffer.

Approach 1: Use Internal Buffer of 4 Characters

Figure 3. Solution with internal buffer.

Algorithm

Let's use an internal buffer of 4 characters to solve this problem:

File -> Internal Buffer of 4 Characters -> Buffer of N Characters.

Initialize the number of copied characters copiedChars = 0, and the number of read characters: readChars = 4. It's convenient initialize readChars to 4 and then use readChars != 4 as EOF marker.
Initialize an internal buffer of 4 characters: buf4.
While number of copied characters is less than N: copiedChars < n and there are still characters in the file: readChars == 4:
- Read from file into internal buffer: readChars = read4(buf4).
- Copy the characters from internal buffer buf4 into main buffer buf one by one. Increase copiedChars after each character. In the number of copied characters is equal to N: copiedChars == n, interrupt the copy process and return copiedChars.

Implementation

Complexity Analysis

Time complexity: $\mathcal{O}(N)$ to copy N characters.
Space complexity: $\mathcal{O}(1)$ to keep buf4 of 4 elements.

Approach 2: Speed Up: No Internal Buffer

Figure 4. Solution without internal buffer.

This solution is mainly suitable for the languages (C, C++, Golang) where pointers allow to append directly to the primary buffer buf.

Algorithm

Initialize the number of copied characters copiedChars = 0, and the number of read characters: readChars = 4.
While number of copied characters is less than N: copiedChars < n and there are still characters in the file: readChars == 4:
- Read from file directly into buffer: read4(buf + copiedChars).
- Increase copiedChars: copiedChars += readChars.
Now buf contains at least N characters. Return min(n, copiedChars).

Implementation

Complexity Analysis

Time complexity: $\mathcal{O}(N)$ to copy N characters.
Space complexity: $\mathcal{O}(1)$ .

Comments:

Took some time to even to completely get the question :(

the Approach 2 might overflow the buffer since the buf is only guaranteed to fit n characters. :|
For example, file is "abcdefgh" and you want to read 5 chars, than buf will overflow with 3 additional bytes.

SHOW 3 REPLIES

Old school question. The rational stumped me at first, but it's a good question.

I saw that question in Facebook interview, I couldn't solve that time. Here my two cents (Java)
Thanks for the explanation.

// Global variable
int ans  = 0;
int ans4 = 0;
int idx4 = 0;
boolean eof = false;
char[] buf4 = new char[4];

public int read(char[] buf, int n) {
    ans = 0; // Reset to reuse read()

    while(ans < n){
        while(ans < n && idx4 < ans4) buf[ans++] = buf4[idx4++];
        if(ans == n || eof) return ans;
        res4 = read4(buf4); // Read other chunk
        idx4 = 0; // Reset to reuse
        if(res4 < 4) eof = true;
    }

    return ans;
}

SHOW 4 REPLIES

Am I wrong in understanding this question ? At first question looked tough but it turned out to be very simple one. I didn't get the rationale behind this simple stupid question!!!

FYI, word is not 4 bytes, word is 2 bytes, dword is 4 bytes.
Please correct explanation

SHOW 2 REPLIES

Good intro to solution. But definitely 13 years ago at least companies didn't select candidates based on DDR questions. But yes questions used to be much simpler.

The example solutions does not consider consecutive read-by-N, unfortunately.

Consider this case: file = "abcdefg", n = 5. If we call the read() twice, what should the second read() get? If it is real world problem, we would expect the second read() to get 'fg', but the solutions here would get empty string "".

The first read() will call read4 twice, and get 'abcd', 'efg', then the solution returned 'abcde' 5 bytes, and discard 'fg'. The second read() calls read4, and the read4 maintained file pointer is already pointing to the end of file, then read4 returns empty. A complete solution should buffer the extra bytes internally, and prefix the left-over to the next read().

SHOW 1 REPLY

Approach 2 with protection against overflowing primary buffer - allocate an internal buffer for the last read if it might overflow.

class Solution {
public:
    int read(char *buf, int n) {
        int copiedChars = 0, readChars = 4;
        
        while (copiedChars < n && readChars == 4) {
            if (n - copiedChars < 4) {
                // internal buffer for final read that may overflow buf
                char buf4[4];
                readChars = read4(buf4);
                for (int i = 0; i < readChars; ++i) {
                    buf[copiedChars] = buf4[i];
                    ++copiedChars;
                }
            } else {
                readChars = read4(buf + copiedChars);
                copiedChars += readChars;
            }
        }
        return min(n, copiedChars);
    }
};

I agree with the approach 2, but definitely not the implementation of approach 2, which takes the risk of overflowing the given buffer.

SHOW 1 REPLY

user9633P

Explore Problems

Contest Discuss

Redeem My List My Playground Notebook Submissions Sessions Progress Points Subscription Sign out

Python - List - Append, Count, Extend, Index, Insert, Pop, Remove, Reverse, Sort

🐍 Advance List List is widely used and it's functionalities are heavily useful. Append Adds one element at the end of the list. Syntax list1.append(value) Input l1 = [1, 2, 3] l1.append(4) l1 Output [1, 2, 3, 4] append can be used to add any datatype in a list. It can even add list inside list. Caution: Append does not return anything. It just appends the list. Count .count(value) counts the number of occurrences of an element in the list. Syntax list1.count(value) Input l1 = [1, 2, 3, 4, 3] l1.count(3) Output 2 It returns 0 if the value is not found in the list. Extend .count(value) counts the number of occurrences of an element in the list. Syntax list1.extend(list) Input l1 = [1, 2, 3] l1.extend([4, 5]) Output [1, 2, 3, 4, 5] If we use append, entire list will be added to the first list like one element. Extend, i nstead of considering a list as one element, it joins the two lists one after other. Append works in the following way. Input l1 = [1, 2, 3] l1.append([4, 5]) Output...

Vinit Khandelwal

Search This Blog

157. Read N Characters Given Read4

157. Read N Characters Given Read4

Solution

Overview

Approach 1: Use Internal Buffer of 4 Characters

Approach 2: Speed Up: No Internal Buffer

Comments

Post a Comment

Popular posts from this blog

Resolve: Uncaught TypeError: firebase.database is not a function

Python - List - Append, Count, Extend, Index, Insert, Pop, Remove, Reverse, Sort

Difference between .exec() and .execPopulate() in Mongoose?