How to find the first non-repeating character in a stream efficiently?

Quen Wi
Updated 2 days ago in

Here’s a clean version you can use:

Short Question:
How do you design a data structure to efficiently track and retrieve the first non-repeating character in a stream?

Description:
In many data stream problems, characters arrive one by one, and we need to continuously determine the first non-repeating character at any given point.

The challenge is to design a data structure that:

  • Supports real-time updates as new characters arrive

  • Efficiently tracks frequencies of characters

  • Quickly returns the first character that has appeared only once so far

A brute-force approach would re-scan the entire stream after each insertion, which is inefficient. Instead, we need an optimized approach using a combination of data structures to maintain order and frequency.

Approach:
Use:

  • A hash map to store frequency of each character

  • A queue to maintain insertion order

Python Code:

from collections import deque

class FirstNonRepeating:
    def __init__(self):
        self.freq = {}
        self.queue = deque()

    def add(self, char):
        # Update frequency
        self.freq[char] = self.freq.get(char, 0) + 1
        
        # Add to queue
        self.queue.append(char)
        
        # Remove repeating characters from front
        while self.queue and self.freq[self.queue[0]] > 1:
            self.queue.popleft()

    def get_first_non_repeating(self):
        return self.queue[0] if self.queue else None


# Example usage
stream = FirstNonRepeating()

for ch in "aabcbd":
    stream.add(ch)
    print(stream.get_first_non_repeating())
  • 1
  • 19
  • 2 days ago
 
22 hours ago

To find the first non-repeating character in a stream efficiently, the key is to process characters in real time while keeping track of both frequency and order.

The idea

You need two things working together:

  1. A way to count how many times each character appears

  2. A way to remember the order in which characters arrived

Efficient approach

Use:

  • A hash map (dictionary) → to store frequency of each character

  • A queue → to track characters in the order they appear

How it works step by step

For every incoming character in the stream:

  1. Increase its count in the hash map

  2. Add it to the queue

  3. While the front of the queue has frequency > 1, remove it

  4. The front of the queue (if any) is your answer

Example

Stream: a a b c b

  • After a → first non-repeating = a

  • After second a → none

  • After bb

  • After cb

  • After second bc

Python implementation

from collections import deque

def first_non_repeating(stream):
    freq = {}
    q = deque()
    
    result = []
    
    for ch in stream:
        freq[ch] = freq.get(ch, 0) + 1
        q.append(ch)
        
        while q and freq[q[0]] > 1:
            q.popleft()
        
        result.append(q[0] if q else None)
    
    return result

Complexity

  • Time: O(n)

  • Space: O(k), where k is number of unique characters

Why this works well

  • You process each character only once

  • You never re-scan the stream

  • You always have the answer ready in constant time

Real-world relevance

This pattern is useful in:

  • Real-time log processing

  • Streaming data pipelines

  • Event-driven systems

If you’re building systems that deal with continuous data, this kind of approach becomes extremely powerful.

  • Liked by
Reply
Cancel
Loading more replies