Why does my neural network overfit despite using dropout and early stopping?

Tariq
Updated on March 17, 2026 in

I’m training a simple deep learning model, but it still overfits even after applying dropout and early stopping. Training accuracy is high, but validation performance drops.

 
import tensorflow as tf
from tensorflow.keras import layers, models

model = models.Sequential([
layers.Dense(128, activation=‘relu’, input_shape=(20,)),
layers.Dropout(0.5),
layers.Dense(64, activation=‘relu’),
layers.Dense(1, activation=‘sigmoid’)
])

model.compile(optimizer=‘adam’,
loss=‘binary_crossentropy’,
metrics=[‘accuracy’])

history = model.fit(X_train, y_train,
validation_data=(X_val, y_val),
epochs=50,
batch_size=32)

 

What are the common reasons this still happens in practice, and how can it be mitigated beyond basic regularization?

  • 1
  • 104
  • 1 month ago
 
on March 29, 2026

Overfitting can still happen even with dropout and early stopping if the model capacity is too high or the data is limited.

You can combine multiple strategies instead of relying on just those two:

import tensorflow as tf
from tensorflow.keras import layers, models, regularizers

model = models.Sequential([
    layers.Dense(128, activation='relu', 
                 kernel_regularizer=regularizers.l2(0.001)),
    layers.BatchNormalization(),
    layers.Dropout(0.5),

    layers.Dense(64, activation='relu', 
                 kernel_regularizer=regularizers.l2(0.001)),
    layers.BatchNormalization(),
    layers.Dropout(0.5),

    layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

early_stop = tf.keras.callbacks.EarlyStopping(
    monitor='val_loss',
    patience=5,
    restore_best_weights=True
)

history = model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    epochs=100,
    batch_size=32,
    callbacks=[early_stop]
)

Key things added here:

  • L2 regularization to penalize large weights

  • Batch normalization for more stable learning

  • Reduced layer size to limit model complexity

Also worth checking:

  • Data leakage between train and validation

  • Feature quality and noise

  • Whether your dataset is large enough for the model

Sometimes the fix isn’t more regularization, it’s a simpler model or better data.

  • Liked by
Reply
Cancel
Loading more replies