Give Me 5 Minutes and I’ll Explain Why You’re Splitting Functions Wrong

Everything You Need To Know On When To Split Functions For Clean Code

Feb 07, 2025

Generated with: flux-1.1-pro. Prompt: “Generate me a water color image where there is a beige background and a handwritten red text that says: “5 minutes!”.

Just a couple of days ago, I was refactoring a complex greedy algorithm in Cresta’s codebase.

I made the mistake of splitting the algorithm into too many functions, which made it harder to understand because I had to jump between them constantly. Despite having reviewed nearly 1,000 pull requests, I realized there’s always something new to learn about functions.

In this article, we’ll explore how to recognize when it’s a good idea to split a function and dive into a real-world code example from React’s codebase.

To Split Or Not To Split?

Software engineers often break up functions too much. Length alone isn’t a reliable indicator of whether a function should be split.

Here are some guidelines to help you decide:

When to Split:

Multiple Responsibilities: If you’re trying to name a function and find yourself using more than one verb (e.g., validateAndProcessData), it’s a sign that the function is doing more than one thing.
Reusability: If a block of code is repeated in multiple places, extract it into a separate function to avoid duplication.
Complexity: If a function becomes too long or complex (e.g., deeply nested loops or conditionals), split it into smaller, more manageable functions.

When Not to Split:

Avoid Over-Engineering: Don’t split functions just for the sake of splitting. Over-engineering can introduce unnecessary complexity.
Tight Coupling: If splitting a function results in tightly coupled functions (e.g., one function depends heavily on the internal logic of another), it’s better to keep them together.
Conjoined Functions: Each function should be understandable independently. If you can’t grasp the implementation of one function without also understanding another, that’s a red flag.

How Splitting Function affects The Codebase

Let’s take a look at the image below, which visualizes how a function can be split and the side effects of doing so.

Splitting functions can have significant implications for your codebase:

More Functions Mean More Interfaces: Each new function introduces an additional interface that needs to be documented, learned, and maintained.
Loss of Independence: If functions are made too small, they can lose their independence, resulting in conjoined functions that must be read and understood together.
Depth Over Length: Focus on making functions deep (i.e., encapsulating meaningful logic) before worrying about making them short. Once they’re deep, aim to make them concise enough to be easily readable.

A function (Case A) can be split in two ways:

Extracting a Subtask (Case B): This maintains a single interface for the caller while breaking down the logic into smaller, reusable pieces.
Dividing Functionality into Separate Methods (Case C): This can be useful if the tasks are distinct and reusable.

However, a function should not be split if it results in shallow methods (Case D), where the caller has to manage too many small, interdependent interfaces.

Case A: Good — Single Interface

def analyze_text_sentiment(text):
    # Single interface that handles everything internally
    words = text.lower().split()
    sentiment_score = 0
    
    for word in words:
        if word in POSITIVE_WORDS:
            sentiment_score += 1
        elif word in NEGATIVE_WORDS:
            sentiment_score -= 1
            
    return {
        'overall_sentiment': 'positive' if sentiment_score > 0 else 'negative',
        'confidence': abs(sentiment_score) / len(words),
        'word_count': len(words),
        'sentiment_score': sentiment_score
    }

Case B: Good — Subtask Extraction but Still Single Interface

def calculate_sentiment_metrics(text):
    words = text.lower().split()
    sentiment_score = 0
    
    for word in words:
        if word in POSITIVE_WORDS:
            sentiment_score += 1
        elif word in NEGATIVE_WORDS:
            sentiment_score -= 1
            
    return sentiment_score, len(words)

def analyze_text_sentiment(text):
    # Caller still has a single, clean interface
    sentiment_score, word_count = calculate_sentiment_metrics(text)
    return {
        'overall_sentiment': 'positive' if sentiment_score > 0 else 'negative',
        'confidence': abs(sentiment_score) / word_count,
        'word_count': word_count,
        'sentiment_score': sentiment_score
    }

Case D: Avoid — Multiple Shallow Interfaces

# Multiple shallow interfaces that the caller needs to manage
def normalize_text(text):
    return text.lower()

def split_text(text):
    return text.split()

def check_word_sentiment(word):
    if word in POSITIVE_WORDS:
        return 1
    elif word in NEGATIVE_WORDS:
        return -1
    return 0

def calculate_confidence(score, count):
    return abs(score) / count

# Caller now has to manage many small interfaces
def process_text():
    text = get_user_input()
    normalized = normalize_text(text)
    words = split_text(normalized)
    sentiment = 0
    for word in words:
        sentiment += check_word_sentiment(word)
    confidence = calculate_confidence(sentiment, len(words))
    # This is the worst case - too many small interfaces to manage

To get a better sense of when to split functions and when to keep them as a single unit, let’s investigate React’s open-source code.

Real World Example 1: Split!

The function runWithEnvironment is a prime example of a function that should have been split:

It performs multiple distinct operations (pruning, merging, validation, code generation)
Each operation has its own logging step
The operations are loosely coupled — they could be understood independently
The function is quite long and handles multiple levels of abstraction

Take a look at the long function below:

Source: https://github.com/facebook/react

If we split the function it looks like this:

runWithEnvironment split into several functions

Splitting this function gives us following benefits:

Clear Phases: Each function represents a distinct phase in the compilation pipeline:
Better Testing: Each phase can be tested independently
Improved Readability: Each function has a clear single responsibility and is easier to understand
Easier Maintenance: Changes to one phase don’t require understanding the entire pipeline
Better Error Handling: Each phase could potentially have its own error handling strategy
Documentation: The function names themselves serve as documentation of the compilation pipeline stages

Real World Example 2: Don’t Split!

Let’s take a look at a function that should not be split. The warnForMissingKey from the react-reconciler package is a good example for this:

Here’s why the function should stay as a single unit despite it length:

Single Purpose: The function has one clear responsibility — warning about missing keys in React children. All its logic is dedicated to building that single warning message.
State Flow: The function follows a natural progression that includes early returns for invalid cases, store validation checks, component name resolution, error message building with context, and the final error output. Splitting it up would disrupt this intuitive flow.
Shared Context: It accumulates all necessary context information (componentName, currentComponentErrorInfo, childOwnerAppendix) to generate the final error message. Separating the function would force you to pass this shared context around unnecessarily.
Progressive Enhancement: The error message is built in layers — starting with basic component info, then adding parent component context, and finally including child owner context — making it more detailed as conditions are met.

Real World Example 3: Split!

The getOwnerStackByFiberInDev function is a perfect candidate for a split.

This function should be split because it does multiple distinct operations:

Initial component type handling
Owner stack processing
Debug stack formatting
Server component handling

Take a look at the split below:

Here is why it is better:

Clearer Responsibilities: Each function handles one specific type of stack processing
Better Error Isolation: Each part can be tested and debugged independently
Improved Readability: The main function shows the high-level flow while delegating details
Easier Maintenance: Changes to server or client stack handling can be made independently
Better Type Safety: Each function can have more specific type signatures

Final Words

It takes some time to learn when you should split a function and no one is perfect at it. And as you can see from the real world examples, even people who get paid 160K$+ per year make mistakes.

As a key takeaway keep this in mind:

Each function should be independently understandable. Needing another function’s context to grasp one is a red flag.

Cheers,
Lorenz

💡 Want More Tools to Help You Grow?

I love sharing tools and insights that help others grow — both as engineers and as humans.

If you’re enjoying this post, here are two things you shouldn’t miss:

🔧 Medium Insights — A tool I built to explore the analytics behind any Medium author, helping you uncover what works and why.
📚 Everything I Learned About Life — A Notion doc where I reflect on lessons learned, practical tools I use daily, and content that keeps me growing.

👉 Find both (and more) here: Start Here

The Growing Engineer

Discussion about this post