What Is a Diff? A Beginner's Guide to Text Comparison

The word "diff" gets thrown around constantly in software development, document editing, and data work. But what does it actually mean, how does it work, and when would you use one? This guide explains everything from scratch — no technical background required.

The Simple Definition

A diff (short for "difference") is a comparison between two versions of a text file that shows exactly what changed between them. It highlights what was added, what was removed, and — depending on the tool — what was modified.

The word comes from the Unix command diff, first written in the early 1970s. But the concept is universal and now used across every field that deals with text: programming, writing, law, data analysis, system administration, and more.

Think of it like the "track changes" feature in Microsoft Word, but for any plain-text file — not just Word documents. A diff shows you the delta between two states: what the file looked like before, and what it looks like after.

A Concrete Example

Suppose you have two versions of the same sentence:

Version A: The quick brown fox jumps over the lazy dog.

Version B: The quick brown fox leaps over the sleeping dog.

A diff of these two lines would tell you:

  • Removed: "jumps", "lazy"
  • Added: "leaps", "sleeping"
  • Unchanged: everything else

At line level, the whole sentence changed. At word level, only two words changed. A good diff tool shows you both views, so you can zoom in or out depending on what you need to understand.

How Diff Algorithms Actually Work

You don't need to understand the algorithm to use a diff tool, but it helps to have a mental model of what's happening under the hood.

The core problem a diff algorithm solves is: given two sequences of lines (or words, or characters), what's the minimum set of changes needed to turn sequence A into sequence B? This is called the edit distance problem.

The most widely used approach is finding the Longest Common Subsequence (LCS) — the longest sequence of lines that appear in both files in the same order, even if they're not contiguous. Everything in the LCS is "unchanged". Everything in file A that's not in the LCS was "removed". Everything in file B that's not in the LCS was "added".

Modern diff tools use more sophisticated variants — like the Myers diff algorithm (used in Git) or Patience diff — that are faster on large files and produce more human-readable output. TextFileCompare uses the Myers algorithm for exactly this reason.

Reading Diff Output

If you've ever seen raw diff output in a terminal, it can look cryptic. Here's what it means:

- The quick brown fox jumps over the lazy dog.
+ The quick brown fox leaps over the sleeping dog.

Lines starting with - (in red in most tools) were in the original file and are gone in the new version. Lines starting with + (in green) were added in the new version. Lines with no prefix are unchanged context.

Visual diff tools like TextFileCompare translate all of this into a side-by-side layout with colour coding, making it much easier to read without needing to parse the raw format.

Line-Level vs Word-Level vs Character-Level Diffs

Line-level diff

The default in most tools. The algorithm compares files line by line. Good for code, where each line is a discrete unit of meaning. Less useful for prose, where a paragraph rewrite will show the entire paragraph as changed even if only one sentence moved.

Word-level diff

Compares files word by word within changed lines. Excellent for prose, legal text, and documentation — you see exactly which words were substituted or added. TextFileCompare highlights within-line word changes so you can see not just that a line changed, but precisely what within it changed.

Character-level diff

The most granular level — compares character by character. Useful for catching subtle typos or single-character changes that might be hard to spot even with word-level highlighting.

When Would You Actually Use a Diff?

In software development

Developers use diffs constantly — in code review (what did this pull request actually change?), in debugging (what changed between the version that worked and the one that doesn't?), and in merge conflict resolution. Git, the most popular version control system, is built entirely around diffs.

In writing and editing

Writers comparing drafts, editors reviewing revisions, and anyone proofreading a document before submitting can use a diff to instantly see every change between two versions — without having to read both documents from start to finish.

In legal and compliance work

Contract redlining — comparing an original contract to a revised version — is a daily task in legal work. Diff tools make this faster and more accurate than manual review, especially for long documents with many small changes.

In data and configuration work

System administrators compare configuration files before and after a change to understand what was modified. Data engineers compare data exports to verify that a transformation didn't alter anything unexpected.

Unified Diff vs Side-by-Side Diff

A unified diff shows both versions interleaved in a single column, with removed lines in red and added lines in green. This is compact and useful for quickly scanning changes.

A side-by-side diff shows the original version in the left column and the new version in the right column, with corresponding lines aligned. This makes it much easier to compare individual lines directly and understand the context of a change. TextFileCompare defaults to side-by-side, which most users find more readable — particularly for longer files or when comparing prose.

Summary

A diff is simply a structured way to see what changed between two versions of a text file. Diff algorithms find the minimum set of additions and deletions needed to turn one version into another. Visual diff tools present this in a readable side-by-side layout with colour coding. They're used by developers, writers, lawyers, data professionals, and anyone who works with text and needs to understand what changed.

Try a live comparison on TextFileCompare →