find, locate, and xargs#

Concepts#

find — Search for Files by Criteria#

find walks the directory tree and tests each file against your criteria. It is powerful, flexible, and available everywhere.

find [starting-path] [tests] [actions]

Search by Name#

# Find by exact name
find /etc -name "hosts"

# Case-insensitive
find /home -iname "readme*"

# Wildcards (must be quoted to prevent shell expansion)
find . -name "*.log"
find . -name "*.txt" -o -name "*.md"     # -o = OR

Search by Type#

find /var -type f          # regular files
find /var -type d          # directories
find /var -type l          # symbolic links

Search by Size#

find /var/log -size +10M        # larger than 10 MB
find . -size -1k                # smaller than 1 KB
find . -size +100M -size -1G    # between 100 MB and 1 GB
# Units: c=bytes, k=KB, M=MB, G=GB

Search by Time#

# Modified time (days)
find /tmp -mtime -1       # modified in the last 24 hours
find /tmp -mtime +30      # modified more than 30 days ago

# Accessed time
find . -atime -7          # accessed in the last 7 days

# Modified time (minutes)
find . -mmin -60          # modified in the last 60 minutes

# Newer than a file
find . -newer reference.txt

Search by Permissions and Ownership#

# Exact permission
find /usr -perm 755

# At least these permissions set
find . -perm -644

# Owned by user
find / -user kmiguel 2>/dev/null

# Owned by group
find / -group www-data 2>/dev/null

# Files with no owner (orphaned)
find / -nouser 2>/dev/null

Search by Depth#

find . -maxdepth 1 -name "*.txt"    # current directory only
find . -maxdepth 2 -type d          # at most 2 levels deep
find . -mindepth 2 -name "*.conf"   # skip the first level

Combining Tests#

# AND (default) — both must be true
find . -name "*.log" -size +1M

# OR
find . -name "*.jpg" -o -name "*.png"

# NOT
find . ! -name "*.tmp"
find . -not -user root

# Grouping with parentheses (must be escaped)
find . \( -name "*.log" -o -name "*.tmp" \) -mtime +30

Actions#

# Default action: -print (show the path)
find . -name "*.txt"

# Delete matching files
find /tmp -name "*.tmp" -delete

# Execute a command on each result
find . -name "*.log" -exec ls -lh {} \;
# {} is replaced by the filename. \; ends the command.

# Execute with confirmation
find . -name "*.bak" -ok rm {} \;

# More efficient: pass multiple files at once (like xargs)
find . -name "*.txt" -exec grep -l "TODO" {} +
# {} + passes as many files as possible in one command invocation

locate searches a pre-built database, making it much faster than find — but the database may be stale.

# Install (not always pre-installed)
sudo apt install -y plocate     # Ubuntu 22+/Debian 12+
# (older systems: sudo apt install mlocate)

# Update the database
sudo updatedb

# Search
locate hosts
locate -i readme          # case-insensitive
locate -c "*.conf"        # count matches
locate -r '\.py$'         # regex
locate --existing         # only show files that still exist

The database is usually updated automatically once a day via cron. After creating new files, run sudo updatedb to include them.

find vs locate#

find locate
Speed Slower (walks filesystem) Very fast (searches database)
Freshness Always current May be stale
Criteria Name, size, time, permissions, etc. Name/path only
Actions Can delete, exec commands Search only

Use locate to quickly find files by name. Use find when you need criteria beyond the name or guaranteed current results.

xargs — Build Commands from stdin#

xargs reads items from stdin and executes a command with those items as arguments. It is often used with find and grep.

# Basic usage: pass stdin as arguments
echo "file1.txt file2.txt file3.txt" | xargs rm

# With find (alternative to -exec)
find . -name "*.log" | xargs ls -lh

# Handle filenames with spaces (null-delimited)
find . -name "*.txt" -print0 | xargs -0 rm
# -print0: find outputs null-separated names
# -0: xargs reads null-separated input

Controlling Argument Placement#

# -I {} — replace {} with each input item (one at a time)
find . -name "*.conf" | xargs -I {} cp {} /backup/

# Process one item at a time
echo "a b c" | xargs -n 1 echo
# a
# b
# c

# Process two items at a time
echo "a b c d" | xargs -n 2 echo
# a b
# c d

Parallel Execution#

# Run up to 4 processes in parallel
find . -name "*.gz" -print0 | xargs -0 -P 4 gunzip

# Combine with -n for parallel, one-item-at-a-time processing
cat urls.txt | xargs -n 1 -P 8 curl -O

Practical Examples#

# Delete all .tmp files (safe with spaces in names)
find /tmp -name "*.tmp" -print0 | xargs -0 rm -f

# Grep across all Python files
find . -name "*.py" -print0 | xargs -0 grep -l "import os"

# Count lines in all shell scripts
find . -name "*.sh" -print0 | xargs -0 wc -l

# Change permissions on all directories
find . -type d -print0 | xargs -0 chmod 755

# Change permissions on all files
find . -type f -print0 | xargs -0 chmod 644

# Show sizes of all found files
find /var/log -name "*.log" -print0 | xargs -0 du -sh

# Rename files (using -I for placeholder)
ls *.txt | xargs -I {} mv {} {}.bak

Why Use xargs Instead of find -exec?#

  • xargs can run commands in parallel (-P)
  • xargs batches arguments efficiently (fewer command invocations)
  • xargs works with any input, not just find

For simple cases, find -exec ... {} + is equivalent and simpler.


Lab#

Exercise 1: find Basics#

# Find all .conf files in /etc
find /etc -name "*.conf" 2>/dev/null | head -10

# Find directories in your home
find ~ -maxdepth 2 -type d | head -20

# Find files modified in the last hour
find /tmp -mmin -60 -type f 2>/dev/null

# Find files larger than 10MB
find /var -size +10M -type f 2>/dev/null

Exercise 2: find with Actions#

# Create test files
mkdir -p /tmp/findlab
touch /tmp/findlab/{a,b,c}.txt /tmp/findlab/{x,y}.log

# Find .txt files and list them in detail
find /tmp/findlab -name "*.txt" -exec ls -l {} \;

# Find .log files and show their names
find /tmp/findlab -name "*.log" -exec basename {} \;

# Delete .log files
find /tmp/findlab -name "*.log" -delete
ls /tmp/findlab

# Clean up
rm -rf /tmp/findlab

Exercise 3: locate#

# Update the database
sudo updatedb

# Find all files named "bashrc"
locate bashrc

# Find all .service files (systemd units)
locate -c ".service"

# Case-insensitive search
locate -i readme | head -10

Exercise 4: xargs#

# Create test files
mkdir -p /tmp/xargslab
for i in {1..5}; do echo "line $i" > /tmp/xargslab/file$i.txt; done

# Count lines in all files using xargs
find /tmp/xargslab -name "*.txt" | xargs wc -l

# Safe version with spaces
find /tmp/xargslab -name "*.txt" -print0 | xargs -0 wc -l

# Use xargs to create backup copies
find /tmp/xargslab -name "*.txt" | xargs -I {} cp {} {}.bak
ls /tmp/xargslab

# Clean up
rm -rf /tmp/xargslab

Review#

1. How do you find all .log files larger than 10MB?

find / -name "*.log" -size +10M. Combine -name for the pattern and -size +10M for the size filter.

2. What is the difference between find and locate?

find walks the filesystem in real-time — always current but slower. locate searches a pre-built database — very fast but may be stale. locate can only search by name/path; find supports size, time, permissions, and more.

3. How do you handle filenames with spaces when using find and xargs?

Use find ... -print0 | xargs -0 .... The -print0 flag outputs null-delimited filenames, and -0 tells xargs to read null-delimited input.

4. What does find's -exec do?

Runs a command on each matching file. {} is replaced by the filename, and the command ends with \; (one invocation per file) or + (batch multiple files into one invocation).

5. How do you run xargs in parallel?

Use the -P flag: xargs -P 4 runs up to 4 processes simultaneously. Combine with -n 1 to process one item per invocation.


Previous: Shell Customization | Next: Scheduling with cron and at