Claude Code for Non-Developers
Working with Data

Your Data Analysis Toolkit

A summary of the data skills from this module, with a prompt quick reference, project folder setup, and guidance on when to use Claude Code versus other tools.

What you've covered

This module took you from opening a data file to building reports and catching errors. You explored CSV, Excel, and JSON files. You asked business questions and drilled into the answers. You cleaned messy data, combined files from different sources, created charts, and generated summary reports. And you learned the three ways Claude Code gets data analysis wrong — column misinterpretation, silent sampling, and hallucinated statistics.

That's a lot of ground. Before moving on, let's organize it into something you can grab when you're sitting down with a real data file and need to get something done.

The data analysis workflow

Every data task follows roughly the same path:

  1. Open the file. Ask what's in it — how many rows, what columns, any obvious problems.
  2. Fix what's broken. Standardize values, remove duplicates, handle missing data. Always work on a copy.
  3. Ask what the data tells you. Start broad, then drill down.
  4. Combine files, add calculated columns, reshape the data to match the question you're asking.
  5. Create charts or summary reports. Iterate until it communicates clearly.
  6. Check the numbers. Compare against something you know. Ask Claude Code to show its work.

Not every task uses every step. Sometimes you jump straight from step 1 to step 3. Sometimes you clean and reshape before asking anything. Treat it as a guide, not a checklist.

Prompt quick reference

These are the most useful data prompts from this module, all in one place. You don't need to type them word for word — adapt them to your own data.

What you want to doWhat to type
Get an overview of a file"Summarize this file. How many rows and columns? What are the column names? Are there missing values?"
Ask a business question"Show me the top 10 [items] by [metric]. Break it down by [group]"
Compare time periods"Compare [metric] between [period 1] and [period 2]. Show the change as a percentage"
Find problems in the data"Run a data quality check. Look for missing values, duplicates, and inconsistent formats"
Clean inconsistent values"Standardize the [column] to use these values: [list]. Show me unique values first"
Remove duplicates"Find and remove duplicate rows based on [columns]. Show me the duplicates before removing them"
Combine multiple files"Combine all CSV files in this folder into one file. Stack the rows — they have the same columns"
Merge on a shared column"Merge [file 1] and [file 2] on the [column] column. Show me any rows that didn't match"
Add a calculated column"Add a column called [name] that calculates [formula]. Show me the first 10 rows to verify"
Create a chart"Create a bar chart showing [metric] by [category]. Save it as a PNG"
Build a full report"Write a summary report of [topic]. Include key metrics, trends, and a chart. Save it as an HTML file"
Verify a number"Show me the exact calculation for [number]. Which rows and columns did you use?"

Bookmark this page or paste these into a note. After a few sessions, you'll stop reaching for the templates — you'll have your own phrasing.

Setting up a data project folder

When you're doing more than a one-off question, a dedicated project folder saves time:

my-data-project/
├── CLAUDE.md          # Tells Claude Code about your data
├── data/              # Your input files (original copies)
├── output/            # Charts, reports, and cleaned files go here
└── working/           # Copies Claude Code works on

You don't have to use this exact layout. The point is keeping your original files separate from what Claude Code produces, so you always have an untouched copy.

The CLAUDE.md is the part that matters most. Even a short one makes a real difference:

# Sales data project

## Files
- data/customer_orders.csv: All orders from 2024-2025, about 2,800 rows

## Column definitions
- order_id: Unique order number
- customer_name: Buyer's full name
- order_date: Date placed (YYYY-MM-DD)
- total: Dollar amount — this is revenue
- region: Sales territory (West, East, Central, South)
- status: Shipped, Pending, or Cancelled

## Rules
- Analyze all rows. Do not sample.
- Exclude Cancelled orders when calculating revenue.
- Save all output files to the output/ folder.

Claude Code reads this file at the start of every session. Write it once, and every session starts with Claude Code already knowing what your columns mean, what format your dates use, and where to put the results.

If you're working with the same data regularly — weekly reports, monthly analyses — this setup turns 20 minutes into two, because you skip the explanation every time.

When to use Claude Code vs. other tools

Claude Code is good at data work, but it's not always the right pick.

Claude Code works best when:

  • Your questions keep changing. "Show me this by region. Now by quarter. Now add a trend line." In a spreadsheet, each pivot means rebuilding. Claude Code just adapts.
  • You're combining data from multiple files. Merging and cross-referencing across sources is where it saves the most time.
  • Your data is messy. Standardizing thousands of inconsistent values in a spreadsheet is tedious. Claude Code does it in one prompt.
  • You want a chart or report quickly. Describe what you want, get a file you can open.
  • Your file is large. Spreadsheets slow down past 100,000 rows. Claude Code handles millions.

A spreadsheet is better when:

  • The task is quick. A sum, an average, a basic filter — no setup overhead needed.
  • You need to edit individual cells by hand. Spreadsheets are built for that. Claude Code is not.
  • Your team collaborates in Google Sheets. Introducing a different tool adds friction.
  • The deliverable is a spreadsheet. If someone expects an .xlsx file they can open and modify, just give them one.

A BI tool (Tableau, Power BI, Looker) is better when:

  • Multiple people need the same live data, updated automatically.
  • You need a recurring dashboard that refreshes on its own.
  • Your organization already has one set up with clean data and agreed-on definitions.

Ask a data analyst when:

  • The stakes are high. If real money or strategy depends on the numbers, get a human to validate.
  • The mess requires business context. Missing values that need judgment calls, duplicates that need institutional knowledge, definitions that vary by department.
  • You need to turn numbers into a recommendation. That's a human skill.

The honest version of this guide: use Claude Code for the grunt work — the cleaning, the formatting, the initial exploration — and spend your time on the part that matters, which is deciding what the numbers mean and what to do about them.

The 80/20 of data work

Data analysts joke that 80% of their time goes to preparing data and 20% to actually analyzing it. That ratio is real. Cleaning, formatting, converting, deduplicating — hours of work before you get to the interesting part.

Claude Code flips that ratio. The tedious 80% — predictable, repeatable data manipulation — is exactly what it handles best. That leaves the 20% that matters: asking the right questions and deciding what the numbers mean.

You don't need to become a data expert. You need to know what questions to ask and how to check the answers. This module gave you both.

What's ahead

Module 4 is about automating repetitive tasks — taking something you do every week and turning it into something Claude Code handles in seconds. You've already seen a preview: the reusable report script from page 6. Next, you'll go deeper into spotting automation opportunities, building reusable scripts, and running bulk operations across hundreds of files.

The data skills from this module carry straight into that work. Weekly reports, regular cleanups, recurring file processing — same conversation skills, same verification habits, just applied to work that repeats.

On this page