Claude Code for Non-Developers
Working with Data

Asking Questions About Your Data

How to move from exploring a data file to asking real business questions with Claude Code

From exploration to analysis

In the last page, you learned how to open a data file and get a summary: how many rows, what columns, any missing data. That tells you what's in the file. Now you'll ask what the data means.

This is where Claude Code starts replacing the spreadsheet juggling — the pivot tables, the VLOOKUP formulas, the "sort by column G then filter column B" routine. You describe what you want to know in plain language, and Claude Code figures out how to answer it.

Asking your first real question

Say you still have that customer_orders.csv file from the previous page, with 2,847 orders across 12 columns. You want to know which products are selling the most.

Type this:

What are the top 10 products by total revenue in customer_orders.csv?

Claude Code responds with something like:

Top 10 Products by Total Revenue:

1. Enterprise License    $284,500
2. Professional Plan     $198,200
3. Starter Kit           $142,800
4. Add-on: Storage       $97,300
5. Training Package      $89,100
6. Consulting Hours      $76,400
7. Support Upgrade       $62,900
8. API Access            $58,200
9. Custom Integration    $45,600
10. Onboarding Service   $38,700

In a spreadsheet, that same answer takes several steps: sort by the revenue column, group by product name, add up the totals, then sort the results. Claude Code did all of that behind the scenes.

Follow-up questions

Here's what makes this different from a one-off spreadsheet formula: Claude Code remembers the conversation. You can build on your last question without starting over.

After seeing the top products, you might want more detail:

Break that down by region. Which regions buy the most Enterprise Licenses?

Claude Code shows you a breakdown. Then you follow up:

How does that compare between Q1 and Q2?

And then:

Are there any regions where Enterprise License sales dropped more than 20% between quarters?

Each question builds on the last. You're drilling down into your data the way you'd talk through a problem with an analyst — except the analyst responds in seconds.

You don't need to re-explain your data each time. Claude Code already knows the file, the columns, and the questions you've already asked.

Common types of questions

Most business data questions fall into a few categories. Here are examples of each, so you can see the kind of language that works.

Ranking and top/bottom lists:

What are the 5 highest-value orders this quarter?
Which customer has the most orders?
What's the lowest-performing region by revenue?

Filtering to a specific slice:

Show me all orders from January 2025 with a total over $500.
Which orders are still in Pending status?
List all customers in the West region who paid by credit card.

Grouping and totals:

What's the total revenue by region?
How many orders came in each month?
What's the average order value by payment method?

Comparing time periods:

How do sales this quarter compare to last quarter?
Show me monthly revenue for the past 12 months.
Which month had the highest number of new customers?

Spotting patterns:

Are there any days of the week when we get more orders?
Is there a correlation between order size and region?
Which products are often purchased together?

You don't need to memorize these. They're a starting point. The real skill is describing what you want to know, in your own words, about your own data.

Describing what you want clearly

Claude Code understands natural language, but vague questions get vague answers.

Compare these two prompts:

Vague: "Tell me about the sales data."

Specific: "What's the total revenue by region for Q2, and which region had the biggest increase compared to Q1?"

The vague version gets you a general summary. The specific version gets you the exact numbers you need for your meeting tomorrow.

What makes a data question effective usually comes down to a few things.

Name the metric you care about. "Revenue" is clearer than "sales." "Average order value" is clearer than "how orders look." If your file has a column called total, tell Claude Code that's the revenue column instead of making it guess.

Set the scope. "In Q2" is clearer than "recently." "The West region" is clearer than "out west." Use the same values that appear in your data: if the column says "West" not "Western," use "West."

State what you're comparing. "Compared to Q1" tells Claude Code you want two numbers side by side. "Over time" tells it you want a trend. Without a comparison, you often get a single number with no context.

When Claude Code gets it wrong

Claude Code doesn't always interpret your data correctly. This is worth slowing down for, because wrong answers from Claude Code can look convincing.

The biggest problem is column misinterpretation. Claude Code guesses what each column means based on the column name and the data inside it. Usually it guesses right. Sometimes it doesn't.

A column called status might contain order statuses (Shipped, Pending, Cancelled) or it might contain employee statuses (Active, On Leave, Terminated). A column called value might be revenue or it might be a customer satisfaction score. If Claude Code guesses wrong, every number it gives you will be wrong too, and the numbers will still look reasonable. That's the tricky part.

The easiest way to catch it is to verify one number you already know. If you know last month's total revenue was roughly $150,000, and Claude Code tells you it was $1.5 million, something is off. Check whether it used the right column, the right time range, and the right calculation.

You can also ask Claude Code to show its work:

Show me the calculation you used to get that revenue number. Which column did you sum, and did you apply any filters?

If it summed the wrong column or misunderstood a filter, you'll see it right away.

The best way to prevent this altogether is to tell Claude Code what your columns mean before you start asking questions. At the beginning of a data analysis session, type something like:

Before we start, here's what the columns mean in customer_orders.csv:
- order_id: unique ID for each order
- customer_name: the buyer's name
- total: the dollar amount of the order (this is revenue)
- region: sales territory (West, East, Central, South)
- status: where the order is in fulfillment (Shipped, Pending, Cancelled)

You don't need to describe every column. Just the ones that could be misread. This takes 30 seconds and saves you from getting wrong answers you might not catch.

Heads up: When Claude Code gives you a number, ask yourself: "Does this seem right based on what I already know?" If you have no idea whether the answer is reasonable, that's a sign to spot-check against a known value before trusting the result.

Running the same question twice

Here's a verification trick worth knowing. If you get a number that matters — one you're going to put in a report or share with your team — run the same question a second time.

Start a new conversation (type /clear to wipe the current one) and ask the same question again. If you get the same answer both times, it's almost certainly correct. If the numbers differ, something is unstable in how Claude Code is interpreting the data. That's your signal to provide more context: a column description, a specific date range, or a clearer question.

You don't need to double-check every answer. Save this for numbers that will be seen by others or used to make decisions.

Useful data query prompts

Here are prompts you can adapt for your own data files. Replace the bracketed words with your own column names and values.

What you wantWhat to type
Top performers"What are the top 10 [items] by [metric]?"
Time comparison"Compare [metric] between [period 1] and [period 2]"
Group summary"What's the total [metric] grouped by [category]?"
Filter and count"How many [items] match [condition]?"
Trend"Show me [metric] by month for the past year"
Outliers"Are there any unusually high or low values in [column]?"
Distribution"What's the breakdown of [category column]? How many of each?"
Verify"Show me the calculation you used to get that number"

Next, you'll deal with the reality that most data files aren't clean when you first get them — inconsistent names, missing values, duplicate rows — and how to fix all of that before you start analyzing.

On this page