use-cases/clean-messy-data.md +73 −0 added
1---
2name: Clean and prepare messy data
3tagline: Process tabular data without affecting the original.
4summary: Drag in or mention a messy CSV or spreadsheet, describe the problems
5 you see, and ask Codex to write a cleaned copy while keeping the original file
6 unchanged.
7skills:
8 - token: $spreadsheet
9 description: Inspect tabular files, clean columns, and produce reviewable outputs.
10bestFor:
11 - CSV or spreadsheet exports with mixed dates, currencies, duplicates, summary
12 rows, or missing values.
13 - Teams who work with data from multiple sources.
14starterPrompt:
15 title: Clean a Copy
16 body: >-
17 Clean @marketplace-risk-rollout-export.csv.
18
19
20 What's wrong:
21
22 - dates are mixed between MM/DD/YYYY and YYYY-MM-DD
23
24 - currency values include $, commas, and blank cells
25
26 - a few duplicate customer rows came from repeated exports
27
28 - region and category names use several aliases
29
30 - there are pasted summary rows mixed into the data
31
32
33 What I want:
34
35 - write a cleaned CSV
36
37 - keep the original file unchanged
38
39 - use one date format
40
41 - keep blank currency cells blank
42
43 - preserve source row IDs when possible
44
45 - add a short data-quality note with rows you changed, removed, or could not
46 clean confidently
47 suggestedEffort: low
48relatedLinks:
49 - label: Analyze data with Codex
50 url: /codex/use-cases/analyze-data-export
51 - label: File inputs
52 url: /api/docs/guides/file-inputs
53 - label: Agent skills
54 url: /codex/skills
55---
56
57## Introduction
58
59Codex is great at cleaning systematically tabular data.
60When a CSV or spreadsheet has mixed dates, duplicate rows, currency strings, blank cells, aliases, or pasted summary rows, ask Codex to clean a copy and leave the original file unchanged.
61
62## How to use
63
64
65
661. Drag the file into Codex or mention it in your prompt, such as `@customer-export.csv`.
672. Describe the problems you already see.
683. Tell Codex what the cleaned version should be: CSV, spreadsheet tab, or upload-ready file.
694. Review the cleaned copy before using it.
70
71
72
73Use the starter prompt on this page for the first cleaning pass. Replace the file name and bullets with your own. The useful details are the problems you already see and the file you need next: a cleaned CSV, a clean spreadsheet tab, or an upload-ready file. After Codex writes the clean copy, open the cleaned file and the data-quality note from the thread before using the data downstream.