1name: Clean and prepare messy data1# Clean and prepare messy data | Codex use cases
2tagline: Process tabular data without affecting the original.
3summary: Drag in or mention a messy CSV or spreadsheet, describe the problems
4 you see, and ask Codex to write a cleaned copy while keeping the original file
5 unchanged.
6skills:
7 - token: $spreadsheet
8 description: Inspect tabular files, clean columns, and produce reviewable outputs.
9bestFor:
10 - CSV or spreadsheet exports with mixed dates, currencies, duplicates, summary
11 rows, or missing values.
12 - Teams who work with data from multiple sources.
13starterPrompt:
14 title: Clean a Copy
15 body: >-
16 Clean @marketplace-risk-rollout-export.csv.
17 2
3Codex use cases
18 4
19 What's wrong:5
20 6
21 - dates are mixed between MM/DD/YYYY and YYYY-MM-DD7
22 8
23 - currency values include $, commas, and blank cells9Codex use case
24 10
25 - a few duplicate customer rows came from repeated exports11# Clean and prepare messy data
26 12
27 - region and category names use several aliases13Process tabular data without affecting the original.
28 14
29 - there are pasted summary rows mixed into the data15Difficulty **Easy**
30 16
17Time horizon **5m**
31 18
32 What I want:19Drag in or mention a messy CSV or spreadsheet, describe the problems you see, and ask Codex to write a cleaned copy while keeping the original file unchanged.
33 20
34 - write a cleaned CSV21## Best for
35 22
36 - keep the original file unchanged23- CSV or spreadsheet exports with mixed dates, currencies, duplicates, summary rows, or missing values.
24- Teams who work with data from multiple sources.
37 25
38 - use one date format26# Contents
39 27
40 - keep blank currency cells blank28[← All use cases](https://developers.openai.com/codex/use-cases)
41 29
42 - preserve source row IDs when possible30Copy page [Export as PDF](https://developers.openai.com/codex/use-cases/clean-messy-data/?export=pdf)
43 31
44 - add a short data-quality note with rows you changed, removed, or could not32Drag in or mention a messy CSV or spreadsheet, describe the problems you see, and ask Codex to write a cleaned copy while keeping the original file unchanged.
45 clean confidently33
46 suggestedEffort: low34Easy
47relatedLinks:35
48 - label: Analyze data with Codex365m
49 url: /codex/use-cases/analyze-data-export37
50 - label: File inputs38Related links
51 url: /api/docs/guides/file-inputs39
52 - label: Agent skills40[Analyze data with Codex](https://developers.openai.com/codex/use-cases/analyze-data-export) [File inputs](https://developers.openai.com/api/docs/guides/file-inputs) [Agent skills](https://developers.openai.com/codex/skills)
53 url: /codex/skills41
42## Best for
43
44- CSV or spreadsheet exports with mixed dates, currencies, duplicates, summary rows, or missing values.
45- Teams who work with data from multiple sources.
46
47## Skills & Plugins
48
49- Spreadsheet
50
51 Inspect tabular files, clean columns, and produce reviewable outputs.
52
53| Skill | Why use it |
54| --- | --- |
55| Spreadsheet | Inspect tabular files, clean columns, and produce reviewable outputs. |
56
57## Starter prompt
58
59Clean @marketplace-risk-rollout-export.csv.
60What's wrong:
61- dates are mixed between MM/DD/YYYY and YYYY-MM-DD
62- currency values include $, commas, and blank cells
63- a few duplicate customer rows came from repeated exports
64- region and category names use several aliases
65- there are pasted summary rows mixed into the data
66What I want:
67- write a cleaned CSV
68- keep the original file unchanged
69- use one date format
70- keep blank currency cells blank
71- preserve source row IDs when possible
72- add a short data-quality note with rows you changed, removed, or could not clean confidently
73
74[Open in the Codex app](codex://new?prompt=Clean+%40marketplace-risk-rollout-export.csv.%0A%0AWhat%27s+wrong%3A%0A-+dates+are+mixed+between+MM%2FDD%2FYYYY+and+YYYY-MM-DD%0A-+currency+values+include+%24%2C+commas%2C+and+blank+cells%0A-+a+few+duplicate+customer+rows+came+from+repeated+exports%0A-+region+and+category+names+use+several+aliases%0A-+there+are+pasted+summary+rows+mixed+into+the+data%0A%0AWhat+I+want%3A%0A-+write+a+cleaned+CSV%0A-+keep+the+original+file+unchanged%0A-+use+one+date+format%0A-+keep+blank+currency+cells+blank%0A-+preserve+source+row+IDs+when+possible%0A-+add+a+short+data-quality+note+with+rows+you+changed%2C+removed%2C+or+could+not+clean+confidently "Open in the Codex app")
75
76Clean @marketplace-risk-rollout-export.csv.
77What's wrong:
78- dates are mixed between MM/DD/YYYY and YYYY-MM-DD
79- currency values include $, commas, and blank cells
80- a few duplicate customer rows came from repeated exports
81- region and category names use several aliases
82- there are pasted summary rows mixed into the data
83What I want:
84- write a cleaned CSV
85- keep the original file unchanged
86- use one date format
87- keep blank currency cells blank
88- preserve source row IDs when possible
89- add a short data-quality note with rows you changed, removed, or could not clean confidently
54 90
55## Introduction91## Introduction
56 92
57Codex is great at cleaning systematically tabular data.93Codex is great at cleaning systematically tabular data.
58When a CSV or spreadsheet has mixed dates, duplicate rows, currency strings, blank cells, aliases, or pasted summary rows, ask Codex to clean a copy and leave the original file unchanged.94When a CSV or spreadsheet has mixed dates, duplicate rows, currency strings, blank cells, aliases, or pasted summary rows, ask Codex to clean a copy and leave the original file unchanged.
59 95
60## How to use96[
61 97Your browser does not support the video tag.
98](https://cdn.openai.com/codex/docs/developers-website/use-cases/data-analysis-cleaning-csv.mp4)
62 99
100## How to use
63 101
641. Drag the file into Codex or mention it in your prompt, such as `@customer-export.csv`.1021. Drag the file into Codex or mention it in your prompt, such as `@customer-export.csv`.
652. Describe the problems you already see.1032. Describe the problems you already see.
663. Tell Codex what the cleaned version should be: CSV, spreadsheet tab, or upload-ready file.1043. Tell Codex what the cleaned version should be: CSV, spreadsheet tab, or upload-ready file.
674. Review the cleaned copy before using it.1054. Review the cleaned copy before using it.
68 106
107Use the starter prompt on this page for the first cleaning pass. Replace the file name and bullets with your own. The useful details are the problems you already see and the file you need next: a cleaned CSV, a clean spreadsheet tab, or an upload-ready file. After Codex writes the clean copy, open the cleaned file and the data-quality note from the thread before using the data downstream.
69 108
109## Related use cases
110
111[
112
113### Query tabular data
114
115Use Codex with a CSV, spreadsheet, dashboard export, Google Sheet, or local data file to...
116
117Data Knowledge Work](https://developers.openai.com/codex/use-cases/analyze-data-export)[
118
119### Turn feedback into actions
120
121Connect Codex to multiple data sources such as Slack, GitHub, Linear, or Google Drive to...
122
123Data Integrations](https://developers.openai.com/codex/use-cases/feedback-synthesis)[
124
125### Coordinate new-hire onboarding
126
127Use Codex to gather approved new-hire context, stage tracker updates, draft team-by-team...
128
129Integrations Data](https://developers.openai.com/codex/use-cases/new-hire-onboarding)
70 130
71Use the starter prompt on this page for the first cleaning pass. Replace the file name and bullets with your own. The useful details are the problems you already see and the file you need next: a cleaned CSV, a clean spreadsheet tab, or an upload-ready file. After Codex writes the clean copy, open the cleaned file and the data-quality note from the thread before using the data downstream.