use-cases/clean-messy-data.md +41 −97
11# Clean and prepare messy data | Codex use cases---
22 name: Clean and prepare messy data
33Codex use casestagline: Process tabular data without affecting the original.
44 summary: Drag in or mention a messy CSV or spreadsheet, describe the problems
55 you see, and ask Codex to write a cleaned copy while keeping the original file
66 unchanged.
77skills:
88 - token: $spreadsheet
99Codex use case description: Inspect tabular files, clean columns, and produce reviewable outputs.
1010 bestFor:
1111# Clean and prepare messy data - CSV or spreadsheet exports with mixed dates, currencies, duplicates, summary
1212 rows, or missing values.
13Process tabular data without affecting the original.
14
15Difficulty **Easy**
16
17Time horizon **5m**
18
19Drag in or mention a messy CSV or spreadsheet, describe the problems you see, and ask Codex to write a cleaned copy while keeping the original file unchanged.
20
21## Best for
22
23- CSV or spreadsheet exports with mixed dates, currencies, duplicates, summary rows, or missing values.
24 - Teams who work with data from multiple sources.13 - Teams who work with data from multiple sources.
14starterPrompt:
15 title: Clean a Copy
16 body: >-
17 Clean @marketplace-risk-rollout-export.csv.
25 18
26# Contents
27
28[← All use cases](https://developers.openai.com/codex/use-cases)
29
30Copy page [Export as PDF](https://developers.openai.com/codex/use-cases/clean-messy-data/?export=pdf)
31
32Drag in or mention a messy CSV or spreadsheet, describe the problems you see, and ask Codex to write a cleaned copy while keeping the original file unchanged.
33
34Easy
35
365m
37
38Related links
39
40[Analyze data with Codex](https://developers.openai.com/codex/use-cases/analyze-data-export) [File inputs](https://developers.openai.com/api/docs/guides/file-inputs) [Agent skills](https://developers.openai.com/codex/skills)
41
42## Best for
43
44- CSV or spreadsheet exports with mixed dates, currencies, duplicates, summary rows, or missing values.
45 - Teams who work with data from multiple sources.
46
47## Skills & Plugins
48
49- Spreadsheet
50
51 Inspect tabular files, clean columns, and produce reviewable outputs.
52
53| Skill | Why use it |
54| --- | --- |
55| Spreadsheet | Inspect tabular files, clean columns, and produce reviewable outputs. |
56
57## Starter prompt
58 19
59 Clean @marketplace-risk-rollout-export.csv.
60 What's wrong:20 What's wrong:
21
61 - dates are mixed between MM/DD/YYYY and YYYY-MM-DD22 - dates are mixed between MM/DD/YYYY and YYYY-MM-DD
23
62 - currency values include $, commas, and blank cells24 - currency values include $, commas, and blank cells
25
63 - a few duplicate customer rows came from repeated exports26 - a few duplicate customer rows came from repeated exports
27
64 - region and category names use several aliases28 - region and category names use several aliases
29
65 - there are pasted summary rows mixed into the data30 - there are pasted summary rows mixed into the data
66 What I want:
67 - write a cleaned CSV
68 - keep the original file unchanged
69 - use one date format
70 - keep blank currency cells blank
71 - preserve source row IDs when possible
72- add a short data-quality note with rows you changed, removed, or could not clean confidently
73 31
74[Open in the Codex app](codex://new?prompt=Clean+%40marketplace-risk-rollout-export.csv.%0A%0AWhat%27s+wrong%3A%0A-+dates+are+mixed+between+MM%2FDD%2FYYYY+and+YYYY-MM-DD%0A-+currency+values+include+%24%2C+commas%2C+and+blank+cells%0A-+a+few+duplicate+customer+rows+came+from+repeated+exports%0A-+region+and+category+names+use+several+aliases%0A-+there+are+pasted+summary+rows+mixed+into+the+data%0A%0AWhat+I+want%3A%0A-+write+a+cleaned+CSV%0A-+keep+the+original+file+unchanged%0A-+use+one+date+format%0A-+keep+blank+currency+cells+blank%0A-+preserve+source+row+IDs+when+possible%0A-+add+a+short+data-quality+note+with+rows+you+changed%2C+removed%2C+or+could+not+clean+confidently "Open in the Codex app")
75 32
76 Clean @marketplace-risk-rollout-export.csv.
77 What's wrong:
78 - dates are mixed between MM/DD/YYYY and YYYY-MM-DD
79 - currency values include $, commas, and blank cells
80 - a few duplicate customer rows came from repeated exports
81 - region and category names use several aliases
82 - there are pasted summary rows mixed into the data
83 What I want:33 What I want:
34
84 - write a cleaned CSV35 - write a cleaned CSV
36
85 - keep the original file unchanged37 - keep the original file unchanged
38
86 - use one date format39 - use one date format
40
87 - keep blank currency cells blank41 - keep blank currency cells blank
42
88 - preserve source row IDs when possible43 - preserve source row IDs when possible
8944- add a short data-quality note with rows you changed, removed, or could not clean confidently
45 - add a short data-quality note with rows you changed, removed, or could not
46 clean confidently
47 suggestedEffort: low
48relatedLinks:
49 - label: Analyze data with Codex
50 url: /codex/use-cases/analyze-data-export
51 - label: File inputs
52 url: /api/docs/guides/file-inputs
53 - label: Agent skills
54 url: /codex/skills
55---
90 56
91## Introduction57## Introduction
92 58
93Codex is great at cleaning systematically tabular data.59Codex is great at cleaning systematically tabular data.
94When a CSV or spreadsheet has mixed dates, duplicate rows, currency strings, blank cells, aliases, or pasted summary rows, ask Codex to clean a copy and leave the original file unchanged.60When a CSV or spreadsheet has mixed dates, duplicate rows, currency strings, blank cells, aliases, or pasted summary rows, ask Codex to clean a copy and leave the original file unchanged.
95 61
96[
97Your browser does not support the video tag.
98](https://cdn.openai.com/codex/docs/developers-website/use-cases/data-analysis-cleaning-csv.mp4)
99
100## How to use62## How to use
101 63
64
65
1021. Drag the file into Codex or mention it in your prompt, such as `@customer-export.csv`.661. Drag the file into Codex or mention it in your prompt, such as `@customer-export.csv`.
1032. Describe the problems you already see.672. Describe the problems you already see.
1043. Tell Codex what the cleaned version should be: CSV, spreadsheet tab, or upload-ready file.683. Tell Codex what the cleaned version should be: CSV, spreadsheet tab, or upload-ready file.
1054. Review the cleaned copy before using it.694. Review the cleaned copy before using it.
106 70
107Use the starter prompt on this page for the first cleaning pass. Replace the file name and bullets with your own. The useful details are the problems you already see and the file you need next: a cleaned CSV, a clean spreadsheet tab, or an upload-ready file. After Codex writes the clean copy, open the cleaned file and the data-quality note from the thread before using the data downstream.
108
109## Related use cases
110
111[
112 71
113### Query tabular data
114 72
11573Use Codex with a CSV, spreadsheet, dashboard export, Google Sheet, or local data file to...Use the starter prompt on this page for the first cleaning pass. Replace the file name and bullets with your own. The useful details are the problems you already see and the file you need next: a cleaned CSV, a clean spreadsheet tab, or an upload-ready file. After Codex writes the clean copy, open the cleaned file and the data-quality note from the thread before using the data downstream.
116
117Data Knowledge Work](https://developers.openai.com/codex/use-cases/analyze-data-export)[
118
119### Turn feedback into actions
120
121Connect Codex to multiple data sources such as Slack, GitHub, Linear, or Google Drive to...
122
123Data Integrations](https://developers.openai.com/codex/use-cases/feedback-synthesis)[
124
125### Coordinate new-hire onboarding
126
127Use Codex to gather approved new-hire context, stage tracker updates, draft team-by-team...
128
129Integrations Data](https://developers.openai.com/codex/use-cases/new-hire-onboarding)