CSV Encoding & Delimiter Fixer
Free in-browser tool to fix CSV delimiter and encoding problems before import — detect UTF-8 vs Latin-1, commas vs semicolons, and BOM issues. No upload.
A supplier price list that opens cleanly on one machine and turns into mojibake on the next is almost always a CSV that disagrees with the program reading it. This tool helps you fix CSV delimiter and encoding problems before import: it detects the real field separator, the character encoding, and the row structure of any file you paste or upload, then tells you in plain language what to change.
CSV Encoding & Delimiter Fixer
The interactive version of this tool is coming soon. It will run entirely in your browser — no login, no upload limits.
Planned tool: fix csv delimiter encoding
Need this now? Talk to ClaroWhat it checks
- Delimiter detection — whether fields are separated by commas, semicolons, tabs, or pipes, and whether that choice is consistent across every row. A German MRO supplier exporting from a semicolon locale is the classic source of a single-column mess.
- Character encoding — distinguishes UTF-8 (with or without a byte-order mark), UTF-16, and legacy single-byte encodings such as Windows-1252 and ISO-8859-1, the usual cause of garbled accents in furniture and CPG brand names.
- Byte-order mark (BOM) — flags a leading BOM that can corrupt the first header (for example turning
SKUintoSKU) and break column mapping on import. - Quoting and escaping — checks that fields containing the delimiter, line breaks, or quote characters are wrapped and escaped per the standard, so a product description with a comma does not shift every later column.
- Row width consistency — counts fields per row and reports rows with too many or too few columns, the signature of an unescaped delimiter inside a free-text field.
- Line endings — identifies LF, CRLF, or mixed terminators that cause trailing-whitespace or phantom-row errors in stricter importers.
How it works to fix CSV delimiter encoding
CSV looks trivial but has no single enforced standard; the common reference is RFC 4180, and most real files deviate from it. The tool reads the raw bytes of your file and applies the same heuristics a careful parser would. For the delimiter, it tallies how often each candidate separator appears per line and picks the one that produces the most uniform column count. For encoding, it inspects the byte signature — a BOM, valid UTF-8 multi-byte sequences, or high bytes that only make sense as Windows-1252 — and reports the most likely match rather than guessing silently the way a spreadsheet auto-import does.
This matters because the failure is usually invisible until late. A 40,000-row CPG feed imports “successfully,” and only weeks later does someone notice every price after row 12 is off by one column because one product name contained a stray comma. Catching it at the file level is far cheaper than reconciling it downstream.
| Symptom | Likely cause | Fix |
|---|---|---|
| Whole row in one cell | Wrong delimiter (semicolon read as comma) | Re-export or convert to the expected separator |
| Café shows as Caf├⌐ | Latin-1 / Windows-1252 read as UTF-8 | Convert the file to UTF-8 |
| First header has hidden chars | Leading byte-order mark | Strip the BOM |
| Columns shift mid-file | Unescaped delimiter in a text field | Quote and escape affected fields |
Related resources
Glossary
What Is Schema Mapping?
Once the file parses cleanly, you still have to line its columns up with your own attributes.
Playbook
Map Supplier Attributes to Your Schema
A step-by-step approach to turning a clean supplier export into structured catalog records.
Guide
Supplier Onboarding Checklist for Distributors
Where file hygiene fits in the wider onboarding sequence, from intake to publish.
Tool
Shopify Product CSV Validator
Validate a product CSV against Shopify's required columns after you have fixed encoding and delimiters.
Tool
Product JSON / JSONL Schema Validator
Moving to a structured feed instead of CSV? Validate it against your schema.
Claro
Automate the whole intake
See how Claro ingests messy supplier files and resolves them into canonical product records.
FAQ
Why does my CSV open as a single column in Excel or Google Sheets?
The file uses a delimiter the program does not expect for your locale — most often semicolons where a comma is assumed, or vice versa. Spreadsheet apps pick a separator based on regional settings and rarely tell you when the guess is wrong. Run the file through the detector above to confirm the real delimiter, then re-export or convert it to match what your importer expects.
How do I fix garbled characters like é or ’ in a CSV?
Those artifacts mean the file is encoded in a single-byte set such as Windows-1252 or ISO-8859-1 but is being read as UTF-8. The fix is to convert the file to UTF-8 (without a BOM, for most importers). The tool reports the detected encoding so you know what you are converting from before you re-save.
What is a BOM and should I remove it?
A byte-order mark is a few invisible bytes at the start of a file that signal its encoding. Many importers do not expect one and will fold it into your first header — turning SKU into something that no longer matches your column mapping. For most CSV pipelines, saving as “UTF-8 without BOM” is the safest choice.
Is there a maximum file size, and is my data safe?
There is no enforced size limit because processing happens entirely in your browser; the practical ceiling is your device’s memory. Because nothing is uploaded, confidential price lists and full catalog exports stay on your machine.
What is the difference between comma, semicolon, and tab delimited files?
They are all “CSV-style” flat files that differ only in the character separating fields. Commas are the English-locale default, semicolons are common in European exports (where commas are decimal separators), and tabs avoid clashing with either. None is more correct — what matters is that the file’s delimiter matches what the receiving system expects.