CSV Encoding & Delimiter Fixer

Free in-browser tool to fix CSV delimiter and encoding problems before import — detect UTF-8 vs Latin-1, commas vs semicolons, and BOM issues. No upload.

published onboarding

A supplier price list that opens cleanly on one machine and turns into mojibake on the next is almost always a CSV that disagrees with the program reading it. This tool helps you fix CSV delimiter and encoding problems before import: it detects the real field separator, the character encoding, and the row structure of any file you paste or upload, then tells you in plain language what to change.

CSV Encoding & Delimiter Fixer

The interactive version of this tool is coming soon. It will run entirely in your browser — no login, no upload limits.

Planned tool: fix csv delimiter encoding

Need this now? Talk to Claro

What it checks

  • Delimiter detection — whether fields are separated by commas, semicolons, tabs, or pipes, and whether that choice is consistent across every row. A German MRO supplier exporting from a semicolon locale is the classic source of a single-column mess.
  • Character encoding — distinguishes UTF-8 (with or without a byte-order mark), UTF-16, and legacy single-byte encodings such as Windows-1252 and ISO-8859-1, the usual cause of garbled accents in furniture and CPG brand names.
  • Byte-order mark (BOM) — flags a leading BOM that can corrupt the first header (for example turning SKU into SKU) and break column mapping on import.
  • Quoting and escaping — checks that fields containing the delimiter, line breaks, or quote characters are wrapped and escaped per the standard, so a product description with a comma does not shift every later column.
  • Row width consistency — counts fields per row and reports rows with too many or too few columns, the signature of an unescaped delimiter inside a free-text field.
  • Line endings — identifies LF, CRLF, or mixed terminators that cause trailing-whitespace or phantom-row errors in stricter importers.

How it works to fix CSV delimiter encoding

CSV looks trivial but has no single enforced standard; the common reference is RFC 4180, and most real files deviate from it. The tool reads the raw bytes of your file and applies the same heuristics a careful parser would. For the delimiter, it tallies how often each candidate separator appears per line and picks the one that produces the most uniform column count. For encoding, it inspects the byte signature — a BOM, valid UTF-8 multi-byte sequences, or high bytes that only make sense as Windows-1252 — and reports the most likely match rather than guessing silently the way a spreadsheet auto-import does.

This matters because the failure is usually invisible until late. A 40,000-row CPG feed imports “successfully,” and only weeks later does someone notice every price after row 12 is off by one column because one product name contained a stray comma. Catching it at the file level is far cheaper than reconciling it downstream.

Symptom Likely cause Fix
Whole row in one cell Wrong delimiter (semicolon read as comma) Re-export or convert to the expected separator
Café shows as Caf├⌐ Latin-1 / Windows-1252 read as UTF-8 Convert the file to UTF-8
First header has hidden chars Leading byte-order mark Strip the BOM
Columns shift mid-file Unescaped delimiter in a text field Quote and escape affected fields

FAQ

Why does my CSV open as a single column in Excel or Google Sheets?

The file uses a delimiter the program does not expect for your locale — most often semicolons where a comma is assumed, or vice versa. Spreadsheet apps pick a separator based on regional settings and rarely tell you when the guess is wrong. Run the file through the detector above to confirm the real delimiter, then re-export or convert it to match what your importer expects.

How do I fix garbled characters like é or ’ in a CSV?

Those artifacts mean the file is encoded in a single-byte set such as Windows-1252 or ISO-8859-1 but is being read as UTF-8. The fix is to convert the file to UTF-8 (without a BOM, for most importers). The tool reports the detected encoding so you know what you are converting from before you re-save.

What is a BOM and should I remove it?

A byte-order mark is a few invisible bytes at the start of a file that signal its encoding. Many importers do not expect one and will fold it into your first header — turning SKU into something that no longer matches your column mapping. For most CSV pipelines, saving as “UTF-8 without BOM” is the safest choice.

Is there a maximum file size, and is my data safe?

There is no enforced size limit because processing happens entirely in your browser; the practical ceiling is your device’s memory. Because nothing is uploaded, confidential price lists and full catalog exports stay on your machine.

What is the difference between comma, semicolon, and tab delimited files?

They are all “CSV-style” flat files that differ only in the character separating fields. Commas are the English-locale default, semicolons are common in European exports (where commas are decimal separators), and tabs avoid clashing with either. None is more correct — what matters is that the file’s delimiter matches what the receiving system expects.