Which Table Format Do LLMs Understand Best? (Results for 11 Formats)

Which Table Format Do LLMs Understand Best? (Results for 11 Formats)

When discussing the reliability of AI-based systems, there’s something fundamental that doesn’t get enough attention: what’s the best format for passing tables of data to an LLM?

Should you use markdown tables or CSV?

JSON or YAML?

Or does some other format work better than any of these? Why This Question Matters

As AI systems become integral to data analysis, business intelligence, and decision-making processes, understanding format sensitivity is crucial for:

Data Pipeline Architecture: Structuring data workflows for maximum AI comprehension
Performance Optimization: Reducing processing overhead while maintaining accuracy
Cost Management: Minimizing token usage and API costs in production systems

Many RAG pipelines involve ingesting documents that contain tables of data. If we’re not formatting that data in a way that it is easy for an LLM to consume, then we may be needlessly hurting the accuracy of the overall system. Our Methodology

We designed a controlled experiment to test how the formatting of a set of data would affect how accurately an LLM could answer questions about that data.

Our tests involved passing 1000 records to an LLM and asking it to answer a question based on the data. We then evaluated whether it answered correctly or not in each case.

We repeated this process for 1000 questions, using each of 11 different data formats.

Dataset: 1,000 synthetic employee records with 8 attributes each (ID, name, age, city, department, salary, experience, project count)
Questions: 1,000 randomized queries about specific data points
Model: GPT-4.1-nano
Formats Tested: 11 different data representation formats

Example Question-Answer Pairs

Q. “How many years of experience does Grace X413 have? (Return just the number, e.g. ‘12’.)” A. “15”

Q. “What is Alice W204’s salary? (Return just the number, e.g. ‘85200’.)” A. “131370”

Notes on Methodology

We opted to pass a relatively large number of records to the LLM in order to test its limits. In practice, with a large structured dataset, you’ll often want to chunk it up and/or query it in some way in order to extract just the most relevant records / information and only pass that reduced amount of context to the LLM.

When using formats such as CSV, HTML tables and markdown tables that involve headers, you may want to repeat those headers on a regular basis (e.g. every 100 records) to help with understanding. For simplicity, we didn’t do that here. How Well Did the LLM Understand Each Format? Format Accuracy 95% Confidence Interval Tokens Markdown-KV 60.7% 57.6% – 63.7% 52,104 XML 56.0% 52.9% – 59.0% 76,114 INI 55.7% 52.6% – 58.8% 48,100 YAML 54.7% 51.6% – 57.8% 55,395 HTML 53.6% 50.5% – 56.7% 75,204 JSON 52.3% 49.2% – 55.4% 66,396 Markdown-Table 51.9% 48.8% – 55.0% 25,140 Natural-Language 49.6% 46.5% – 52.7% 43,411 JSONL 45.0% 41.9% – 48.1% 54,407 CSV 44.3% 41.2% – 47.4% 19,524 Pipe-Delimited 41.1% 38.1% – 44.2% 43,098 Highlights

Format seems important: we saw significant differences in understanding between the different formats.
CSV and JSONL performed poorly: suggesting the potential for quick wins if you’re currently using one of these formats by default.
Markdown-KV came out top, hitting 60.7% accuracy and landing roughly 16 points ahead of CSV. (Markdown-KV is our term for a non-standardised format featuring “key: value” pairs in markdown.)
Accuracy cost tokens: the top-performing Markdown-KV format used 2.7 times as many tokens as the most token-efficient format, CSV.

Sumber: #