Dataset Health Analyzer: Score & Export a Data Quality Report
Score any CSV's data quality in seconds and export a shareable report as Markdown or PDF. Free, in-browser dataset health analyzer — no uploads, no signup.
Before a dataset becomes a dashboard, a model, or a database table, one question decides whether the result is trustworthy: is the data actually any good? The Dataset Health Analyzer answers it in seconds — it reads a CSV, grades its quality from 0 to 100, explains exactly what's wrong, and lets you export the whole report as Markdown or PDF to share with your team.
Everything runs in your browser. Paste a CSV or drop a .csv file and you get a full health report without a single byte leaving your device.
What the Dataset Health Analyzer does
Instead of running six separate checks by hand, the analyzer profiles the whole file in one pass and rolls the findings into a single graded score. At the top you get the headline numbers — health score and letter grade, row and column counts, missing cells, duplicates, type mismatches, whitespace, outliers, and constant or empty columns — then a detailed breakdown beneath.
The health score: six quality dimensions
The 0–100 score is a weighted blend of six dimensions, so a single number reflects the things that actually break downstream work:
- Completeness (25%) — how few cells are missing or null-like.
- Validity (20%) — how well values match their column's dominant type.
- Structure (15%) — consistent row width, no ragged or empty columns.
- Uniqueness (15%) — how few rows are exact duplicates.
- Consistency (15%) — no stray whitespace, mixed casing, or numeric outliers.
- Headers (10%) — every column is named and unique.
Per-column profiling
The score tells you something is wrong; the column table tells you where. Every column is profiled and flagged so you can jump straight to the problem field:
- Inferred type (integer, float, date, email, UUID, string, and more) with a consistency percentage.
- Missing count and percentage, plus cardinality — how many distinct values and the most common one.
- Numeric statistics: min, max, mean, median, standard deviation, and outliers beyond 1.5×IQR.
- Flags like sparse, whitespace, mixed case, constant, empty, outliers, or unique key (a likely identifier).
Insights and recommendations
Below the tables, the analyzer turns its findings into a prioritized to-do list: drop these empty columns, deduplicate that many rows, trim whitespace in these cells, standardize casing in that column, review these numeric outliers. It's the difference between knowing the data is messy and knowing exactly what to fix first.
Score your dataset's health now — free, no signup, in your browser.Dataset Health Analyzer · free in your browserRun a health check step by step
- 1Open the Dataset Health Analyzer and paste your CSV, or drag a .csv file onto the input.
- 2Confirm the delimiter — commas, semicolons, tabs, and pipes are auto-detected.
- 3Read the score header and summary cards for the headline verdict.
- 4Scan the findings, insights, and per-column tables to see exactly what needs fixing.
- 5Download the report as Markdown or PDF to share or attach to a ticket.
Export a report as Markdown or PDF
A health check is most useful when you can hand it to someone else. Download the report as a Markdown file to drop into a pull request, wiki, or data-catalog entry — or export a PDF that mirrors the on-screen report exactly, score and tables included, ready to attach to a ticket or send to a stakeholder.
Pair it with deeper tools
When the health report points at a specific problem, drill in with a focused tool: get a per-column missing-value breakdown, or run full statistics to investigate the outliers and spread behind a flagged numeric column.
Break down missing values column by column with the Missing Data Report.Missing Data Report · free in your browserInvestigate outliers and spread with Column Statistics.Column Statistics · free in your browser