Data Preparation

Missing Value Imputation on Lattice

When your dataset has empty cells, missing value imputation helps you fill those gaps so you can continue your analysis. This process allows you to maintain your original data while creating a new, prepared version for your next step, ensuring that empty information does not block your work.

Handling Empty Data Safely

Empty cells are common in real-world data collection, but they often stop analysis tools from working correctly. Instead of manually cleaning your file, you can use missing value imputation to address these gaps systematically.

Lattice handles this by creating a new dataset based on your chosen strategy. This ensures that you always have a clear trail back to your original, raw data if you need to compare results or re-run your process with different settings.

Choosing Your Filling Strategy

Different data types require different approaches. For numeric data, you can use the 'mean' for standard sets or the 'median' if your data contains extreme values that might skew the average. For categorical or grouping columns, a 'constant' value is usually the best way to categorize missing entries without introducing bias.

If your data tracks changes over time, forward and backward filling strategies allow you to assume that a missing value is likely the same as the observation that preceded or followed it.

Auditability and Data Integrity

Every time you perform an imputation, Lattice generates a summary report. This report tracks exactly how many values were filled in each column and identifies if any missing values remain, such as when a forward-fill at the very beginning of a dataset has no prior value to reference.

By enforcing strict rules—like requiring a numeric input for mean calculations—Lattice ensures that you do not accidentally introduce incorrect data types into your analysis. This keeps your workflow transparent and prevents common errors that occur when mixing numeric and non-numeric content.

1 · Intent → method

An LLM picks data_fillna from a fixed catalog.

2 · Method → numbers

Deterministic Python engine runs the math. Same input → same output.

3 · Numbers → plain language

A second LLM translates the result into your domain’s vocabulary.

  • Does this tool overwrite my original data?

    No. Lattice never modifies your original file. When you use missing value imputation, the platform creates a new, derived dataset that contains your changes, leaving the source data untouched for auditability.

  • Why can't I use the 'mean' strategy on a text column?

    The mean and median strategies rely on mathematical calculation, which only applies to numeric data. Lattice enforces this rule to prevent calculation errors and ensures that your analysis results remain statistically valid.

Tool input schema

Schema for data_fillna not exported yet (run pnpm export:registry).