Data Preparation

Data Transformation Log and Box-Cox on Lattice

Use this tool to adjust skewed data distributions so your analyses better reflect underlying patterns. If your data values are bunched at one end of a range rather than spread evenly, this method creates a new, transformed version of your dataset while keeping your original information untouched for future verification.

Understanding Data Transformation

In many datasets, values are not spread out evenly. Some columns might have a 'tail' of high values that pulls the average away from the center. This skew can make it difficult for statistical methods to identify clear trends.

The log and Box-Cox transformations work by mathematically re-scaling your data. By compressing the range of these values, you can pull the distribution into a more standard shape, which makes identifying differences or relationships between groups much more reliable.

Safe Transformation Methods

Lattice provides two primary ways to adjust your data: log transformation and Box-Cox. Log transformation is a standard way to shrink the scale of positive numbers. Box-Cox is an automated approach that tests different power settings to find the one that best stabilizes the variance and reduces skew.

Because these methods rely on specific mathematical requirements—such as requiring strictly positive numbers—Lattice includes built-in checks. If your data contains zeros or negative numbers, the tool will pause and prompt you to add a 'shift' value to ensure the calculation is valid.

Maintaining Data Integrity

Every operation you perform on Lattice creates a 'derived dataset.' This means that your original CSV remains exactly as you uploaded it, acting as a permanent record. When you run a transformation, you receive a new dataset ID specifically for the adjusted values.

This approach allows you to chain steps. You can filter your data, transform it, and then run a test, all while maintaining a clear audit trail that shows exactly what happened to your numbers at every stage of the process.

Interpreting Your Results

After applying a transformation, Lattice provides a summary comparing the 'before' and 'after' states. You can look at metrics like skewness and kurtosis to see the effect of the tool immediately.

For example, if your skewness drops from a high number down toward zero, you have successfully centered your data distribution. This feedback loop helps you confirm that the transformation worked as expected before moving forward with your analysis.

1 · Intent → method

An LLM picks data_transform from a fixed catalog.

2 · Method → numbers

Deterministic Python engine runs the math. Same input → same output.

3 · Numbers → plain language

A second LLM translates the result into your domain’s vocabulary.

  • Will this tool change my original file?

    No. Lattice follows a strict 'never-modify' policy. Every data transformation log or Box-Cox operation generates a new derived dataset, ensuring your original uploaded file remains intact for audit and reference.

  • How do I choose between log and Box-Cox?

    Use the log method for simple positive skew. Box-Cox is more flexible and can automatically determine the optimal adjustment to make your data distribution appear more symmetric.

Tool input schema

Schema for data_transform not exported yet (run pnpm export:registry).