Linear discriminant analysis helps you classify items into distinct groups based on numeric data. Use this method when you want to understand which variables most effectively separate categories—such as predicting a customer segment or diagnosing a condition—and need a straightforward, stable boundary to visualize how those groups differ.
Understanding Group Separation
Linear discriminant analysis identifies the underlying axes—called linear discriminants—that provide the maximum separation between your defined groups. By calculating the mean values of features for each class, the method determines which variables contribute most significantly to pulling these groups apart.
When you run this on Lattice, it calculates the 'explained variance' for these discriminants. This tells you exactly how much of the differences between your groups can be explained by the primary axes versus minor variations in the data.
Deterministic Categorization
Because Lattice executes this method with a fixed seed, your classification results remain consistent every time you run the tool. The output includes a confusion matrix, which allows you to see exactly where the model accurately identified group members and where it potentially misclassified them.
You will receive accuracy scores based on both your training data and cross-validation folds. This helps ensure that the boundaries identified by the model are reliable and not simply artifacts of the specific data points included in your initial set.
Visualizing Boundaries
The tool translates high-dimensional data into a two-dimensional plot, projecting your categories into a sub-space. This makes it easy to see if your groups are clustered tightly or if there is overlap in the characteristics of different classes.
This projection is particularly useful for detecting outliers or identifying subsets of your data that do not fit the established pattern. It turns abstract numerical relationships into a clear visual map of how your data is structured.
1 · Intent → method
An LLM picks stats_lda from a fixed catalog.
2 · Method → numbers
Deterministic Python engine runs the math. Same input → same output.
3 · Numbers → plain language
A second LLM translates the result into your domain’s vocabulary.
How is this different from a standard regression?
Linear discriminant analysis is specifically designed for classification tasks where the goal is to define boundaries between groups, whereas regression is typically used to predict a continuous numerical value.
What happens if my data groups have different variances?
This method assumes that all groups share similar variance structures. If your groups have significantly different spreads, the model may struggle to define an accurate boundary.
Tool input schema
Schema for stats_lda not exported yet (run pnpm export:registry).