Error 190: SIZES_OF_ARRAYS_DONT_MATCH
This error occurs when array functions that require equal-length arrays receive arrays of different sizes.
This commonly happens with functions like arrayMap, arrayZip, higher-order array functions, and array distance functions (like arrayL2Distance) that operate on corresponding elements from multiple arrays.
Most common causes
-
Array functions with mismatched input arrays
- Using
arrayMapwith lambda functions that require multiple arrays of different lengths - Passing arrays of different sizes to
arrayZipwhen it expects equal-length inputs - Using higher-order functions like
arrayFilter,arrayExists, orarraySplitwith multiple arrays of different sizes - Array distance functions receiving embeddings or vectors of different dimensions
- Using
-
Misleading error messages in recent versions (24.2+)
- In ClickHouse 24.2+, the error message may report incorrect array sizes (e.g., "Argument 2 has size 1, but expected 1")
- The reported sizes in the error message may not accurately reflect the actual array dimensions
- This makes debugging more difficult on large queries where the actual mismatch is unclear
-
Version-specific issues with array functions
- After migrating from 24.1.4.20 to 24.2.1.2248, functions like
arrayL2Distancemay fail with this error - Can occur when processing embeddings or vector data with inconsistent dimensions
- Bitmap transformation functions may trigger internal array mismatches
- After migrating from 24.1.4.20 to 24.2.1.2248, functions like
-
Context-dependent evaluation with untuple and arrayZip
- Using
arrayZip(untuple(...))with certain table engines (ReplicatedMergeTree) may fail - Adding WHERE clauses can trigger unexpected behavior with empty untuple results
- Works differently on Memory engine vs. ReplicatedMergeTree
- Using
-
Data quality issues
- Inconsistent data ingestion creating arrays of varying lengths
- Nested structures where inner arrays have different sizes across rows
- NULL or empty arrays mixed with populated arrays in multi-array operations
Common solutions
1. Verify array lengths before calling array functions
2. Use arrayZipUnaligned for arrays of different lengths
3. Validate embedding dimensions before distance calculations
4. Handle version-specific issues (24.2+ misleading errors)
5. Fix untuple issues with ReplicatedMergeTree (use PREWHERE or experimental analyzer)
6. Handle bitmap transform operations carefully
7. Debug complex queries with multiple arrays
Prevention tips
- Always validate array dimensions: Before passing arrays to functions that require equal sizes, check their lengths using
length()function or add assertions in your queries. Consider adding CHECK constraints on array columns if appropriate. - Be cautious after version upgrades: When upgrading ClickHouse (especially to 24.2+), test queries involving array functions as error messages may be misleading and behavior might have changed. Keep a test suite of array operations.
- Use appropriate array functions: Choose
arrayZipUnalignedwhen you need to handle arrays of different lengths, andarrayZiponly when you're certain arrays are equal-sized. - Validate embedding data pipelines: If using vector embeddings, implement validation checks in your data ingestion pipeline to ensure all vectors have consistent dimensions before insertion. Reject or pad vectors at the source.
- Consider table engine differences: Be aware that some array operations may behave differently on Memory engine vs. ReplicatedMergeTree, especially with complex expressions like
untuple. Test on the target engine type. - Add data quality checks: Implement regular data quality monitoring to detect when arrays of varying lengths are being inserted:
-
Document expected array sizes: In table schemas and application code, clearly document the expected sizes of arrays, especially for ML embeddings or fixed-size data structures.
-
Use materialized columns for validation: Create materialized columns that compute and store array lengths for quick validation: