Problem
ClickHouse data types are case-sensitive and require PascalCase (e.g., String, Int32, Nullable). However, the sqlparser-rs library's Display implementation for DataType converts certain types to uppercase, causing UNKNOWN_TYPE errors when round-tripping SQL through ClickHouse.
Example
use sqlparser::dialect::ClickHouseDialect;
use sqlparser::parser::Parser;
let sql = "CREATE TABLE t (col Nullable(String))";
let dialect = ClickHouseDialect {};
let ast = Parser::parse_sql(&dialect, sql).unwrap();
// Round-trip: parse and convert back to string
let regenerated = ast[0].to_string();
// Result: "CREATE TABLE t (col Nullable(STRING))"
// ^^^^^^ uppercase!
When this regenerated SQL is executed against ClickHouse, it fails with:
Code: 47. DB::Exception: Unknown type STRING. (UNKNOWN_TYPE)
Affected Types
| Type |
Current Output |
ClickHouse Requires |
DataType::Int8 |
INT8 |
Int8 |
DataType::Int64 |
INT64 |
Int64 |
DataType::Float64 |
FLOAT64 |
Float64 |
DataType::String |
STRING |
String |
DataType::Bool |
BOOL |
Bool |
DataType::Date |
DATE |
Date |
DataType::Datetime |
DATETIME |
DateTime |
Types already correct (PascalCase):
Int16, Int32, Int128, Int256
UInt8, UInt16, UInt32, UInt64, UInt128, UInt256
Float32
Nullable, LowCardinality, Array, Map, Tuple, Nested
Root Cause
The Display trait implementation for DataType uses uppercase for type names (e.g., write!(f, "STRING")), which is standard for most SQL dialects but incorrect for ClickHouse.
The challenge is that Display doesn't have access to dialect context, so it can't conditionally format based on the active dialect.
The Problem with Display
Most users serialize SQL by calling Display on top-level AST types:
let ast = Parser::parse_sql(&dialect, sql).unwrap();
let regenerated = ast[0].to_string(); // Uses Display on Statement
// or
let regenerated = format!("{}", query); // Uses Display on Query
The Display trait signature doesn't allow passing context:
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result
Proposed Solution
To solve this, we need dialect-aware serialization throughout the AST hierarchy:
Add to_sql(&dyn Dialect) to all AST types
- Add
to_sql(&dyn Dialect) -> String method to Statement, Query, Expr, ColumnDef, and other AST types
- Each type's implementation calls
to_sql() on its children, propagating the dialect
- Keep existing
Display implementations unchanged for backwards compatibility
// Example usage after fix:
let ast = Parser::parse_sql(&dialect, sql).unwrap();
let regenerated = ast[0].to_sql(&dialect); // Correct PascalCase for ClickHouse
Implementation Scope
The affected types include (non-exhaustive):
Statement (top-level)
Query, SetExpr, Select
Expr (especially Cast, TryCast, SafeCast)
ColumnDef, ColumnOption
TableConstraint
AlterTableOperation
FunctionArg, FunctionArgExpr
This is a significant change but provides the cleanest API and maintains backwards compatibility.
Workarounds
Currently, users must post-process the SQL string output to fix casing. See 514-labs/moosestack#3152 for an example regex-based workaround.
References
Problem
ClickHouse data types are case-sensitive and require PascalCase (e.g.,
String,Int32,Nullable). However, thesqlparser-rslibrary'sDisplayimplementation forDataTypeconverts certain types to uppercase, causingUNKNOWN_TYPEerrors when round-tripping SQL through ClickHouse.Example
When this regenerated SQL is executed against ClickHouse, it fails with:
Affected Types
DataType::Int8INT8Int8DataType::Int64INT64Int64DataType::Float64FLOAT64Float64DataType::StringSTRINGStringDataType::BoolBOOLBoolDataType::DateDATEDateDataType::DatetimeDATETIMEDateTimeTypes already correct (PascalCase):
Int16,Int32,Int128,Int256UInt8,UInt16,UInt32,UInt64,UInt128,UInt256Float32Nullable,LowCardinality,Array,Map,Tuple,NestedRoot Cause
The
Displaytrait implementation forDataTypeuses uppercase for type names (e.g.,write!(f, "STRING")), which is standard for most SQL dialects but incorrect for ClickHouse.The challenge is that
Displaydoesn't have access to dialect context, so it can't conditionally format based on the active dialect.The Problem with
DisplayMost users serialize SQL by calling
Displayon top-level AST types:The
Displaytrait signature doesn't allow passing context:Proposed Solution
To solve this, we need dialect-aware serialization throughout the AST hierarchy:
Add
to_sql(&dyn Dialect)to all AST typesto_sql(&dyn Dialect) -> Stringmethod toStatement,Query,Expr,ColumnDef, and other AST typesto_sql()on its children, propagating the dialectDisplayimplementations unchanged for backwards compatibilityImplementation Scope
The affected types include (non-exhaustive):
Statement(top-level)Query,SetExpr,SelectExpr(especiallyCast,TryCast,SafeCast)ColumnDef,ColumnOptionTableConstraintAlterTableOperationFunctionArg,FunctionArgExprThis is a significant change but provides the cleanest API and maintains backwards compatibility.
Workarounds
Currently, users must post-process the SQL string output to fix casing. See 514-labs/moosestack#3152 for an example regex-based workaround.
References