vortex-data
vortex
BlogDocsChangelog

Performance History

Latest Results

Add VectorOps::clear (#5409) Signed-off-by: Nicholas Gates <nick@nickgates.com>
develop
3 hours ago
clippy Signed-off-by: Andrew Duffy <andrew@a10y.dev>
aduffy/filter-pushdown-fix
5 hours ago
fix: filter pushdown for nested fields In #5295, we accidentally broke nested filter pushdown. The issue is that FileSource::try_pushdown_filters seems like it's meant to evaluate using the whole file schema, rather than any projected schema. As an example, in the Github Archive benchmark dataset, we have the following query, which should trivially pushdown and be pruned, executing about 30ms or so: ``` SELECT COUNT(*) from events WHERE payload.ref = 'refs/head/main' ``` However, after this change, pushdown of this field was failing, pushing query time up 100x. The root cause is that the old logic attempted to apply the file schema to the source_expr directly. Concretely, for the gharchive query, the whole expression is something like: ```text BinaryExpr { lhs: GetField { source_expr: Column { name: "payload", index: 0 }, field_expr: Literal { value: "ref" } } rhs: Literal { value: "refs/head/main" } operator: Eq } ``` The issue is that the column index 0 is wrong for the whole file. Instead, we need to recursively ensure that the source_expr is a valid sequence of Column and GetField expressions that resolve properly. Note how we already were doing this for checking if a standalone Column expression can be pushed down: ``` } else if let Some(col) = expr.downcast_ref::<df_expr::Column>() { schema .field_with_name(col.name()) .ok() .is_some_and(|field| supported_data_types(field.data_type())) ``` Signed-off-by: Andrew Duffy <andrew@a10y.dev>
aduffy/filter-pushdown-fix
7 hours ago
fix: filter pushdown for nested fields In #5295, we accidentally broke nested filter pushdown. The issue is that FileSource::try_pushdown_filters seems like it's meant to evaluate using the whole file schema, rather than any projected schema. As an example, in the Github Archive benchmark dataset, we have the following query, which should trivially pushdown and be pruned, executing about 30ms or so: ``` SELECT COUNT(*) from events WHERE payload.ref = 'refs/head/main' ``` However, after this change, pushdown of this field was failing, pushing query time up 100x. The root cause is that the old logic attempted to apply the file schema to the source_expr directly. Concretely, for the gharchive query, the whole expression is something like: ```text BinaryExpr { lhs: GetField { source_expr: Column { name: "payload", index: 0 }, field_expr: Literal { value: "ref" } } rhs: Literal { value: "refs/head/main" } operator: Eq } ``` The issue is that the column index 0 is wrong for the whole file. Instead, we need to recursively ensure that the source_expr is a valid sequence of Column and GetField expressions that resolve properly. Note how we already were doing this for checking if a standalone Column expression can be pushed down: ``` } else if let Some(col) = expr.downcast_ref::<df_expr::Column>() { schema .field_with_name(col.name()) .ok() .is_some_and(|field| supported_data_types(field.data_type())) ``` Signed-off-by: Andrew Duffy <andrew@a10y.dev>
aduffy/filter-pushdown-fix
7 hours ago
u Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
ji/opt-replace-transform
9 hours ago

Active Branches

fix: filter pushdown for nested fields
last run
5 hours ago
#5406
CodSpeed Performance Gauge
0%
#5399
CodSpeed Performance Gauge
0%
© 2025 CodSpeed Technology
Home Terms Privacy Docs