Skip to content

Feat/c python timeseries metadata#767

Merged
hongzhi-gao merged 7 commits intodevelopfrom
feat/c-python-timeseries-metadata
Apr 7, 2026
Merged

Feat/c python timeseries metadata#767
hongzhi-gao merged 7 commits intodevelopfrom
feat/c-python-timeseries-metadata

Conversation

@hongzhi-gao
Copy link
Copy Markdown
Contributor

@hongzhi-gao hongzhi-gao commented Apr 3, 2026

Summary

Adds C-wrapper support for listing devices and reading per-device timeseries metadata, with Python TsFileReader bindings. Device identity is a single DeviceID (path, table name, segments). Chunk statistics are exposed as a tagged layout: common header TsFileStatisticBase (has_statistic, TSDataType type, row_count, start_time, end_time) plus a TimeseriesStatisticUnion arm selected by base.type. Numeric arms expose sum (boolean sum of true values as double, integer sum as double, float/double sum as double), ranges, and first/last where applicable; STRING exposes lexicographic min/max and time-ordered first/last strings; TEXT exposes first/last strings only (no min/max).

C API

Types: TsFileStatisticBase, TsFileBoolStatistic, TsFileIntStatistic, TsFileFloatStatistic, TsFileStringStatistic, TsFileTextStatistic, TimeseriesStatisticUnion, TimeseriesStatistic, TimeseriesMetadata, DeviceID, DeviceTimeseriesMetadataEntry (embeds device only), DeviceTimeseriesMetadataMap.

Device listing: tsfile_reader_get_all_devicesDeviceID*; free with tsfile_free_device_id_array. For a single struct, clear heap fields with tsfile_device_id_free_contents.

Metadata: tsfile_reader_get_timeseries_metadata_all (whole file); tsfile_reader_get_timeseries_metadata_for_devices (length == 0 → empty map; devices == NULL && length > 0 → invalid arg; when length > 0, each entry’s path is required). Free maps with tsfile_free_device_timeseries_metadata_map only (also frees nested device strings and statistic heap strings).

Read common statistic fields via tsfile_statistic_base(&statistic); read sum and type-specific min/max/first/last via statistic.u.bool_s, .int_s, .float_s, .string_s, or .text_s according to base.type.

Python

  • get_all_devices()list[DeviceID] (tsfile.schema.DeviceID: path, table name, segments).
  • get_timeseries_metadata(device_ids=None)dict[str, DeviceTimeseriesMetadataGroup]: None = all devices, [] = empty map, non-empty list filters by device path. Elements may be DeviceID (uses .path) or any object accepted as path via str(...).

Statistic types map to TimeseriesStatistic plus subclasses IntTimeseriesStatistic, FloatTimeseriesStatistic, BoolTimeseriesStatistic, StringTimeseriesStatistic, and TextTimeseriesStatistic (with sum only on the numeric/boolean subclasses).

Tests

  • cpp/test/cwrapper/cwrapper_metadata_test.cc
  • python/tests/test_reader_metadata.py

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 3, 2026

Codecov Report

❌ Patch coverage is 45.01425% with 193 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.59%. Comparing base (b6249d9) to head (a392dd7).

Files with missing lines Patch % Lines
cpp/src/cwrapper/tsfile_cwrapper.cc 45.01% 157 Missing and 36 partials ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #767      +/-   ##
===========================================
- Coverage    62.73%   62.59%   -0.15%     
===========================================
  Files          705      705              
  Lines        42165    42516     +351     
  Branches      6223     6285      +62     
===========================================
+ Hits         26453    26613     +160     
- Misses       14781    14936     +155     
- Partials       931      967      +36     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hongzhi-gao hongzhi-gao requested a review from ColinLeeo April 7, 2026 01:31
char* str_max;
char* str_first;
char* str_last;
} TimeseriesStatistic;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typedef struct BaseStatistics {
TSDataType type;
int32_t row_count;
int64_t start_time;
int64_t end_time;
} BaseStatistic

typedef struct FloatStatistics {
BaseStatistic _base;
double min_float64;
double max_float64;
double first_float64;
double last_float64;
} FloatStatistic

May consider patterns like this?
Otherwise, the memory footprint of Statistics will be much higher.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +979 to +980
if segs[i] == NULL:
out.append("")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use none?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@hongzhi-gao hongzhi-gao merged commit af35aee into develop Apr 7, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants