my-server
← Wiki

Relational data stream management system

A relational data stream management system (RDSMS) is a distributed, in-memory data stream management system (DSMS) that is designed to use standards-compliant SQL queries to process unstructured and structured data streams in real-time. Unlike SQL queries executed in a traditional RDBMS, which return a result and exit, SQL queries executed in a RDSMS do not exit, generating results continuously as new data become available. Continuous SQL queries in a RDSMS use the SQL Window function to analyze, join and aggregate data streams over fixed or sliding windows. Windows can be specified as time-based or row-based.

RDSMS SQL Query Examples

Continuous SQL queries in a RDSMS conform to the ANSI SQL standards. The most common RDSMS SQL query is performed with the declarative <code>SELECT</code> statement. A continuous SQL <code>SELECT</code> operates on data across one or more data streams, with optional keywords and clauses that include <code>FROM</code> with an optional <code>JOIN</code> subclause to specify the rules for joining multiple data streams, the <code>WHERE</code> clause and comparison predicate to restrict the records returned by the query, <code>GROUP BY</code> to project streams with common values into a smaller set, <code>HAVING</code> to filter records resulting from a <code>GROUP BY</code>, and <code>ORDER BY</code> to sort the results.

The following is an example of a continuous data stream aggregation using a <code>SELECT</code> query that aggregates a sensor stream from a weather monitoring station. The <code>SELECT</code>query aggregates the minimum, maximum and average temperature values over a one-second time period, returning a continuous stream of aggregated results at one second intervals.

RDSMS SQL queries also operate on data streams over time or row-based windows. The following example shows a second continuous SQL query using the <code>WINDOW</code> clause with a one-second duration. The <code>WINDOW</code> clause changes the behavior of the query, to output a result for each new record as it arrives. Hence the output is a stream of incrementally updated results with zero result latency.

See also

External links