API Concepts

NebulaStream offers a set of stream processing and complex event processing operators to analyze and manipulate data streams. To submit queries, NebulaStream provides clients for C++ and Java, as well as a REST interface. In the following, we discuss the individual operators in detail and provide examples in the different query languages.

Streams

NebulaStream processes data in the form of streams. Queries operate on streams, process individual records, and produce a result stream. In general, a stream is a conceptually infinite sequence of records. However, the system does not guarantee the order of individual records. Furthermore, all records in a stream have to follow a common schema. To this end, the user has to define the schema when registering a logical stream in the stream catalog, similar to the table in a relational database (see defining data sources). Each logical stream has a name and a schema. For example, a logical stream with VEHICLES can have the following schema that consists of the three fields id, velocity, and type.

erDiagram VEHICLES { UINT64 id DOUBLE velocity CHAR type }

For further details about the possible field data types, see Data Types.

Data Sources

All queries in NebulaStream process data from a specific logical stream. Depending on the query api (C++ or Java) we can use the from method to create a new data stream object.

💡 Currently, it is not possible to connect to a data source in an ad-hoc fashion. Thus, the logical stream has to first be registered in the stream catalog. See the discussion of the individual clients for further details.

// Creates a stream object based on a specific logical stream
auto stream = Query::from("logical_stream_name");
// Creates a stream object based on a specific logical stream
Stream stream = nebulaStreamRuntime.readFromSource("logical_stream_name");