Persistence & Indexing - StreamAnalytix

Persistence & Indexing

Persistence allows you to persist the data in any NoSQL store like HBase or Cassandra. Indexing allows you to index the data in Solr or Elasticsearch.


You can specify the persistence store configurations at the group level.

You can also specify the target table name using an expression:

  • For Cassandra: Prefix is “ns_”+tenantId+”_” in persistence Table Name Expression.
  • For HBase:  Prefix is “ns_”+tenantId+”:” in persistence Table Name Expression.

If the store is HBase, you can specify Region Splitting Definition and HBase Region Boundaries.

  • Table Name Expression: tt is the JavaScript expression that is used to evaluate the persistence store table name
  • Compression: it provides the facility to compress the message before storing it
  • Region Splitting Definition: it allows you to define how the HBase tables should be pre-split
    • DefaultNo Pre-Split– only one region will be created initially
    • Based on Region Boundaries: regions are created on the basis of given key boundaries. For example, if your key is a hexadecimal key and you provide a value ‘4, 8, d’, it will create four regions as follows: 1st region for keys less than 4; 2nd region for keys greater than 4 and less than 8; 3rd region for keys greater than 8 and less than d; 4th region for keys greater than d
  • Hbase Region Boundaries: you specify the comma separated boundary values of persistence row key for pre-splitting. For e.g. If the user input is “d, k, r, v”. Pre-splitting will create 5 regions of range, that is <d, d-k, k-r, r-v and >v


You can specify the Index store configurations at the group level. You can also enter the target index name using an expression, and can enable replication, shards, full text search and custom routing. Additionally, you can enable source if Elasticsearch is configured as the Index store.

  • Index name Expression: it is the JavaScript Expression used to evaluate the index name
  • Number of Shards: number of shards to be created per index in the index store
  • Search Across Field: enable this if you want to have full text search over data of all the fields, this will utilize additional space in the index store as it indexes value of all the fields
  • Replication Factor: number of additional copies of data that is to be kept
  • Custom Routing Policy: It is chosen if custom routing is required for indexing data. It enables the routing policy as well. You will have to specify the routing policy for the sharding data across shard.
  • Routing policy:


Where, “0” is timestamp from where routing policy will be effective. This value is 0, it means routing policy will be applied for timestamp greater than 0. Here “name” is field name message for which routing policy will be applied. Here define relative weightage of field’s values, on that basis sharding key will be calculated for querying and indexing.

  • Source Enable: It is enabled only if indexer.type=elasticsearch. It stores the actual JSON that was used as the indexed document. It is not indexed (searchable), just stored.

*Note: – For LogMonitoring Sub-System, while creating group for the system message the Index Name Expression should look like as below:

Index Name Expression: ‘GroupName_’+Math.round(timestamp/(3600*1000*24*7))

Where, GroupName, is the name of the group, timestamp, specifies the alias name of the timestamp column to be created while creating the message.

Fields Configuration

You can configure every field of a message for Indexing and Persistence purpose. By default, all fields are marked for Indexing and Persistence. Once you decide your choice, it can’t be over-ridden for the same message definition.

Schedule a Demo