11.9. Tables

All data in the application framework is associated with a Table. Fundamentally, a Table is a Django model instance that defines the columns and attributes. In many senses that Table definition is similar to a SQL TABLE construct.

11.9.1. Overview

Tables are defined as part of report configuration source files. All tables are defined by instantiating a subclass of DatasourceTable that is specific to a particular data source.

Defining tables involves the following definitions:

Tables

Tables are the fundamental mechanism for defining the data that is to collected. Each table is a specific to a data source and is tied to a function in that data source to generate data based on the table’s columns as well as any static and dynamic criteria.

Columns

A table is fundamentally a two dimensional construct of rows and columns. The report configuration source file defines the columns associated with a table. The column binds a label and data type to data source specific fields that the table query function uses to populate data. Similar to a SQL table, some columns are key columns, the rest are data or metrics columns.

Table Fields

Table fields define the required and optional criteria associated with a table. Each data source can define a set of table fields that are required or optional for each type of table

Note that while tables are defined in the context of a report configuration source file, tables are only loosely coupled with a report via widgets. In fact, it is possible to run tables at the command line without running an entire report.

11.9.1.1. Sample Table Definition

The following table is taken from the Wave sample plugin:

import steelscript.wave.appfwk.datasources.wave_source as wave

# Define a waves table with 3 separate waves.
table = wave.WaveTable.create(
    name='wave-table', duration='15min', resolution='1s', beta=4)

# Add columns for time and the related waves
table.add_column('time', 'Time', datatype=Column.DATATYPE_TIME, iskey=True)
table.add_column('sin1', 'Sine Wave 1', func='sin', period='5min', alpha=3)
table.add_column('sin2', 'Sine Wave 2', func='sin', period='8min', alpha=5)
table.add_column('cos',  'Cosine Wave', func='cos', period='3min', alpha=2.5)

11.9.2. Creating Tables

Table objects are Django model instances backed by the database. Tables are created by calling the create class method of specific table of interest. Each table type is programmed to generate data differently.

The term “data source” is intentionally vague, as all that is required of a given type of table is that it can, on demand, produce a data set – a two dimensional set of rows and columns that match the requested table configuration (options and columns) as well as dynamic user provided criteria. The following are some examples of valid data sources:

  • leverage configured devices to run queries on remote machines
  • generate data based on some algorithm
  • read data from a file or database
  • merge data from other tables or source and produce a modified table

The following table lists some of the data source tables available:

DatasourceTable Subclass Name Package
WaveTable steelscript.wave.appfwk.datasources.wave_source
AnalysisTable steelscript.appfwk.apps.datasource.modules.analysis
HTMLTable steelscript.appfwk.apps.datasource.modules.html
SharepointTable steelscript.appfwk.apps.plugins.builtin.sharepoint.datasources.sharepoint
SolarwindsTable steelscript.appfwk.apps.plugins.builtin.solarwinds.datasources.solarwinds
NetProfilerTable steelscript.netprofiler.appfwk.datasources.netprofiler
NetProfilerTimeSeriesTable steelscript.netprofiler.appfwk.datasources.netprofiler
NetProfilerGroupbyTable steelscript.netprofiler.appfwk.datasources.netprofiler
NetProfilerDeviceTable steelscript.netprofiler.appfwk.datasources.netprofiler_devices
NetProfilerTemplateTable steelscript.netprofiler.appfwk.datasources.netprofiler_template
NetSharkTable steelscript.netshark.appfwk.datasources.netshark
WiresharkTable steelscript.wireshark.appfwk.datasources.wireshark_source

Tables are created by calling the create class method of the DatasourceTable subclass:

from <package> import <cls>
table = <cls>.create(name, [table_options], [field_options])
classmethod DatasourceTable.create(name, **kwargs)

Create a table.

Parameters:name (str) – Unique identifier for this table

Standard keyword arguments:

Parameters:
  • rows (int) – set maximum number of rows to save after sorting (defaults to -1, no maximum)
  • resample (bool) – if True, timeseries data returned by the data source will be resampled to criteria.resample_resolution or criteria.resolution
  • field_map (dict) –

    a dictionary mapping fields by keyword to either a new keyword or to a map of field values to customize for the given field

    field_map = {‘qos’: ‘qos_1’}

    or

    field_map = {‘qos’: { ‘keyword’: ‘qos_1’,
    ‘label’: ‘QoS 1’, ‘default’: ‘AF’ } }

    This is used to remap fields defined by standard tables to allow the same table to be used multiple times in the same report but with different criteria via different keywords.

Additional table and field options keyword arguments may be provided that are unique to the specific data source table being instantiatied:

table_options

These options define subclass-specific options that allow customization of a table instance. Table options are not visible to users running reports via the UI. Option values are provided at table creation and are considered static for the life of this table instance.

field_options

Most tables are designed to take input from the user via table fields. The user fills in values for each table field and the set of input becomes the report criteria.

Field options allow the report writer to customize the aspects of table fields such as the initial value of a form field or the list of choices in a drop-down selection field.

11.9.2.1. Adding Columns

Columns define the keys and values of the data set that this table will collect. They are added to a table using DatasourceTable.add_column().

When a query is run, the data source associated with a table inspects the list of key and value columns and generates a data table matching the requested column set.

DatasourceTable.add_column(name, label=None, sortasc=False, sortdesc=False, **kwargs)

Add a column to this table.

Parameters:
  • name (str) – Unique identifier for this column
  • label (str) – Display string, defaults to name
  • sortasc, sortdesc (bool) – Sort table based on this columns data

Standard keyword arguments:

Parameters:
  • iskey (bool) – Set True for key columns
  • datatype (enum) –

    type of data his column contains, defaults to ‘float’:

    • float
    • integer
    • time
    • string
    • html
    • date
  • units (enum) –

    Units for data in this column, defaults to none:

    • none - no units
    • s - seconds
    • ms - milliseconds
    • B - bytes
    • B/s - bytes per second
    • b - bits
    • b/s - bits per second
    • pct - percentage
  • position (int) – Display position relative to other columns, automatically computed by default
  • synthetic (bool) – Set True to compute this columns value according to compute_expression
  • compute_expression (str) – Computation expression for syntetic columns
  • compute_post_resample (bool) – If true, compute this synthetic column after resampling (time series only)
  • resample_operation (str) – Operation to use on this column to aggregate multiple rows during resampling, defaults to sum

11.9.2.2. Synthetic Columns

In addition to columnar data generated by a data source, additional synthetic columns may be attached to a table. Synthetic columns provide an easy way to perform computations on other data columns in the same table.

This is best explained by an example based on the WaveTable above:

table.add_column('sin1', 'Sine Wave 1', func='sin', period='5min', alpha=3)
table.add_column('sin1-doubled', synthetic=True, compute_expression='2*{sin1}')

The first is a normal column whose data will be provided by the wave data source. The second column is a synthetic column that is simply the ‘sin1’ column multiplied by 2.

The compute_expression column keyword defines the operation to perform:

  • Other column values are referenced using the syntax {name}, where name is the name assigned to another column in the same table. Any number of other columns may be referenced
  • Standard mathematical operators may be used: +, -, *, /, and others.
  • {name} is actually a Python Pandas Series object, and thus functions on series data can be leveraged either by methods on the series object or by using the full package path:
    • {name}.quantile(0.95) will compute the 95th percentile for the data in column {name}
    • pandas.stats.moments.ewma({name}, span=20) will compute the EWMA (exponential weighted moving average) of the {name} column using a span of 20 data points.

For more advanced analysis techniques, see Analysis Tables.

11.9.2.3. Resampling Time Series Tables

When working with time series data, a common operation is to resample that data:

  • Incoming data is at 1 minute resolution, but the output needs to be at 5 minute resolution
  • Incoming data has erratic non-normalized timestamps, the output should be graphed at steady 1 minute resolution

The application framework will automatically resample timeseries data when the resample=True at creation. In addition, there must be a criteria field named either resample_resolution or just resolution, which sets the target resample interval.

When resampling, data from multiple rows must be aggregated (each row represents a timestamp or time interval). The aggregation operation is different for different types of data:

  • Counted metrics such as “total bytes” involves computing the “sum” of all rows covered by the new interval.
  • Peak metrics such as “peak network RTT” require computing the “max” of all metrics.

Each data column may be set up with a different resample_operation based. The default is sum, but this is not always appropriate for all data types.

Note that when using synthetic columns as describe above, you can choose to compute the synthetic columns before or after resampling by setting compute_post_resample.