11.9. Tables¶
All data in the application framework is associated with a Table. Fundamentally, a Table is a Django model instance that defines the columns and attributes. In many senses that Table definition is similar to a SQL TABLE construct.
11.9.1. Overview¶
Tables are defined as part of report configuration source files. All
tables are defined by instantiating a subclass of DatasourceTable
that is specific to a particular data source.
Defining tables involves the following definitions:
Tables
Tables are the fundamental mechanism for defining the data that is to collected. Each table is a specific to a data source and is tied to a function in that data source to generate data based on the table’s columns as well as any static and dynamic criteria.
Columns
A table is fundamentally a two dimensional construct of rows and columns. The report configuration source file defines the columns associated with a table. The column binds a label and data type to data source specific fields that the table query function uses to populate data. Similar to a SQL table, some columns are key columns, the rest are data or metrics columns.
Table Fields
Table fields define the required and optional criteria associated with a table. Each data source can define a set of table fields that are required or optional for each type of table
Note that while tables are defined in the context of a report configuration source file, tables are only loosely coupled with a report via widgets. In fact, it is possible to run tables at the command line without running an entire report.
11.9.1.1. Sample Table Definition¶
The following table is taken from the Wave sample plugin:
import steelscript.wave.appfwk.datasources.wave_source as wave
# Define a waves table with 3 separate waves.
table = wave.WaveTable.create(
name='wave-table', duration='15min', resolution='1s', beta=4)
# Add columns for time and the related waves
table.add_column('time', 'Time', datatype=Column.DATATYPE_TIME, iskey=True)
table.add_column('sin1', 'Sine Wave 1', func='sin', period='5min', alpha=3)
table.add_column('sin2', 'Sine Wave 2', func='sin', period='8min', alpha=5)
table.add_column('cos', 'Cosine Wave', func='cos', period='3min', alpha=2.5)
11.9.2. Creating Tables¶
Table objects are Django model instances backed by the database.
Tables are created by calling the create
class method of specific
table of interest. Each table type is programmed to generate data
differently.
The term “data source” is intentionally vague, as all that is required of a given type of table is that it can, on demand, produce a data set – a two dimensional set of rows and columns that match the requested table configuration (options and columns) as well as dynamic user provided criteria. The following are some examples of valid data sources:
- leverage configured devices to run queries on remote machines
- generate data based on some algorithm
- read data from a file or database
- merge data from other tables or source and produce a modified table
The following table lists some of the data source tables available:
DatasourceTable Subclass Name | Package |
---|---|
WaveTable | steelscript.wave.appfwk.datasources.wave_source |
AnalysisTable | steelscript.appfwk.apps.datasource.modules.analysis |
HTMLTable | steelscript.appfwk.apps.datasource.modules.html |
SharepointTable | steelscript.appfwk.apps.plugins.builtin.sharepoint.datasources.sharepoint |
SolarwindsTable | steelscript.appfwk.apps.plugins.builtin.solarwinds.datasources.solarwinds |
NetProfilerTable | steelscript.netprofiler.appfwk.datasources.netprofiler |
NetProfilerTimeSeriesTable | steelscript.netprofiler.appfwk.datasources.netprofiler |
NetProfilerGroupbyTable | steelscript.netprofiler.appfwk.datasources.netprofiler |
NetProfilerDeviceTable | steelscript.netprofiler.appfwk.datasources.netprofiler_devices |
NetProfilerTemplateTable | steelscript.netprofiler.appfwk.datasources.netprofiler_template |
NetSharkTable | steelscript.netshark.appfwk.datasources.netshark |
WiresharkTable | steelscript.wireshark.appfwk.datasources.wireshark_source |
Tables are created by calling the create
class method of the
DatasourceTable subclass:
from <package> import <cls>
table = <cls>.create(name, [table_options], [field_options])
-
classmethod
DatasourceTable.
create
(name, **kwargs)¶ Create a table.
Parameters: name (str) – Unique identifier for this table Standard keyword arguments:
Parameters: - rows (int) – set maximum number of rows to save after sorting (defaults to -1, no maximum)
- resample (bool) – if True, timeseries data returned by the
data source will be resampled to
criteria.resample_resolution
orcriteria.resolution
- field_map (dict) –
a dictionary mapping fields by keyword to either a new keyword or to a map of field values to customize for the given field
field_map = {‘qos’: ‘qos_1’}or
- field_map = {‘qos’: { ‘keyword’: ‘qos_1’,
- ‘label’: ‘QoS 1’, ‘default’: ‘AF’ } }
This is used to remap fields defined by standard tables to allow the same table to be used multiple times in the same report but with different criteria via different keywords.
Additional table and field options keyword arguments may be provided that are unique to the specific data source table being instantiatied:
table_options
These options define subclass-specific options that allow customization of a table instance. Table options are not visible to users running reports via the UI. Option values are provided at table creation and are considered static for the life of this table instance.field_options
Most tables are designed to take input from the user via table fields. The user fills in values for each table field and the set of input becomes the report criteria.
Field options allow the report writer to customize the aspects of table fields such as the initial value of a form field or the list of choices in a drop-down selection field.
11.9.2.1. Adding Columns¶
Columns define the keys and values of the data set that this table
will collect. They are added to a table using
DatasourceTable.add_column()
.
When a query is run, the data source associated with a table inspects the list of key and value columns and generates a data table matching the requested column set.
-
DatasourceTable.
add_column
(name, label=None, sortasc=False, sortdesc=False, **kwargs)¶ Add a column to this table.
Parameters: - name (str) – Unique identifier for this column
- label (str) – Display string, defaults to name
- sortasc, sortdesc (bool) – Sort table based on this columns data
Standard keyword arguments:
Parameters: - iskey (bool) – Set True for key columns
- datatype (enum) –
type of data his column contains, defaults to ‘float’:
- float
- integer
- time
- string
- html
- date
- units (enum) –
Units for data in this column, defaults to none:
- none - no units
- s - seconds
- ms - milliseconds
- B - bytes
- B/s - bytes per second
- b - bits
- b/s - bits per second
- pct - percentage
- position (int) – Display position relative to other columns, automatically computed by default
- synthetic (bool) – Set True to compute this columns value
according to
compute_expression
- compute_expression (str) – Computation expression for syntetic columns
- compute_post_resample (bool) – If true, compute this synthetic column after resampling (time series only)
- resample_operation (str) – Operation to use on this column to aggregate multiple rows during resampling, defaults to sum
11.9.2.2. Synthetic Columns¶
In addition to columnar data generated by a data source, additional synthetic columns may be attached to a table. Synthetic columns provide an easy way to perform computations on other data columns in the same table.
This is best explained by an example based on the WaveTable above:
table.add_column('sin1', 'Sine Wave 1', func='sin', period='5min', alpha=3)
table.add_column('sin1-doubled', synthetic=True, compute_expression='2*{sin1}')
The first is a normal column whose data will be provided by the wave data source. The second column is a synthetic column that is simply the ‘sin1’ column multiplied by 2.
The compute_expression
column keyword defines the operation to
perform:
- Other column values are referenced using the syntax
{name}
, wherename
is the name assigned to another column in the same table. Any number of other columns may be referenced - Standard mathematical operators may be used:
+
,-
,*
,/
, and others. {name}
is actually a Python Pandas Series object, and thus functions on series data can be leveraged either by methods on the series object or by using the full package path:{name}.quantile(0.95)
will compute the 95th percentile for the data in column {name}pandas.stats.moments.ewma({name}, span=20)
will compute the EWMA (exponential weighted moving average) of the{name}
column using a span of 20 data points.
For more advanced analysis techniques, see Analysis Tables.
11.9.2.3. Resampling Time Series Tables¶
When working with time series data, a common operation is to resample that data:
- Incoming data is at 1 minute resolution, but the output needs to be at 5 minute resolution
- Incoming data has erratic non-normalized timestamps, the output should be graphed at steady 1 minute resolution
The application framework will automatically resample timeseries data
when the resample=True
at creation. In addition, there must be
a criteria field named either resample_resolution
or just
resolution
, which sets the target resample interval.
When resampling, data from multiple rows must be aggregated (each row represents a timestamp or time interval). The aggregation operation is different for different types of data:
- Counted metrics such as “total bytes” involves computing the “sum” of all rows covered by the new interval.
- Peak metrics such as “peak network RTT” require computing the “max” of all metrics.
Each data column may be set up with a different resample_operation
based. The default is sum
, but this is not always appropriate for
all data types.
Note that when using synthetic columns as describe above, you can
choose to compute the synthetic columns before or after resampling by
setting compute_post_resample
.