6.2. SteelScript AppResponse Report Tutorial¶

This tutorial will show you how to run a report against an AppResponse appliance using SteelScript for Python. This tutorial assumes a basic understanding of Python.

The tutorial has been organized so you can follow it sequentially. Throughout the example, you will be expected to fill in details specific to your environment. These will be called out using a dollar sign $<name> – for example $host indicates you should fill in the host name or IP address of an AppResponse appliance.

Whenever you see >>>, this indicates an interactive session using the Python shell. The command that you are expected to type follows the >>>. The result of the command follows. Any lines with a # are just comments to describe what is happening. In many cases the exact output will depend on your environment, so it may not match precisely what you see in this tutorial.

6.2.1. AppResponse Object¶

Interacting with an AppResponse Applicance leverages two key classes:

AppResponse - provides the primary interface to the appliance, handling initialization, setup,

and communication via REST API calls.

Report - talks through the AppResponse object to create new report and pull data when the report is completed.

To start, start Python from the shell or command line:

$ python
Python 2.7.13 (default, Apr  4 2017, 08:47:57)
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

Once in the python shell, let’s create an AppReponse object:

>>> from steelscript.appresponse.core.appresponse import AppResponse
>>> from steelscript.common import UserAuth

>>> ar = AppResponse('$host', auth=UserAuth('$username', '$password'))

In the above code snippet, we have created an AppResponse object, which represents a connection to an AppResponse appliance. The first argument is the hostname or IP address of the AppResponse appliance. The second argument is a named parameter and identifies the authentication method to use – in this case, simple username/password is used.

As soon as the AppResponse object is created, a connection is established to the AppResponse appliance, and the authentication credentials are validated. If the username and password are not correct, you will immediately see an exception.

The ar object is the basis for all communication with the AppResponse appliance, whether that is running a report, updating host groups or downloading a pcap file. Now lets take a look at the basic information of the AppResponse appliance that we just connected to:

>>> info = ar.get_info()

>>> info['model']
u'VSCAN-2000'

>>> info['sw_version']
u'11.2.0 #13859'

# Let's see the entire info structure
>>> info
{u'device_name': u'680-valloy1',
 u'hw_version': u'',
 u'mgmt_addresses': [u'10.33.158.77'],
 u'model': u'VSCAN-2000',
 u'serial': u'',
 u'sw_version': u'11.2.0 #13859'}

6.2.2. Creating a Report Script¶

Let’s create our first script. We’re going to write a simple script that runs a report against a packets capture job on our AppResponse appliance.

This script will get packets from a running packets capture job. To start, make sure the targeted AppResponse appliance has a running packets capture job.

Now create a file called report.py and insert the following code:

import pprint

from steelscript.appresponse.core.appresponse import AppResponse
from steelscript.common import UserAuth
from steelscript.appresponse.core.reports import DataDef, Report
from steelscript.appresponse.core.types import Key, Value, TrafficFilter
from steelscript.appresponse.core.reports import SourceProxy

# Fill these in with appropriate values
host = '$host'
username = '$username'
password = '$password'

# Open a connection to the appliance and authenticate
ar = AppResponse(host, auth=UserAuth(username, password))

packets_source = ar.get_capture_job_by_name('default_job')

source = SourceProxy(packets_source)

columns = [Key('start_time'), Value('sum_tcp.total_bytes'), Value('avg_frame.total_bytes')]

granularity = '10'

resolution = '20'

time_range = 'last 1 minute'

data_def = DataDef(source=source, columns=columns, granularity=granularity,
                   resolution=resolution, time_range=time_range)

data_def.add_filter(TrafficFilter('tcp.port==80'))

report = Report(ar)
report.add(data_def)
report.run()
pprint.pprint(report.get_data())

Be sure to fill in appropriate values for $host, $username and $password. Run this script as follows and you should see something like the following:

$ python report.py
 [(1510685000, 3602855, 772.979),
  (1510685020, 4109306, 754.001),
  (1510685040, 657524, 779.057)]

Let’s take a closer look at what this script is doing.

6.2.2.1. Importing Classes¶

The first few lines are simply importing a few classes that we will be using:

import pprint

from steelscript.appresponse.core.appresponse import AppResponse
from steelscript.common import UserAuth
from steelscript.appresponse.core.reports import DataDef, Report
from steelscript.appresponse.core.types import Key, Value, TrafficFilter
from steelscript.appresponse.core.reports import SourceProxy

6.2.2.2. Creating an AppResponse object¶

Next, we create an AppResponse object that establishes our connection to the target appliance:

# Open a connection to the appliance and authenticate
ar = AppResponse(host, auth=UserAuth(username, password))

6.2.2.3. Creating a Data Definition Object¶

This section describes how to create a data definition object.

6.2.2.3.1. Creating a SourceProxy object¶

Now we need to create a SourceProxy object which carries the information of source where data will be fetched.

packets_source = ar.get_capture_job_by_name('default_job')

source = SourceProxy(packets_source)

We first obtain a packet capture job object by using the name of the capture job.

packets_source = ar.get_capture_job_by_name('default_job')

To run a report against a Pcap file source, the file object can be derived as below:

packets_source = ar.get_file_by_id('$file_id')

Then we need to initialize a SourceProxy object as below:

source_proxy = SourceProxy(packets_source)

To run a report against a non-packets source, the SourceProxy object is initialized by just using the name of the source as below:

source_proxy = SourceProxy(name='$source_name')

To know the available source names, just execute the following command in shell:

$ steel appresponse sources $host -u $username -p $password

Name                 Groups                                       Filters Supported on Metric Columns  Granularities in Seconds
------------------------------------------------------------------------------------------------------------------------------------------------------
packets              Packets                                      False                                0.1, 0.01, 0.001, 1, 10, 60, 600, 3600, 86400
aggregates           Application Stream Analysis, Web             True                                 60, 300, 3600, 21600, 86400
                     Transaction Analysis, UC Analysis
dbsession_summaries  DB Analysis                                  False                                60, 300, 3600, 21600, 86400
sql_summaries        DB Analysis                                  False                                60, 300, 3600, 21600, 86400

It shows that there are totally 4 supported sources. Note the following:

Source aggregates belongs to the 3 groups: Application Stream Analysis, Web Transaction Analysis and UC Analysis.
Filters can be applied on the metric columns for the source aggregates.
Filters are not supported on metric columns for source packets, dbsession_summaries and sql_summaries.

We will support native methods for accessing source information via Python in an upcoming release.

6.2.2.3.2. Choosing Columns¶

Then we select the set of columns that we are interested in collecting. Note that AppResponse supports multiple sources. Each source supports a different set of columns. Each column can be either a key column or a value column. Each row of data will be aggregated according to the set of key columns selected. The value columns define the set of additional data to collect per row. In this example, we are asking to collect total bytes for tcp packets and average total packet length for each resolution bucket.

To help identify which columns are available, just execute the helper command as below in your shell prompt.

$ steel appresponse columns $host -u $usernmae -p $password --source $source_name

For instance, to know the available columns within source packets, we execute the command in shell as:

$ steel appresponse columns $host -u $username -p $password --source packets

  ID                                                Description                                        Type       Metric   Key/Value
  ----------------------------------------------------------------------------------------------------------------------------------
  ...
  avg_frame.total_bytes                             Total packet length                                number     True     Value
  ...
  start_time                                        Used for time series data. Indicates the           timestamp  ----     Key
                                                    beginning of a resolution bucket.
  ...
  sum_tcp.total_bytes                               Number of total bytes for TCP traffic              integer    True     Value

Note that it would be better to pipe the output using | more as there can be more than 1000 rows.

Construct a list of columns, including both key columns and value columns in your script as shown below.

columns = [Key('start_time'), Value('sum_tcp.total_bytes'), Value('avg_frame.total_bytes')]

We will support native methods for accessing column information via Python in an upcoming release.

6.2.2.3.3. Setting Time Fields¶

Now it is time to set the time related criteria fields. We firstly need to see the possible granularity values that the interested source supports. Running the below command in shell.

$ steel appresponse sources $host -u $username -p $password

Name                 Groups                                       Filters Supported on Metric Columns  Granularities in Seconds
------------------------------------------------------------------------------------------------------------------------------------------------------
packets              Packets                                      False                                0.1, 0.01, 0.001, 1, 10, 60, 600, 3600, 86400
aggregates           Application Stream Analysis, Web             True                                 60, 300, 3600, 21600, 86400
                     Transaction Analysis, UC Analysis
dbsession_summaries  DB Analysis                                  False                                60, 300, 3600, 21600, 86400
sql_summaries        DB Analysis                                  False                                60, 300, 3600, 21600, 86400

As can be seen, source packets supports graunularity values of 0.1, 0.01, 0.001, 1, 10, 60, 600, 3600 and 86400 (as in seconds).

granularity = '10'

resolution = '20'

time_range = 'last 1 minute'

Setting granularity to 10 means the data source computes a summary of the metrics it received based on intervals of 10 seconds.

Resolution is a setting in addition to granularity that tells the data source to aggregate the data further. Its numeric value must be multiple of the requested granularity value. In the script, the data will be aggregated on 20-second intervals. Setting resolution is optional.

If resolution is taken out from the script, the output would consist of 10-second summaries instead of 20-second aggregated records, similar as below.

$ python report.py
 [(1510687770, 911456, 784.386),
  (1510687780, 1672581, 780.85),
  (1510687790, 1709843, 776.143),
  (1510687800, 1338178, 797.484),
  (1510687810, 1368713, 771.541),
  (1510687820, 545244, 791.356)]

The parameter time_range specifies the time range for which the data source computes the metrics. Other valid formats include “this minute”, “previous hour” and “06/05/17 17:09:00 to 06/05/17 18:09:00”.

6.2.2.3.4. Initializing Data Definition Object¶

With all the above values derived, we can now create a DataDef object as below.

data_def = DataDef(source=source, columns=columns, granularity=granularity, time_range=time_range)

6.2.2.3.5. Adding Traffic filters¶

To filter the data, it is easy to add traffic filters to the DataDef object. Firstly let us create a traffic filter as below.

tf = TrafficFilter('tcp.port==80')

The above filter is a steelfilter traffic filter that output records with tcp.port == 80. Note that running the sources commmand script can show whether filters can be applied on metric columns for each source.

It is worth mentioning that packets source also supports bpf filter and wireshark filter. They both have their own syntax and set of filter fields. Other sources do not support either bpf filter or wireshark filter.

bpf filter and wireshark filter can be created as below.

bpf_filter = TrafficFilter('port 80', type_='bpf')
wireshark_filter = TrafficFilter('tcp.port==80', type_='wireshark')

Now we can add the filter to the DataDef object.

data_def.add_filter(tf)

You can create multiple filters and add them to the DataDef object one by one using the above method.

6.2.2.3.6. Running a report¶

After creating the data definition object, then we are ready to run a report as below:

# Initialize a new report
report = Report(ar)

# Add one data definition object to the report
report.add(data_def)

# Run the report
report.run()

# Grab the data
pprint.pprint(report.get_data())

Currently, we only support one data definition per each report instance. Next release will include the ability to run multiple data definitions per each report instance. The reason for running multiple data definitions is to reuse same data source between data definitions and yield much performance gain as a result.

6.2.3. Extending the Example¶

As a last item to help get started with your own scripts, we will extend our example with one helpful feature: table outputs.

Rather than show how to update your existing example script, we will post the new script, then walk through key differences that add the feature.

Let us create a file table_report.py and insert the following code:

from steelscript.appresponse.core.appresponse import AppResponse
from steelscript.common import UserAuth
from steelscript.appresponse.core.reports import DataDef, Report
from steelscript.appresponse.core.types import Key, Value, TrafficFilter
from steelscript.appresponse.core.reports import SourceProxy
# Import the Formatter class to output data in a table format
from steelscript.common.datautils import Formatter

# Fill these in with appropriate values
host = '$host'
username = '$username'
password = '$password'

# Open a connection to the appliance and authenticate
ar = AppResponse(host, auth=UserAuth(username, password))

packets_source = ar.get_capture_job_by_name('default_job')

source_proxy = SourceProxy(packets_source)

columns = [Key('start_time'), Value('sum_tcp.total_bytes'), Value('avg_frame.total_bytes')]

granularity = '10'

resolution = '20'

time_range = 'last 1 minute'

data_def = DataDef(source=source_proxy, columns=columns, granularity=granularity,
                   resolution=resolution, time_range=time_range)

data_def.add_filter(TrafficFilter('tcp.port==80'))

report = Report(ar)
report.add(data_def)
report.run()

# Get the header of the table
header = report.get_legend()

data = report.get_data()

# Output the data in a table format
Formatter.print_table(data, header)

Be sure to fill in appropriate values for $host, $username and $password. Run this script as follows and you should see report result is rendered in a table format as the following:

$ python table_report.py

 start_time    sum_tcp.total_bytes    avg_frame.total_bytes
 --------------------------------------------------------------
 1510685000    3602855                772.979
 1510685020    4109306                754.001
 1510685040    657524                 779.057

As can be seen from the script, there are 3 differences.

First, we import the Formatter class as below:

from steelscript.common.datautils import Formatter

After the report finished running, we obtain the header of the table, which is essentially a list of column names that match the report result, shown as below:

header = report.get_legend()

At last, the Formatter class is used to render the report result in a nice table format, shown as below:

Formatter.print_table(data, header)