Summarize Attributes

Summarize Attributes diagram


This tool summarizes all the matching values in one or more fields and calculates statistics on them. The most basic statistic is the count of features that have been summarized together, but you can calculate more advanced statistics as well. You can optionally choose to summarize into time steps, or summarize all features without grouping them.

For example, suppose you have point features of store locations with a field representing the DISTRICT_MANAGER_NAME, and you want to summarize coffee sales by manager. You can specify the field DISTRICT_MANAGER_NAME as the field to dissolve on, and all rows of data representing individual managers will be summarized. This means all store locations that are managed by

Manager1
will be summarized into one row, with summary statistics calculated. In this instance, statistics like the count of the number of stores and the sum of TOTAL_SALES for all stores that
Manager1
manages would be calculated, as well as for any other manager listed in the DISTRICT_MANAGER_NAME field.

Choose the input to summarize


The layer that contains the fields that will be summarized.

In addition to choosing a layer from your map, you can choose Choose Analysis Layer at the bottom of the drop-down list to browse to your contents for a big data file share dataset or feature layer. You may optionally apply a filter on your input layer or apply a selection on hosted layers added to your map. Filters and selections are only applied for analysis.

Choose how to summarize your data


There are two ways to summarize your data:

  • All features into a single feature—Calculates statistics on all features.
  • By fields—Groups features by like values in specified fields.

For example, suppose you had a dataset of trees that included fields of treetype with values Maple, Fir and Pine, and a field treeheight, and you are interested in finding the mean treeheight. If you summarize all features into a single feature, you will find the mean tree height of all trees, which will result in one mean value. If you summarized by the field treeheight, you would end up with a mean for maple trees, a mean for fir trees, and a mean for pine trees.

Add statistics (optional)


You can calculate statistics on features that are summarized. You can calculate the following on numeric fields:

  • Count—Calculates the number of nonnull values. It can be used on numeric fields or strings. The count of [null, 0, 2] is 2.
  • Sum—The sum of numeric values in a field. The sum of [null, null, 3] is 3.
  • Mean—The mean of numeric values. The mean of [0, 2, null] is 1.
  • Min—The minimum value of a numeric field. The minimum of [0, 2, null] is 0.
  • Max—The maximum value of a numeric field. The maximum value of [0, 2, null] is 2.
  • Range—The range of a numeric field. This is calculated as the minimum values subtracted from the maximum value. The range of [0, null, 1] is 1. The range of [null, 4] is 0.
  • Variance—The variance of a numeric field in a track. The variance of [1] is null. The variance of [null, 1,0,1,1] is 0.25.
  • Standard deviation—The standard deviation of a numeric field. The standard deviation of [1] is null. The standard deviation of [null, 1,0,1,1] is 0.5.

You can calculate the following on string fields:

  • Count—The number of nonnull strings.
  • Any—This statistic is a random sample of a string value in the specified field.
All statistics are calculated on nonnull values. The resulting layer will contain a new field for each statistic calculated. Any number of statistics can be added by choosing an attribute and statistic.

Summarize using time steps (optional)


If time is enabled on the input layer and it is of type instant, you can analyze using time stepping. There are three parameters you can set when you use time:

  • Time step interval
  • How often to repeat the time step
  • Time to align the time steps to

For example, if you have data that represents a year in time and you want to analyze it using weekly steps, set Time step interval to 1 week.

For example, if you have data that represents a year in time and you want to analyze it using the first week of the month, set Time step interval to 1 week, How often to repeat the time step to 1 month, and Time to align the time steps to to January 1, at 12:00 am.

Time step interval


The interval of time used for generating time steps. Time step interval can be used alone or with the How often to repeat the time step or Time to align the time steps to parameter.

For example, if you want to create time steps that take place every Monday from 9:00 a.m. until 10:00 a.m., set Time step interval to 1 hour, How often to repeat the time step to 1 week, and Time to align the time steps to to 9:00:00 a.m. on a Monday.

How often to repeat the time step


The step used for calculating a time step. How often to repeat the time step can be used alone, with Time step interval, with Reference Time, or with both Time step interval and Time to align the time steps to.

For example, if you want to create time steps that take place every Monday from 9:00 a.m. until 10:00 a.m., set Time step interval to 1 hour, How often to repeat the time step to 1 week, and Time to align the time steps to to 9:00:00 a.m. on a Monday.

Time to align time steps to


The date and time used to align time slicing. Time stepping will start at and continue backward from this time. If no reference time is selected, time stepping will align to January 1st, 1970.

For example, if you want to create time steps that take place every Monday from 9:00 a.m. until 10:00 a.m., set Time step interval to 1 hour, How often to repeat the time step to 1 week, and Time to align the time steps to to 9:00:00 a.m. on a Monday.

Choose datastore


GeoAnalytics results are stored to a data store and exposed as a feature layer in Portal for ArcGIS. In most cases, results should be stored to the spatiotemporal data store, and this is the default. In some cases, saving results to the relational data store is a good option. The following are reasons you may want to store results to the relational data store:

  • You can use results in portal-to-portal collaboration.
  • You can enable sync capabilities with your results.

Result layer name


The name of the layer that will be created. If you are writing to an ArcGIS Data Store, your results will be saved in My Content and added to the map. If you are writing to a big data file share, your results will be stored in the big data file share and added to its manifest. They will not be added to the map. The default name is based on the tool name and the input layer name. If the layer already exists, the tool will fail.

When writing to ArcGIS Data Store (relational or spatiotemporal big data store) using the Save result in drop-down box, you can specify the name of a folder in My Content where the result will be saved.