◆ Glossary

Data & Datasets

  • Dataset

    A dataset is an organized collection of  records, each of which include data on a common set of attributes. A dataset can be thought as a single table or sheet in a spreadsheet software.

  • Record (also called: Entity, Item)

    A record is a single observed entity in a dataset. Such observations can be of things, people, resources, and so on, each of which is characterized with attributes. If a dataset is stored in a table or spreadsheet, each row becomes a record.

  • Attribute (also called: Variable, Column, Feature)

    An attribute is a specific measurement of a record, such as cost of a project, gender of a person, or GDP trends of a country. Attributes can have a variety of data types, such as categories, text, numbers, time (timestamp), and time-series with numbers. If a dataset is stored in a table or spreadsheet, each column is commonly an attribute.

    • Categorical attribute

      Categorical attributes capture text data, where each unique text answer becomes a category. For example, the gender of a person, the city of an event, survey questions based on option lists, all produce categorical data. Keshif detects unique categories in such attributes.
      Learn more.

      • Multi-valued categorical attribute

        Multi-valued categorical attributes capture multiple values (text, categories) within a single attribute. Examples are amenities in Airbnb listings, or answers of multiple-choice survey questions. They affect data analysis options, and enable new chart types.
        Learn more.

    • Numeric attribute

      Numeric attributes capture a numeric property, such as age of a person, price of an item, duration of a project, and so on. If the numeric data is recorded at multiple times, it can be converted to a time-series attribute (see below).

    • Timestamp attribute

      Timestamp attributes capture when a certain activity happened. For example, the birth-date of a person and the start (or end) date of a project are stored as timestamp data. Such data may store the year, the date, or even the time within the day (resolution). Keshif's visual analytics automatically adjusts to such resolution characteristics of a timestamp attribute.

    • Time-series attribute

      Time-series attributes capture observations of a numeric data over time. Each observation is made at a different time, like by year or day. On a spreadsheet software, each time point would be stored in a separate column.
      Learn more.

      • Time-point (Time-key)

        A time-key is a specific time of observation, such as a specific year on a time-series data that measures trends annually. A time-key is used for basic visualizations (such as, to apply colors or sorting), or can be used as a parameter for time-indexed time-series charts.


  • Dashboard

    A Keshif dashboard is a fully interactive visual data analysis and exploration environment for a dataset.

  • Dashboard Mode

    Dashboard modes change available key functionalities, and are aimed to target different use cases of visual analytics.
    Learn more.

    • Authoring

      Authoring enables adding and removing charts into the dashboard.
      Learn more.

      • Attribute Panel

        The attribute panel includes an organized list of all data attributes, also called variables, columns, or features, in a dataset.
        Learn more.

  • Record Details Panel

    The record details panel shows all the data in a selected record from the record chart.
    Learn more.

  • Data Status Panel

    The data status panel summarizes all of the settings in use across the dashboard.
    Learn more.

    • Data Selection Crumbs

      Crumbs show the active filtering/comparison state in the dashboard at the top data status panel.


  • Aggregate (Record Groups) & Aggregate Charts

    Aggregates enable visual analytics by grouping records that share the same characteristics, such as similar cost, age, type, and so on. Aggregate charts creates aggregates and visualizes their characteristics to enable high-level analysis of a dataset composed of many records with different characteristics. 

    • Category Aggregate

      A category is a text feature that can be shared across different records in a dataset, such as gender of a person, or type of project. They are visually analyzed using categorical charts.
      Learn more.

    • Interval (Bin) Aggregate

      Intervals (or bins) are used to create data aggregates (groups) for numeric and time(stamp) data attributes. This approach enables grouping records with similar values, enabling high-level analysis. Binning can be controlled by adjusting axis type, ranges, and zooming. Histograms and line charts use these aggregation types.
      Learn more.

  • Data Selection

    Data selections allow selecting groups of data to dive deeper and enable cross-analysis across features of a dataset. The selections include highlighting, filtering, and comparisons.
    Learn more.

  • Measurement (for aggregates)

    Aggregate charts, such as categorical bar charts or histograms, create groups (aggregates) of records, apply a measurement to each group, combining all the data to one value. Then, resulting numbers and trends are visualized according to the chart type for intuitive visual understanding of the data.

    • Measurement Mode

      Measurement mode defines how the records are combined together for analysis. You can analyze data using the count of records, or using the sum or average of a numeric attribute in a dataset.
      Learn more.

    • Breakdown Mode (Absolute / Percentage modes)

      Breakdown modes apply to aggregate charts, and define how the values in each aggregate are computed. You can view trends absolute (raw) values, or in percentages, with three different modes identifying different computational approaches for generating percentage values.
      Learn more.

    • Measurement Axis Scale (Type & Extent)

      The measurement axis defines how the values are mapped to a visual characteristics, such as bar size, line position, or color. The characteristics of these visualizations can be controlled using scale type (linear / log) and scale extent (fit / sync / full).


  • Bar Chart

    Bar charts visualize numeric trends of aggregates or records by using bars with their length corresponding to the measured values. Keshif also offers rich and intuitive interaction methods to use the bars for data selection.

    • Aggregate Bar Charts

      These bar charts visualize categorical data. Each bar corresponds to a category in the dataset.  Analytics settings control how visualizations are generated.
      Learn more.

    • Record Bar Charts

      These bar charts visualize numeric record data. Each bar corresponds to a record in the dataset, and these bars are shown in record list charts.
      Learn more.

  • Histogram Chart

    Histogram charts visualize high-level (aggregate) numeric data trends, and enable rich and intuitive interactions to filter through the datasets.
    Learn more.

  • Line Chart (Timestamp Data)

    Line charts visualize high-level (aggregate) timestamp data trends.
    Learn more.

  • Stacked Charts

    Stacked charts are used to compare data and show breakdown distributions on a stacked visual form for data sub-groups that dissect the dataset. This setting applies in all main aggregate chart types.
    Learn more.

  • Side-by-Side Charts

    Side-by-side charts are used to compare data distributions of multiple data sub-groups using side-by-side visualizations for different sub-groups. This setting applies in all main aggregate chart types.
    Learn more.

  • Maps

    Maps show the geographic distribution of the data, using attributes that describe geographic (spatial) characteristics of records in a dataset.

    • Choropleth Maps

      Choropleth maps visualize regional distribution of data and commonly show administrative boundaries, such as countries, states, districts, cities and neighborhoods.

      • Record Choropleth Map

        These maps visualize regional geographic features that are unique for each record. 
        Learn more.

      • Aggregate Choropleth Map

        These maps visualize aggregated distribution of an underlying categorical data feature, which in turn describe a geographic region.
        Learn more.

    • Record Cluster Map

      Cluster maps are special to point-based location data features that are unique for each record. In in event database, these can show the latitude and longitude of individual events. Keshif automatically enables aggregation of nearby points, and dynamically adjusts these clusters based on map viewing configuration, such as zoom levels.
      Learn more.

  • Time-series chart

    Time-series charts visualize a time-series attribute, and plot each record separately on a chart that shows observed values over a horizontal time-axis.
    Learn more.

    • Slope chart

      Slope chart are time-series charts that only show a start and end time. Any other data measurements within these points are removed to focus on changes between two selected time points (time keys).
      Learn more.

    • Bump chart

      Bump charts are time-series chart that show the ranking of each record over time based on underlying values (such as GDP of a country). This chart is a good alternative to have a normalized look into the data where rankings, and changes in rankings, are emphasized.
      Learn more.

    • Time-Indexed Charts

      Time-indexed charts enable exploration of how the data changes relatively over a specific time. 
      Learn more.

    • Sparklines

      Sparklines are compact visualizations that show the trends over time on a single record.
      Learn more.

  • Scatter plot

    A scatter-plot chart reveals trends across two (or more) numeric attributes. Each data record becomes a point (dot) placed on a position based on the values of its two numeric attributes. Color and size can also be used to map two additional numeric attributes.
    Learn more.

    • Connected Scatter Plots

      Connected scatter plots are specialized charts that show the journey of a single point over time on demand along the X and Y axis attributes in a scatter plot.
      Learn more.