The High-D User Guide is also available in PDF format. |
Introduction
Welcome to the documentation for Macrofocus High-D.
The "Getting Started" chapter provides a brief introduction to the most important features of the application and is a good place to start for new users.
All of the features are explained in detail in the subsequent chapters.
If you don’t find what you are looking for in these pages, then you might want to look at our FAQ list, and of course you may also contact High-D support.
Getting started
This chapter will introduce the core features of Macrofocus High-D and is intended to help new users get started with analyzing their data.
High-D is still lacking proper documentation. Meanwhile, we can suggest the following references:
After you played around a bit, you will have found that it is very easy to quickly access specific values for specific objects. It is much faster than looking it up in a big table or issuing a database query. Just click on an object in any of the views and there you go. Individual values can not only be easily found, but they are also embedded in the overall context and you immediately see how they relate to other objects.
In addition to quick data access, the different views provide various ways of revealing interesting patterns in the data and allow you to make sense of it.
User interface
Menu and toolbars
File menu
- New
-
Creates a new empty window.
- Open…
-
Load a data file in one of the supported format.
- Open URL…
-
Load a data file in one of the supported format from a remote location.
- Open Database…
-
Load a database table or query from one of the supported datatabase system.
- Open Directory…
-
Create a dataset based on the directory structure.
- Open Google Spreadsheet…
-
Load data from Google Spreadsheet.
- Open Dataset
-
Load a dataset from High-D Server.
- Open Recent
-
Load one of the previously opened dataset.
- Reload
-
Reload the currently opened dataset, possibly retrieving updated data.
- Save
-
Save the active window in native High-D format.
- Save As…
-
Save the active window in native High-D format and give it a new file name.
- Export Graphics…
-
The current view is exported in vector or raster form in one of the following supported formats:
-
PDF (Portable Document Format) (
*.pdf
) -
The resulting document is ideal for printing or inclusion in a report. It is a vector format and therefore resolution independent.
-
Scalable Vector Graphics (
*.svg
) -
The resulting document is ideal for further editing and for inclusion into another document. It is a vector format and therefore resolution independent. Scalable Vector Graphics (SVG) can be displayed by many web browsers with an embedded SVG viewer, or edited by any application supporting SVG (such as Adobe Illustrator).
-
Postscript (
*.ps
) -
A common vector format and therefore resolution independent. Can be used for printing.
-
EMF (Enhanced Metafile) (
*.emf
) -
A resolution independent format common on the Windows platform.
-
PNG (Portable Network Graphics) (
*.png
) -
A raster format.
-
JPEG (
*.jpg
) -
A raster format.
-
Compuserve GIF (
*.gif
) -
A raster format.
-
TIFF (Tagged Image File Format) (
*.tiff
) -
A raster format.
-
PDF (Portable Document Format) (
All the raster export format allow for setting the desired DPI for high-quality output. |
- Export Data…
-
The data visible in the Table view can be exported with
for further processing in spreadsheet programs or other applications. The following formats are supported:-
CSV (Commad Delimited) (
*.csv
) -
The comma-separated values (CSV) format stores tabular data (numbers and text) in plain-text form. Most spreadsheet and data management software are able to import data in this format.
-
Text (Tab Delimited) (
*.txt;*.tsv;*.tab;*.raw
) -
The tab-separated values format is a popular method of data interchange among databases and spreadsheets. It stores tabular data (numbers and text) in plain-text form.
-
Microsoft Excel Workbook (
*.xls;*.xlsx;*.xlsm
) -
The comma-separated values (CSV) format stores tabular data (numbers and text) in plain-text form. Most spreadsheet and data management software are able to import data in this format.
-
Apache Arrow (
*.arrow
) -
The Arrow format stores tabular data (numbers and text) in form that allow data access without serialization overhead.
-
Apache Parquet (
*.parquet
) -
The Parquet format stores tabular data (numbers and text) in a compressed, efficient columnar data representation. It is popular in the Hadoop ecosystem.
-
CSV (Commad Delimited) (
- Import Settings…
-
All the settings can be exported using
- Export Settings…
-
Settings saved using
can be applied to another dataset using - Page Setup…
-
Setup the formatting of the page.
- Print…
-
Print the current window.
- Close
-
Close the current window.
- Exit
-
Quit the High-D application.
Edit menu
- Reset
-
Reset the views to their default.
Select menu
- All
-
Select every non-filtered object.
- Inverse
-
Inverse the selection.
- None
-
Select nothing.
Filter menu
- Selected
-
Filter out the selected object
- None
-
Unfilter what has been previously filtered
- Reset
-
Reset the filtering
Paint menu
- Color
-
Paint the selected objects with the given color.
- Reset
-
Reset the coloring
Interaction menu
- Mode
-
- Selection
-
Selection mode.
- Filter
-
Filtering mode.
- Toggle
-
Toggle selection mode.
- DoNothing
-
Disabled interaction.
Options menu
- Rendering
-
- Density
-
Density-based drawing scheme.
- AlphaBlended
-
Alpha-blended drawing scheme.
- Opaque
-
Opaque drawing scheme.
- Antialiasing
-
Turn antialiasing on or on
- Show Filtered
-
Show filtered objects
- Geometry
-
- Polylines
-
Connect the points in the Parallel Coordinates view can be connected using polylines.
- Steps
-
Connect the points in the Parallel Coordinates view can be connected using steps.
- Polycurves
-
Connect the points in the Parallel Coordinates view can be connected using polycurves.
- Look and Feel
-
Change the look and feel of the application
Create menu
- Scatter Plot
-
Create an additional Scatter Plot.
- Control Chart
-
Create a Control Chart.
Window menu
- Full Screen
-
Go into full-screen mode
Help menu
- High-D Help
-
Read the High-D documentation
- Check for Update…
-
Check for new version of the software
- Register…
-
Register the license key
- About High-D…
-
Obtains information about the current version of High-D
Status bar
Loading data
High-D offers the possibility of loading data in various formats and from multiple data sources. The most common ways of importing your own data is to use tab-delimited or comma-separated files, as well as Excel workbooks. Connectivity to common relational databases and some on-line data providers is also provided.
File-based data sources
To load data files, either
-
use the
menu entry. This will open a dialog to select the file to open:Figure 2. File chooser dialog for selecting a data file -
drag and drop a file with a known file extension onto the High-D application frame,
-
or double-click on the file if its extension is registered to High-D.
Macrofocus High-D (*.mhd
)
This is the native format used by High-D. It can be used to store both a copy of the actual data, its original data source, as well as all the configurations made using the High-D application. The data are stored in a highly compressed binary format to reduce the file size and all the configuration information in XML format. For a detailed technical specification of the data format, please contact us.
Text (Tab delimited) (*.txt;*.tsv;*.tab;*.raw
)
Loading data from tab-delimited text files should be pretty straightforward. High-D expects the first line to contain the name of each column, using the tab character to separate each column.
The tab-separated values format is a popular method of data interchange among databases and spreadsheets. It stores tabular data (numbers and text) in plain-text form. While it is a loosely defined format (even though IANA attempts to standardize it), High-D automatically detects its encoding, the type of data values, and handles smoothly all the most common causes of errors. Tab-delimited files are processed similarly to comma-delimited files, except that they use the tabulator character to separate each column.
High-D expects the first line as a header to contain names corresponding to the columns in the file. These values will be used to name each of the variables. Each record is then located on a separate line. The values between each column are delimited by tabs. Each record "should" contain the same number of tab-separated fields. Any field may be quoted (with double quotes). Fields containing a line-break, double-quote, and/or tab should be quoted. A (double) quote character in a field must be represented by two (double) quote characters.
After the file has been loaded, High-D will attempt to detect the data type of each column. Automatically recognized types are text (String), numbers (Integer and Double) and some more specialized types such as dates (supported formats are "MM/dd/yyyy", "MM/dd/yy", "yyyy-MM-dd", "dd.MM.yyyy HH:mm:ss"), URLs, geometries (in WKT format), and binary data (in Base64 format).
As an example, the following text file
Planet Region Spherical area Radius in km Discovery date Wikipedia article Mercury Inner Solar System 18688458.19 2439 http://en.wikipedia.org/wiki/Mercury_(planet) Venus Inner Solar System 115066184.2 6052 http://en.wikipedia.org/wiki/Venus Earth Inner Solar System 127796483.1 6378 http://en.wikipedia.org/wiki/Earth Mars Inner Solar System 36274097.98 3398 http://en.wikipedia.org/wiki/Mars Jupiter Outer Solar System 16014816458 71398 http://en.wikipedia.org/wiki/Jupiter Saturn Outer Solar System 11309733553 60000 http://en.wikipedia.org/wiki/Saturn Uranus Outer Solar System 2026829916 25400 3/13/1781 http://en.wikipedia.org/wiki/Uranus Neptune Outer Solar System 1855079046 24300 9/23/1846 http://en.wikipedia.org/wiki/Neptune Pluto Outer Solar System 7547676.35 1550 2/18/1930 http://en.wikipedia.org/wiki/Pluto
will result in the following table being loaded in High-D:
Planet |
Region |
Spherical area |
Radius in km |
Discovery date |
Wikipedia article |
|
|
|
|
|
|
Mercury |
Inner Solar System |
18688458.19 |
2439 |
||
Venus |
Inner Solar System |
115066184.2 |
6052 |
||
Earth |
Inner Solar System |
127796483.1 |
6378 |
||
Mars |
Inner Solar System |
36274097.98 |
3398 |
||
Jupiter |
Outer Solar System |
16014816458 |
71398 |
||
Saturn |
Outer Solar System |
11309733553 |
60000 |
||
Uranus |
Outer Solar System |
2026829916 |
25400 |
3/13/1781 |
|
Neptune |
Outer Solar System |
1855079046 |
24300 |
9/23/1846 |
|
Pluto |
Outer Solar System |
7547676.35 |
1550 |
2/18/1930 |
While High-D will autodetect the character encoding used for representing international and special characters beyond ASCII characters, it is recommended to use the Unicode standards (typically UTF-8 or UTF-16).
To force High-D to parse values for a specific data type, an optional second header line can be inserted. The second line can optionally contain information about the type of values to be expected for each column. Possible types are "String" for any type of textual information, "Integer" for numbers without a fractional or decimal component, "Float" and "Double" for single and double precision floating-point numbers, and "Color" to provide color information. Each subsequent lines should contain the respective values for each of the columns.
As an example, you can download the Forbes Global 2000 dataset in this format.
After the data file has been loaded into High-D, it will automatically attempt to create a default configuration.
CSV (Comma delimited) (*.csv
)
The comma-separated values (CSV) format stores tabular data (numbers and text) in plain-text form. Most spreadsheet and data management software are able to export data in this format. While it is a loosely defined format (even though RFC 4180 attempts to standardize it), High-D automatically detects its encoding, the type of data values, and handles smoothly all the most common causes of errors. Comma-delimited files are processed similarly to tab-delimited files, except that they use a comma (or semicolon) to separate each column.
High-D expects the first line as a header to contain names corresponding to the columns in the file. These values will be used to name each of the variables. Each record is then located on a separate line. The values between each column are delimited by commas (or semicolons). Each record "should" contain the same number of comma-separated fields. Any field may be quoted (with double quotes). Fields containing a line-break, double-quote, and/or commas should be quoted. A (double) quote character in a field must be represented by two (double) quote characters.
After the file has been loaded, High-D will attempt to detect the data type of each column. Automatically recognized types are text (String), numbers (Integer and Double) and some more specialized types such as dates (supported formats are "MM/dd/yyyy", "MM/dd/yy", "yyyy-MM-dd", "dd.MM.yyyy HH:mm:ss"), URLs, geometries (in WKT format), and binary data (in Base64 format).
As an example, the following text file
Planet,Region,Spherical area,Radius in km,Discovery date,Wikipedia article Mercury,Inner Solar System,18688458.19,2439,,http://en.wikipedia.org/wiki/Mercury_(planet) Venus,Inner Solar System,115066184.2,6052,,http://en.wikipedia.org/wiki/Venus Earth,Inner Solar System,127796483.1,6378,,http://en.wikipedia.org/wiki/Earth Mars,Inner Solar System,36274097.98,3398,,http://en.wikipedia.org/wiki/Mars Jupiter,Outer Solar System,16014816458,71398,,http://en.wikipedia.org/wiki/Jupiter Saturn,Outer Solar System,11309733553,60000,,http://en.wikipedia.org/wiki/Saturn Uranus,Outer Solar System,2026829916,25400,3/13/1781,http://en.wikipedia.org/wiki/Uranus Neptune,Outer Solar System,1855079046,24300,9/23/1846,http://en.wikipedia.org/wiki/Neptune Pluto,Outer Solar System,7547676.35,1550,2/18/1930,http://en.wikipedia.org/wiki/Pluto
will result in the following table being loaded in High-D:
Planet |
Region |
Spherical area |
Radius in km |
Discovery date |
Wikipedia article |
|
|
|
|
|
|
Mercury |
Inner Solar System |
18688458.19 |
2439 |
||
Venus |
Inner Solar System |
115066184.2 |
6052 |
||
Earth |
Inner Solar System |
127796483.1 |
6378 |
||
Mars |
Inner Solar System |
36274097.98 |
3398 |
||
Jupiter |
Outer Solar System |
16014816458 |
71398 |
||
Saturn |
Outer Solar System |
11309733553 |
60000 |
||
Uranus |
Outer Solar System |
2026829916 |
25400 |
3/13/1781 |
|
Neptune |
Outer Solar System |
1855079046 |
24300 |
9/23/1846 |
|
Pluto |
Outer Solar System |
7547676.35 |
1550 |
2/18/1930 |
While High-D will autodetect the character encoding used for representing international and special characters beyond ASCII characters, it is recommended to use the Unicode standards (typically UTF-8 or UTF-16).
To force High-D to parse values for a specific data type, an optional second header line can be inserted. The second line can optionally contain information about the type of values to be expected for each column. Possible types are "String" for any type of textual information, "Integer" for numbers without a fractional or decimal component, "Float" and "Double" for single and double precision floating-point numbers, and "Color" to provide coloring information. Each subsequent line should contain the respective values for each of the columns.
As an example, you can download the Forbes Global 2000 dataset in this format.
After the data file has been loaded into High-D, it will automatically attempt to create a default configuration.
Microsoft Excel Workbook (*.xls;*.xlsx;*.xlsm
)
High-D can read files produced by Microsoft Excel, including the recent Office Open XML format, even without having Excel installed on the local computer. The first row is expected to contain the name of each column. If the workbook contains multiple sheets, a dialog allows to choose which one should be loaded by High-D.
To force High-D to parse values for a specific data type, an optional second header line can be inserted. The second line can optionally contain information about the type of values to be expected for each column. Possible types are "String" for any type of textual information, "Integer" for numbers without a fractional or decimal component, "Float" and "Double" for single and double precision floating-point numbers, and "Color" to provide color information. Each subsequent line should contain the respective values for each of the columns.
As an example, you can download the Forbes Global 2000 dataset in this format.
ODF Spreadsheet (*.ods
)
High-D can read files in the native OpenOffice and LibreOffice format.
SPSS (*.sav
)
High-D can read files in the native SPSS format.
SAS (*.sas7bdat
)
High-D can read files in the native SAS format.
ESRI Shapefile (*.shp
)
This is a popular geospatial vector data format for geographic information systems (GIS) software. Shapefiles spatially describe features: points, lines, and polygons, representing, for example, water wells, rivers, and lakes. Each item usually has attributes that describe it, such as name or temperature.
Apache Arrow (*.arrow
)
High-D can read files in the Apache Arrow format.
Apache Parquet (*.parquet
)
High-D can read files in the Apache Parquet format.
Microsoft Access (*.mdb;*.accdb
)
Access database tables can directly be loaded into High-D. However, this is only supported on the Windows platform and requires Microsoft Access or the Microsoft Access Database Engine to be installed.
Database connectivity
High-D can directly import data from popular relational database servers installed on the local computer or on a remote machine. Currently supported are:
-
MySQL
-
Oracle
-
Microsoft SQL Server
-
PostgreSQL
-
IBM DB2
-
SAP MaxDB
-
PostGIS
Please contact support if your database system is not currently supported. Any data source queryable through a JDBC driver can easily be integrated into High-D.
Microsoft Access is also supported, but as a file-based data source.
To start importing data from a database, go to
. This will open a dialog to define the required parameters:On-line data sources
Stock quotes data from Yahoo Finance can directly be access through the
submenu, as well as all the example datasets provided on our website. This menu entry also provides integration withHigh-D Server.Automatic default configuration
By default, High-D automatically assigns the first categorical variable to the label, the second categorical variable (if available) to the grouping, the first numerical variable to the size, and the second numerical variable (if available) to the color.
Data types
All data types support null (blank) values. Supported types are:
Text
-
String
-
Represents character strings such as "abc".
-
StringPath
-
Represents an array of character strings. Values should be delimited by commas.
-
HtmlString
-
Represents a tagged string in HTML format.
Numbers
-
Byte
-
The Byte data type is an 8-bit signed two’s complement integer. It has a minimum value of -128 and a maximum value of 127 (inclusive).
-
Short
-
The short data type is a 16-bit signed two’s complement integer. It has a minimum value of -32,768 and a maximum value of 32,767 (inclusive).
-
Integer
-
The Integer data type is a 32-bit signed two’s complement integer. It has a minimum value of -2,147,483,648 and a maximum value of 2,147,483,647 (inclusive). For integral values, this data type is generally the default choice unless there is a reason (like the above) to choose something else. This data type will most likely be large enough for the numbers your program will use, but if you need a wider range of values, use Long instead.
-
Long
-
The Long data type is a 64-bit signed two’s complement integer. It has a minimum value of -9,223,372,036,854,775,808 and a maximum value of 9,223,372,036,854,775,807 (inclusive). Use this data type when you need a range of values wider than those provided by Integer.
-
Float
-
The Float data type is a single-precision 32-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but it can typically handle more than 7 decimal digits. This data type should never be used for precise values, such as currency. For that, you will need to use the BigDecimal type instead.
-
Double
-
The Double data type is a double-precision 64-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but it can typically handle more than 15 decimal digits. For decimal values, this data type is generally the default choice. As mentioned above, this data type should never be used for precise values, such as currency.
-
BigDecimal
-
An arbitrary-precision signed decimal number.
-
StringDouble
-
A Double data type with support for formatting patterns.
Others
-
Boolean
-
The Boolean data type has only two possible values: true and false. Use this data type for simple flags that track true/false conditions. This data type represents one bit of information.
-
Date
-
Represents a specific instant in time, with millisecond precision.
-
Color
-
The Color data type is used to encapsulate colors in the default sRGB color space. Every color has an implicit alpha value of 1.0 or an explicit one provided in the constructor. The alpha value defines the transparency of a color and can be represented by a float value in the range 0.0 - 1.0 or 0 - 255. An alpha value of 1.0 or 255 means that the color is completely opaque and an alpha value of 0 or 0.0 means that the color is completely transparent. When constructing a Color with an explicit alpha or getting the color/alpha components of a Color, the color components are never premultiplied by the alpha component.
-
Icon
-
A small fixed size picture, typically used to decorate components.
-
Image
-
Represents graphical images.
-
URL
-
The URL data type represents a Uniform Resource Locator, a pointer to a "resource" on the World Wide Web. A resource can be something as simple as a file or a directory, or it can be a reference to a more complicated object, such as a query to a database or to a search engine. More information on the types of URLs and their formats can be found in the URL Specification.
-
File
-
A representation of file and directory pathnames.
-
byte[]
-
For binary data.
-
Geometry
-
Represents geometric information, such as points, lines, and polygons.
Axes panel
The Axes panel allows the customization of each of the available variables.
In the upper part, all the variables included in the dataset are listed.
Selecting one variable can be performed by clicking on its row. Adding to or removing variables from the selection can be done by holding the Ctrl key down while selecting them. Multiple adjoining variables can be selected using the mouse while holding down the Alt key.
At the bottom part, can apply customization that will be applied to each of the selected variables:
Visibility
- Categories
-
Whether to show the variables in the Categories view.
- Visualizations
-
Whether to show the variable in the Parallel Coordinates, Table Plot, Distributions, Scatter Plot Matrix, Parallel Coordinates Matrix views.
Scale
- Minimum
-
The lower end of the scale.
- Maximum
-
The upper end of the scale.
Filter
- Start
-
Items with a value below this threshold will be filtered out.
- End
-
Items with a value above this threshold will be filtered out.
- Set to Visible Range
-
Set to scale to the minimum and maximum values of the non-filtered items.
- Set to Range Slider
-
Set the scale to the current range of the Parallel Coordinates sliders.
- Make Common Range
-
Compute the overall minimum and maximum values of the selected variables.
- Make Symetrical around Mean
-
The center of the scale will be the mean value.
- Make Symetrical Range around 0
-
The center of the scale will be at 0.
- Round Range Values
-
Will round the scale using power of 10.
- Reset to Data Range
-
Will set the scale to the minimum and maximum values found in the data.
Distribution
In order to classify values in a certain amount of bins,
The following settings apply to the Distributions view only. |
- Type of binning
-
Auto
will automatically attempt to find a good number of bins, or will use the number specified below. The size of each bin will be distributed evenly. On the other hand, withSigma
will have the size of each bin vary depending on the standard deviation. A value of 6 will yield the typical Six Sigma (6σ) split often found in process improvement analysis. - Number of bins
-
With a value higher than 0, then fix the number of bins, otherwise the number of bins will be determined empirically.
Axis reordering
You can reorder the selected axis based on their similarity. To do so, select a some variables and hit the Reorder button.
You can also manually reorder the axis by dragging them in the Parallel Coordinates view. |
Configuration panel
High-D possesses a powerful layout, data processing, and rendering engine that offers a vast choice of customization possibilities. The configuration panel gives instant access to all the key settings, where each section can be further expanded to expose the full palette of choices to fine-tune the appearance of the various views.
Color
The Color drop down list gives the possibility of selecting which variable should be used for coloring the shapes.
- Import Colormap…
-
Import a colormap definition and apply it to the currently selected color variable.
- Export Colormap…
-
Export the colormap definition of the currently selected color variable.
- Copy Graphics
-
Copy the colormap to the clipboard.
- Export Graphics…
-
Export the colormap to a raster or vector-based graphic format.
- Print…
-
Print the colormap.
Categorical colormap
If a categorical variable is selected, then colors are automatically assigned to each of the value. Each color can be individually customized by clicking on the color itself. Each color can be individually changed by clicking on the color cell.
- Missing Value Color
-
If the data contains missing values, then their color can be edited here.
- Reset
-
Allows to reassign all the values to their default color.
Predefined colormap
If a numerical variable is selected, High-D offers the possibility of setting the lowest and highest values that should be mapped to the selected colormap. If the variable contains negative values, the range is automatically made symmetric.
- Palette
-
A color palette can be selected from a wide range of predefined color palettes
- Maximum
-
Sets the upper bound of the colormap.
- Minimum
-
Sets the lower bound of the colormap.
- Set to Data Range
-
Set the minimum and maximum of the colormap to the minimum and maximum values of the data.
- Set to Symmetrical Range around 0
-
Will make the colormap symmetrical should it contain negative values.
- Set to Rounded Range
-
Will round the minimum and maximum values to their next power of 10 value.
- Number of Steps
-
Can be used to segment the palette into a specified number of discrete colors.
- Inverted
-
Invert the colormap.
- Brightness
-
The color luminance can be adjusted by increasing or decreasing its brightness.
- Saturation
-
The color intensity can be adjusted by increasing or decreasing its saturation.
- Overflow Color
-
If some of the data values fall above the upper threshold, then their color can be edited here.
- Underflow Color
-
If some of the data values fall below the lower threshold, then their color can be edited here.
- Missing Value Color
-
If the data contains missing values, then their color can be edited here.
Custom colormap
For more customization possibilities, it is also possible to define a custom colormap by setting thresholds at given values. High-D will take care of interpolating the colors if Ramps mode is selected, or will make them valid for the whole range in Steps mode.
- Threshold
-
Define the threshold for which values equal or above its value will be assigned the associated color. Removing a threshold can be accomplished by setting its color to None. New thresholds can be added by specifying a value in the last entry of the table. The color associated to the new threshold will be automatically extrapolated from the current colormap definition.
- Color
-
The color associated with each threshold.
- Ramps/Steps
-
Indicates whether the values should be interpolated within the threshold ranges (Ramps) or made discrete (Steps).
- Brightness
-
The color luminance can be adjusted by increasing or decreasing its brightness.
- Saturation
-
The color intensity can be adjusted by increasing or decreasing its saturation.
- Overflow Color
-
If some of the data values fall above the upper threshold, then their color can be edited here.
- Underflow Color
-
If some of the data values fall below the lower threshold, then their color can be edited here.
- Missing Value Color
-
If the data contains missing values, then their color can be edited here.
Rendering
More options can be customized in the Rendering pane. Each visualization can be rendered using AlphaBlended
, Density
, or Opaque
drawing schemes.
- Antialiasing
-
Gives the possibility to disabled antialiased drawing.
- Show Filtered
-
Allows to make filtered items visible.
- Geometry
-
Points in the parallel coordinates plots can be connected using
Polylines
,Steps
, orPolycurves
.
Legend
A graphical depiction of the color scale as well as a textual description of the main options that have been selected. The legend can be exported accessing the context menu (by right clicking the mouse):
- Copy Graphics
-
Copy the legend to the clipboard.
- Export Graphics…
-
Export the legend to a raster or vector-based graphic format.
- Print…
-
Print the legend.
Parallel Coordinates view
Parallel coordinates works by having vertical axis per data column and each row is displayed as a series of connected points along the axes. Using our innate pattern-recognition abilities, it enables spotting multivariate relations in a blink. Parallel coordinates has been popularised and systematically developed by Alfred Inselberg [Inselberg2009]. Thanks to the unique approach taken by High-D to use density-based rendering to avoid overplotting, and the choice between straight and curved geometries, relations and trends emerge immediately.
At the bottom of the user interface, you will find the Parallel Coordinates view that corresponds to the chosen settings in the Configuration and Axes panels. Each item is represented by a polyline or polycurve.
Moving the mouse over a shape will display a pop-up window (also called a tooltip) that shows the values configured in the "Labels" section of the "Configuration" panel.
Probing and selection
Selection can be performed by clicking on a shape. Adding to or removing items from the selection can be done by holding the Ctrl key down while clicking on shapes.
Multiple adjoining items can be selected using a rubberband which is activated by dragging the mouse while holding down the Alt key.
Filtering
Items can be filtered by using the range sliders embedded in the Parallel Coordinates view. The range of an attribute can be specified by moving the handles on the top and bottom of the corresponding range slider. Items whose value for that attribute falls outside of the specified range, are filtered out and can not be interacted with anymore. Their "ghosts" remain visible though and they appear greyed-out. Use a combination of range sliders to dynamically formulate complex queries.
Axes reordering
An axis can be moved to a different position by dragging its label and dropping it at the desired position. Automatic reordering can be performed through the Axes panel.
TablePlot view
The TablePlot view works by having rows sorted by one variable to visually spot how the increase in value is correlate to the values of other variables. To obtain the complete picture, sorting through each variable is necessary.
At the bottom of the user interface, you will find the Table Lens view that corresponds to the chosen settings in the Configuration and Axes panels. For each axis, each item is represented by a line those width is proportional to its value.
Moving the mouse over a shape will display a pop-up window (also called a tooltip) that shows the values configured in the "Labels" section of the "Configuration" panel.
Probing and selection
Selection can be performed by clicking on a shape. Adding to or removing items from the selection can be done by holding the Ctrl key down while clicking on shapes.
Multiple adjoining items can be selected using a rubberband which is activated by dragging the mouse while holding down the Alt key.
Distributions view
The Distributions view shows how values are distributed for each variable.
At the top of the user interface, you will find the Distributions view that corresponds to the chosen settings in the Configuration and Axes panels. For each axis, items are grouped into bins those width is proportional to the number of values.
Moving the mouse over a shape will display a pop-up window (also called a tooltip) that shows the values configured in the "Labels" section of the "Configuration" panel.
Probing and selection
Selection can be performed by clicking on a shape. Adding to or removing items from the selection can be done by holding the Ctrl key down while clicking on shapes.
Multiple adjoining items can be selected using a rubberband which is activated by dragging the mouse while holding down the Alt key.
Scatter Plot Matrix view
The Scatter Plot Matrix view allows you to create a view containing a scatter plot for each pairwise combination of variables.
Parallel Coordinates Matrix view
The Parallel Coordinates Matrix view extends the parallel coordinates idea by providing a view of each pairwise relations between variables. Using our innate pattern-recognition abilities, it enables spotting correlations in a blink. Thanks to its unique density-based approach to avoid overplotting, and the choice between straight and curved geometries, relations emerge immediately.
At the bottom of the user interface, you will find the Parallel Coordinates Matrix view that corresponds to the chosen settings in the Configuration and Axes panels. Each item is represented by a polyline or polycurve.
Moving the mouse over a shape will display a pop-up window (also called a tooltip) that shows the values configured in the "Labels" section of the "Configuration" panel.
Probing and selection
Selection can be performed by clicking on a shape. Adding to or removing items from the selection can be done by holding the Ctrl key down while clicking on shapes.
Multiple adjoining items can be selected using a rubberband which is activated by dragging the mouse while holding down the Alt key.
Scatter Plot view
The Scatter Plot view allows you to create a scatter plot of the data. Any combination of numerical variables can be used to map to the x- and y-axes as well as size and color of the glyphs.
Configuration
To configure which of the numerical variables should be mapped to the x- and y-axes, use the drop-down lists located at the end of the axes.
The color and size of the markers are determined the same way as for the other views, i.e. the definitions in the drop-down lists in the Configuration panel.
Zooming
You can zoom in by using the range sliders on the top and to the right of the display area. You can zoom out by double-clicking anywhere on the slider.
And of course the mouse wheel also works.
Probing and selection
Selection can be performed by clicking on a marker. Adding to or removing items from the selection can be done by holding the Ctrl key down while clicking on markers.
Multiple adjoining items can be selected using a rubberband which is activated by dragging the mouse while holding down the Alt key.
Multidimensional Scaling view
The Multidimensional Scaling view allows you to create a two-dimensional projection of the multidimensional data that attempts to capture the main relationships: items close together in the view are similar in the high-dimensional space while dissimilar one will be further away.
Computation
The process is iterative can be started using the Start button. When the layout has reached the desired stability, hitting the Stop button will terminate the computation. Two layout algorithms, Spring and Sammon are currently provided.
Several dimensionality reduction algorithms are provided, including Sammon
's mapping [Sammon1969], a Spring
-based layout [Fruchterman1991], t-Distributed Stochastic Neighbor Embedding (t-SNE
) [Maaten2008], and Principal Component Analysis (PCA
) [Pearson1901].
Zooming
You can zoom in by using the range sliders on the top and to the right of the display area. You can zoom out by double-clicking anywhere on the slider.
And of course the mouse wheel also works.
Probing and selection
Selection can be performed by clicking on a marker. Adding to or removing items from the selection can be done by holding the Ctrl key down while clicking on markers.
Multiple adjoining items can be selected using a rubberband which is activated by dragging the mouse while holding down the Alt key.
TreeMap view
The TreeMap view shows how values are distributed for each variable.
At the top of the user interface, you will find the TreeMap view that corresponds to the chosen settings in the Configuration and Axes panels. For each axis, items are grouped into bins those width is proportional to the number of values.
Moving the mouse over a shape will display a pop-up window (also called a tooltip) that shows the values configured in the "Labels" section of the "Configuration" panel.
Probing and selection
Selection can be performed by clicking on a shape. Adding to or removing items from the selection can be done by holding the Ctrl key down while clicking on shapes.
Multiple adjoining items can be selected using a rubberband which is activated by dragging the mouse while holding down the Alt key.
CartoPlot view
When the data contains geographical features (Longitude/Latitude) coordinates, or geometrical objects such as lines and polygons), the CartoPlot view shows the items on top of geographical tiles obtained online map services.
Zooming
You can zoom in by using the range sliders on the top and to the right of the display area. You can zoom out by double-clicking anywhere on the slider.
And of course the mouse wheel also works.
Probing and selection
Selection can be performed by clicking on a marker. Adding to or removing items from the selection can be done by holding the Ctrl key down while clicking on markers.
Multiple adjoining items can be selected using a rubberband which is activated by dragging the mouse while holding down the Alt key.
Saving settings, data, and graphics
High-D can save the data along with the settings applied to the visualization in its own data format (file with .mhd
extension). For this, use the or menu:File]Save As…] menu entries. To only save the settings and have the data file referenced instead of being embedded, you can produce such a file by doing:
-
Open a data file (Excel for example)
-
Modify all the parameters as desired
-
Do
-
Select
Macrofocus High-D (*.mhd)
as file type
When you open the resulting file (e.g. using .mhd
file. It can be opened using any text editor.
Exporting graphics
You can also export the currently active Hidh-D view using the following schemes:
-
Using
: the current view is exported in vector or raster form in one of the following supported formats:-
PDF (Portable Document Format) (
*.pdf
) -
The resulting document is ideal for printing or inclusion in a report. It is a vector format and therefore resolution independent.
-
Scalable Vector Graphics (
*.svg
) -
The resulting document is ideal for further editing and for inclusion into another document. It is a vector format and therefore resolution independent. Scalable Vector Graphics (SVG) can be displayed by many web browsers with an embedded SVG viewer, or edited by any application supporting SVG (such as Adobe Illustrator).
-
Postscript (
*.ps
) -
A common vector format and therefore resolution independent. Can be used for printing.
-
EMF (Enhanced Metafile) (
*.emf
) -
A resolution independent format common on the Windows platform.
-
PNG (Portable Network Graphics) (
*.png
) -
A raster format.
-
JPEG (
*.jpg
) -
A raster format.
-
Compuserve GIF (
*.gif
) -
A raster format.
-
TIFF (Tagged Image File Format) (
*.tiff
) -
A raster format.
All the raster export format allow for setting the desired DPI for high-quality output.
-
PDF (Portable Document Format) (
-
Using
: the current view is put into the clipboard in bitmap format (and can be pasted into applications such as Microsoft Powerpoint).
Exporting data
The data visible in the TreeTable view can be exported with
for further processing in spreadsheet programs or other applications. The following formats are supported:-
CSV (Commad Delimited) (
*.csv
) -
The comma-separated values (CSV) format stores tabular data (numbers and text) in plain-text form. Most spreadsheet and data management software are able to import data in this format.
-
Text (Tab Delimited) (
*.txt;*.tsv;*.tab;*.raw
) -
The tab-separated values format is a popular method of data interchange among databases and spreadsheets. It stores tabular data (numbers and text) in plain-text form.
-
Microsoft Excel Workbook (
*.xls;*.xlsx;*.xlsm
) -
The comma-separated values (CSV) format stores tabular data (numbers and text) in plain-text form. Most spreadsheet and data management software are able to import data in this format.
-
Apache Arrow (
*.arrow
) -
The Arrow format stores tabular data (numbers and text) in form that allow data access without serialization overhead.
-
Apache Parquet (
*.parquet
) -
The Parquet format stores tabular data (numbers and text) in a compressed, efficient columnar data representation. It is popular in the Hadoop ecosystem.
Import/export of settings
All the settings can be exported using
to be applied to another dataset usingPrinting
Using
to get a printout of the active High-D view (note that the resulting print job can also be redirected to a file)Invoking High-D through the command line
High-D can be invoked from the command line, typically for automating and batch processing the production of several visualizations.
If you intend to use this scripting possibility in unattended and automated batch jobs (typically a night job running on a remote build server): non-human devices that utilize our software without user interaction are counted as users and you would then need to order the appropriate number of licenses.
High-D comes bundled with its own optimized Java runtime (that can be found in the jre directory), which is order of magnitude faster for certain operations than the standard Java runtime. Nevertheless, High-D is fully compatible with Java 11. The invocation of High-D from the command line can be done as follows:
- Windows
-
Start Command Prompt application and then type:
cd "C:\Program Files\High-D"
jre\bin\java -jar lib/high-d-swing.jar --uiscaling 1.2 data.mhd`
- macOS
-
Start Terminal application
cd /Applications/High-D/
./.install4j/jre.bundle/Contents/Home/bin/java -jar lib/high-d-swing.jar --uiscaling 1.2 data.xls
- Linux
cd /usr/local/TreeMap
jre/bin/java -jar lib/high-d-swing.jar --uiscaling 1.2 data.csv
Command line options
High-D can be invoked from the command line with the following options::
-
-h
,--help
-
Show the help
-
-e, --expert
-
Run High-D in expert mode
-
-f, --lf <argument>
-
Set the look and feel
-
-u, --uiscaling <argument>
-
Scale the UI
Bibliography
-
[Fruchterman1991] Thomas M. J. Fruchterman, Edward M. Reingold (1991), "Graph Drawing by Force-Directed Placement", Software – Practice & Experience, Wiley, 21 (11): 1129–1164, doi:10.1002/spe.4380211102.
-
[Inselberg2009] Alfred Inselberg (2009). "Parallel Coordinates: Visual Multidimensional Geometry and its Applications". Springer. ISBN 978-0-387-68628-8.
-
[Maaten2008] L.J.P. van der Maaten and G.E. Hinton (2008). "Visualizing High-Dimensional Data Using t-SNE", Journal of Machine Learning Research 9 (Nov): 2579-2605. https://lvdmaaten.github.io/publications/papers/JMLR_2008.pdf.
-
[Pearson1901] Karl Pearson (1901). "On Lines and Planes of Closest Fit to Systems of Points in Space", Philosophical Magazine. 2 (11): 559–572. doi:10.1080/14786440109462720.
-
[Sammon1969] John W. Sammon (1969). "A nonlinear mapping for data structure analysis", IEEE Transactions on Computers. 18 (5): 401–409, doi:10.1109/t-c.1969.222678.