Sometimes, the data you want to analyze in Spotfire is not in the most appropriate format, or may contain errors. It can therefore be useful to modify the data to get the best results from the analysis.
Tip: Some common data wrangling operations are available directly in The Expanded Data Panel. There, you can easily change the data type for a column, configure a sort order, replace empty values, split the values in a column, etc.
There are also several transformation methods that can be used to transform your data, either when adding a data table, or, as a data transformation step after a data table is added:
Calculate and replace column – replace a column in the data table with a calculated column.
Calculate new column – add a calculated column to the data table.
Change column names – change the name of one or more of the columns in the data table.
Change data types – change the data type for one or more of the columns in the data table.
Data function – use a previously registered data function as a transformation step.
Exclude columns – exclude one or more of the columns from the data table.
Normalization – normalize the data prior to addition of the data table.
Pivot – pivot the data, that is, change the data table from a tall/skinny format to a short/wide format.
Replace specific value – change the value on a specific row in the data table to a different value.
Replace value – change all instances of a value in a column to a different value.
Unpivot – unpivot the data, that is, change the data table from a short/wide format to a tall/skinny format.
Additional transformations may be available to you if these have been added locally.
You can perform transformations on most of the "regular" column types that are loaded into Spotfire, but not on certain column types whose content changes depending on selections you make in the analysis. Calculated columns, columns created by adding tags to the analysis, and columns created using tools like K-means Clustering and Line Similarity are some examples of column types that you cannot apply transformations to. Columns that cannot be transformed will not be available for selection in any of the settings dialogs used for transformations.
Applying transformations on the source or afterwards?
Transformations may be applied either upon data loading, or later on, when the data has already been loaded into Spotfire. Having the transformation directly on the data source is generally only necessary if the original data is very large and the transformation itself reduces or remodels the data to load; for example, if an unpivot operation is required to reduce the number of columns. Most of the time, transformations benefit from being added as a separate step in the data history, on the already loaded data. When the transformation group is added as a separate step, you can easily edit or remove one or more transformations without needing to reload the data.
When you work with data from data connections, you can only transform data if it is imported into Spotfire. Then, you can apply transformations to the data after it has been loaded, by using Insert Transformations, as described below. In-database data from connections, however, cannot be transformed.
To apply transformations when loading data:
Tip: If your data source allows it, it is possible to insert a transformation on the source from the 'Loaded data' step (lower left part) in the source view of the expanded Data panel.
Select File > Add Data Tables....
Click Add and select the type of data to add from the drop-down list.
Specify which file, information link, etc., to add.
Comment: You can also apply transformations when you replace a data table and when you insert columns or rows to an already existing data table.
Click Show transformations to expand the lower part of the dialog, where the settings for adding transformations are located.
In the Transformations drop-down list, select the type of transformation you want to perform on the data.
Click Add... to open a dialog with settings relevant for the selected transformation type. When you have specified the necessary settings, click OK to return to the Add Data Tables dialog.
Repeat steps 5 and 6 to add more transformations.
Tip: Click on the Preview button to see what the result will be with the added transformations.
Click OK in the Add Data Tables dialog to load the data with the added transformations to the analysis.
Tip: If you already have added a transformation to the data table, you can add more transformations by editing the transformation step in the source view of the expanded Data panel.
Select Insert > Transformations....
Select the data table you want to add transformations to.
In the Transformations drop-down list, select the type of transformation you want to perform on the data.
Click Add... to open a dialog with settings relevant for the selected transformation type. When you have specified the necessary settings, click OK to return to the Insert Transformations dialog.
Repeat steps 3 and 4 to add more transformations.
Tip: Click on the Preview button to see what the result will be with the added transformations.
Click OK to apply the selected transformations to the data.
To replace a value:
See Replacing a Value.
To edit a previously added transformation:
To remove a transformation:
See Removing Operations in the Source View.
See also:
Details on Insert Transformations
Details on Show Transformation
Details on Replace Data Table - Select Source