I am pleased to announce the Dataverse 3.1.3 release is now available.
This GA release extends the script-less node configuration capabilities of the Filter node and Split node to include additional operators for the filter/split criteria. The release provides a number of new (java-based) nodes that are replacements for some existing (C++ based) nodes, enhancing performance and improving support for Unicode characters. The ‘File Picker’ support has been extended to allow its use when configuring additional nodes. The performance and capabilities of the Data Viewer have been improved, enabling it to be more responsive when viewing wide/larger data sets. You can now group selected nodes into Composites and import licence files post-installation from the GUI. In addition, we have enhanced the capabilities for managing scheduled runs and extended the product’s audit logging functionality. The release also delivers numerous performance, stability and usability improvements.
CSV/Delimited Input Node
This is a replacement for the CSV File node and the Delimited File node, which are now deprecated. The node allows you to specify the source of the data in a number of ways: (a) the filename path of a file; (b) the name of a field on the input containing the filename path - one filename per record; (c) the name of a field on the input whose records contain the delimited data to be imported.
The CSV/Delimited Input node supports a range of common delimited file formats. The chosen format is used to automatically configure the default values for the delimiter character, escape character and quote character - streamlining the configuration of the node when dealing with typical data files.
When the CSV/Delimited Input node is used to import data from multiple files, the data from each file is merged into the output data set. The concatenation mode of the node can be configured to require the fields to be the same for all input files (generating an error if they differ), to perform a Union of the fields from each of the files (populating 'missing' fields with NULL values), or to only output the Intersection of the fields that are common to all data files.
When the node is configured to use fields from its input pin, fields can be passed through to the output unchanged. You can configure the fields to be passed through based on whether they are used or not (used fields are passed through by default).
The node can be configured to set the data type of a field based on information in the header record.
When typed headers are used, the format must be of the form:
<Field Name>:<Data Type>
Note: The data type must be a valid Dataverse data type.
In some cases, data files can include lines before the header record. The CSV/Delimited Input node allows you to specify the number of records to skip prior to attempting to parse the data.
Data files may contain 'messy' data, making it difficult to import. The CSV/Delimited Input node provides two output pins. Successfully imported data are presented on the node's output pin. If errors are encountered when processing data, the records generating errors are output on the ‘errors’ pin by default and the node continues to process the remaining input data.
You can specify the node’s exception behaviour. The node can be configured to Error, Log (default) or Ignore errors. An error threshold can be specified as: (a) A fixed number of errors; (b) A percentage of the records processed.
The performance of the node when processing Unicode data has been significantly improved compared with the Delimited File node. Files containing Byte-order mark (BOM) Unicode characters are handled by the UTF-16 character set.
Create Data Node
The Create Data node is a replacement for the Static Data node which is now deprecated. The default data in the Create Data node includes a wider range of Dataverse data types.
The delimiter character can now be specified, simplifying the process of populating the node with external data (the default is comma). The node can also be configured to generate multiple copies of the data, simplifying the process of creating a larger test data set.
Head Node and Tail Node
This release includes drop-in replacement nodes for the Head node and Tail node. The performance of the Tail node has been significantly improved and resolves an issue with running out of memory when processing data sets with a large number of records.
Directory List Node
This release includes a replacement node for the existing Directory List node.
The Directory List node now has an optional input pin. This is used to supply a list of multiple directories to be searched by the node (specify one directory path per record).
When an input pin is connected, fields from the input can be passed through to the output. You can choose to pass through None (default)/ All/ Used/ Unused fields. By default, the contents of the input fields used to generate the output record are passed through to the output – e.g. the path of the directory containing the particular file, or the parent directory when the ‘Recurse’ option is used.
When using the ‘Recurse’ option to search subdirectories, the Directory List node now provides additional properties that enable you to:
- Specify a list of patterns matching subdirectories which are to be excluded from the search
- Specify a list of patterns matching filenames which are to be excluded from the search
The Directory List node now outputs additional file information, including:
- File name
- File created time
- File modified time
- File size
Note: the ‘FileName’ property is now output as a ‘unicode’ data type field instead of a ‘string’ data type field. This may affect down-stream nodes that expect a ‘string’ data type field.
Filter and Split Node Additional Operators
The script-free configuration options of the Filter and Split nodes now also support ‘Is Null’ and ‘Not Null’ operators, allowing you to easily identify and separate records with missing data.
File Picker Functionality
When working with nodes that have a ‘From Filename’ input option (e.g. JSON Data, XML Data), the file picker is now available – enabling you to browse to the file that you want to import. The File Picker is also available with the QVD File node.
Data Viewer Enhancements
The performance of the Dataverse Data Viewer has been improved. It is now more responsive when scrolling across ‘wide’ data sets and up/down in large data sets.
Multi-line Record Support
The Data Viewer now renders cells containing multiple lines of data.
Selected nodes can now be grouped into a composite node. Drag-and-Drop actions can also be used to add nodes on the canvas to a composite.
Note: Node state is not maintained when grouped.
Import License Option
In addition to specifying an application license during installation you can now import a license into Dataverse from the ‘Licensing’ option of the main menu.
It is not necessary to restart the application when a new license is applied.
Data flow run and save actions are now included in the Dataverse audit log. This provides an audit trail of the data flows that have been executed and modified by users.
Dataverse Scheduling Features
Continue Failed Scheduled Run
In some situations a scheduled run may fail e.g. due to a communication problem with an external system. In this case you may want to continue the failed run from the point where it failed. Dataverse now supports this option.
Purge Temporary Data
Dataverse can now be configured to delete temporary data after a configurable number of days. This improves the management of system resources consumed by purging aged temporary data created by successful scheduled runs.
The nodes listed below have been deprecated. While these nodes continue to be available in the current release, support will be removed in a future release. Lavastorm recommend users employ the new nodes in all new projects and, where possible, transition to using the new nodes in existing data flows
Static Data Node
- The Create Data node is the replacement for the Static Data node.
CSV File node
- The CSV/Delimited Input node is the replacement for the CSV File node.
- The CSV/Delimited Input node is the replacement for the CSV File node.
Static Data Node Issues
The new Create Data node which replaces the deprecated Static Data node resolves the following issues:
- The new Create Data node does not allow you to create multiple fields with the same name. LAE-8157
- The new Create Data node resolves the issue where a leading space before a datetime string value or a quoted string value caused the Static Data node to fail. LAE-8702
- The Create Data node allows you to specify a field delimiter. LAE-8701
Output Excel and Append Excel nodes Issues
- The Output Excel and Append Excel nodes have a new ‘TrustedSource’ property where you can specify whether you trust any Excel file source used by the node, allowing you to force the node to run successfully when the source is trusted, even if a zip bomb is detected. LAE-8716
Library nodes with similar name
- Previously, the system would fail to import Legacy Node Libraries when multiple library nodes shared the same name except for one or more non-alphanumeric characters, for example "Node name" and "Node name +". This issue has now been fixed. LAE-8700
Hide properties on library nodes
- Resolved issue where hidden properties were still being displayed on library nodes. LAE-8085
Directory List node
- Resolved issue where the output file names on a Directory List node contained a mixture of "/" and "\" file path separators. LAE-8349
- Resolved issue that caused the Tail node to run out of memory when used with data sets with a large record count. The performance of this node has been greatly improved. LAE-8697
- Resolved issue where the Tail node failed to produce a valid output pin (missing column headers). LAE-8698
CSV File and Delimited File nodes
The new CSV/Delimited Input node which replaces the deprecated CSV File and Delimited File nodes resolves the following node issues:
- The new CSV/Delimited Input node contains an errors output by default which contains information about any errors, including the full file path of the source file. LAE-8308
- The new CSV/Delimited Input node contains a ‘TypedHeaders’ property where you can specify if the field metadata contains data type information. This resolves the issue of field types not being correctly read on the CSV File and Delimited File nodes. LAE-8303
- The new CSV/Delimited Input node supports single character Unicode delimiters. LAE-8301
- When importing a UTF-8 encoded file, the deprecated Delimited File node would output the fields as string, not as Unicode. This issue has been resolved with the new CSV/Delimited Input node. LAE-8300
- The new CSV/Delimited Input node resolves an issue seen on the deprecated Delimited File node whereby an additional value that was not present on the input was being output if the last record had a blank last field without a field delimiter. LAE-5189
- The new CSV/Delimited Input node supports additional encodings with the UTF-16 character set. LAE-2684
Amazon Redshift JDBC Driver
The Amazon Redshift JDBC driver is not shipped with this release due to a clash with the Postgres driver that is used with the Dataverse database.
Dataverse Desktop Minimum System Requirements
The minimum supported RAM for the Dataverse Desktop editions is now 8GB. This change does not affect users who are using their browser to access a Dataverse Server instance on a remote machine.