Welcome to RapidMiner 9.1 Release Candidate!

Dear RapidMiner User,

The new RapidMiner 9.1 release is now waiting for you! The release has great new features for both new and experienced users – check out the new Automatic Feature Selection, Pivot operator, enhanced Time Series capabilities and support for internally signed certificates. We also have some great news for Server and Radoop users: RapidMiner Server is now HA-ready, we improved the Scheduler and added HDP3 support. Please, try these out and let us know if the new capabilities meet your expectations. Also, please try some of your existing processes to make sure everything is compatible.

What's New

RapidMiner Studio

Automatic Feature Selection and Engineering

Improve your predictive models through automated selection and generation of optimal feature sets. Benefit from the capability in Auto Model or by using the new Automatic Feature Engineering operator in a RapidMiner process.

automatic-feature-engineering

Improved integration of Turbo Prep and Auto Model into process design

Access Turbo Prep and Auto Model seamlessly from the process canvas, by right-clicking on the output port, or by interacting with the buttons on results view. Once done with transforming data in Turbo Prep, you can generate a sub process which will add the steps you performed in Turbo Prep back into the process you have been working on. This way you can quickly navigate back and forth and benefit from the intuitive and interactive interface of Turbo Prep while keeping the full flexibility of process design.

right-click-menu results-view

Further improvements in Turbo Prep and Auto Model

  • Turbo Prep

    • Use bar charts in PIVOT to explore your data.
    • Filter or sort columns in PIVOT tables.
    • Additional education videos added.
  • Auto Model

    • Use Support Vector Machine for prediction tasks.
    • Extract features from date columns.
    • Export result to Repository, Qlik, Excel, CSV or continue directly in Turbo Prep.

New Pivot operator and percentile aggregation

  • Easily aggregate and transform your data with a single and lightning fast Pivot operator. This new operator will deprecate the old one.
  • Calculate percentile for a numerical column using the the enhanced Aggregate operator. Just add the desired percentile as parameter to "percentile(X)" function.

percentile-configpercentile-result

Time Series

Tackle the complexity of time series data with the new time series capabilities: Understand trends and seasonality using the new time series decomposition operators. Forecast with the Holt-Winters method. Process nominal time series data with the Windowing, Process Windows, and Replace Missing Values (Series) operators.

timeseries-1timeseries-2

Use internally signed certificates

Connect to an https server using a custom certificate by copying it to .RapidMiner/cacerts folder. Studio will make sure that it is added to Trust Store upon next start, and usable until it is removed.

In-Database Processing

Save time doing data prep on large data with the new In-Database extension. Visually define data prep or ETL workflows in RapidMiner Studio and execute them directly in the database. Reduce data transfer by loading only the data you need after preparation.

How it works

With the new In-Database Processing extension you can design a subprocess with new, but familiar preprocessing operators. Computation of these operators is pushed down into a database, i.e. they are automatically translated into SQL code which is submitted to the database. You can then process the result with other operators just like in a normal RapidMiner process.

The main goal of this extension is to allow you to limit the data that you read from a database into the memory of RapidMiner Studio or Server. This is especially important when you are using cloud engines like Google BigQuery where you have to pay for the amount of data you retrieve. Another goal is to leverage your database's computing power which is also important when using distributed, scalable database or cloud engines. All this is done without the need to write SQL code.

This first version of the extension supports Google BigQuery (via OAuth 2), PostgreSQL, MySQL and H2. Further database and cloud engine support is planned for the future.

Installation

Save the JAR file to your .RapidMiner/extensions directory (located in your user directory). New operators will appear inside the Extensions / In-Database Processingoperator group. Always start your process with a Database Nest operator.

RapidMiner Server

New, modern scheduler

Automate the scheduling of your processes and integrate them with external applications using the new and modern scheduler API.

scheduler

Ready for High-Availability

Avoid downtime and provide high availability to your Data Science projects. RapidMiner Server 9.1 is compatible with HA-configurations. See our documentation here on how to set it up.

scheduler

Custom library folder in Job Agents

Extend the execution capabilities of the Job Agents by adding custom libraries with classes that you can use with the "Execute Script" operator.

Security improvement: disable multiple sessions

Avoid unlawful impersonation of your users by preventing concurrent sessions from remote sites.

Execution error reporting

Receive now error reports by mail. Make sure you are notified in case any error should show up in your RapidMiner processes.

RapidMiner Radoop

HDP 3 support

Keep up with the changes in the Big Data world where major distributions adopt Hadoop 3 and offer new amazing features like storage-saving erasure coding. RapidMiner Radoop now offers support for Hortonworks HDP 3.

HDP3

Cloudera 6 beta-support

Cloudera 6 is also adopting Hadoop 3. RapidMiner Radoop 9.1 comes with an initial support for Cloudera 6 (limited to certain configurations).

Differences for Release Candidate

The Release Candidate (released December 6th) contains some changes in RapidMiner Studio and RapidMiner Server compared to the Beta:

RapidMiner Studio

  • Improved Studio settings UI
  • Fixed a bug which caused the item count of a X-Means cluster model to be twice the expected size
  • Fixed a bug with Read CSV that prevented automatic type guessing and parsing
  • Fixed an issue where (temporary) Access files could not be deleted in a RapidMiner process
  • Fixed a bug with Read CSV that caused the header row to also be read as the first data row

RapidMiner Server

  • The installer now retains the JWT secret and the ActiveMQ username and password when pointing to an existing RapidMiner home directory during the installation
  • Changed the default `autosave` option for PostgreSQL to `conservative`
  • Memory consumption on job details page is still shown after job completion
  • Fixed persistence of processes during repository initialization
  • Fixed streaming of MetaData which was created by a not loaded extension
  • Fixed Internet Explorer compatibility issues for the Web App interface
  • Fixed LDAP case sensitive name binding

Documentation

We have set up a beta documentation page where you will find additional information about the new features of this release. It is available at docs-beta.rapidminer.com.

Downloads

Below you can download the release candidate of RapidMiner 9.1. Please note that your existing licenses will determine the products and functionality you are able to test.

RapidMiner Studio

Windows

Installation: Extract all contents of the ZIP archive and run RapidMiner-Studio.exe

Note that this release cannot be used to update existing installations!

Mac OS X

Installation: Open the disk image and drag the RapidMiner Studio 9.0 Preview App to your Applications folder.

Other Platforms

Installation: Extract all contents of the ZIP archive and either run RapidMiner-Studio.sh (Linux) or RapidMiner-Studio.bat (Windows).

Note that this release cannot be used to update existing installations!

RapidMiner Server

All Platforms

Installation: See our Installation Walkthrough for details. Note that we strongly advice against upgrading existing installations! Please install this preview release separately from any of your production or backup systems. Please contact us with any questions you may have regarding installation.

RapidMiner Radoop

All Platforms

Installation: Save the JAR file to your .RapidMiner/extensions directory (located in your user directory).

If you need to install RapidMiner Radoop functions (Hive UDFs) manually then please contact us to discuss the beta UDF upgrade on your Hadoop cluster.

In-Database Processing

All Platforms

Installation: Save the JAR file to your .RapidMiner/extensions directory (located in your user directory). New operators will appear inside the Extensions / In-Database Processing operator group. Always start your process with a Database Nest operator.

Feedback

Your feedback is a critical to the success of our beta program and we looking forward to your comments.

Please send all your feedback – positive or negative – via the “Submit Feedback” button below. Please submit separate reports for each new topic so that we are better able to track and address your comments.

For bugs or errors, please be as specific as possible when submitting your report:

  • How can we reproduce the error?
  • What UI elements were you interacting with?
  • Attach stack traces files that show the error.

    • RapidMiner Studio stores log files in .RapidMiner/rapidminer-studio.log and .RapidMiner/launcher.log. You can also enable the log view in RapidMiner Studio via View > Show Panel.
    • RapidMiner Server logs to ./standalone/log/server.log (relative to its installation directory).
    • RapidMiner Radoop can export logs after a connection test using the Extract Logs... button on the Manage Radoop Connections dialog