Welcome to RapidMiner 9.7 BETA!

Dear RapidMiner User,

The focus for this release has been on significantly improving collaboration and governance for data-loving teams. These improvements enable team members who are most productive designing processes and others who excel in writing code, to effectively work together towards a common goal: business impact.

Please try it and share your feedback with us.

What's New

Projects

RapidMiner Studio and Server, as well as JupyterHub, now support the concept of projects, enabling you to structure and isolate your work., allowing multiple users to collaborate while maintaining a consistent state across the entire project.

On top of that, projects are versioned, providing the following cool features:

  • Linear backup, you can always revert to a past state (nothing is lost, no matter what you do).
  • Each snapshot (project version) is fully consistent, so it's easy to answer compliance questions like "which process trained this model".
  • Traceability: snapshots log who did what, when and why (through user-written comments).
  • There's a Git server used as the version control backend. This also enables storing files of arbitrary types like .py or .csv, making your projects whole.

snapshot_history

A single identity, across our entire platform

Over the last few releases we have notably grown our platform by integrating great tools such as JupyterHub and Grafana. Now we have closed the gap on providing an enterprise-grade, easy to manage and easy to use authentication and authorization experience for our containerized platform deployment.

Users will enjoy a true Single Sign-On experience to all deployed components, including Go, Studio, Server, JupyterHub, Grafana, and the revamped platform admin interface.

Administrators will get a handy tool to administer users, groups or implement a fine-grained role-based access control.

Under the hood, we achieve this by integrating and configuring an open source product named KeyCloak.

Here are a few cool things you can do:

  • configure federation with any SAML 2.0 or OAuth2/OpenID Connect based identity provider
  • configure self-service password resets, password expiration policies, and two-factor authentication, to ensure compliance with security requirements
  • create and manage groups and roles for fine-grained access control to various platform components

RapidMiner Studio

HDF5 as the new file format

RapidMiner ExampleSets are now written to disk in a new file format: HDF5. This well-established format ensures stability and performance when storing large amounts of data. It also means that Python and RapidMiner Studio can exchange data easier and faster than ever before.

Improvements in our guided machine learning features

Auto Model:
  • some processes (e.g. SVM, FLM, or weight calculations) now use 'Target Encoding' (new operator) instead of one-hot encoding which reduces memory usage and run times
  • You can submit multiple Auto Model jobs to RapidMiner Server and use its repository to load the results.
Model Ops:
  • Repositories on RapidMiner Server and RapidMiner Studio can be used as storage locations for deployed models (also known as "deployment location")
  • unused and ID columns are now kept in the results after scoring

Miscellaneous improvements

  • for repositories and projects created in 9.7 and later it is now possible to have a folder or a process with the same name as a data entry they can also take advantage of supporting all files you may have on your computer (.py, .jpeg, .pdf, etc).
  • project with files
  • Binomial attributes show the positive and negative value on Statistics view
  • Improved dealing with whitespaces in repository entry names
  • Improved cleanup of temp files, to reduce disk space clutter when Studio runs for a long time, i.e. in a Server environment
  • Made log tables in Result View behave more like other results, adding more actions and a shortcut to the context menu
  • Process background images use a relative path to the image where possible (only applies for background images create with 9.7 or later).
  • The operators 'Explain Predictions' and 'Model Simulator' now also support grouped models where arbitrary models have been grouped instead of only preprocessing models

Time Series

  • New operator 'Integrate' to integrate time series with different methods (cumulative sum / left and right riemann sum / trapezoidal rule)
  • Added the option to specify negative lags and a default lag for a set of attributes (selected by an attribute subset selector) to the 'Lag' operator
  • Unfortunately due to parameter key incompatibilities, the old version of the 'Lag' operator had to be deprecated and new version with the same name, but different operator key is added.
  • Added options to use padding for Fast Fourier Transformation and calculate the frequency of the amplitude value.

Updated H2O library

We made a big jump forward so our H2O based operators now use the latest version of the library under the hood. Aside from a sizable improvement in stability, robustness on input data and performance, we added some new features, too:

  • Gradient Boosted Trees now support monotonicity constraints
  • Deep Learning now outputs model weights
An important note: we made sure nothing breaks and your old models trained with the previous version will work exactly the same as before. New model training will only be possible using the new library version.

RapidMiner Server

Force stop button

Sometimes, it takes a long time to stop a jobs, if it has become non-responding. The new "force stop" button allows you to immediately kill the job and free the queue.

force-stop

New Dashboard for the home page

Now the Server home page presents users and admins with useful summary graphs of what's going on in the system:

  • Executed and failed jobs
  • Disk usage in Projects
  • Configured schedules and web services.
  • ... and more.

RapidMiner Server mobile app

Especially designed for admins, a new mobile app grants you access to your RapidMiner Server. You can check jobs, schedules and other activities and react as needed. The Server needs to be accessible from the internet.

mobile1 mobile2

Become a beta tester (iOS application will follow soon)

google_play_store apple_store

RapidMiner Radoop

Minor improvements only this time

These improvements are for convenience and ease of use:

  • Read Database (Radoop) now supports the new Connection framework, so you can easily use your database connections from Studio to get data into your Hadoop cluster
  • Fixed The Spark Script operator's example code

RapidMiner Platform Deployment

Identity and security

As described above, we integrated an open source component named KeyCloak to implement a single identity across the platform, along with a Single Sign-On experience across all platform components.

Additional security related improvements at a glance:

  • Pre-configured fine-grained role based authorization for all platform components.
  • State of the art authentication and authorization architecture greatly reduces the potential attack surface against the platform.
  • Base container images updated to CentOS 8, fixing a lot of potential identified vulnerabilities.

Grafana updates

We updated to Grafana version 7, which brings two great new tools for dashboarding:

  • Customizable, user friendly table panel for dashboards
  • Grafana-side data filtering and transformations, which should make your dashboard backend RapidMiner processes way simpler

Learn more about the full list of updates on Grafana's website.

Jupyter updates

We integrated a Git extension to our JupyterLab instances, and created a handy UI to easily interact with projects created in RapidMiner Server.

You can use the GUI of this Git extension, or familiar Git commands on the Jupyter teminal, to interact with projects (as well as other Git repositories).

We also updated our notebook template on how to use the brand new Project API in our Python library to interact with your data and processes in RapidMiner projects and repositories.

Streamlined admin tools

We merged Python Environment Manager and the Real-Time Scoring Admin UI into a single admin tool named the RapidMiner Platform Admin.

Aside from leveraging the Single Sign-On capabilities of the platform (except for the Scoring endpoint authentication which was kept simple for performance reasons), we also implemented fine-grained role based authorization for each admin feature.

Changelog for RapidMiner Studio BETA3

  • Updated JxBrowser to version 7.7, please also test HTML5 visualizations etc and report if there are any issues!
  • SSO dialogs for projects do no longer keep popping up
  • SSO dialog should no longer be able to freeze the Studio UI
  • The passwords stored in the Password Manager are now also encrypted with up-to-date encryption of AES256GCM. Automatic migration from the old format happens on Studio startup. They are now stored in the credentials.xml file. The secrets.xml has been deprecated and will no longer be used in the future.
  • Renamed RapidMiner Server to RapidMiner AI Hub
  • Added earlier error in case of a duplicate name conflict in the AI Hub repository (instead of after the upload) in most cases (based on a cache, so information could be outdated in some cases)
  • Opening the Process panel when opening a process (e.g. from the Conflicts UI) while in the Design view to make it more obvious something happened
  • Added "Connect" button to disconnected projects in the Version History Panel
  • Fixed error that could prevent remote versions during conflict resolution to refuse to open
  • .json and .xml are now also diffable in the conflict UI by default
  • Added description button to Enterprise SSO checkbox in project/AI Hub repository dialog
  • Fixed broken replacing (instead it was duplicated) on move of data entries to a different repository
  • Improved memory usage for Aggregate and Pivot operators for nominal columns with potentially a lot of unused values
  • Replaced Send Mail operator with new version which supports file attachments
  • Made discard result dialog less scary if nothing was actually discarded
  • Improved Materialize performance in certain scenarios

Documentation

We have set up a beta/RC documentation page where you will find additional information about the new features of this release. It is available at docs-beta.rapidminer.com.

Downloads

Below you can download the Beta/RC version of RapidMiner 9.7. Please note that your existing licenses will determine the products and functionality you are able to test.

RapidMiner Studio

Windows

Installation: Extract all contents of the ZIP archive and run RapidMiner-Studio.exe

Note that this release cannot be used to update existing installations!

Mac OS X

Installation: Open the disk image and drag the RapidMiner Studio 9.7 Preview App to your Applications folder.

Other Platforms

Installation: Extract all contents of the ZIP archive and either run RapidMiner-Studio.sh (Linux) or RapidMiner-Studio.bat (Windows).

Note that this release cannot be used to update existing installations!

RapidMiner Server

All Platforms

Installation: See our Installation Walkthrough for details. Note that we strongly advise against upgrading existing installations! Please install this preview release separately from any of your production or backup systems. Please contact us with any questions you may have regarding installation.

RapidMiner Radoop

All Platforms

Installation: Save the JAR file to your .RapidMiner/extensions directory (located in your user directory).

If you need to install RapidMiner Radoop functions (Hive UDFs) manually then please contact us to discuss the beta UDF upgrade on your Hadoop cluster.

RapidMiner Platform Deployment

All Platforms

Installation: Download the installation package, and do the following steps:

  1. Unzip it on a machine which runs Docker.
  2. Edit the .env file in your favorite editor. Add the public URL of your machine to the PUBLIC_URL and SSO_PUBLIC_URL variables. It can be a hostname or an IP address, just make sure to prefix it with http://
  3. Optionally, provide your Server license key in the SERVER_LICENSE variable.
  4. On a terminal, first create a docker network needed for the setup using docker network create jupyterhub-user-net-default.
  5. Issue docker-compose up -d rm-init-svc and wait a few minutes to complete.
  6. Issue docker-compose up -d.
  7. Point your browser to the IP address or hostname of your machine that you provided in step 2.

Please deploy this beta release separately from any of your production or backup systems. Please contact us with any questions you may have regarding installation.

Feedback

Your feedback is a critical to the success of our beta program and we are looking forward to your comments.

Please send all your feedback – positive or negative – via the “Submit Feedback” button below. Please submit separate reports for each new topic so that we are better able to track and address your comments.

For bugs or errors, please be as specific as possible when submitting your report:

  • How can we reproduce the error?
  • What UI elements were you interacting with?
  • Attach stack traces files that show the error.

    • RapidMiner Studio stores log files in .RapidMiner/rapidminer-studio.log and .RapidMiner/launcher.log. You can also enable the log view in RapidMiner Studio via View > Show Panel.
    • RapidMiner Server logs to .../rapidminer-server-home-directory/log/server.log (relative to its installation directory).
    • RapidMiner Radoop can export logs after a connection test using the Extract Logs... button on the Manage Radoop Connections dialog