Dear RapidMiner User,
The focus for this release has been on significantly improving collaboration and governance for data-loving teams. These improvements enable team members who are most productive designing processes and others who excel in writing code, to effectively work together towards a common goal: business impact.
Please try it and share your feedback with us.
RapidMiner Studio and Server, as well as JupyterHub, now support the concept of projects, enabling you to structure and isolate your work., allowing multiple users to collaborate while maintaining a consistent state across the entire project.
On top of that, projects are versioned, providing the following cool features:
- Linear backup, you can always revert to a past state (nothing is lost, no matter what you do).
- Each snapshot (project version) is fully consistent, so it's easy to answer compliance questions like "which process trained this model".
- Traceability: snapshots log who did what, when and why (through user-written comments).
- There's a Git server used as the version control backend. This also enables storing files of arbitrary types like .py or .csv, making your projects whole.
A single identity, across our entire platform
Over the last few releases we have notably grown our platform by integrating great tools such as JupyterHub and Grafana. Now we have closed the gap on providing an enterprise-grade, easy to manage and easy to use authentication and authorization experience for our containerized platform deployment.
Users will enjoy a true Single Sign-On experience to all deployed components, including Go, Studio, Server, JupyterHub, Grafana, and the revamped platform admin interface.
Administrators will get a handy tool to administer users, groups or implement a fine-grained role-based access control.
Under the hood, we achieve this by integrating and configuring an open source product named KeyCloak.
Here are a few cool things you can do:
- configure federation with any SAML 2.0 or OAuth2/OpenID Connect based identity provider
- configure self-service password resets, password expiration policies, and two-factor authentication, to ensure compliance with security requirements
- create and manage groups and roles for fine-grained access control to various platform components
HDF5 as the new file format
RapidMiner ExampleSets are now written to disk in a new file format: HDF5. This well-established format ensures stability and performance when storing large amounts of data. It also means that Python and RapidMiner Studio can exchange data easier and faster than ever before.
Improvements in our guided machine learning features
- some processes (e.g. SVM, FLM, or weight calculations) now use 'Target Encoding' (new operator) instead of one-hot encoding which reduces memory usage and run times
- You can submit multiple Auto Model jobs to RapidMiner Server and use its repository to load the results.
- Repositories on RapidMiner Server and RapidMiner Studio can be used as storage locations for deployed models (also known as "deployment location")
- unused and ID columns are now kept in the results after scoring
- for repositories and projects created in 9.7 and later it is now possible to have a folder or a process with the same name as a data entry they can also take advantage of supporting all files you may have on your computer (.py, .jpeg, .pdf, etc).
- Binomial attributes show the positive and negative value on Statistics view
- Improved dealing with whitespaces in repository entry names
- Improved cleanup of temp files, to reduce disk space clutter when Studio runs for a long time, i.e. in a Server environment
- Made log tables in Result View behave more like other results, adding more actions and a shortcut to the context menu
- Process background images use a relative path to the image where possible (only applies for background images create with 9.7 or later).
- The operators 'Explain Predictions' and 'Model Simulator' now also support grouped models where arbitrary models have been grouped instead of only preprocessing models
- New operator 'Integrate' to integrate time series with different methods (cumulative sum / left and right riemann sum / trapezoidal rule)
- Added the option to specify negative lags and a default lag for a set of attributes (selected by an attribute subset selector) to the 'Lag' operator
- Unfortunately due to parameter key incompatibilities, the old version of the 'Lag' operator had to be deprecated and new version with the same name, but different operator key is added.
- Added options to use padding for Fast Fourier Transformation and calculate the frequency of the amplitude value.
Updated H2O library
We made a big jump forward so our H2O based operators now use the latest version of the library under the hood. Aside from a sizable improvement in stability, robustness on input data and performance, we added some new features, too:
- Gradient Boosted Trees now support monotonicity constraints
- Deep Learning now outputs model weights
Force stop button
Sometimes, it takes a long time to stop a jobs, if it has become non-responding. The new "force stop" button allows you to immediately kill the job and free the queue.
New Dashboard for the home page
Now the Server home page presents users and admins with useful summary graphs of what's going on in the system:
- Executed and failed jobs
- Disk usage in Projects
- Configured schedules and web services.
- ... and more.
RapidMiner Server mobile app
Especially designed for admins, a new mobile app grants you access to your RapidMiner Server. You can check jobs, schedules and other activities and react as needed. The Server needs to be accessible from the internet.
Become a beta tester (iOS application will follow soon)
Minor improvements only this time
These improvements are for convenience and ease of use:
- Read Database (Radoop) now supports the new Connection framework, so you can easily use your database connections from Studio to get data into your Hadoop cluster
- Fixed The Spark Script operator's example code
RapidMiner Platform Deployment
Identity and security
As described above, we integrated an open source component named KeyCloak to implement a single identity across the platform, along with a Single Sign-On experience across all platform components.
Additional security related improvements at a glance:
- Pre-configured fine-grained role based authorization for all platform components.
- State of the art authentication and authorization architecture greatly reduces the potential attack surface against the platform.
- Base container images updated to CentOS 8, fixing a lot of potential identified vulnerabilities.
We updated to Grafana version 7, which brings two great new tools for dashboarding:
- Customizable, user friendly table panel for dashboards
- Grafana-side data filtering and transformations, which should make your dashboard backend RapidMiner processes way simpler
Learn more about the full list of updates on Grafana's website.
We integrated a Git extension to our JupyterLab instances, and created a handy UI to easily interact with projects created in RapidMiner Server.
You can use the GUI of this Git extension, or familiar Git commands on the Jupyter teminal, to interact with projects (as well as other Git repositories).
We also updated our notebook template on how to use the brand new Project API in our Python library to interact with your data and processes in RapidMiner projects and repositories.
Streamlined admin tools
We merged Python Environment Manager and the Real-Time Scoring Admin UI into a single admin tool named the RapidMiner Platform Admin.
Aside from leveraging the Single Sign-On capabilities of the platform (except for the Scoring endpoint authentication which was kept simple for performance reasons), we also implemented fine-grained role based authorization for each admin feature.
Changelog for RapidMiner Studio BETA3
- Updated JxBrowser to version 7.7, please also test HTML5 visualizations etc and report if there are any issues!
- SSO dialogs for projects do no longer keep popping up
- SSO dialog should no longer be able to freeze the Studio UI
- The passwords stored in the Password Manager are now also encrypted with up-to-date encryption of AES256GCM. Automatic migration from the old format happens on Studio startup. They are now stored in the credentials.xml file. The secrets.xml has been deprecated and will no longer be used in the future.
- Renamed RapidMiner Server to RapidMiner AI Hub
- Added earlier error in case of a duplicate name conflict in the AI Hub repository (instead of after the upload) in most cases (based on a cache, so information could be outdated in some cases)
- Opening the Process panel when opening a process (e.g. from the Conflicts UI) while in the Design view to make it more obvious something happened
- Added "Connect" button to disconnected projects in the Version History Panel
- Fixed error that could prevent remote versions during conflict resolution to refuse to open
- .json and .xml are now also diffable in the conflict UI by default
- Added description button to Enterprise SSO checkbox in project/AI Hub repository dialog
- Fixed broken replacing (instead it was duplicated) on move of data entries to a different repository
- Improved memory usage for Aggregate and Pivot operators for nominal columns with potentially a lot of unused values
- Replaced Send Mail operator with new version which supports file attachments
- Made discard result dialog less scary if nothing was actually discarded
- Improved Materialize performance in certain scenarios
We have set up a beta/RC documentation page where you will find additional information about the new features of this release. It is available at docs-beta.rapidminer.com.
Below you can download the Beta/RC version of RapidMiner 9.7. Please note that your existing licenses will determine the products and functionality you are able to test.
Installation: Extract all contents of the ZIP archive and run RapidMiner-Studio.exe
Note that this release cannot be used to update existing installations!
Installation: Open the disk image and drag the RapidMiner Studio 9.7 Preview App to your Applications folder.
Installation: Extract all contents of the ZIP archive and either run RapidMiner-Studio.sh (Linux) or RapidMiner-Studio.bat (Windows).
Note that this release cannot be used to update existing installations!
RapidMiner ServerAll Platforms
Installation: See our Installation Walkthrough for details. Note that we strongly advise against upgrading existing installations! Please install this preview release separately from any of your production or backup systems. Please contact us with any questions you may have regarding installation.
RapidMiner RadoopAll Platforms
Installation: Save the JAR file to your
.RapidMiner/extensions directory (located in your user directory).
RapidMiner Platform DeploymentAll Platforms
Installation: Download the installation package, and do the following steps:
- Unzip it on a machine which runs Docker.
- Edit the
.envfile in your favorite editor. Add the public URL of your machine to the
SSO_PUBLIC_URLvariables. It can be a hostname or an IP address, just make sure to prefix it with
- Optionally, provide your Server license key in the
- On a terminal, first create a docker network needed for the setup using
docker network create jupyterhub-user-net-default.
docker-compose up -d rm-init-svcand wait a few minutes to complete.
docker-compose up -d.
- Point your browser to the IP address or hostname of your machine that you provided in step 2.
Please deploy this beta release separately from any of your production or backup systems. Please contact us with any questions you may have regarding installation.
Your feedback is a critical to the success of our beta program and we are looking forward to your comments.
Please send all your feedback – positive or negative – via the “Submit Feedback” button below. Please submit separate reports for each new topic so that we are better able to track and address your comments.
For bugs or errors, please be as specific as possible when submitting your report:
- How can we reproduce the error?
- What UI elements were you interacting with?
Attach stack traces files that show the error.
- RapidMiner Studio stores log files in
.RapidMiner/launcher.log. You can also enable the log view in RapidMiner Studio via View > Show Panel.
- RapidMiner Server logs to
.../rapidminer-server-home-directory/log/server.log(relative to its installation directory).
- RapidMiner Radoop can export logs after a connection test using the Extract Logs... button on the Manage Radoop Connections dialog
- RapidMiner Studio stores log files in