Predictive Model Markup Language Resources
Developed by the Data Mining Group, an independent, vendor led committee, PMML provides an open standard for representing data mining models. In this way, models can easily be shared between different applications avoiding proprietary issues and incompatibilities. Currently, all major commercial and open source data mining tools already support PMML.
PMML is an XML-based language which follows a very intuitive structure to describe data pre- and post-processing as well as predictive algorithms. Not only does PMML represent a wide range of statistical techniques, but it can also be used to represent input data as well as the data transformations necessary to transform raw data into meaningful features.
As part of the Data Mining Group, Zementis is committed to the continual development of PMML. It is our vision for the community that users will be free to share models among many solutions, benefiting from an environment in which interoperability is truly attainable. For this reason, we have made available to the data mining community the Transformations Generator. It allows users to interactively design data transformations in PMML.
Zementis presentation on PMML and Predictive Analytics to the ACM Data Mining Bay Area/SF group at the LinkedIn auditorium in Sunnyvale, CA. In this talk, Dr. Alex Guazzelli, Zementis CTO, provides the business rationale behind PMML as well as an overview of its main components. Besides being able to describe the most common modeling techniques, PMML is capable of handling complex pre and post-processing tasks, including text mining, a functionality introduced in PMML 4.2. PMML also offers the ability to represent model ensemble, segmentation, chaining, and composition within a single language element. This ability to represent an entire predictive solution (from pre-processing to model(s) to post-processing) attests to the language’s refinement and maturity.
PMML Community Forum
For an on-going discussion and to read about the latest PMML news, we would like to invite you to join the PMML group in LinkedIn.
PMML in Action
The data mining community has derived a broad foundation of statistical algorithms and software solutions that has allowed predictive analytics to become a standard approach used in science and industry.
For many years, much emphasis has been placed on the development of predictive models. As a consequence, the market place offers a range of powerful tools, many open-source, for effective model building. However, once we turn to the operational deployment and practical application of predictive solutions within an existing IT infrastructure, we face a much more limited choice of options. Often it takes months for models to be integrated and deployed via custom code or proprietary processes.
The Predictive Model Markup Language (PMML) standard has reached a significant stage of maturity and has obtained broad industry support, allowing users to develop predictive solutions within one application and use another to execute them. Previously, this was very difficult, but with PMML, the exchange of predictive solutions between compliant applications is now straightforward.
The aim of this book is to present PMML from a practical perspective. It contains a variety of code snippets so that concepts are made clear through the use of examples. Readers are assumed to have a basic knowledge of predictive analytics and its techniques and so the book is intended for data mining movers and shakers: anyone interested in moving predictive analytic solutions between applications, including students and scientists.
PMML in Action is a great way to learn how to represent your predictive solutions through a mature and refined open standard. The 2nd edition includes new chapters and an expanded description of how to represent multiple models in PMML, including model ensemble, segmentation, chaining, and composition. The book is divided into six parts, taking you in a PMML journey in which language elements and attributes are used to represent not only modeling techniques but also data pre- and post-processing.
With PMML, users benefit from a single and concise standard to represent predictive models, thus avoiding the need for custom code and proprietary solutions.
You too can join the PMML movement! Unleash the power of predictive analytics and data mining today!
Available for purchase on Amazon.com
“The very first book that covers the industry standard for transferring and integrating predictive models across systems, this is a milestone for predictive analytics. If you want the long and short on engineering for versatility in how predictive models can be deployed and put to work, get started by curling up with this book.”
Eric Siegel, Ph.D., President, Prediction Impact, Inc., Conference Chair, Predictive Analytics World (Predictive Analytics World)
“Open standards facilitate innovation and progress (web is a great example). PMML (the Predictive Model Markup Language) is an open standard for predictive analytics and data mining, developed over more than 12 years and supported by most industry leaders. This easy to read book covers data transformations, many modeling methods (Associations, Clustering, Decision Trees, Neural Nets, Regression, SVM, and more), model ensembles, and verification. This book is your essential guide to PMML!”
Gregory Piatetsky, Ph.D., Editor KDnuggets, Founder KDD/SIGKDD (KDNuggets.com)
“Next generation enterprise are going to be driven by analytics, especially predictive analytics. Sharing and rapidly deploying predictive analytic models is essential and PMML is the open standard that delivers the interoperability and agility that these predictive enterprises need.”
James Taylor, CEO, Decision Management Solutions, Co-author of “Smart (Enough) Systems: How to Deliver Competitive Advantage by Automating Hidden Decisions” (JTonEDM.com)
“PMML in Action” may be destined to become an analog to the famous Kernighan and Richie book, “The C Programming Language”, published in 1978. This book (affectionately known as K&R) became the standard guide for ANSII C programming practice. I expect that “PMML in Action” will function likewise in the burgeoning development of PMML in analytical tools now, and in the future. It is the “cookbook” for PMML programming. Julia Child made French cuisine kiss-simple for housewives to create. Now, programmers can follow the descriptions and practices in this book to implement analytical solutions in PMML as easily and efficiently as Julia enabled a housewife to make a French soufflé.”
Robert A. Nisbet, Ph.D., (Co-author of “Handbook of Statistical Analysis & Data Mining Applications“)
We have compiled a list of useful PMML links below. Please, make sure to check them if you would like to become a PMML pro.
- Book – PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics.
- Data Mining Group (DMG): PMML specification and more
- PMML 4.2 Specification – released February 2014.
- PMML page on WIKIPEDIA
- IBM developerWorks Article – What is PMML? A great introduction to PMML.
- IBM developerWorks Article – Representing Predictive Solutions in PMML: Describes how data pre-processing and model are represented in PMML.
- Zementis Community Forums: Explore our PMML forums. Learn from the pros and share your experience.
- Predictive Analytics and PMML Blog: All things PMML.
- Zementis Blog: Issues and tips on how to export PMML from your favorite modeling tool.
- LinkedIn PMML Group: Join the PMML discussion group in LinkedIn.