Keith McCormick's Blog
April 4, 2022
ODSC East in Boston April 20th
I’m excited to announce that I’ll be speaking at a face-to-face conference for the first time since the pandemic. On April 20th, I’ll be at ODSC in Boston.
There’s a great lineup for the event.
The subject of my talk will be What analytics leaders should know about Human-in-the-Loop. I’ve written a blog post in support of the conference, but the post is not a summary of the talk. It picks up on a related theme.
I hope that you’ll be able to join me at the conference. Be sure to say Hello.
New online courses in the Essential Elements of predictive analytics
It started over 10 years ago with this post. It become a popular topic at workshops and it worked its way into my consulting and training. Since then 100,000 learners have seen a much-expanded version of the material in that blog post.
The folks at LinkedIn Learning have worked with me to give the course a completely new look, and I’m made some great additions since the original online course. One of the new videos that I’m proudest of is the debut of Tom Khabaza’s 10th Law. The chapter on Tom’s 9 Laws of Data Mining is much expanded.
I also have developed this version specifically for analytics leadership. The topics are similar, but it’s not merely an abbreviated version. It shifts the focus to building and managing teams with the Essential Elements concepts.
June 26, 2021
2021 dates for for Advanced IBM SPSS MODELER TRAINING in NYC
I’m going to be offering an advanced three-day Modeler training for a very small number of experienced users in New York City in July. There will be more than 3 hours a day of “office hours” style Q&A, in addition to a very full day of lecture and discussion. If you are looking for advanced training please message me privately and I will send you the detailed information.
For this training to be a good fit you should be a power user and should have some familiarity with the kind of content that we cover in the Modeler Cookbook. If that describes you I’d be excited to have you join us, but I’m limiting the training to just five.
You should already have taken something like Predictive Modeling for Categorical Targets Using IBM SPSS Modeler and/or Predictive Modeling for Continuous Targets Using IBM SPSS Modeler, or have the equivalent experience with predictive models in Modeler.
Dates for 2021 will be July 20, 21, and 23.
March 17, 2021
Analytics Software Overview
There are hundreds of software options in analytics and more are developed every year. Analytics applications can be reasonably organized into four categories:
Open-source languages (R, Python) Commercial workbenches (SAS Enterprise Miner, IBM SPSS, and many others)Auto-ML (Datarobot, H20.ai’s Driverless AI, and many others)Open-source workbenches (KNIME and many others)Open-Source LanguagesR and Python have completely dominated the conversation in recent years, but they don’t represent the only choices. There is little doubt that a full-time data scientist has to know a little about each of them, but if you lack programming experience, they can seem daunting. Moreover, it’s not clear that they represent the best choices for someone who interacts with analytics on only a part-time basis. This course will not require a knowledge of either one, but if you do choose to learn some R and Python programming, it’s best to start by choosing an appropriate editing environment—such as R Studio—that can help you become acclimated. It’s rarely necessary to start with a blank computer screen!
Commercial WorkbenchesIn the 90s, when predictive analytics and machine learning software began to take off, two options were dominant: SAS and IBM SPSS. It is valuable to know this because they influenced the design of everything that followed. There is a whole generation of machine learning experts—now in their mid-career years—whose training was influenced by this period. Although these tools are powerful, they are also expensive—and have been losing ground to various open-source options for years.
Auto-MLThere are a variety of newer software options that can be used by both business analysts with minimal training as well as by data science experts. Less-experienced users can rely on the software to automatically select settings and options for conducting an analysis while more experienced users can fine-tune the parameters directly. This is not unlike a camera with an auto-focus capability. Both rookies and experts can use the camera, but experts can turn off the auto-focus feature and apply manual settings for better artistic control over their photos.
Auto-ML, as this is called, becomes somewhat controversial when the software makes elections that are either opaque or irreversible (or both). These software technologies are evolving rapidly and will likely grow in popularity. However, it is still desirable to have knowledgeable human oversight until the software becomes more sophisticated and reliable. Any tool that saves time is helpful—as long as the results are transparent and validated. Many tools in the Auto-ML toolkit already meet these criteria, but complete start-to-finish automation that doesn’t require human oversight hasn’t yet been achieved.
March 7, 2021
Maximizing ROI for Machine Learning
Join our LinkedIn Live session to hear from Dean Abbott, Chief Data Scientist at SmarterHQ, and learn:
-Why it pays to have people in the loop during ML model deployment
-Tips for deploying people more strategically to ease ML development challenges
-How to design an HITL approach for better model outcomes
Dapper Data: Data is my Science
Why is SPSS so Valuable and Will it Live Forever with Special Guest Keith McCormick.
Keynote: Product Leadership Festival
Institute of Product Leadership
Trends in Data Science Management careers.
Coffee With Entrepreneurs Live! With Marie Incontrera
I talk with Marie Incontrera, on Coffee With Entrepreneurs Live! A show about data science, machine learning, LinkedIn Learning, and the 4th edition of SPSS Statistics For Dummies.
Level Up With Lori: LinkedIn Live
Ep 14: Is It Possible to Do Data Science/Data Analytics Without Statistics?
Keith and Lori discuss…
1 – why some folks think statistics isn’t necessary
2 – what’s required for solid statistical thinking to be embedded in data science/analytics work
3 – where’s the “divide” between the work of data science/statistics and others, such as those in Finance
… and a host of other subjects!
EM360 podcast with Max Kurton
Ask the Expert: The Relevance of Statistics to Contemporary Machine Learning
Max is joined by Keith McCormick. The emphasis on statistics in machine learning has changed over time. Before, it was assumed that anyone in the machine learning environment had a stats background, but today, it’s not uncommon to find data scientists that have not come across statistics education in their studies.