Go to previous topic
Go to next topic
Last Post 02 Oct 2020 05:21 PM by  Patrick Ng
Highlights on Tackle the Issues - Where should Machine Learning go (and not go)?
 0 Replies
Author Messages
Patrick Ng
Basic Member
Basic Member
Posts:148


--
02 Oct 2020 05:21 PM
    Observations - Dr. Lian and speaker Ball showed in examples simple architecture, random forest (RF) and support vector machine (SVM) work well in many cases (some as good as deep learning models).

    Q1: what does that tell us about ML model - simplicity vs complexity, and the direction?

    Recap - much has to do with data / sampling. If we have lots of data, deep learning will do well. When we have limited data, RF and SVM may perform better (in supervised and unsupervised ML, respectively). Deep learning delivers benefits when we show tender loving care on input data. (Note in Making the Business Case for Geology on Day 2 - emphasize the key to success of bringing in cores, geology, petrophysics etc. together early on vs working on data in silos then integrating information at late stage. As such, RF deemed the best vs neural network and others.)

    It echoes Prof Andrew Ng @Stanford (founder of Coursera and deeplearnnig.ai). Often the success of ML projects hinges not on size of big data (say lots and lots, bigger than Texas big), but “strong” data (multidiscipline data fit that mode). Once again, I highly recommend Coursera’s Machine Learning* by Andrew. He shares real life project insights from Silicon Valley to Silicon Alley (Beijing). For $49, it was my best investment in time and money.

    *his keystone course led to the launch of Coursera and validated MOOC as a business.

    Q2 how soon do you think a fully AI-interpreted prospect will be drilled (per opening Michel T Halbouty lecture)?

    Kathy succinctly pointed out that autonomous drilling on prospect is underway. It is happening and only a matter of time before it become routine, once the kinks are ironed out in that process.

    Sashi opined that once we have every bit of data in the cloud, including sensors, the infrastructure is there for that to happen.

    My take - the choice is not binary. Duval’s closing thoughts (from Michel T Halbouty lecture) puts it nicely, “like drilling the first well in a new basin, we bring together different disciplines, adjust our thought process and navigate the new normal.”

    Q3: given the half life* is 3-6 months, how do we catch up on implementing ML in our projects, like we run and try to catch the ball while learning how to do it?

    Panel consensus is that we shall think of life-long learning (project life) and be diligent in evaluating data, ML workflow and outcomes. As Renato talked about DNA / human-centered AI. To make progress, we shall embrace the risk of failure (ML may not work in all situations) and learn from it. Make changes to the model, and how we prepare data (one in audience suggested ML for data QC). Alec chimed in "not do the same thing over and over, and expect a different outcome. That will be insanity."

    *per Prof Alexie Efros (UC Berkeley), GAN (generative adversarial network, the most sophisticated algo and used for good in autonomous driving, ugly in creating more Max Headroom, bad in deepfake video) is 3 months, so I'd put the rest of ML probably 6 months. Open source stuff happens fast.
    0


    ---