I'm tasked with defining AWS tools for ML development at a medium-sized company. Assume about a dozen ML engineers plus other DevOps staff familiar with serverless ( lambdas and the framework ). The main questions are: a) what is an architecture that allows for the main tasks related to ML development (creating, training, fitting models, data pre-processing, hyper parameter optimization, job management, wrapping serverless services, gathering model metrics, etc ), b) what are the main tools that can be used for packaging and deploying things and c) what are the development tools (IDEs, SDKs, 'frameworks' ) used for it?
I just want to set Jupyter notebooks aside for a second. Jupyter notebooks are great for proof-of-concepts and the closest thing to PowerPoint for management... But I have a problem with notebooks when thinking about deployable units of code.
My intuition points to a preliminary target architecture with 5 parts:
1 - A 'core' with ML models supporting basic model operations (create blank, create pre-trained, train, test/fit, etc). I foresee core Python scripts here - no problem.
2- (optional) A 'containerized-set-of-things' that performs hyper parameter optimization and/or model versioning
3- A 'contained-unit-of-Python-scripts-around-models' that exposes an API and that does job management and incorporates data pre-processing. This also reads and writes to S3 buckets.
4- A 'serverless layer' with high level API ( in Python ). It talks to #3 and/or #1 above.
5- Some container or bundling thing that will unpack files from Git and deploy them onto various AWS services creating things from the previous 3 points.
As you can see, my terms are rather fuzzy:) If someone can be specific with terms that will be helpful. My intuition and my preliminary readings say that the answer will likely include a local IDE like PyCharm or Anaconda or a cloud-based IDE (what can these be? - don't mention notebooks please). The point that I'm not really clear about is #5. Candidates include Amazon SageMaker Components for Kubeflow Pipelines and/or Amazon SageMaker Components for Kubeflow Pipelines and/or AWS Step Functions DS SDK For SageMaker. It's unclear to me how they can perform #5, however. Kubeflow looks very interesting but does it have enough adoption or will it die in 2 years? Are Amazon SageMaker Components for Kubeflow Pipelines, Amazon SageMaker Components for Kubeflow Pipelines and AWS Step Functions DS SDK For SageMaker mutually exclusive? How can each of them help with 'containerizing things' and with basic provisioning and deployment tasks?