Running
To run the calibration process Docker is used. This allows to create an isolated environment that can be run in any machine that has Docker (ideal for cloud computing tasks).
By default, Docker will run the entrypoint. This is already implemented, so when you start the docker container it will run the calibration process by default.
The provided project can be executed using
docker run ${image_name}
where ${image_name}
is the name that was defined:
if the Makefile was used, the name in this file
otherwise, the image name that was specified when running the
docker build
command.
If nothing is specified, the calibration process will start. It will use all the files in the data folder, so if you have copied multiple data files you have to manually specify which one should be used.
Note
It is common to specify a name for the specific container, to be identified later. You can specify the name of the container to be started with
docker run --name ${myname} ${image_name} ...
More options can be found at the official documentation.
To change the default behaviour, you can pass parameters to docker run
.
To see all the available options run docker run --rm ${image_name} --help
1.
The entrypoint accepts the following commands:
configure
run
docker run <image> configure
This is the command that you will usually run. It executes the entire calibration process. You have to specify the data file to use. Note that all the datafiles wil be located at /epidemic/project/data/…, so if you want to calibrate the project using the dataset myCountry.csv run:
docker run ${image_name} configure /epidemic/project/data/myCountry.csv
You can specify the following optional parameters:
- slots: The number of processes that will be used to calibrate the model.
It should match the number of cores in the machine you plan to use.
- verbose: Select between critical, error, warning, info, debug.
This is the verbosity of PyDGGA.
For example, to run the same dataset as before, but using a machine with 48 cores and setting the verbosity to error run:
docker run ${image_name} configure /epidemic/project/data/myCountry.csv --verbose error --slots 48
docker run <image> run
Sometimes to verify if the model is working correctly, you might be interested
in running the model inside the container without executing the entire calibration
process.
For this purpose, the entrypoint also accepts the command run
.
You can run a single execution of the model for an instance and a random seed by executing:
docker run ${image_name} run /epidemic/project/data/${instance} ${seed}
This will execute the entrypoint function in the file model.py.
Run in the cloud
In order to run the calibration process in the cloud, you have two options:
If you have access to the underlying machine that will run the container, upload the template folder (using SSH, FTP…) and then run the same steps on the remote machine as if it was a local machine, as explained above.
If you have access to a service that can run Docker images, you must first upload your image to a registry that can be accessed by this service.
Note
The 2nd method also works if you have access to the machine, and instead of uploading the template and building the image on the remote machine, you can upload the image built locally and pull it from the remote machine.
Here we explain how you can upload your image to a registry. Common registries are:
Here we present a guide for the first two repositories, as well as a basic guide on how to run an image in Amazon EC2 virtual machines.
Upload your image to Dockerhub
The first step is to create a user at https://hub.docker.com.
With a user, you can go to https://hub.docker.com/repository/create to start creating your repository. A repository will contain all the versions for an image. Give your repository a descriptive name, usually it is the same as the one given for the image.
Once the repository is created, you have to name the image accordingly. This can be done before
creating the image (i.e. when building the image specifying the name), or after by using
docker tag
.
The name will be composed by your user, the name, and the tag: myUser/myImage:myTag
,
although the :myTag
part is optional, and if not specified the tag will be latest.
Note
The tag should be specified if you plan to publish/use/keep multiple versions of your image
If you have created your image before and you did not set the name for this repository, it usually
will be myImage:myTag
. You can set another name for the image by using:
docker image tag myImage:myTag myUser/myImage:myTag
Note
This will not delete the new name for the image. You will have listed two images with different name but pointing to the same image.
Once you have your image correctly named (you can check if it is listed in docker image ls
),
you can push it to Docker hub by using:
docker push myUser/myImage:myTag
More info on the push command can be found here.
Note
If you create a private repository, you have to run docker login
before
being able to push or access the image from the current machine.
Upload your image to ECR
The process to upload your image to ECR is similar to the one used to push to Docker hub. The main difference is that the names of the images should contain an indication that the registry used is ECR.
First, you have to create a repository for your image. You can either use the web interface available at https://aws.amazon.com or use the AWS cli tool (recomended).
To create the repository in the default registry using the CLI tool use:
aws ecr create-repository \
--repository-name myImage \
--image-scanning-configuration scanOnPush=true \
--region ${aws_region}
Then, name the image accordingly:
docker tag myImage:myTag ${aws_account_id}.dkr.ecr.${aws_region}.amazonaws.com/myImage:myTag
and push it (previously login in to have permissions):
aws ecr get-login-password --region ${aws_region} | docker login \
--username AWS --password-stdin \
${aws_account_id}.dkr.ecr.${aws_region}.amazonaws.com
docker push ${aws_account_id}.dkr.ecr.${aws_region}.amazonaws.com/myImage:myTag
More details can be found at ECR (AWS)
Run in an EC2 virtual machine
To calibrate the model in a virtual machine in the cloud you can spawn a EC2 instance using the AWS CLI tool or the web interface.
When prompted for the instance AMI, choose Amazon Linux 2022, which already has Docker installed. Otherwise, select another AMI and install Docker manually.
Once the machine is running and Docker is available in the machine, you can:
- Build the image in the virtual machine
Copy the entire template folder to the virtual machine, for example using
scp
- Pull an already built image that is available in a registry
Connect to the remote machine, and then pull the image from your registry with
# From dockerhub docker pull myUser/myImage:myTag # or from ECR docker pull ${aws_account_id}.dkr.ecr.${aws_region}.amazonaws.com/myImage:myTag
Note
Pulling an image is optional, as
docker run
will pull images that are not found locally
Then, you can simply run the calibration process as explained in docker run <image> configure.
Footnotes
- 1
The
--rm
flag will delete automatically the container once it finishes. Do not use it if you plan to read the logs or pull some files from the container later on. Here, as we only consult the help we can remove it safely.