Every compute node on Qarnot has resources available to be used by a task. This includes and is not limited to:
In many cases, it is important for a task to use most if not all of the available resources to improve performances. It can also be useful to monitor RAM usage in case it overflows and causes your task to crash.
One way to do so is through our Python SDK.
It is possible to get semi-live updates of CPU and RAM usage from our API using our SDK. The script below is a simple example of how to launch a task that will simply sleep for 2 minutes and send back information on CPU and RAM usage every 10 seconds on your terminal.
Python
import qarnot
from datetime import datetime
conn = qarnot.Connection(client_token = '<<<MY_SECRET_TOKEN>>>>')
task = conn.create_task('CPU-RAM-monitoring', 'docker-batch', 1)
task.constants['DOCKER_CMD'] = 'sleep 120'
task.submit()
last_state = ''
done = False
while not done:
if task.state != last_state:
last_state = task.state
print("** {}".format(last_state))
if task.state == 'FullyExecuting':
instance_info = task.status.running_instances_info.per_running_instance_info[0]
cpu = instance_info.cpu_usage
memory = instance_info.current_memory_mb
print("\n*******************************\n")
print('Current Timestamp : ', datetime.now())
print("Current CPU usage : {:.2f} %".format(cpu))
print("Current memory usage : {:.2f} MB".format(memory))
done = task.wait(10)
Bash
#!/bin/bash
# Note: the following assumes that you have installed the .json parser utility [jq](https://jqlang.github.io/jq/manual/#basic-filters)
# =============== Task creation =============== #
# Your info
export QARNOT_CLIENT_TOKEN="<<<MY_SECRET_TOKEN>>>"
# Create and run task
qarnot task create \
--name "CPU-RAM-monitoring" \
--shortname "1234567890" \
--profile docker-batch \
--instance 1 \
--constants "DOCKER_CMD=sleep 1200"
# =============== Task info processing =============== #
###############################################
# Utility function to fetch the infos of the task
###############################################
get_info () {
qarnot task info --id "1234567890"
}
# Fetch task info and extract task state
info=$(get_info)
last_state=""
while true ; do
# Fetch task info and extract task state
info=$(get_info)
state=$(echo "$info" | jq .[0].State)
# Print changes of state to stdout
if [[ "$state" != "$last_state" ]] ; then
last_state=$state
echo "$last_state"
fi
# Check if task is done
completed=$(echo "$info" | jq .[0].Completed)
if $completed ; then
exit
fi
# If task is executing, update cpu and memory usage
if [[ $state = \"FullyExecuting\" ]] ; then
instance_info=$(echo "$info" | jq .[0].Status.RunningInstancesInfo.PerRunningInstanceInfo[0] )
cpu_usage=$(echo "$instance_info" | jq .CpuUsage)
memory_usage=$(echo "$instance_info" | jq .MemoryUsage)
echo "*******************************"
echo "Current Timestamp : $(date)"
echo "Current CPU usage : ${cpu_usage}"
echo "Current memory usage : ${memory_usage}"
fi
# Wait 10 seconds before refreshinf info
sleep 10
done
This is what the script above does:
You can modify this script to monitor other resources, which can be found in the SDK documentation. For example, you can get access to:
current_frequency_ghzmax_memory_mbexecution_time_secAs an example, to get the execution time, all you need is to add the following line in the above script (after line 21).
Python
execution_time = instance_info.execution_time_sec
print("Current execution time : {:.2f} s".format(execution_time))
Bash
#!/bin/bash
# Note: the following assumes that you have installed the .json parser utility [jq](https://jqlang.github.io/jq/manual/#basic-filters)
# =============== Task creation =============== #
# Your info
export QARNOT_CLIENT_TOKEN="<<<MY_SECRET_TOKEN>>>"
# Create and run task
qarnot task create \
--name "CPU-RAM-monitoring" \
--shortname "1234567890" \
--profile docker-batch \
--instance 1 \
--constants "DOCKER_CMD=sleep 1200"
# =============== Task info processing =============== #
###############################################
# Utility function to fetch the infos of the task
###############################################
get_info () {
qarnot task info --id "1234567890"
}
# Fetch task info and extract task state
info=$(get_info)
last_state=""
while true ; do
# Fetch task info and extract task state
info=$(get_info)
state=$(echo "$info" | jq .[0].State)
# Print changes of state to stdout
if [[ "$state" != "$last_state" ]] ; then
last_state=$state
echo "$last_state"
fi
# Check if task is done
completed=$(echo "$info" | jq .[0].Completed)
if $completed ; then
exit
fi
# If task is executing, update cpu and memory usage
if [[ $state = \"FullyExecuting\" ]] ; then
instance_info=$(echo "$info" | jq .[0].Status.RunningInstancesInfo.PerRunningInstanceInfo[0] )
cpu_usage=$(echo "$instance_info" | jq .CpuUsage)
memory_usage=$(echo "$instance_info" | jq .MemoryUsage)
echo "*******************************"
echo "Current Timestamp : $(date)"
echo "Current CPU usage : ${cpu_usage}"
echo "Current memory usage : ${memory_usage}"
fi
# Wait 10 seconds before refreshinf info
sleep 10
done