WanDBをjob管理データベースにしたい

12/18/2022

こんな前提

WanDBのサービスを契約してる。
たくさんの実験を実行したい。
マシンを複数台持ってる。
複数台のマシンを使って、効率的に実験したい。
RDSは利用したくない（コスト的な理由）

どうするのか？

WanDBのidを利用する。idはuniqueなidである。idを指定しない場合、WanDBはランダムなidを割り当てる。idを恣意的に指定するためには、wandb.initにid引数を与えれば良い。

WanDBにおいて、テーブルはProjectに該当する。なので、テーブルから実験レコードをfetch操作するためには、Projectから実験レコードを取得する。このfetch操作について、WanDB documentは明確に説明していない。なので、少し苦労した。以下のコードでこのfetch操作が可能。

import wandb
from wandb.apis.public import Run
from ast import literal_eval
api = wandb.Api()

what_i_use = "[ユーザー名]/[プロジェクト名]"

# use runs API. https://docs.wandb.ai/ref/python/public-api/api#runs
run = api.runs(what_i_use)


# save the metrics for the run to a csv file
for run_obj in list(run):
  print(type(run_obj))
  assert isinstance(run_obj, Run)  # https://docs.wandb.ai/ref/python/run
  print(run_obj.id, run_obj.name, run_obj.state)

import wandb

from wandb.apis.public import Run

from ast import literal_eval

api = wandb.Api()

what_i_use = "[ユーザー名]/[プロジェクト名]"

# use runs API. https://docs.wandb.ai/ref/python/public-api/api#runs

run = api.runs(what_i_use)

# save the metrics for the run to a csv file

for run_obj in list(run):

print(type(run_obj))

assert isinstance(run_obj, Run) # https://docs.wandb.ai/ref/python/run

print(run_obj.id, run_obj.name, run_obj.state)

Python

Posted by user

Home

Les fleur du mal - Au lecteur: 2

コメント一覧

まだ、コメントがありません