OceanDataStore CLI
Mandatory Arguments
| Long version | Short Version | Description |
|---|---|---|
| action | Specify the action: send_to_zarr, send_to_icechunk, update_zarr, update_icechunk or list. |
|
--filepaths |
-f |
Paths to the files to send or update. |
--credentials |
-c |
Path to the JSON file containing the credentials for the object store. |
--bucket |
-b |
Bucket name. |
Optional Arguments
| Flag | Short Version | Description |
|---|---|---|
--prefix |
-p |
Object prefix (default=None). |
--append-dim |
-ad |
Append dimension (default=time_counter). |
--variables |
-v |
Variables to send (default=None). Default None will send all variables. |
--chunk-strategy |
-cs |
Chunk strategy as a JSON string (default=None). E.g., '{\"time_counter\": 1, \"x\": 100, \"y\": 100}' |
--dask-configuration |
-dc |
Path to the JSON file defining the Dask Local Cluster configuration (default=None). |
--grid-filepath |
-gf |
File path to model grid file containing domain information (default=None). |
--update-coords |
-uc |
Coordinate dimensions to update as a JSON string (default=None). E.g., '{\"nav_lon\": \"glamt\", \"nav_lat\": \"gphit\"}' |
--attributes |
-at |
Attributes to add to the dataset as a JSON string. E.g., '{\"title\": \"my_dataset\"}' |
--zarr-version |
-zv |
Zarr version used to create the zarr store (default=3). Options are 2 (v2) or 3 (v3). |
--branch |
-br |
Branch of Icechunk repository to commit changes to (default=main). |
--commit_message |
-cm |
Commit message to be recorded when committing changes to Icechunk repository. (default="Add new data to my Icechunk repository"). |
--variable-commits |
-vc |
Flag to send variables to Icechunk repository using independent commits. |
--icechunk-configuration |
-ic |
Path to the JSON file defining the Icechunk storage and repository configurations (default=None). |
OceanDataStore Python API
OceanDataStore.object_store_handler.send_to_zarr
send_to_zarr(file, bucket, object_prefix, store_credentials_json, variables=None, append_dim='time_counter', grid_filepath=None, update_coords=None, rechunk=None, attrs=None, dask_config_kwargs=None, dask_cluster_kwargs=None, zarr_version=3)
Write data to new Zarr store in cloud object storage with option of using dask.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file
|
list[str] | str | Dataset
|
Regular expression or list of filepaths to netCDF file(s). Users can also pass a single xarray.Dataset directly. |
required |
bucket
|
str
|
Name of the bucket in the object store. Bucket names can contain only lowercase letters, numbers, dots (.), and hyphens (-). |
required |
object_prefix
|
str
|
Prefix to be added to the object names in the object store. |
required |
store_credentials_json
|
str
|
Path to the JSON file containing the object store credentials. |
required |
variables
|
Optional[list[str]]
|
List of variables to send. If None, all variables will be sent. |
None
|
append_dim
|
str
|
Name of the append dimension, by default "time_counter". |
'time_counter'
|
grid_filepath
|
Optional[str]
|
Path to file containing model grid parameter. |
None
|
update_coords
|
Optional[dict]
|
Dictionary of coordinate variables to update. |
None
|
rechunk
|
Optional[dict]
|
Rechunk strategy dictionary, by default None. |
None
|
attrs
|
Optional[dict]
|
Attributes to add to the dataset. |
None
|
dask_config_kwargs
|
Optional[dict]
|
Dask configuration settings passed to dask.config.set(). |
None
|
dask_cluster_kwargs
|
Optional[dict]
|
Dask cluster configuration settings passed to LocalCluster(). |
None
|
zarr_version
|
int
|
Zarr version to use. |
3
|
Source code in OceanDataStore/object_store_handler.py
1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 | |
OceanDataStore.object_store_handler.update_zarr
update_zarr(file, bucket, object_prefix, store_credentials_json, variables=None, append_dim='time_counter', grid_filepath=None, update_coords=None, rechunk=None, attrs=None, dask_config_kwargs=None, dask_cluster_kwargs=None, zarr_version=3)
Update data in existing Zarr store in cloud object storage with option of using dask.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file
|
list[str] | str | Dataset
|
Regular expression or list of filepaths to netCDF file(s). Users can also pass a single xarray.Dataset directly. |
required |
bucket
|
str
|
Name of the bucket in the object store. Bucket names can contain only lowercase letters, numbers, dots (.), and hyphens (-). |
required |
object_prefix
|
str
|
Prefix to be added to the object names in the object store. |
required |
store_credentials_json
|
str
|
Path to the JSON file containing the object store credentials. |
required |
variables
|
Optional[list[str]]
|
List of variables to send to Zarr stores. If None, all variables will be sent. |
None
|
append_dim
|
str
|
Name of the dimension to append multifile datasets. |
'time_counter'
|
grid_filepath
|
Optional[str]
|
Path to file containing model grid parameter. |
None
|
update_coords
|
Optional[dict]
|
Dictionary of coordinate variables to update. |
None
|
rechunk
|
Optional[dict]
|
Rechunk strategy dictionary. |
None
|
attrs
|
Optional[dict]
|
Attributes to add to the dataset. |
None
|
dask_config_kwargs
|
Optional[dict]
|
Dask configuration settings passed to dask.config.set(). |
None
|
dask_cluster_kwargs
|
Optional[dict]
|
Dask cluster configuration settings passed to LocalCluster(). |
None
|
zarr_version
|
int
|
zarr version to use. |
3
|
Source code in OceanDataStore/object_store_handler.py
1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 | |
OceanDataStore.object_store_handler.send_to_icechunk
send_to_icechunk(file, bucket, object_prefix, store_credentials_json, variables=None, append_dim='time_counter', grid_filepath=None, update_coords=None, rechunk=None, attrs=None, branch='main', commit_message='Add new data to my Icechunk repository', variable_commits=False, dask_config_kwargs=None, dask_cluster_kwargs=None, icechunk_config=None)
Write data to new Icechunk repository in cloud object storage with option of using dask.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file
|
list[str] | str | Dataset
|
Regular expression or list of filepaths to netCDF file(s). Users can also pass a single xarray.Dataset directly. |
required |
bucket
|
str
|
Name of the bucket in the object store. Bucket names can contain only lowercase letters, numbers, dots (.), and hyphens (-). |
required |
object_prefix
|
str
|
Prefix to be added to the object names in the object store. |
required |
store_credentials_json
|
str
|
Path to the JSON file containing the object store credentials. |
required |
variables
|
Optional[list[str]]
|
List of variables to send. If None, all variables will be sent. |
None
|
append_dim
|
Optional[str]
|
Name of the dimension to append multifile datasets. |
'time_counter'
|
grid_filepath
|
Optional[str]
|
Path to file containing model grid parameter. |
None
|
update_coords
|
Optional[dict]
|
Dictionary of coordinate variables to update. |
None
|
rechunk
|
Optional[dict]
|
Rechunk strategy dictionary, by default None. |
None
|
attrs
|
Optional[dict]
|
Attributes to add to the dataset. |
None
|
branch
|
str
|
Branch on which to write data to IcechunkStore. |
'main'
|
commit_message
|
str
|
Commit message when updating the Icechunk repository. |
'Add new data to my Icechunk repository'
|
variable_commits
|
bool
|
Whether to write each variable to Icechunk repository using separate commits. |
False
|
dask_config_kwargs
|
Optional[dict]
|
Dask configuration settings passed to dask.config.set(). |
None
|
dask_cluster_kwargs
|
Optional[dict]
|
Dask cluster configuration settings passed to LocalCluster(). |
None
|
icechunk_config
|
Optional[dict]
|
Icechunk repository configuration. |
None
|
Source code in OceanDataStore/object_store_handler.py
1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 | |
OceanDataStore.object_store_handler.update_icechunk
update_icechunk(file, bucket, object_prefix, store_credentials_json, variables='all', append_dim='time_counter', grid_filepath=None, update_coords=None, rechunk=None, attrs=None, branch='main', commit_message='Update data in my Icechunk repository', dask_config_kwargs=None, dask_cluster_kwargs=None, icechunk_config=None)
Update data in existing Icechunk repository in cloud object storage with option of using dask.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file
|
list[str] | str | Dataset
|
Regular expression or list of filepaths to netCDF file(s). Users can also pass a single xarray.Dataset directly. |
required |
bucket
|
str
|
Name of the bucket in the object store. Bucket names can contain only lowercase letters, numbers, dots (.), and hyphens (-). |
required |
object_prefix
|
str
|
Prefix to be added to the object names in the object store. |
required |
store_credentials_json
|
str
|
Path to the JSON file containing the object store credentials. |
required |
variables
|
list[str] | str
|
List of variables to send. If None, all variables will be sent. |
'all'
|
append_dim
|
Optional[str]
|
Name of the dimension to append multifile datasets. |
'time_counter'
|
grid_filepath
|
Optional[str]
|
Path to file containing model grid parameter. |
None
|
update_coords
|
Optional[dict]
|
Dictionary of coordinate variables to update. |
None
|
rechunk
|
Optional[dict]
|
Rechunk strategy dictionary, by default None. |
None
|
attrs
|
Optional[dict]
|
Attributes to add to the dataset. |
None
|
branch
|
str
|
Branch on which to write data to IcechunkStore. |
'main'
|
commit_message
|
str
|
Commit message when updating the Icechunk repository. |
'Update data in my Icechunk repository'
|
dask_config_kwargs
|
Optional[dict]
|
Dask configuration settings passed to dask.config.set(). |
None
|
dask_cluster_kwargs
|
Optional[dict]
|
Dask cluster configuration settings passed to LocalCluster(). |
None
|
icechunk_config
|
Optional[dict]
|
Icechunk repository configuration. |
None
|
Source code in OceanDataStore/object_store_handler.py
1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 | |