Can I ingest the same file twice?#
Yes, if you set lamindb.settings.error_on_file_hash_exists
to False
.
If
True
(the default), you’ll get an error.If
False
, you’ll get a warning if a data object with the same hash exists already.
import lamindb as ln
import pytest
ln.track()
✅ Loaded instance: testuser1/mydata
💬 Instance: testuser1/mydata
💬 User: testuser1
✅ Added: Transform(id='ANW20Fr4eZgM', version='0', name='04-ingest-same-file-twice', type=notebook, title='Can I ingest the same file twice?', created_by_id='DzTjkKse', created_at=datetime.datetime(2023, 5, 30, 20, 25, 56))
✅ Added: Run(id='jjV8fShqtvG8ZEm63jgN', transform_id='ANW20Fr4eZgM', transform_version='0', created_by_id='DzTjkKse', created_at=datetime.datetime(2023, 5, 30, 20, 25, 56))
assert ln.settings.error_on_file_hash_exists == True
filepath = ln.dev.datasets.file_fcs()
file = ln.File(filepath)
ln.add(file)
💡 file will be copied to storage upon `ln.add()` using storage key = AkgsTMzjY5kyiUtm6WXZ.fcs
💡 storing object example.fcs with key AkgsTMzjY5kyiUtm6WXZ.fcs
File(id='AkgsTMzjY5kyiUtm6WXZ', name='example.fcs', suffix='.fcs', size=6785467, hash='KCEXRahJ-Ui9Y6nksQ8z1A', run_id='jjV8fShqtvG8ZEm63jgN', transform_id='ANW20Fr4eZgM', transform_version='0', storage_id='nmLvrDUj', created_at=datetime.datetime(2023, 5, 30, 20, 25, 57), created_by_id='DzTjkKse')
with pytest.raises(RuntimeError):
file = ln.File(filepath)
ln.settings.error_on_file_hash_exists = False
file = ln.File(filepath)
🔶 A file with same hash is already in the DB: [File(id='AkgsTMzjY5kyiUtm6WXZ', name='example.fcs', suffix='.fcs', size=6785467, hash='KCEXRahJ-Ui9Y6nksQ8z1A', run_id='jjV8fShqtvG8ZEm63jgN', transform_id='ANW20Fr4eZgM', transform_version='0', storage_id='nmLvrDUj', created_at=datetime.datetime(2023, 5, 30, 20, 25, 57), created_by_id='DzTjkKse')]
💡 file will be copied to storage upon `ln.add()` using storage key = zRcZY8a1gFR3iZQPHxaW.fcs