While there is a great amount of flexibility in deposit structure in order to accommodate diverse workflows, MorphoSource has preferences and recommendations for what to include and not to include in terms of "raw" and "derivative" data. These recommendations are based on maximizing meaningful reproducibility and reuse potential while minimizing deposit storage requirements. When contributors pay for their data storage, MorphoSource is more open to inclusion of large, but rarely critical raw files, though we may still ask contributors to think about whether it is in their best interest to use their storage space on them. Whether contributors are paying for storage or not, when special conditions exist requiring more extensive and larger or more limited, derivative deposits, MorphoSource will accept these if the contributors make a strong case for deviating from the preferred structures described below.
There is a lot of information on this page, and if you are feeling overwhelmed you may want to first review this page onExample Deposit Structures.
We describe preferences here in terms of an anticipated derivative chain while being somewhat ambiguous about whether data are technically "raw" or "derived"
PRIMARY data: the "least processed" or "closest to raw" data deposited.
SECONDARY data: the first derivative of the primary data
TERTIARY data: derivatives of the secondary data.
WHAT ARE ACCEPTABLE DEVIATIONS FROM THE PREFERRED DEPOSIT STRUCTURES?
It is very often the case that between data considered to be preferred-PRIMARY and preferred-SECONDARY in our schema below, there will be a lot of intermediates (see this example). In these scenarios the SECONDARY file is more precisely referred to as a more TERTIARY derivative in the workflow, but this is fine. The main concern is when preferred primary or secondary datasets are lacking such that a more derived file type is filling that role instead of its more appropriate tertiary role.
WHAT WILL HAPPEN IF MY DEPOSITS DEVIATE FROM THE PREFERRED STRUCTURES IN UNACCEPTABLE WAYS?
Most likely a MorphoSource representative will contact you for justification or revision.
Reconstructed Image stacks that are uncropped with lots of empty space
Reconstructed image stacks that include multiple specimens (forbidden)
any proprietary formats (e.g., .aim, .vgl) (forbidden)
These files (1) are very large and may be 2-5 times the size the "preferred" image stacks;(2) raw scanner output may often include multiple specimens that were scanned together for efficiency; (3) scanner raw data cannot be visualized 3-dimensionally without further processing and critical metadata values to allow successful processing are not reliably available; (4) our user communities do not request these files or (as far as we know) work with them aside from when they first process scanner output into image stacks.
These files have been processed too much to effectively communicate: (1) the quality or limitations of the raw data and more primary derivatives, (2) to have very good reuse potential
Reconstructed image stacks that (1) include lots of "empty space" are a waste of server space and should be cropped to minimize this prior to upload; (2) include multiple specimens break the data model of MorphoSource and are forbidden. (3) Proprietary formats have poor preservation and reuse potential. Their use also deepens inequities between users who can afford expensive software and those who cannot.
reconstructed image stacks using lossless raster formats including a single specimen and one or more objects representing parts of that specimen (cropping to reduce empty space is encouraged)
(1) raw format uncropped digital photographs may take up 2 orders of magnitude more space than "preferred" compressed, cropped images with no significant effect on quality of 3D models.
These files have been processed too much to effectively communicate: (1) the quality or limitations of the raw data and more primary derivatives, (2) to have very good reuse potential, including (3) regenerating 3D models from photo collections with updated algorithms or simply to check reproducibility.
cropped, compressed photographs.
This means cropping each photograph to image dimensions limited to the focal object for the 3D reconstruction.
Saving photos as .jp2 or some other lossless compression format that can be read by photogrammetry modeling software.
using this cropped, compressed collection as a basis for the 3D model creation.