Preferred Deposit Structures

While there is a great amount of flexibility in deposit structure in order to accommodate diverse workflows, MorphoSource has preferences and recommendations for what to include and not to include in terms of "raw" and "derivative" data. These recommendations are based on maximizing meaningful reproducibility and reuse potential while minimizing deposit storage requirements. When contributors pay for their data storage, MorphoSource is more open to inclusion of large, but rarely critical raw files, though we may still ask contributors to think about whether it is in their best interest to use their storage space on them. Whether contributors are paying for storage or not, when special conditions exist requiring more extensive and larger or more limited, derivative deposits, MorphoSource will accept these if the contributors make a strong case for deviating from the preferred structures described below.

There is a lot of information on this page, and if you are feeling overwhelmed you may want to first review this page on Example Deposit Structures.

We describe preferences here in terms of an anticipated derivative chain while being somewhat ambiguous about whether data are technically "raw" or "derived"

PRIMARY data: the "least processed" or "closest to raw" data deposited.

SECONDARY data: the first derivative of the primary data

TERTIARY data: derivatives of the secondary data.

WHAT ARE ACCEPTABLE DEVIATIONS FROM THE PREFERRED DEPOSIT STRUCTURES?

It is very often the case that between data considered to be preferred-PRIMARY and preferred-SECONDARY in our schema below, there will be a lot of intermediates (see this example).  In these scenarios the SECONDARY file is more precisely referred to as a more TERTIARY derivative in the workflow, but this is fine.  The main concern is when preferred primary or secondary datasets are lacking such that a more derived file type is filling that role instead of its more appropriate tertiary role.

WHAT WILL HAPPEN IF MY DEPOSITS DEVIATE FROM THE PREFERRED STRUCTURES IN UNACCEPTABLE WAYS? 

Most likely a MorphoSource representative will contact you for justification or revision.


ModalityNon-preferred PRIMARY dataRationale for non-prefered PRIMARY datapreferred PRIMARY datanon-preferred SECONDARY datapreferred/acceptable SECONDARY dataTERTIARY data
Ct/MRI scans

Too raw:

  • projection files
  • sinograms

Too derived:

  • mesh files
  • 3D pdfs
  • images
  • videos

Other:

  • Reconstructed Image stacks that are uncropped with lots of empty space
  • Reconstructed image stacks that include multiple specimens (forbidden)
  • any proprietary formats (e.g., .aim, .vgl) (forbidden)

Too raw:

These files (1) are very large and may be 2-5 times the size the "preferred" image stacks; (2) raw scanner output may often include multiple specimens that were scanned together for efficiency; (3) scanner raw data cannot be visualized 3-dimensionally without further processing and critical metadata values to allow successful processing are not reliably available; (4) our user communities do not request these files or (as far as we know) work with them aside from when they first process scanner output into image stacks. 

Too derived:

These files have been processed too much to effectively communicate: (1) the quality or limitations of the raw data and more primary derivatives, (2) to have very good reuse potential

Other:

Reconstructed image stacks that (1) include lots of "empty space" are a waste of server space and should be cropped to minimize this prior to upload; (2) include multiple specimens break the data model of MorphoSource and are forbidden. (3) Proprietary formats have poor preservation and reuse potential. Their use also deepens inequities between users who can afford expensive software and those who cannot.

Too derived:

  • 3D pdfs
  • images
  • videos

Other:

  • any proprietary formats (forbidden)
  • image stacks with bit depth, cropping, resolution, or chirality changes.
  • mesh files
  • textured mesh files
    • textures showng false color surface density
    • manually painted
  • additionally modified/processed meshes
  • 3D pdfs
  • images
  • videos
Photogrammetry

Too raw:

  • raw, uncropped digital photographs

Too derived:

  • textured mesh
  • 3D pdfs
  • images
  • videos

Too raw:

(1) raw format uncropped digital photographs may take up 2 orders of magnitude more space than "preferred" compressed, cropped images with no significant effect on quality of 3D models.

Too derived:

These files have been processed too much to effectively communicate: (1) the quality or limitations of the raw data and more primary derivatives, (2) to have very good reuse potential, including (3) regenerating 3D models from photo collections with updated algorithms or simply to check reproducibility.

  • cropped, compressed photographs.
    • This means cropping each photograph to image dimensions limited to the focal object for the 3D reconstruction.
    • Saving photos as .jp2 or some other lossless compression format that can be read by photogrammetry modeling software.
    • using this cropped, compressed collection as a basis for the 3D model creation.

Too derived:

  • 3D pdfs
  • images
  • videos

Other:

  • any proprietary formats (forbidden)
  • textured mesh files
  • additionally modified/processed meshes
  • 3D pdfs
  • images
  • videos
Surface scans
  • Any proprietary formats (forbidden)
Proprietary formats have poor preservation and reuse potential. Their use also deepens inequities between users who can afford expensive software and those who cannot.
  • No preferences - structured light and laser scanning are accomplished with a great diversity of changing devices and workflows that make recommendations here premature.
  • possibilities
    • zipped collection of point cloud or mesh fragments representing separate scanning passes (note these must be in an accepted format)
    • single mesh or point cloud representing the "rawest" form of the data after alignment and merging of original pointcloud or mesh pieces.
    • single mesh that has been textured and processed and is ready for study or reuse.

Too derived:

  • 3D pdfs
  • images
  • videos

Other:

  • any proprietary formats (forbidden)
  • mesh files
  • textured mesh files
  • additionally modified/processed meshes
  • 3D pdfs
  • images
  • videos