View this PageEdit this PageUploads to this PageHistory of this PageHomeRecent ChangesSearchHelp Guide

Information for Potential Mirror Sites

Ani says that he believes all the SDSS data is available on the DAS site if anyone wants it, it just may not be very convenient to pull all of it quickly. What NCDM distributes is really everything that someone should need for a full mirror, i.e. all of the CAS data and all the DAS data that is not contained in the CAS. You don't need, for example, the tsObj and spPlate files that are already ingested into the CAS.

There are no official guidelines on the total disk space needed for a mirror site, partly because no one has a full mirror (CAS + DAS) as yet. How much you decide to go with really depends on whether you intend to upgrade your mirror to every future SDSS release or not. It's each researchers call.

Per Ani:
I assumes that the CAS and DAS will be hosted on different servers, so the 13 TB will subdivided among them. On the CAS server, i would recommend at least 3 TB to leave enough room for tempdb expansion etc. You do not need to copy the DRx subsets if you are bringing over the full dataset.


The catalog data is currently ~2.4TB in size. The image data is ~10TB. The spectroscopic data is ~230GB. The DR5 subsets that we also have are another ~320GB.

That should mean that a 13TB disk will hold all the data. But that leaves no room for growth as any future releases grow in size. Plus, there is more data that is available on request but that we don't host here at UIC.