Though DCM4CHEE can interface HSM systems using OS level tools out of the box, you will need to change some configuration parameters to adjust it to your environment and get it running.
This document assumes that you already have DCM4CHEE running, and you can send DICOM images to DCM4CHEE and query-retrieve them.
First you'll have to define a separate directory to put files to be migrated by HSM tools. If you plan to deal with a big amount of DICOM data, then it is advised to have a separate partition on physically separate disk or RAID system for HSM migrations. This will reduce the IO contention on the disks and increase the throughput of the system.
I will use here the UNIX notation to make things clearer and to be consistent with the DCM4CHEE's internal representation. Let say you have a partition mounted as
/hsm and it will be used for HSM migrations.
- Add this partition as a file system to DCM4CHEE. In the JMX console open
FileSystemMgt MBean view. Scroll down to
addFileSystem() method and invoke it with the parameters:
dirPath => tar:/hsm, aet => YOUR_DCM4CHEE_INSTALLATION_AET, availability => NEARLINE, status => RO, user info => SOME_INFO_FOR_YOUR_REFERENCE. Here
dirPath has to have the prefix
tar: to match the
DestinationFileSystem parameter in the
FileCopy service. This prefix is used to tell
FileCopy service to pack files into a tar file before copying.
- On the top of
FileSystemMgt view, in the configuration parameters part, adjust the clean-up rules: set
StudyAgeForDeletion to something like
w means weeks, you can use
h - hours and
d - days as well.
FileSystemMgt cleans up main file system in intervals shown in
FreeDiskSpaceInterval parameter. By default it is
5m, change it to a longer period to reduce the contention on DB and disks. Files are copied by the
FileCopy service. After a file was copied to the HSM partition,
FileCopy service doesn't touch the original file entry in the database, but adds a new entry for it. This entry will have a different
status (see below) and a different
file_path - something like
<SOME_PATH>.tar!<PATH_TO_FILE_IN_TAR>. This is used by
FileSystemMgt service. During the clean-up session it looks for the files older than the given age and if it can find the mentioned copy entry in the database with the status
ARCHIVED, it deletes the original file and it's original entry from the database.
FileCopy MBean view you'll have to adjust the following parameters: set
true. This will tell the
FileCopy service to pack files into a tar archive, verify MD5 sums of copied files, save the tar file under
/hsm, add a copy DB entry for an each file in the tar archive and change it's status TO_ARCHIVE. At this point HSM tools step in. Depending on your environment you might have a transparent HSM migration tool or a command line tool to migrate a given file to tapes or other long term storage. If you have a transparent HSM migration agent, then configure it to migrate all files under
/hsm to your long term storage. If you don't have a transparent migration tool, then use
TarOutgoingDirectory config parameters of the
FileCopy service to invoke an HSM migration command after files were packed into a tar file.
FileCopy service will reschedule file copy orders if by some reason they fail. Number and the interval of retries can be changed in
- Next you'll have to change
SyncFileStatus MBean configuration.
MonitoredFileSystem is your original main file system path, where all files sent to your server are kept. Change
Command according to HSM query tools provided by your environment.
Pattern is the regular expression to check the output of the
Command. It will also depend on your environment - on the response of HSM query. Change
TaskInterval to adjust when and how often you'd like to run file status checks.
If everything is done correctly, then as soon as your server receives files it will schedule a file copy order.
FileCopy service will put them into a tar archive and will trigger an HSM migration. At this moment you'll have a copy of the files in the long term storage as well as in your online storage. Also you'll have doubled DB entries for each file, but with a different
status. Depending on the intervals, file status checks will be invoked and for the successful ones files will be marked as
FileSystemMgt will take care of cleaning up your online storage and will delete old successfully archived files. When somebody will try to access the archived files, invoking a
QueryRetrieveScpService will use
TarRetriever to retrieve files. If you have a transparent HSM, then the only thing is to tell
TarRetriever which directory to use as a
CacheRoot and leave
NONE. The rest will be transparent: your HSM will bring files back from the long term storage, when
TarRetriever will try to access them. If your HSM is not transparent, then use
TarFetchCommand to invoke an HSM retrieve command.
HSM Service Configuration