Parallel IO with MPI

From XdmfWeb
Revision as of 13:17, 13 March 2017 by Burns (talk | contribs) (→‎Distributed Shared Memory)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
Note: as of version 1.8.13 hdf5 must be patched as follows to allow for this functionality.

In the file /hdf5/src/CMakeLists.txt

change

if (NOT HDF5_INSTALL_NO_DEVELOPMENT)
  install (
      FILES
          ${H5_PUBLIC_HEADERS}
      DESTINATION
          ${HDF5_INSTALL_INCLUDE_DIR}
      COMPONENT
          headers
  )
endif (NOT HDF5_INSTALL_NO_DEVELOPMENT)

to

if (NOT HDF5_INSTALL_NO_DEVELOPMENT)
  install (
      FILES
          ${H5_PUBLIC_HEADERS}
          ${H5_PRIVATE_HEADERS}
      DESTINATION
          ${HDF5_INSTALL_INCLUDE_DIR}
      COMPONENT
          headers
  )
endif (NOT HDF5_INSTALL_NO_DEVELOPMENT)

Distributed Shared Memory

By leveraging the h5fd and hdf5 libraries, Xdmf provides an interface by which a user can set up a dsm server that may be interacted with in a manner similar to hdf5.

Multiple Datasets and filenames are supported as of version 3.3.0.

Initializing DSM

After starting MPI, the DSM can be started by creating an instance of XdmfHDF5WriterDSM or XdmfHDF5ControllerDSM.

int size, id, dsmSize;
dsmSize = 64;//The total size of the DSM being created
std::string newPath = "dsm";

unsigned int numServersCores = 2;

MPI_Comm comm = MPI_COMM_WORLD;

MPI_Init(&argc, &argv);

MPI_Comm_rank(comm, &id);
MPI_Comm_size(comm, &size);

unsigned int dataspaceAllocated = dsmSize/numServersCores;

// Splitting MPICommWorld so that a comm that contains the non-Server cores exists.

MPI_Comm workerComm;

MPI_Group workers, dsmgroup;

MPI_Comm_group(comm, &dsmgroup);
int * ServerIds = (int *)calloc((numServersCores), sizeof(int));
unsigned int index = 0;
for(int i=size-numServersCores ; i <= size-1 ; ++i)
{
  ServerIds[index++] = i;
}

MPI_Group_excl(dsmgroup, index, ServerIds, &workers);
int testval = MPI_Comm_create(comm, workers, &workerComm);
cfree(ServerIds);

// The last two cores in the Comm are regulated to manage the DSM.
shared_ptr<XdmfHDF5WriterDSM> exampleWriter = 
  XdmfHDF5WriterDSM::New(newPath, comm, dataspaceAllocated, size-numServersCores, size-1);

This creates a DSM buffer and manager which must be passed to any new DSM objects in order for those objects to function.

After the user is finished with DSM the manager must be disposed of. Not doing this will result in an error.

if (id == 0)
{
  exampleWriter->stopDSM();
}

MPI_Barrier(comm);

//the dsmManager must be deleted or else there will be a segfault
exampleWriter->deleteManager();

MPI_Finalize();

A full example program can be found in the Xdmf source at: Xdmf/core/dsm/tests/Cxx/DSMLoopTest.cpp