Xdmf::New(): Difference between revisions

Latest revision as of 11:53, 7 January 2010

After some discussion with several interested parties, we've decided to re-work Xdmf a bit. Most of the changes involve the API, particularly moving to smart pointers and a more full featured and customizable API for producing Xdmf in parallel.

The existing XML format will most likely stay the same with a few additions. So if you're currently producing Xdmf data outside the API this will have minimal impact. Also if you're just reading Xdmf as a file format this will have minimal impact as well.

Here's a current running list of proposed changes/additions :

Smart Pointers

 Smart pointers in C++ will help eliminate memory leaks and double free problems.
 Most likely the implementation will be intrusive smart pointers with reference counting.
 It is expected that moving to smart pointers will have minimal effect on performance while
 greatly enhancing resource management.

API changes/additions for sharing and producing data in parallel

  While accessing simple Xdmf is well defined and straight forward in the current API, producing and sharing
  the underlying geometric and computed data can become complex (particularly in a parallel computation).
  Since there can be several ways to represent the same data (a single large uniform grid, collections, etc.)
  it is best to allow the application to determine the proper representation and the most efficient method to
  achieve the result (generate the heavy data and collect the associated XML). What is needed is a more flexible 
  mechanism.

  Will Dicharry (Stellar Science Ltd Co) has proposed an API that looks very promising. If you'd like to take a look, the code is available at:
  SVN Repository at GoogleCode

Remove libxml2 dependencies from base objects - use containers

  Currently, Xdmf is tightly tied to libxml2, actually using the library to store state information. This leads to
  instances where data in the Xdmf objects can become out of sync with the data in the XML representation in libxml2.
  A better approach is probably to let the Xdmf objects store all necessary data internally using C++ containers and 
  decouple the generation of the XML using a visitor pattern. This would also allow for various other representations
  of the light data as they become necessary.

Better Fortran Support

  Xdmf is currently 'C'-centric in that array dimensions are assumed to be row-major and data is accessed in that fashion.
  To properly use Xdmf with arrays in Fortran, which is column-major, the arrays need to be transposed on one end or the other.
  Transposing in memory may not work for huge arrays and transposing during I/O may incur significant overhead. At very least 
  there should be transposing support with options.

Additional Topology Support

  Additional Topology types

Static Geometry, Dynamic Attribute Support

  If the Topology and/or Geometry does not change over time, it would be more efficient to cache that information. The
  current implementation assumes that every <Grid> is independent and re-reads the Topo/Geo. One idea to use 
  XPointers in XML and mark the <Grid> with an XML attribute that tells Xdmf that it's possible to cache. Another idea
  is to cache the CData of the <DataItem> (from Dominik Szczerba) to flag identical heavy data items. DSM applications where
  the Heavy Data location stays constant, but the underlying data changes, must be considered.

Active Grid an/or Attribute

  Some mechanism to mark a Grid or an Attribute as "active" so, by default, only that one is read. Very efficient, particularly
  during debugging.

Much more testing, example code, and example datasets

  CTest as much as possible. Take more advantage of the existing dashboards. Provide many more example datasets and code for things
  such as parallel I/O, ghost cells, Collections, etc.

Xdmf::New(): Difference between revisions

Latest revision as of 11:53, 7 January 2010

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

@@ Line 4: / Line 4: @@
 for producing Xdmf in parallel.
-The '''''existing XML format will most likely stay the same''''' with a few
+The <span style='color:red'>'''''existing XML format will most likely stay the same'''''</span> with a few
 additions. So if you're currently producing Xdmf data outside the API
 this will have minimal impact. Also if you're just reading Xdmf as a
 file format this will have minimal impact as well.
+Here's a current running list of proposed changes/additions :
 '''Smart Pointers'''
@@ Line 13: / Line 15: @@
    Most likely the implementation will be intrusive smart pointers with reference counting.
    It is expected that moving to smart pointers will have minimal effect on performance while
-   greatly increasing the efficiency of resource management.
+   greatly enhancing resource management.
 '''API changes/additions for sharing and producing data in parallel'''
-    While reading Xdmf is well defined and straight forward in the current API, producing and sharing
+    While accessing simple Xdmf is well defined and straight forward in the current API, producing and sharing
     the underlying geometric and computed data can become complex (particularly in a parallel computation).
     Since there can be several ways to represent the same data (a single large uniform grid, collections, etc.)
@@ Line 24: / Line 26: @@
     Will Dicharry (Stellar Science Ltd Co) has proposed an API that looks very promising. If you'd like to take a look, the code is available at:
-    [http://xdm.googlecode.com/svn/trunk]
+    [http://xdm.googlecode.com/svn/trunk SVN Repository at GoogleCode]
 '''Remove libxml2 dependencies from base objects - use containers'''
@@ Line 46: / Line 48: @@
     current implementation assumes that every <Grid> is independent and re-reads the Topo/Geo. One idea to use
     XPointers in XML and mark the <Grid> with an XML attribute that tells Xdmf that it's possible to cache. Another idea
-    is to cache the CData of the <DataItem> (from Dominik Szczerba) to flag identical heavy data items.
+    is to cache the CData of the <DataItem> (from Dominik Szczerba) to flag identical heavy data items. DSM applications where
+   the Heavy Data location stays constant, but the underlying data changes, must be considered.
 '''Active Grid an/or Attribute'''
@@ Line 53: / Line 56: @@
 '''Much more testing, example code, and example datasets'''
+   CTest as much as possible. Take more advantage of the existing dashboards. Provide many more example datasets and code for things
+   such as parallel I/O, ghost cells, Collections, etc.