REDOC Section I.5 - Data Analysis & Display System

Date: 09 July 2002

This document aims to set out the long term, high level requirements for a flexible library of tools to facilitate processing and analysis of data in the common PRISM data format and also the requirements of the display system. Existing packages will be surveyed to see if they are appropriate or could be extended to meet the requirements.

PART A: Data Analysis

    Current Working Assumptions:
  1. The WP2c/4a meeting of October 15th/16th 2001 recommended that the CF convention met our Meta-Data requirements and that netCDF would make a good file format for data exchange. We are seeking comment on these recommendations. Details of netCDF and the CF convention can be found in the following documents: http://www.unidata.ucar.edu/packages/netcdf/index.html and http://www.cgd.ucar.edu/cms/eaton/netcdf/CF-current.htm

  2. AGREE
    Proposed extensions to the CF convention:
  1. The Meta-Data needs to record extra information for a full description of the extent and shape of cells in non-rectilinear grids, for instance cell area. A general method is being added to CF in order to support this.

  2. AGREE
  3. The Meta-Data should be able to support non-spherical grids e.g. for data on a Cartesian plane. No specific support is offered in CF for this, but nor is it disallowed.

  4. AGREE
    Meta-Data and File Formats:.
  1. The CF convention provides Meta-Data to describe source, history, institution, etc. It should, consequently, be possible to have variables with different source, institution, history, etc in the same file.

  2. AGREE
  3. It is desirable that the File Format should support lossy & non-lossy compression. (Although netCDF does not support non-lossy compression, it is possible to overcome this using gzip on the entire file). It is also desirable that data can be written to any defined accuracy. (NetCDF only offers predefined data types.)

  4. AGREE
  5. There should be no limit to the number of records, which can be held in a file. There should be no limit to the number of dimensions that can be stored in a record.

  6. AGREE
  7. It should be possible to deal with multiple files as a single logical file.

  8. AGREE
  9. CF Meta-Data is intended to describe physical data at points or in cells. PRISM files may contain other kinds of information which would need different kinds of Meta-Data to describe them. For example, it may be required that a formula used to calculate temperature at a certain level in the atmosphere model may need to be passed to the ocean model via the coupler. The PRISM Meta-Data definition needs to be extendable beyond the CF convention to cater for this type of data transfer. Multiple files may be required to cater for data belonging to different conventions, but should be treated as a single logical file.

  10. AGREE
  11. PRISM will need to agree lists of standard names for data fields and co-ordinates. CF includes a standard name table; the requirements of PRISM can be incorporated into this.

  12. AGREE
     

    Processing Library:

  13. There will be a flexible choice of run-time processing and post-processing.

  14. AGREE
  15. The library should be able to deal with time processing of "moving" missing data. E.g. Time mean of cloud top temperature when cloud points move.

  16. AGREE
  17. It should be possible to run the data processing and display software as separate components. These components should be able to be integrated effectively.

  18. AGREE
  19. A rich set of processing tools that are available through a high level language is needed. The GUI should sit on top of this tool set. Ideally, the tool set should be able to run interactively from the command-line. This software toolkit approach allows the users to develop their own processing algorithms.

  20. AGREE
  21. The toolkit and processing library should be multi-dimensional in design (at least 3 space dimensions, time and other dimensions such as radiation bands). I.e. A single 'structure' within the library will be able to hold multi-dimensional data and its Meta-Data.

  22. AGREE
  23. The toolkit should simultaneously process data and Meta-Data.

  24. AGREE
  25. A program to check that a given file conforms to the CF Meta-Data standard should be included as part of the package.

  26. AGREE
  27. It should be possible to link the toolkit to other software (E.g. PV-WAVE).

  28. AGREE
  29. It will be possible to archive processed data. The processed data should have the correct Meta-Data.

  30. AGREE
  31. The software needs to support efficient remote file access. It must be possible to have access to subsets of a file. E.g. slices through the file and access to individual structures in the file.

  32. AGREE
  33. The system must be efficient enough to process bulk data.

  34. AGREE

PART B: General

Processing Library & Visualization

Questions Ranking and Comments
The processing library and data display should be non-proprietary.
AGREE
DISAGREE
Yes
All UNIX-like platforms should be supported.
AGREE
DISAGREE
Yes
Would you be willing to learn a new scripting language?
yes
Depends 


on the package

Would like to, but do not have time
no
Yes

PART C: Visualization

This visualization questionaire consists of two parts: The section 'Requirements For Visualization Of PRISM Output' evaluates future needs for data display of a PRISM coupled simulation.
The second section 'Constraints: Past Experiences' elaborates on how researchers have done their visualization. The results of both questionaires should lead to a compromise on the work efforts to be spent for certain visualization techniques (ARCDI).

Requirements For Visualization Of PRISM Output

Questions Ranking and Comments
Which kind of standard plots are mandatory (e.g. for model comparison)? Please mark or add further plots! horizontal slices vertical slices variables mapped to
isopycincal surfaces
variables mapped to
pressure surfaces
hovmoeller diagrams timeseries of
zonal means
Yes Yes Yes Yes Yes Yes
The visualisation package needs to make it easy to do comparisons of equivalent fields from different runs, times etc. AGREE DISAGREE Any Comments
Yes
Which standard plots would you like to have additionally? Please discriminate between runtime (online quality control) and postprocessing! At runtime Postprocessing
Yes Yes with no x session openned previously. We currently use Virtual Frame Buffer to do this but it should be directly as part of the graphic package.
Would you like to have a command line and/or scripting interface for data display in order to have better control over the data display? Please, mark your preferences!
scripting interface
command line
both
To learn what command lines to use Mostly prefered Yes
What type of graphical output format do you need?
screen
jpeg
postscript
png animation
(mpeg,avs)
Yes Not used but animation tools associated pdf format prefered (smaller than ps) and eps to include in documents Yes and also the mng format to be able to make animation.
Currently using gif (animated gif) but like to replace this.
Not used, because oriented to video. Always see some degradations. svg format could be interesting
Do you think 3D graphics is necessary/helpful to analyse PRISM data?
very often
sometimes
never
Yes, especially with animated isovolumes .
Which functionality should such a 3d package have?
slices
streamlines
isosurfaces
trajectories particle trace volume rendering multible datasets,
overlays
non-regular grids
Yes Yes Yes Yes Yes Yes Yes Of course
Are animations important in order to display temporal evolution of certain complex data relationships? Please rank!
yes
from time to time to
present my project to others
nice to have 
no
Really important to explore/correct simulations Yes to focus on a special phenomenon already known
Would you like to see such animations online as a way of monitoring the correctness of you simulation? Please comment!
yes
nice to have 
not applicable
Already the case
Would you like to share your results and to work on them with other scientists by means of an online collaborative visualization? Please rank!
yes
sometimes
never
collaborative vis.,
never heard of it.
Yes, if well prepared can be usefull
Do you see a need for immersive visualization solutions (multi-channel stereoscopic displays with head tracking devices and data gloves, caves)? Please comment! 
yes, Iam interested
in such solutions
nice to have 
no
Yes, French National Computing Center looks for project to justify new equiments (bench display in particular)

 

Constraints / Past Experience

Questions Ranking and Comments
What kind of hardware platforms is your modelling group using for visualization of "PRISM- type" models? Give a list ordered by descending frequency of usage.
1
most preferred
2
3
4
least used
Linux, Unix
What kind of software packages are you using for visualization? Give a list ordered by descending frequency of usage.
1
most preferred
2
3
4
least used
Ferret, IDL, CDAT, Grads
Are you visualizing or presenting your results by using 3D graphic packages or have your results been presented that way? Please rank! 
quite often
yes, for the analysis


of complex data

from time to time, just to
advertise my project to others
never
Yes, used as technical demo
Do you need high definition display quality, close to reality? (Resolutions close to 3kx3k pixels, 48bit RGB, full scene anti-aliasing, various lightening schemes and high definition shadings) Please mark and comment!
yes
good workstation
is fine for me
midrange PC is okay
don't know Any other comment you would like to provide?
Prism WP4 could inform participant what is the minumum equipment to buy
Should be enough especially with graphic cards coming from game industries
How large is your greatest set of data on which you envisage to work during a visualization session? Please give a rough estimate for the following properties and amount of data in GB!
the largest number
of grid cells
output frequency
number of timesteps
number of variables total amount of data (GB)
(720*360*33
not regular for ocean model
* number of variables
+ 180*90*55
regular for atmospheric model and vegetation model
* number of variables
) * time steps
Currently
0.1 Gb per time steps
Soon ~4Gb when full resolution will be used
How would you rate the portion of computer resources which is used by manipulating and processing your data for preparing the visualization? Do you feel compute and memory bound during your visualization? Please comment!
Significant, could be faster
Processing and
visualization are equal
Load is due to the display
part of the visualization
Feel no constraints
during the visualization
Don't know
Processing are still the main resource consumer in our daily activity. Use of CF conventions with XML encapsulation (see multiple files as a single logical file) will help us a lot. Once done, visualization part sould be easier. 
Several graphic packages allow to add functionality by coding scripts or procedures. Are you familiar with the following script or programming languages? Please rank your expertise with high/medium/low/none!
Unix shell
C
C++
Java Python IDL
High High Low Low Medium High



Last Modified: 09 July 2002
Patrick Brockamnn for IPSL