Working with TUFLOW Outputs#

Most of the common TUFLOW outputs are supported by pytuflow. This includes: TPC, XMDF, and NCGrid. Static, single result outputs such as TIF and ASC are also supported.

Output classes all derive from the same base class, and as such, share a common interface. This makes the accessing data from the outputs consistent and mostly format agnostic. There are some slight differences between a time-series output, which uses IDs, and a map output, which uses coordinates. However, the method names and usage are the same.

The examples below use models from the TUFLOW Example Model Dataset.

TPC#

The TPC class can be loaded by initialising the class with the path to the .tpc file, or by using the TCF.tpc() method. The below example uses results from EG15_001.tcf.

>>> from pytuflow import TPC
>>> tpc = TPC('path/to/results/plot/EG15_001.tpc')

or

>>> from pytuflow import TCF
>>> tcf = TCF('path/to/EG15_001.tcf')
>>> tpc = tcf.context().tpc()

Result Information#

Information can be obtained from the results, such as the IDs, result types, and the time steps using TPC.ids(), TPC.data_types(), and TPC.times() methods respectively.

>>> tpc.ids()
['FC01.1_R.1', 'FC01.1_R.2', 'FC01.2_R.1', ..., 'Pipe8', 'Pipe9']

>>> tpc.data_types()
['water level',
 'mass balance',
 'node flow regime',
 'flow',
 'velocity',
 'channel entry losses',
 'channel additional losses',
 'channel exit losses',
 'channel flow regime']

>>> tpc.times()
[0.0, 0.016666666666666666, 0.03333333333333333, ..., 2.9833333333333334, 3.0]

These methods also accept an optionals filter_by argument, which can be used to filter the return data by a given domain, geometry, data type, or ID. For example, to get the IDs only for 2D domain (PO) results:

>>> tpc.ids(filter_by='po')
[]

In this case, there are no 2D results in the TPC file, so an empty list is returned. We can also filter by geometry, and we can conbine the geometry filter with the domain filter:

>>> tpc.ids('1d/line')
['FC01.1_R', 'FC01.2_R', 'FC04.1_C', 'Pipe0', 'Pipe1', ..., 'Pit8', 'Pit9']

>>> tpc.ids('channel')
['FC01.1_R', 'FC01.2_R', 'FC04.1_C', 'Pipe0', 'Pipe1', ..., 'Pit8', 'Pit9']

The "channel" filter is a shorthand for "1d/line" since channels are only a 1D type. A similar shorthand exists for "1d/point" and "node".

Time-Series#

Time-series data can be accessed using the TPC.time_series() method:

>>> tpc.time_series('Pipe1', 'flow')
          channel/flow/Pipe1
time
0.000000               0.000
0.016667               0.009
0.033333               0.040
0.050000               0.075
0.066667               0.106
...                      ...
2.933333               0.000
2.950000               0.000
2.966667               0.000
2.983334               0.000
3.000000               0.000

Since a given ID could exist in multiple domains, for example, a 1D node, a 2D PO point, and a RL point could all have the same name (TUFLOW allows this), the return DataFrame header will include the domain, result type, and ID in the column name.

It’s also possible to pass in a list of IDs and/or result types to the TPC.time_series() method to get multiple time-series at once:

>>> tpc.time_series(['Pipe1', 'Pipe2'], ['flow', 'velocity'])
          channel/flow/Pipe1  channel/flow/Pipe2  channel/velocity/Pipe1  channel/velocity/Pipe2
time
0.000000               0.000               0.000                   0.000                   0.000
0.016667               0.009               0.005                   0.510                   0.456
0.033333               0.040               0.014                   0.740                   0.567
0.050000               0.075               0.021                   0.875                   0.632
0.066667               0.106               0.029                   0.966                   0.681
...                      ...                 ...                     ...                     ...
2.933333               0.000               0.000                   0.000                   0.000
2.950000               0.000               0.000                   0.000                   0.000
2.966667               0.000               0.000                   0.000                   0.000
2.983334               0.000               0.000                   0.000                   0.000
3.000000               0.000               0.000                   0.000                   0.000

Section#

TPC section data returns a long section from the given channel ID to either the outlet of the connected channels, or if a second channel ID is provided, to that channel.

>>> tpc.section('Pipe1', 'h', 1.)
    branch_id channel      node  offset        h
0           0   Pipe1      Pit2     0.0  43.7653
6           0   Pipe1      Pit3    26.6  43.7654
1           0  Pipe19      Pit3    26.6  43.7654
7           0  Pipe19     Pit16    58.3  43.7652
2           0   Pipe5     Pit16    58.3  43.7652
8           0   Pipe5     Pit15    94.8  43.7652
3           0   Pipe6     Pit15    94.8  43.7652
9           0   Pipe6     Pit14   126.2  43.7654
4           0  Pipe15     Pit14   126.2  43.7654
10          0  Pipe15     Pit13   140.0  43.7653
5           0  Pipe16     Pit13   140.0  43.7653
11          0  Pipe16  Pipe16.2   212.8  43.7648

In the example above, we use the well known short-hand "h" for the "water level" result type. pytuflow accepts well known short-hands for result types, and it’s worth nothing that the column name in the returned DataFrame will be set based on the result type the user provided. For example, in the example above, "h" is provided and the column name is set to "h". If the user provided "water level", then column would be set to "water level". This is also true for the TPC.time_series().

A flow trace downstream could branch into multiple channels that go in different directions, the TPC.section() method will return data for all branches. The branch_id column is used to identify the branch. If the data is used for plotting, the branch_id can be used to group the data.

XMDF#

The XMDF class is used to read TUFLOW XMDF map outputs (.xmdf files). Other map output formats, such as the NetCDF grid format (NCGrid) and the TUFLOW FV NetCDF mesh output (NCMesh) share similar methods and the examples below are transferable to those formats.

Similar to the TPC class, the XMDF class can be loaded by initialising the class with the path to the .xmdf file. Unlike the TPC class, the TCF class does not have a method to load the XMDF automatically. The reason for this, is that the .tpc output is always created by TUFLOW, whereas the .xmdf output is optional. It is very easy to obtain the path to the .xmdf file from your TUFLOW model. The example below uses results from EG00_001.tcf.

>>> from pytuflow import TCF, XMDF
>>> tcf = TCF('path/to/EG00_001.tcf')
>>> xmxdf_path = tcf.context().output_folder_2d() / f'{tcf.context().output_name()}.xmdf'
>>> xmdf = NCGrid(xmxdf_path)

Result Information#

Information, such as result types and time steps, can be obtained using XMDF.data_types() and XMDF.times() methods respectively.

>>> xmdf.data_types()
['water level',
 'depth',
 'velocity',
 'z0',
 'max water level',
 'max depth',
 'max velocity',
 'max z0',
 'tmax water level']

>>> xmdf.times()
[0.0, 0.08333333333333333, 0.16666666666666666, ..., 2.9166666666666665, 3.0]

It’s possible to filter the return data by whether the result type is temporal/static and/or scalar/vector.

>>> xmdf.data_types(filter_by='temporal')
['water level', 'depth', 'velocity', 'z0']

>>> xmdf.data_types(filter_by='vector')
['velocity', 'max velocity']

>>> xmdf.data_types(filter_by='static/scalar')
['max water level', 'max depth', 'max z0', 'tmax water level']

Time-Series#

The XMDF.time_series() method is very similar to the TPC.time_series() method, except that it takes a spatial location (coordinates) instead of an ID. The coordinates can be a tuple (x, y) coordinate, a WKT string "POINT (x y)", or a file path to a GIS point file (e.g. .shp) containing one or more points.

In the example below, we will use the gis\2d_po_EG02_010_P.shp file from the TUFLOW example model dataset.

>>> xmdf.time_series('./gis/2d_po_EG02_010_P.shp', 'water level')
          PO_01/water level  PO_02/water level
time
0.000000                NaN          36.500000
0.083333                NaN          36.483509
0.166667                NaN          36.457958
0.250000                NaN          36.441391
0.333333                NaN          36.431271
0.416667                NaN          36.426140
0.500000                NaN          36.423336
0.583333                NaN          36.421467
0.666667          40.110428          36.420143
...                  ...                   ...
2.833333          42.804726          38.509300
2.916667          42.793350          38.429859
3.000000          42.781895          38.342941

Section#

The XMDF.section() extracts a cross-section from the results at a given time, from a given polyline. The polyline can be a series of coordinates, a WKT string, or a path to a GIS polyline file.

The example below uses the gis\2d_po_EG02_010_L.shp file from the TUFLOW example model dataset.

>>> xmdf.section('./gis/2d_po_EG02_010_L.shp', 'water level', 1.)
        PO_01                   PO_02
           offset water level      offset water level
    0    0.000000         NaN    0.000000         NaN
    1    0.223653         NaN    0.378194         NaN
    2    2.947713         NaN    3.266278         NaN
    3    7.948157         NaN    8.285645         NaN
    4   12.948420         NaN   12.927794         NaN
    ..        ...         ...         ...         ...
    69        NaN         NaN  309.448305         NaN
    70        NaN         NaN  314.467647         NaN
    71        NaN         NaN  319.486920         NaN
    72        NaN         NaN  323.888631         NaN
    73        NaN         NaN  325.780984         NaN

Note, that the returned DataFrame does not use a common index, as the section data comes from different polylines. The printed DataFrame is truncated and does contain valid values within the truncated section. The first PO line PO_01 is shorter than the second PO line PO_02, so the last rows are NaN for the first PO line.

Surface#

It’s also possible to extract the 2D surface from map outputs at a given time step. With the surface data, it’s possible to perform all sorts of custom analyses such as differences between results such as statistics, or histograms.

An example extracting the maximum velocity surface:

>>> df = xmdf.surface('max velocity', time=-1)  # time value can be any value for static datasets such as maximums
>>> df
               x            y  value  active
0      292940.043  6177590.372    0.0   False
1      292938.904  6177585.504    0.0   False
2      292934.036  6177586.643    0.0   False
3      292935.174  6177591.511    0.0   False
4      292944.912  6177589.234    0.0   False
...           ...          ...    ...     ...
20944  293595.735  6178417.774    0.0   False
20945  293590.866  6178418.913    0.0   False
20946  293600.603  6178416.635    0.0   False
20947  293605.472  6178415.496    0.0   False
20948  293610.340  6178414.357    0.0   False

We can count the number of cells in each velocity range as defined by the user:

>>> import numpy as np
>>> bins = [0, 0.5, 1.0, 2.5, 5.0, 20.]
>>> mask = df['active']  # the active (wet) mask
>>> counts, _ = np.histogram(df.loc[mask, 'value'], bins)
>>> counts
array([4368, 1820, 2030,    5,    0])

The above calculation tells us that there are 5 cells with a maximum velocity between 2.5 m/s and 5.0 m/s and no cells with a maximum velocity above 5.0 m/s.

Surface Differences#

A similar calculation could be performed on a difference surface to see how many cells exceed certain thresholds. As an example, using results from the example model dataset EG16_~s1~_001.tcf which contains two scenarios - EXG and D01. It is assumed that both scenarios have been run and the results are available.

First load the results and extract the maximum water level surfaces for both scenarios. Because we are working with maximum water level, which is a static result, we can use any time value when extracting the surface. Here we use time=-1 to indicate that it is a static result.

>>> exg = XMDF('/path/to/EG16_EXG_001.xmdf')
>>> d01 = XMDF('/path/to/EG16_D01_001.xmdf')

>>> exg_wl = exg.surface('max water level', time=-1)
>>> d01_wl = d01.surface('max water level', time=-1)

We also need to consider inactive cells in either scenario. In this case, we will treat inactive cells as having the same elevation as the bed level. So for this, we also need to extract the surface for the bed level and then set the inactive cells to the bed level:

>>> exg_z = exg.surface('bed level', time=-1)
>>> d01_z = d01.surface('bed level', time=-1)

>>> exg_wl.loc[~exg_wl['active'], 'value'] = exg_z.loc[~exg_wl['active'], 'value']
>>> d01_wl.loc[~d01_wl['active'], 'value'] = d01_z.loc[~d01_wl['active'], 'value']

Now we perform the difference and then count the number of cells that exceed certain thresholds. We will exclude every cell that is inactive in both scenarios so that we are only comparing wet cells in at least one scenario.

>>> mask = exg_wl['active'] | d01_wl['active'] # combined mask

>>> diff_wl = d01_wl.loc[mask, 'value'] - exg_wl.loc[mask, 'value'] # difference

>>> bins = [
      min(diff_wl.min() - 0.1, -5),
      -2.5, -1.0, -0.5, -0.25, -0.1, 0, 0.1, 0.25, 0.5, 1.0, 2.5,
      max(diff_wl.max(), 5)
]

>>> counts, _ = np.histogram(diff_wl, bins)
>>> counts
array([   0,    3,    5,   62,  110,  915, 3580, 2425,    6,   13,    0,
    975])

It’s also possible to visualise this with matplotlib:

>>> import matplotlib.pyplot as plt
>>> plt.hist(diff_wl, bins)
>>> plt.show()
Histogram of water level differences

In the above histogram, we can see that there are a significant number of cells with differences above 2 metres. These may be on the development site and should not be considered an impact. We can mask these areas by converting the returned DataFrame to a GeoDataFrame and masking the area using a polygon shapefile of the development site.

>>> import geopandas as gpd
>>> gdf = gpd.GeoDataFrame(
      diff_wl,
      geometry=gpd.points_from_xy(exg_wl.loc[mask, 'x'], exg_wl.loc[mask, 'y'])
)
>>> shp = gpd.read_file('/path/to/site_boundary.shp')
>>> inters = gdf.geometry.interesects(shp.geometry.iloc[0])  # assuming only one polygon in shapefile
>>> masked_diff = diff_wl.loc[~inters]
>>> plt.hist(masked_diff, bins)
>>> plt.show()
Histogram of water level differences off site