Skip to content

Releases: ecmwf-ifs/loki

v0.2.7

04 Oct 13:07
477c56d
Compare
Choose a tag to compare

What's New

  • Experimental Fortran-to-CUDA transpilation demonstrated on CLOUDSC (#328)
  • A new SplitReadWriteTransformation that allows user-guided GPU optimisation to make loads independent from stores (#329)
  • A new LowerConstantArrayIndices transformation to pass full arrays instead of constant slices in kernel calls (#348)
  • New transformation utilities to introduce loop blocking for driver loops (#362)
  • A new string-based substitution mechanism for expressions (#366)
  • Refactoring of SCC tests (#353) and transformation utilities (#354)
  • And many small improvements and bug fixes (see below)

All Changes

  • IR: Automatic sanitisation of tuples in IR constructors by @mlange05 in #350
  • Run pytest on macos in GH actions by @reuterbal in #262
  • SCC test reshuffle by @mlange05 in #353
  • Transformations: Move common SCC utility routines to utilities by @mlange05 in #354
  • Transformations: Test and fix corner case in get_local_arrays by @mlange05 in #355
  • Tools: Disable timeout utility test on MacOS due to sporadic failures by @mlange05 in #356
  • Fixed logical evaluation of PRESENT intrinsics on Array variables by @JoeffreyLegaux in #341
  • ecWAM regression tests: switch to develop-1.3 branch by @awnawab in #358
  • Split reads and writes for certain accumulation patterns by @awnawab in #329
  • fix for 'resolve_vector_notation' utility by @MichaelSt98 in #361
  • Transformations: Internalise IdemTransformation by @mlange05 in #360
  • New transformation 'LowerConstantArrayIndices' to allow to … by @MichaelSt98 in #348
  • OMNI: Fix dimension range-indexing in frontend by @mlange05 in #363
  • Loki-transform: Pass cuf option to FilewriteTrafo by @mlange05 in #364
  • Filter out globals in get_local_arrays by @awnawab in #370
  • extend hoist variables functionality by @MichaelSt98 in #357
  • Change/fix pipeline for mode 'scc-raw-stack' by @MichaelSt98 in #371
  • Minimal padding in pool allocator by @awnawab in #365
  • CLOUDSC low-level GPU (transpilation) via Loki (CUF/CUDA) by @MichaelSt98 in #328
  • Loop splitting/blocking of block loops by @wertysas in #362
  • String-based expression substitution and moar expression tests! by @mlange05 in #366
  • SCC: Add vectorisation annotations in SCCRevector and translate in SCCAnnotate by @mlange05 in #359
  • Update VERSION to 0.2.7 by @reuterbal in #381

New Contributors

Full Changelog: v0.2.6...v0.2.7

v0.2.6

26 Jul 13:09
190bdfa
Compare
Choose a tag to compare

This is a minor release with a number of housekeeping changes and some new features.

What's new

  • We had a dependency on the Pydantic 1.x releases until now, and this release adds support for Pydantic 2. The next release will require Pydantic 2. (#349)
  • The InlineTransformation allows now to inline statement functions (#345)
  • A new LoopUnrollTransformation allows to explicitly unroll pragma-annotated loops (#347)
  • Loki IR has now support for the FORALL statement and construct. However, this feature is only fully supported with the Fparser2 frontend (#210)
  • Cray pointers are now represented in the Loki IR as Intrinsic nodes (#342)
  • Python package installation works now correctly also from tarballs and other non-git versioned installation sources (#344)
  • The test base has been cleaned up: all regression tests use now publicly available source branches, and all tests should now create temporary files in test-local temporary directories to avoid littering the source tree (#335, #343)

All changes

New Contributors

Full Changelog: v0.2.5...v0.2.6

v0.2.5

24 Jun 08:15
a39336b
Compare
Choose a tag to compare

A minor release adding new transformations and fixing issues in the frontends, handling of derived types, dataflow analysis and transformations.

What's New

  • A general BlockIndexInjectTransformation that injects the block-index into all array subscripts that have a local rank one less than their declared rank (#303)
  • A corresponding, IFS-specific BlockViewToFieldViewTransformation to replace per-block view pointers with full field pointers (#303)
  • A new SCCRawStackPipeline that uses a pool-allocator variant where each use of temporaries is replaced with fixed offsets into a pre-allocated scratch memory (#314, incorporating #201 by @rolfhm)

All Changes

  • Block-index injection transformations by @awnawab in #303
  • Fix parse failures with REGEX frontend due to white space in declarations by @reuterbal in #323
  • DataFlowAnalysis bug fixes by @awnawab in #320
  • Fix derived type inheritance when parent type is not available (#330) by @reuterbal in #331
  • InlineTransformation: Update Scheduler SGraph if marked_inline is activated by @awnawab in #322
  • HoistVariablesAnalysis: remove unused explicit interfaces after inlining by @awnawab in #319
  • Fix Linter warnings for inline calls with interface block imported from header with func.h suffix by @reuterbal in #332
  • Add transformation generated imports to driver or after inlining by @awnawab in #321
  • Fix wrong classification as StatementFunction in translation to Loki IR by @reuterbal in #327
  • get_pragma_parameters: Fix parsing clauses without parentheses in the tail string by @reuterbal in #324
  • ProgramUnit.resolve_typebound_var: raise error if top-level parent is not declared by @reuterbal in #325
  • Transformations: SCCRawStackPipeline and SCC config-from-file by @mlange05 in #314

Full Changelog: v0.2.4...v0.2.5

v0.2.4

28 May 12:35
f3e7d90
Compare
Choose a tag to compare

This is a minor maintenance release matching the declaration of Hybrid 2024 Milestone 1.

What's Changed

  • Repo reorganisation: Moving transformations by @mlange05 in #296
  • Fix: import of private symbols affects the type inference by @quepas in #308
  • JIT compilation updates and compatibility with f90wrap v0.2.14 by @reuterbal in #315
  • IR: Fix get_pragma_params for multiline pragmas by @mlange05 in #313
  • Transformations: Remap declaration symbols and adjust imports when inlining by @mlange05 in #311
  • Docs: Update to links from static doc pages by @mlange05 in #312

New Contributors

Full Changelog: v0.2.3...v0.2.4

v0.2.3

30 Apr 15:48
102dafd
Compare
Choose a tag to compare

This is a minor bugfix/maintenance release to resolve some issues around the Loki installation and version number discovery, particularly when installing from a code version that is not under Git version control.

What's Changed

Full Changelog: v0.2.2...v0.2.3

v0.2.2

26 Apr 07:02
5673795
Compare
Choose a tag to compare

This is a feature and bugfix release, which adds new functionality and resolves a number of problems.

What's New

  • Loki supports a new, streamlined way of composing transformation pipelines from individual Transformation classes. Transformation arguments are shared among transformations, ensuring consistency, e.g., for Dimension parameters. Pipelines and transformation arguments can even be constructed purely from the config file, which will become the default for the loki-transform.py convert command in the future. See #217 for more details on how this works.
  • The pool allocator transformation has a new option to improve compatibility with Cray Compiler Environment 16 on AMD platforms. For that, the pointer arithmetic is removed and LOC calls are used directly in the kernel to determine the offset of a temporary in the scratch allocation. See #231 for more details.
  • A new RemoveCodeTransformation has been added, replacing the RemoveCallsTransformation and incorporating the dead code removal. Additionally, it provides a new feature to remove pragma-annotated code sections via !$loki remove / !$loki end remove (#276).
  • Loki's JIT functionality that is used to build and run tests has been amended so that it honours environment variables and no longer depends on gfortran exclusively. Instead, environment variables CC, FC, F90, and LD are inspected to determine the compile commands to use, and CFLAGS, FCFLAGS, F90FLAGS, and LDFLAGS can be used to set corresponding flags. Default values are provided for GNU and NVHPC compilers. With this, it is now possible to run the test suite also on MacOS after installing gcc and gfortran (e.g., via Homebrew), and setting the environment variables accordingly. Note that Numpy's F2PY, which is used to call Fortran routines from the Python test base, works also with non-GNU compilers (e.g., NVHPC) but requires gcc to compile the C interface routines. Also, not all tests are compatible with NVHPC and test failures are a known issue that will be resolved in the future (#301). See #294 for more details.
  • The parse_expr utility's functionality has been expanded to support derived types and underpins now the get_pragma_parameters utility, providing a vastly expanded functionality for expressions in pragma annotations (#292).

What's Changed

  • [CMake] Expose GLOBAL_VAR_OFFLOAD and INCLUDES in loki_transform_target by @awnawab in #264
  • Preserve imported statement functions by @awnawab in #251
  • Fix codecov by adding CODECOV_TOKEN by @reuterbal in #278
  • cgen: multiconditional/switch/select case statement by @MichaelSt98 in #267
  • Introducing the Pipeline class by @mlange05 in #260
  • Alternative stack/pool allocator implementation based on Cray pointers compatible with Cray+AMD stack by @MichaelSt98 in #231
  • improved replace_intrinsics and added rename_variables by @MichaelSt98 in #266
  • Revert "DEPENDENCY TRAFO: statement functions included via c-style imports preserved" (#251) by @reuterbal in #282
  • cgen: return type and var for function(s) by @MichaelSt98 in #269
  • Pipeline configuration from file by @mlange05 in #271
  • Fixing nested associate scope-parentage tracking after inlining by @mlange05 in #281
  • F2C: DeReferenceTrafo by @MichaelSt98 in #273
  • REGEX frontend: white space and nesting bugfix by @reuterbal in #274
  • Preserve import statement functions - take II by @awnawab in #283
  • Skip driver routine in GlobalVariableAnalysis by @awnawab in #265
  • MaskedTransformer: Fix in-place rebuilding of scoped nodes by @mlange05 in #284
  • Avoid variable_map in TypedSymbol.get_derived_type_member and verify type information is derived correctly by @reuterbal in #285
  • SCCHoist: hoist inline call temporaries and don't hoist statically declared arrays by @awnawab in #268
  • Pool allocator: correctly resolve derived type member as block dimension and ignore pointer/allocatable arrays by @awnawab in #249
  • Marked region removal and general code removal transformation by @mlange05 in #276
  • SCC: make vertical dimension optional by @awnawab in #270
  • SCCBaseTransformation.get_integer_variable now also checks module imports by @awnawab in #279
  • Improve performance of pragma-region attach/detach by using transformers by @mlange05 in #286
  • Reorganising test directories by @mlange05 in #287
  • [Bugfix] available_frontends: Import pytest locally to make dependency optional by @reuterbal in #290
  • DataflowAnalysis bugfix: preserve body nesting in visit_MaskedStatement by @awnawab in #288
  • Loki expression parser based on pymbolic parser by @MichaelSt98 in #272
  • F2C: optional case-sensitivity for variables/symbols by @MichaelSt98 in #277
  • Transformation to hoist temporaries in kernel language transpilation by @MichaelSt98 in #291
  • fix scoping for global var hoisting by @MichaelSt98 in #293
  • SCC: Support for bounds aliases and derived type members as bounds by @awnawab in #250
  • Consistent, environment-configurable use of Compiler class in JIT compilation by @reuterbal in #294
  • Derived-type inheritance by @awnawab in #295
  • Improve parse_expr and use in process_dimension_pragmas by @MichaelSt98 in #292

Full Changelog: v0.2.1...v0.2.2

v0.2.1

27 Mar 19:56
0a15267
Compare
Choose a tag to compare

This is a bugfix release that contains a number of small fixes in transformations and Scheduler.

What's New

  • Utility methods have been added to CallStatement, which simplify inspecting, validating and converting keyword-arguments to positional arguments (see #235)
  • The batch-processing module loki.bulk has been renamed to loki.batch

What's Changed

  • kwargs utilities by @MichaelSt98 in #235
  • Allow to ignore specific dimensions in "shift to zero indexing" by @MichaelSt98 in #236
  • Add 'reverse_traversal=True' to DerivedTypeArgumentsTransformation manifest by @MichaelSt98 in #238
  • Create a pid-specific temporary directory and clean it up at the end by @reuterbal in #261
  • SCC-HOIST: Hoist variables as kwargs (optionally) by @MichaelSt98 in #237
  • GlobalVarHoistTransformation: fix for functions/inline calls by @MichaelSt98 in #240
  • Support colon notation for all dimensions in flatten_arrays by @MichaelSt98 in #239
  • Small CMake layer fixes for SL by @awnawab in #248
  • Rename bulk->batch and create ir sub-package by @mlange05 in #258
  • SingleColumn: Demote arrays that are not used at all in the body by @mlange05 in #259
  • Scheduler: Fix handling of external module procedures by @reuterbal in #263

Full Changelog: v0.2.0...v0.2.1

v0.2.0

22 Mar 15:17
d013229
Compare
Choose a tag to compare

This release contains a rewrite of Loki's Scheduler, which is responsible for planning and executing batch transformations across complex source trees. Compared to the original implementation, it is more flexible, enables handling of more dependency types (data dependencies, type dependencies as well as control flow dependencies) and is faster. Additional capabilities of pruning the dependency graph have been added as part of this, and the new discovery mechanism may make changes to Scheduler config files necessary. See the expanded documentation section for more details.

No other changes are included in this release and in case of problems we advise to report them as issues and stay on v0.1.7 in the meantime.

What's Changed

Full Changelog: v0.1.7...v0.2.0

v0.1.7

21 Mar 10:12
3fe6ce2
Compare
Choose a tag to compare

This is the final release of the v0.1 version of Loki. The new v0.2 will become available soon and use a rewritten Scheduler implementation for batch processing.

Most changes in this release are bugfixes, minor improvements or preparatory work for the new Scheduler integration. See below for the full list.

What's new

  • The file parsing speed has been improved, which should make the processing of large source trees significantly faster (#229, #241, #242, #245).
  • A new set of coding standards checks has been added, corresponding to the new IFS Arpège coding standards. This includes only three rules currently but will be expanded going forward (#247).
  • The pool allocator transformation inserts the argument for the scratch space now as integer variables instead of a dedicated derived type. This was found to avoid an allocation on NVIDIA GPUs and yield performance improvements (#214).

What's Changed

  • Create separate ModuleWrapTransformation from DependencyTransformation by @reuterbal in #197
  • Transformation configuration and SchedulerConfig update by @mlange05 in #191
  • Pragma-driven subroutine inlining and associated utilities by @mlange05 in #198
  • introduce 'flatten_arrays()' (to overcome pointer hack) by @MichaelSt98 in #199
  • Array shadowing bugfix for inliner by @skarppinen in #202
  • Fix for SCC-HOIST, regarding wrong (hoisted) argument(s) (indexing) i… by @MichaelSt98 in #206
  • Refactored Global Variable Offload by @reuterbal in #207
  • Fixes to minor issues related to SCC HOIST by @skarppinen in #211
  • introduce flag to allow removing all derived types by @MichaelSt98 in #215
  • CMake: Fix DERIVE_ARGUMENT_SHAPE_ARRAY and argument handling by @mlange05 in #217
  • Recursive inlining via InlineTransform and associated fixes by @mlange05 in #205
  • Minor fixes to frontends and IR nodes by @reuterbal in #212
  • Stack/Pool Allocator: pass stack as integer(s) by @MichaelSt98 in #214
  • Frontend: Fix bug for multi-line pragmas and add short test by @mlange05 in #221
  • assumed size array handling for 'normalize_array_shape_and_access' by @MichaelSt98 in #218
  • Nested derived type bug fix by @rolfhm in #192
  • C/C++ De/Reference by @MichaelSt98 in #223
  • allow F2C transpilation using c_ptr (switch for old behaviour or new … by @MichaelSt98 in #219
  • Github actions: Only load SSH_KEY from secrets if available by @reuterbal in #228
  • f2c transpile via convert by @MichaelSt98 in #208
  • Global var hoisting by @MichaelSt98 in #226
  • Allow for limiting resolve_sequence_association to procedures that are inlined in loki-transform convert by @skarppinen in #225
  • Performance optimisations for frontend parsing/sanitising by @mlange05 in #229
  • Enabling SCC-Stack for EC-physics, part 2 by @mlange05 in #222
  • Frontend Transformer optimisation by @mlange05 in #241
  • FParser: Perform in-place scope attachment during parse by @mlange05 in #242
  • Subroutine: Only clone symbol when inferring from allocatable by @mlange05 in #245
  • Loki-lint: First 3 rules for new IFS-Arpégé coding standards by @reuterbal in #247

Full Changelog: v0.1.6...v0.1.7

v0.1.6

08 Dec 16:15
e71171a
Compare
Choose a tag to compare

This release primarily contains bugfix and maintenance changes and few new features. It is intended as a stable basis before a set of breaking changes will be made for the next release. These will primarily relate to Scheduler behaviour and a new, consolidated config file format.

What's new

  • A new utility by @rolfhm allows to resolve sequence association (#173)
  • The disable property in the Scheduler config allows now wildcards/simple patterns (#194)
  • A new utility by @skarppinen allows to extract internal subprograms from procedures to convert them into standalone procedures (#181)
  • Transformation classes have now static properties that define how the scheduler should traverse the dependency graph, e.g., forward/reverse traversal, recursion into contained scopes, or file graph traversal (#154)

The full list of changes:

  • Fix vector section trimming in driver loop by @awnawab in #169
  • TransformInline: Fix rescoping in expression substitution by @mlange05 in #170
  • Fix deep-cloning of subroutiens and modules (fix #174) by @mlange05 in #175
  • Sourcefile does not have a filepath by @joscao in #177
  • InlineMember: rename duplicate locals in the body by @rolfhm in #172
  • Extract/Improve Polyhedron class by @joscao in #178
  • Provide linear algebra utility by @joscao in #179
  • add 'kernel_only=True' to RemoveCallsTransformation by @MichaelSt98 in #185
  • Display dataflow analysis (if attached) in IR graph by @joscao in #183
  • Fix handling of empty files in frontends (fix #186) by @reuterbal in #187
  • Transformation utility to fix sequence association by @rolfhm in #173
  • Minor transformation fixes by @reuterbal in #188
  • Small parsing fixes by @awnawab in #190
  • pool_allocator handles range indices by @rolfhm in #193
  • Generic enrichment process by @reuterbal in #189
  • Allow wildcards in disable list for scheduler by @mlange05 in #194
  • Utility transformation for creating standalone subroutines from contained subroutines by @skarppinen in #181
  • Static Transformation properties (manifest) by @mlange05 in #154

New Contributors

Full Changelog: v0.1.5...v0.1.6